Free Download LLM & Transformer Interview Essentials A-Z: A Silicon Valley Insider’s Guide by X. Fang
English | November 22, 2024 | ISBN: N/A | ASIN: B0DNTJ4ZNG | 366 pages | EPUB | 15 Mb
Ever since the release of ChatGPT in November 2022, the advances in AI and its growth in the public consciousness have been tremendous. At the heart of these epochal changes lies a single machine learning component: the transformer. Like the transistor of the 1960s, the transformer of the 2020s has given us an efficient, composable structure for solving generalized problems, and the transformer can be replicated, scaled up, modularized, and miniaturized however we might need. While the large language models (LLMs) that underpin products like ChatGPT are the most popular, these are but one configuration of the transformer.
This book is written for the engineers, machine learning scientists, data scientists, and technologists who are either working with LLMs or transformers, or who are currently trying to break into the field. This technology is so new that machine learning interviews for such positions have not yet been standardized and commoditized to the level of LeetCode, so a broad familiarity with the core concepts is required. Indeed, it is possible to be an expert on one aspect of the LLM space yet still be blindsided by a comparatively rudimentary question on another.
Table of Contents
I. Architecture Fundamentals
Chapter 1. A ⇒ Attention
Chapter 2. V ⇒ Vanilla Transformer
Chapter 3. E ⇒ Embeddings
Chapter 4. C ⇒ Chinchilla Scaling Laws
Chapter 5. I ⇒ InstructGPT
Chapter 6. R ⇒ RoPE
Chapter 7. M ⇒ Mixture of Experts
II. Lossless Optimizations
Chapter 8. K ⇒ KV Cache
Chapter 9. H ⇒ H100
Chapter 10. F ⇒ FlashAttention
Chapter 11. N ⇒ NCCL
Chapter 12. P ⇒ Pipeline Parallelism
Chapter 13. T ⇒ Tensor Parallelism
Chapter 14. Z ⇒ ZeRO
III. Lossy Optimizations
Chapter 15. Q ⇒ Quantization
Chapter 16. W ⇒ WxAyKVz
Chapter 17. G ⇒ GPTQ
Chapter 18. L ⇒ LoRA
Chapter 19. B ⇒ BitNet
Chapter 20. D ⇒ Distillation
Chapter 21. S ⇒ Structured Sparsity