婵犵數鍋涘Λ搴ㄥ垂閼测晜宕查悗锝庡枛缁€鍌氼熆鐠轰警妲告い锕€寮剁换婵嬪閻樺弶姣愰梺闈涙搐濞差厼鐣峰璺虹厸闁逞屽墰閼鸿精銇愰幒鎴烆棟闂佸搫顦悘婵堝閸喒妲堥柡鍐╂尵閻h京绱掗妸褎鏆╃紒杈ㄦ閺佹捇鏁撻敓锟� [闂備浇鐨崱鈺佹缂傚倸绋勯幏锟� | 闂備胶枪缁绘劗绮旈悜钘夊瀭闁稿本鍩冮弸鏍煏婵炲灝鍔氶柡鍌楀亾]

    • 大语言模型实用指南(影印版)(英文版)
      • 作者:(美)杰伊·阿拉玛//马尔滕·葛鲁顿第斯特|责编:张烨
      • 出版社:东南大学
      • ISBN:9787576617665
      • 出版日期:2025/02/01
      • 页数:403
    • 售价:74.4
  • 内容大纲

        在过去的几年,AI获得了令人惊讶的新语言能力。在深度学习快速发展的推动下,语言AI系统比以往任何时候都能更好地编写和理解文本。这一趋势正在催生新功能、新产品,甚至新的行业。通过本书的可视化教育方式,读者将学习到现在使用这些功能所需的实用工具和概念。
        你将了解如何将预训练的大语言模型用于文案撰写和摘要生成等应用场景,创建超越关键字匹配的语义搜索系统,以及使用现有库和预训练的模型进行文本分类、搜索和聚类。
  • 作者介绍

  • 目录

    Preface
    Part I.  Understanding Language Models
      1. An Introduction to Large Language Models
        What Is Language AI?
        A Recent History of Language AI
          Representing Language as a Bag-of-Words
          Better Representations with Dense Vector Embeddings
          Types of Embeddings
          Encoding and Decoding Context with Attention
          Attention Is All You Need
          Representation Models: Encoder-Only Models
          Generative Models: Decoder-Only Models
          The Year of Generative AI
        The Moving Definition of a "Large Language Model"
        The Training Paradigm of Large Language Models
        Large Language Model Applications: What Makes Them So Useful?
        Responsible LLM Development and Usage
        Limited Resources Are All You Need
        Interfacing with Large Language Models
          Proprietary, Private Models
          Open Models
          Open Source Frameworks
        Generating Your First Text
        Summary
      2. Tokens and Embeddings
        LLM Tokenization
          How Tokenizers Prepare the Inputs to the Language Model
          Downloading and Running an LLM
        How Does the Tokenizer Break Down Text?
        Word Versus Subword Versus Character Versus Byte Tokens
          Comparing Trained LLM Tokenizers
        Tokenizer Properties
        Token Embeddings
        A Language Model Holds Embeddings for the Vocabulary of Its Tokenizer
          Creating Contextualized Word Embeddings with Language Models
        Text Embeddings (for Sentences and Whole Documents)
        Word Embeddings Beyond LLMs
        Using pretrained Word Embeddings
        The Word2vec Algorithm and Contrastive Training
        Embeddings for Recommendation Systems
        Recommending Songs by Embeddings
        Training a Song Embedding Model
        Summary
      3. Looking Inside Large Language Models
        An Overview of Transformer Models
        The Inputs and Outputs of a Trained Transformer LLM
        The Components of the Forward Pass
          Choosing a Single Token from the Probability Distribution (Sampling/Decoding)
        Parallel Token Processing and Context Size
        Speeding Up Generation by Caching Keys and Values

        Inside the Transformer Block
        Recent Improvements to the Transformer Architecture
        More Efficient Attention
        The Transformer Block
        Positional Embeddings (ROPE)
        Other Architectural Experiments and Improvements
        Summary
    Part II.  Using Pretrained Language Models
      4. Text Classification
        The Sentiment of Movie Reviews
        Text Classification with Representation Models
        Model Selection
        Using a Task-Specific Model
        Classification Tasks That Leverage Embeddings
        Supervised Classification
        What If We Do Not Have Labeled Data?
        Text Classification with Generative Models
        Using the Text-to-Text Transfer Transformer
        ChatGPT for Classification
        Summary
      5. Text Clustering and Topic Modeling
        ArXiv's Articles: Computation and Language
        A Common Pipeline for Text Clustering
        Embedding Documents
        Reducing the Dimensionality of Embeddings
        Cluster the Reduced Embeddings
        Inspecting the Clusters
        From Text Clustering to Topic Modeling
        BERTopic: A Modular Topic Modeling Framework
        Adding a Special Lego Block
        The Text Generation Lego Block
        Summary
      6. Prompt Engineering
        Using Text Generation Models
        Choosing a Text Generation Model
        Loading a Text Generation Model
        Controlling Model Output
        Intro to Prompt Engineering
        The Basic Ingredients of a Prompt
        Instruction-Based Prompting
        Advanced Prompt Engineering
        The Potential Complexity of a Prompt
        In-Context Learning: Providing Examples
        Chain Prompting: Breaking up the Problem
        Reasoning with Generative Models
        Chain-of-Thought: Think Before Answering
        Self-Consistency: Sampling Outputs
        Tree-of-Thought: Exploring Intermediate Steps
        Output Verification
        Providing Examples

        Grammar: Constrained Sampling
        Summary
      7. Advanced Text Generation Techniques and Tools
        Model I/O: Loading Quantized Models with LangChain
        Chains: Extending the Capabilities of LLMs
        A Single Link in the Chain: Prompt Template
        A Chain with Multiple Prompts
        Memory: Helping LLMs to Remember Conversations
        Conversation Buffer
        Windowed Conversation Buffer
        Conversation Summary
        Agents: Creating a System of LLMs
        The Driving Power Behind Agents: Step-by-step Reasoning
        ReAct in LangChain
        Summary
      8. Semantic Search and Retrieval-Augmented Generation
        Overview of Semantic Search and RAG
        Semantic Search with Language Models
        Dense Retrieval
        Reranking
        Retrieval Evaluation Metrics
        Retrieval-Augmented Generation (RAG)
        From Search to RAG
        Example: Grounded Generation with an LLM API
        Example: RAG with Local Models
        Advanced RAG Techniques
        RAG Evaluation
        Summary
      9. Multimoflal Large Language Models
        Transformers for Vision
        Multimodal Embedding Models
        CLIP: Connecting Text and Images
        How Can CLIP Generate Multimodal Embeddings?
        OpenCLIP
        Making Text Generation Models Multimodal
        BLIP-2: Bridging the Modality Gap
        Preprocessing Multimodal Inputs
        Use Case 1: Image Captioning
        Use Case 2: Multimodal Chat-Based Prompting
        Summary
    Part III.  Training and Fine-Tuning Language Models
      10. Creating Text Embedding Models
        Embedding Models
        What Is Contrastive Learning?
        SBERT
        Creating an Embedding Model
          Generating Contrastive Examples
          Train Model
          In-Depth Evaluation
          Loss Functions

        Fine-Tuning an Embedding Model
          Supervised
          Augmented SBERT
        Unsupervised Learning
          Transformer-Based Sequential Denoising Auto-Encoder
          Using TSDAE for Domain Adaptation
        Summary
      11. Fine-Tuning Representation Models for Classification
        Supervised Classification
          Fine-Tuning a Pretrained BERT Model
          Freezing Layers
        Few-Shot Classification
          SetFit: Efficient Fine-Tuning with Few Training Examples
          Fine-Tuning for Few-Shot Classification
        Continued Pretraining with Masked Language Modeling
        Named-Entity Recognition
          Preparing Data for Named-Entity Recognition
          Fine-Tuning for Named-Entity Recognition
        Summary
      12. Fine-Tuning Generation Models
        The Three LLM Training Steps: Pretraining, Supervised Fine-Tuning, and
          Preference Tuning
        Supervised Fine-Tuning (SFT)
          Full Fine-Tuning
          Parameter-Efficient Fine-Tuning (PEFT)
        Instruction Tuning with QLoRA
          Templating Instruction Data
          Model Quantization
          LoRA Configuration
          Training Configuration
          Training
          Merge Weights
        Evaluating Generative Models
          Word-Level Metrics
          Benchmarks
          Leaderboards
          Automated Evaluation
          Human Evaluation
        Preference-Tuning / Alignment / RLHF
        Automating Preference Evaluation Using Reward Models
          The Inputs and Outputs of a Reward Model
          Training a Reward Model
          Training No Reward Model
        Preference Tuning with DPO
          Templating Alignment Data
          Model Quantization
          Training Configuration
          Training
        Summary
    Afterword

    Index