微盛|团队技术博客

大模型基础及应用开发框架实践

向驰

2023-07-05

分享目的

  1. 下里巴人:企微管家下大模型的可能应用场景
  2. 灯火阑珊处:梳理清楚大模型的脉络及演进方向
  3. 热潮下的生态位:基于大模型的应用开发框架实践

一、案例演示1

场景一:客服 & 本地知识库
image.png

场景二:内容概述
image.png
wecom-temp-388216-9b4ba93dbe9205725886635bc4e9c965.jpg
wecom-temp-352241-33f2eef93fb23bec05466d4776a2a5d6.jpg
wecom-temp-448835-426e2794c2a12c47ffb482e0048de4cc.jpg
wecom-temp-287345-145a024e8a1bf8ddd1d24e2031e0563e.jpg

二、发展历程

两点之间,直线是不是最短?

GPT在模型理论上的创新主要体现在将Transformer结构应用于文本生成任务,并结合掩码语言模型和大规模无监督语料库的预训练方法,从而提高了模型的效果和泛化能力
大模型=foudation model
image.png
image.png

  1. 道路漫漫:质量(企业可用;垂直领域)、可控(机器幻觉、对齐)、时效、成本(训练+推理)

    开源可商用多语言聊天 LLM,BLOOM-176B 需要在 8 个 80GB A100 GPU上运行才能完成推理任务;微调 BLOOM-176B 则需要 72 个A100 GPU

image.pngimage.png

  1. 没有银弹。对于有成熟标注的数据,微调模型可能仍然是对传统任务的最优解。
    1. BERT在情感分析、新闻分类、命名实体识别、文本摘要表现更出色
    2. GPT模型在文本生成领域则更出色

image.png

  1. Just part of the world:gpt-native应用..

image.pngimage.png
wecom-temp-1004291-18991d4df37ef3c9af1b597e370b1529.jpg

三、路在何方?

陆奇最新演讲实录:我的大模型世界观

image.png
image.png
image.png

四、应用框架

LangChain、Toolformer、HuggingGPT、AutoGPT、BabyAGI…

image.png
image.png
LangChain 是一个用于开发由语言模型驱动的应用程序的框架。他主要拥有 2 个能力

  1. 可以将 LLM 模型与外部数据源进行连接
  2. 允许与 LLM 模型进行交互(planning, scheduling, and cooperation )

3.1 LLMs

自定义大模型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# 案例一:
from typing import Any, List, Mapping, Optional

from langchain.callbacks.manager import CallbackManagerForLLMRun
from langchain.llms.base import LLM

class CustomLLM(LLM):

n: int

@property
def _llm_type(self) -> str:
return "custom"

def _call(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
) -> str:
if stop is not None:
raise ValueError("stop kwargs are not permitted.")
return prompt[:self.n]

@property
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
return {"n": self.n}

llm = CustomLLM(n=10)
llm("This is a foobar thing")

# 案例二:
from typing import Any, List, Mapping, Optional
from langchain.callbacks.manager import CallbackManagerForLLMRun
from langchain.llms.base import LLM
import re

class TfboyLLM(LLM):

@property
def _llm_type(self) -> str:
return "custom"

def _call(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
) -> str:
print("问题:",prompt)
pattern = re.compile(r'^.*(\d+[*/+-]\d+).*$')
match = pattern.search(prompt)
if match:
result = eval(match.group(1))
elif "?" in prompt:
rep_args = {"我":"你", "你":"我", "吗":"", "?":"!"}
result = [(rep_args[c] if c in rep_args else c) for c in list(prompt)]
result = ''.join(result)
else:
result = "很抱歉,请换一种问法。比如:1+1等于几"
return result

@property
def _identifying_params(self) -> Mapping[str, Any]:
return {}

模型的替换

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from langchain import PromptTemplate, LLMChain
from langchain.llms import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM

model_id = 'google/flan-t5-large'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

pipe = pipeline(
"text2text-generation",
model=model,
tokenizer=tokenizer,
max_length=100
)

local_llm = HuggingFacePipeline(pipeline=pipe)
print(local_llm('What is the capital of France? '))


template = """Question: {question} Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=local_llm)
question = "What is the capital of England?"
print(llm_chain.run(question))

3.2 Chain

如何让应用更复杂一点?如何解决并发?chain之间如何路由?

image.png

1
2
3
4
5
6
7
8
9
10
11
12
You are given the below API Documentation:{api_docs}

Using this documentation, generate the full API url to call for answering the user question.
You should build the API url in order to get a response that is as short as possible, while still getting the necessary information to answer the question. Pay attention to deliberately exclude any unnecessary pieces of data in the API call.

Question:{question}
API url: {api_url}

Here is the response from the API:{api_response}

Summarize this response to answer the original question.
Summary:

打标签

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from langchain.chat_models import ChatOpenAI
from langchain.chains import create_tagging_chain, create_tagging_chain_pydantic
from langchain.prompts import ChatPromptTemplate

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")
schema = {
"properties": {
"sentiment": {"type": "string", "enum": ["happy", "neutral", "sad"]},
"aggressiveness": {
"type": "integer",
"enum": [1, 2, 3, 4, 5],
"description": "describes how aggressive the statement is, the higher the number the more aggressive",
},
"language": {
"type": "string",
"enum": ["spanish", "english", "french", "german", "italian"],
},
},
"required": ["language", "sentiment", "aggressiveness"],
}
chain = create_tagging_chain(schema, llm)
inp = "Estoy increiblemente contento de haberte conocido! Creo que seremos muy buenos amigos!"
chain.run(inp)
# 输出:{'sentiment': 'happy', 'aggressiveness': 0, 'language': 'spanish'}

本地知识库

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI,VectorDBQA
from langchain.document_loaders import DirectoryLoader
from langchain.chains import RetrievalQA

# 加载文件夹中的所有txt类型的文件
loader = DirectoryLoader('/content/sample_data/data/', glob='**/*.txt')
# 将数据转成 document 对象,每个文件会作为一个 document
documents = loader.load()

# 初始化加载器
text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)
# 切割加载的 document
split_docs = text_splitter.split_documents(documents)

# 初始化 openai 的 embeddings 对象
embeddings = OpenAIEmbeddings()
# 将 document 通过 openai 的 embeddings 对象计算 embedding 向量信息并临时存入 Chroma 向量数据库,用于后续匹配查询
docsearch = Chroma.from_documents(split_docs, embeddings)

# 创建问答对象
qa = VectorDBQA.from_chain_type(llm=OpenAI(), chain_type="stuff", vectorstore=docsearch,return_source_documents=True)
# 进行问答
result = qa({"query": "科大讯飞今年第一季度收入是多少?"})
print(result)

image.png
数据分析
NOTE: For data-sensitive projects, you can specify return_direct=True in the SQLDatabaseChain initialization to directly return the output of the SQL query without any additional formatting. This prevents the LLM from seeing any contents within the database. Note, however, the LLM still has access to the database scheme (i.e. dialect, table and key names) by default.

1
2
3
4
5
6
from langchain import OpenAI, SQLDatabase, SQLDatabaseChain
db = SQLDatabase.from_uri("sqlite:///../../../../notebooks/Chinook.db")
llm = OpenAI(temperature=0, verbose=True)

db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
db_chain.run("How many employees are there?")

image.png

3.3 Agent

定义:The Agent interface provides the flexibility for such applications. An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next.
image.png
image.png

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Answer the following questions as best you can. 
You have access to the following tools:
Calculator: Useful for when you need to answer questions about math.
Weather: useful for When you want to know about the weather

Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Calculator, Weather]
Action Input: the input to the action
Observation: the result of the action ...
(this Thought/Action/Action Input/Observation can repeat N times)

Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: Query the weather of this week,And How old will I be in ten years? This year I am 28
Thought:

Agent的主要类别(agent_type)

  • Action agents: at each timestep, decide on the next action using the outputs of all previous actions
    • zero-shot-react-description: 根据工具的描述和请求内容的来决定使用哪个工具(最常用)
    • react-docstore: 使用 ReAct 框架和 docstore 交互, 使用Search 和Lookup 工具, 前者用来搜, 后者寻找term, 举例: Wipipedia 工具
    • self-ask-with-search: 此代理只使用一个工具: Intermediate Answer, 它会为问题寻找事实答案(指的非 gpt 生成的答案, 而是在网络中,文本中已存在的), 如 Google search API 工具
    • conversational-react-description: 为会话设置而设计的代理, 它的prompt会被设计的具有会话性, 且还是会使用 ReAct 框架来决定使用来个工具, 并且将过往的会话交互存入内存
    • OpenAI Functions:Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a function should to be called and respond with the inputs that should be passed to the function. The OpenAI Functions Agent is designed to work with these models.
  • Plan-and-execute agents: decide on the full sequence of actions up front, then execute them all without updating the plan

3.4 Tools & Toolkits

如何自定义工具? 如何支持多输入参数?如果做参数验证?如果防止高危操作?

定义:Tools are functions that agents can use to interact with the world. These tools can be generic utilities (e.g. search), other chains, or even other agents.
自定义工具

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from typing import Optional, Type

from langchain.callbacks.manager import AsyncCallbackManagerForToolRun, CallbackManagerForToolRun

class CustomSearchTool(BaseTool):
name = "custom_search"
description = "useful for when you need to answer questions about current events"

def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
"""Use the tool."""
return search.run(query)

async def _arun(self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("custom_search does not support async")

class CustomCalculatorTool(BaseTool):
name = "Calculator"
description = "useful for when you need to answer questions about math"
args_schema: Type[BaseModel] = CalculatorInput

def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
"""Use the tool."""
return llm_math_chain.run(query)

async def _arun(self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("Calculator does not support async")

tools = [CustomSearchTool(), CustomCalculatorTool()]
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?")


# You can provide a custom args schema to add descriptions or custom validation
class SearchSchema(BaseModel):
query: str = Field(description="should be a search query")
engine: str = Field(description="should be a search engine")
gl: str = Field(description="should be a country code")
hl: str = Field(description="should be a language code")

class CustomSearchTool(BaseTool):
name = "custom_search"
description = "useful for when you need to answer questions about current events"
args_schema: Type[SearchSchema] = SearchSchema

def _run(self, query: str, engine: str = "google", gl: str = "us", hl: str = "en", run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
"""Use the tool."""
search_wrapper = SerpAPIWrapper(params={"engine": engine, "gl": gl, "hl": hl})
return search_wrapper.run(query)

async def _arun(self, query: str, engine: str = "google", gl: str = "us", hl: str = "en", run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
"""Use the tool asynchronously."""
raise NotImplementedError("custom_search does not support async")
1
2
3
4
5
6
7
8
9
10
11
12
from langchain.chat_models import ChatOpenAI
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.tools import AIPluginTool

tool = AIPluginTool.from_plugin_url("https://www.klarna.com/.well-known/ai-plugin.json")
llm = ChatOpenAI(temperature=0)
tools = load_tools(["requests_all"] )
tools += [tool]

agent_chain = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent_chain.run("what t shirts are available in klarna?")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import os
os.environ["ZAPIER_NLA_API_KEY"] = ''
from langchain.llms import OpenAI
from langchain.agents import initialize_agent
from langchain.agents.agent_toolkits import ZapierToolkit
from langchain.utilities.zapier import ZapierNLAWrapper


llm = OpenAI(temperature=.3)
zapier = ZapierNLAWrapper()
toolkit = ZapierToolkit.from_zapier_nla_wrapper(zapier)
agent = initialize_agent(toolkit.get_tools(), llm, agent="zero-shot-react-description", verbose=True)

# 我们可以通过打印的方式看到我们都在 Zapier 里面配置了哪些可以用的工具
for tool in toolkit.get_tools():
print (tool.name)
print (tool.description)
print ("\n\n")

agent.run('请用中文总结最后一封"******@qq.com"发给我的邮件。并将总结发送给"******@qq.com"')

image.png

五、案例演示2

当大脑配上了手脚?代码级别的实操作..

image.png
wecom-temp-146094-77923cf09ecc2ec458d4e85876eb2f76.jpg
企业微信截图_a81eef8c-9ffd-41a8-bd5b-a1ddf9a453f5.png
企业微信截图_eff1a898-2221-4ee1-a5bf-f4f4af7998b8.png

六、自我实践

“如果服务To B客户,可能训练一个百亿到千亿的模型,因为To B主要依赖的是大模型的语言理解能力。To C的话或者是跟OpenAI对标,一定要做到很强的AGI(通用人工智能)能力,这就要求更多的GPU卡,比如说一两千块GPU卡可能都少了,五千到一万块GPU卡可能最有竞争力” – 澜舟科技,周明

image.png
image.png
质量(企业可用;垂直领域)、可控(机器幻觉、对齐)、时效、成本(训练+推理)-> 我们的数据战略是什么?

  • 应用场景:…
  • 实施成本:模型推理的成本与性能、模型的微调等
  • 工程基座:python与java应用相互融合的架构性设计与落地

七、参考文档

  1. ReAct: Synergizing Reasoning and Acting in Language Models
  2. A simple Python implementation of the ReAct pattern for LLMs
  3. Could you train a ChatGPT-beating model for $85,000 and run it in a browser?
  4. Large language models are having their Stable Diffusion moment
  5. LangChain: a framework for developing applications powered by language models
  6. 陆奇最新演讲完整视频|大模型带来的新范式-哔哩哔哩
  7. Building Custom Tools for LLM Agents
  8. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
  9. Langchain_Agents_SQL_Database_Agent.ipynb
  10. 深度长文:产品经理视角解析AutoGPT背后的技术原理
  11. OpenAI API 所有 GPT Models 详解
  12. 大模型的发展与解决的问题
  13. LangChain:Model as a Service 粘合剂,被 ChatGPT 插件干掉了吗?
  14. venturi/pandas-ai
  15. AI大模型技术与应用路线图
  16. Transformer模型详解(图解最完整版)
  17. 大模型在不同任务中的优缺点
Tags: 后端

作者: 向驰