百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 编程字典 > 正文

LangChain手册(Python版)22模块索引相关链

toyiye 2024-06-21 12:25 9 浏览 0 评论

  • Analyze Document
  • Chat Over Documents with Chat History
  • Graph QA
  • Hypothetical Document Embeddings
  • Question Answering with Sources
  • Question Answering
  • Summarization
  • Retrieval Question/Answering
  • Retrieval Question Answering with Sources
  • Vector DB Text Generation

分析文档

AnalyzeDocumentChain 更像是链的末端。该链接收单个文档,将其拆分,然后通过 CombineDocumentsChain 运行它。这可以用作更多的端到端链。

with open("../../state_of_the_union.txt") as f:
    state_of_the_union = f.read()

总结

让我们在下面看一下它的实际应用,使用它来总结一个长文档。

from langchain import OpenAI
from langchain.chains.summarize import load_summarize_chain

llm = OpenAI(temperature=0)
summary_chain = load_summarize_chain(llm, chain_type="map_reduce")
from langchain.chains import AnalyzeDocumentChain
summarize_document_chain = AnalyzeDocumentChain(combine_docs_chain=summary_chain)
summarize_document_chain.run(state_of_the_union)
" In this speech, President Biden addresses the American people and the world, discussing the recent aggression of Russia's Vladimir Putin in Ukraine and the US response. He outlines economic sanctions and other measures taken to hold Putin accountable, and announces the US Department of Justice's task force to go after the crimes of Russian oligarchs. He also announces plans to fight inflation and lower costs for families, invest in American manufacturing, and provide military, economic, and humanitarian assistance to Ukraine. He calls for immigration reform, protecting the rights of women, and advancing the rights of LGBTQ+ Americans, and pays tribute to military families. He concludes with optimism for the future of America."

问题解答

让我们使用问答链来了解一下。

from langchain.chains.question_answering import load_qa_chain
qa_chain = load_qa_chain(llm, chain_type="map_reduce")
qa_document_chain = AnalyzeDocumentChain(combine_docs_chain=qa_chain)
qa_document_chain.run(input_document=state_of_the_union, question="what did the president say about justice breyer?")
' The president thanked Justice Breyer for his service.'

使用聊天记录聊天文档

本笔记本介绍了如何使用ConversationalRetrievalChain. 该链与RetrievalQAChain之间的唯一区别在于,它允许传递聊天历史记录,该历史记录可用于允许后续问题。

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import ConversationalRetrievalChain

装入文档。您可以将其替换为加载程序以获取您想要的任何类型的数据

from langchain.document_loaders import TextLoader
loader = TextLoader("../../state_of_the_union.txt")
documents = loader.load()

如果您有多个要合并的加载程序,您可以执行以下操作:

# loaders = [....]
# docs = []
# for loader in loaders:
#     docs.extend(loader.load())

我们现在拆分文档,为它们创建嵌入,并将它们放入向量库中。这允许我们对它们进行语义搜索。

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents, embeddings)
Using embedded DuckDB without persistence: data will be transient

我们现在可以创建一个内存对象,这是跟踪输入/输出和保持对话所必需的。

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

我们现在初始化ConversationalRetrievalChain

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), memory=memory)
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query})
result["answer"]
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
query = "Did he mention who she suceeded"
result = qa({"question": query})
result['answer']
' Ketanji Brown Jackson succeeded Justice Stephen Breyer on the United States Supreme Court.'

传入聊天记录

在上面的示例中,我们使用了一个 Memory 对象来跟踪聊天记录。我们也可以显式地传递它。为此,我们需要初始化一个没有任何内存对象的链。

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever())

这是在没有聊天记录的情况下提问的示例

chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
result["answer"]
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."

这是一个用一些聊天记录提问的例子

chat_history = [(query, result["answer"])]
query = "Did he mention who she suceeded"
result = qa({"question": query, "chat_history": chat_history})
result['answer']
' Ketanji Brown Jackson succeeded Justice Stephen Breyer on the United States Supreme Court.'

返回源文件

您还可以轻松地从 ConversationalRetrievalChain 返回源文档。当您要检查返回的文档时,这很有用。

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
result['source_documents'][0]
Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../state_of_the_union.txt'})

ConversationalRetrievalChain 与search_distance

如果您使用的是支持按搜索距离过滤的向量存储,则可以添加阈值参数。

vectordbkwargs = {"search_distance": 0.9}
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history, "vectordbkwargs": vectordbkwargs})

ConversationalRetrievalChain 与map_reduce

我们还可以将不同类型的组合文档链与 ConversationalRetrievalChain 链一起使用。

from langchain.chains import LLMChain
from langchain.chains.question_answering import load_qa_chain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
llm = OpenAI(temperature=0)
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)
doc_chain = load_qa_chain(llm, chain_type="map_reduce")

chain = ConversationalRetrievalChain(
    retriever=vectorstore.as_retriever(),
    question_generator=question_generator,
    combine_docs_chain=doc_chain,
)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = chain({"question": query, "chat_history": chat_history})
result['answer']
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, from a family of public school educators and police officers, a consensus builder, and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."

ConversationalRetrievalChain 与来源问答

您还可以将此链与源链的问题回答一起使用。

from langchain.chains.qa_with_sources import load_qa_with_sources_chain
llm = OpenAI(temperature=0)
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)
doc_chain = load_qa_with_sources_chain(llm, chain_type="map_reduce")

chain = ConversationalRetrievalChain(
    retriever=vectorstore.as_retriever(),
    question_generator=question_generator,
    combine_docs_chain=doc_chain,
)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = chain({"question": query, "chat_history": chat_history})
result['answer']
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, from a family of public school educators and police officers, a consensus builder, and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \nSOURCES: ../../state_of_the_union.txt"

ConversationalRetrievalChain 流式传输到stdout

在此示例中,链的输出将逐个stdout令牌流式传输到令牌。

from langchain.chains.llm import LLMChain
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT, QA_PROMPT
from langchain.chains.question_answering import load_qa_chain

# Construct a ConversationalRetrievalChain with a streaming llm for combine docs
# and a separate, non-streaming llm for question generation
llm = OpenAI(temperature=0)
streaming_llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)

question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)
doc_chain = load_qa_chain(streaming_llm, chain_type="stuff", prompt=QA_PROMPT)

qa = ConversationalRetrievalChain(
    retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
 The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.
chat_history = [(query, result["answer"])]
query = "Did he mention who she suceeded"
result = qa({"question": query, "chat_history": chat_history})
 Ketanji Brown Jackson succeeded Justice Stephen Breyer on the United States Supreme Court.

get_chat_history 函数

您还可以指定一个get_chat_history函数,该函数可用于格式化 chat_history 字符串。

def get_chat_history(inputs) -> str:
    res = []
    for human, ai in inputs:
        res.append(f"Human:{human}\nAI:{ai}")
    return "\n".join(res)
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), get_chat_history=get_chat_history)
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
result['answer']
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."

图 QA

本笔记本介绍了如何对图形数据结构进行问答。

创建图表

在本节中,我们构建了一个示例图。目前,这最适合小段文本。

from langchain.indexes import GraphIndexCreator
from langchain.llms import OpenAI
from langchain.document_loaders import TextLoader
index_creator = GraphIndexCreator(llm=OpenAI(temperature=0))
with open("../../state_of_the_union.txt") as f:
    all_text = f.read()

我们将只使用一个小片段,因为目前提取知识三元组有点密集。

text = "\n".join(all_text.split("\n\n")[105:108])
text
'It won’t look like much, but if you stop and look closely, you’ll see a “Field of dreams,” the ground on which America’s future will be built. \nThis is where Intel, the American company that helped build Silicon Valley, is going to build its $20 billion semiconductor “mega site”. \nUp to eight state-of-the-art factories in one place. 10,000 new good-paying jobs. '
graph = index_creator.from_text(text)

我们可以检查创建的图形。

graph.get_triples()
[('Intel', '$20 billion semiconductor "mega site"', 'is going to build'),
 ('Intel', 'state-of-the-art factories', 'is building'),
 ('Intel', '10,000 new good-paying jobs', 'is creating'),
 ('Intel', 'Silicon Valley', 'is helping build'),
 ('Field of dreams',
  "America's future will be built",
  'is the ground on which')]

查询图表

我们现在可以使用图 QA 链来询问图的问题

from langchain.chains import GraphQAChain
chain = GraphQAChain.from_llm(OpenAI(temperature=0), graph=graph, verbose=True)
chain.run("what is Intel going to build?")
> Entering new GraphQAChain chain...
Entities Extracted:
 Intel
Full Context:
Intel is going to build $20 billion semiconductor "mega site"
Intel is building state-of-the-art factories
Intel is creating 10,000 new good-paying jobs
Intel is helping build Silicon Valley

> Finished chain.
' Intel is going to build a $20 billion semiconductor "mega site" with state-of-the-art factories, creating 10,000 new good-paying jobs and helping to build Silicon Valley.'

保存图形

我们还可以保存和加载图形。

graph.write_to_gml("graph.gml")
from langchain.indexes.graph import NetworkxEntityGraph
loaded_graph = NetworkxEntityGraph.from_gml("graph.gml")
loaded_graph.get_triples()
[('Intel', '$20 billion semiconductor "mega site"', 'is going to build'),
 ('Intel', 'state-of-the-art factories', 'is building'),
 ('Intel', '10,000 new good-paying jobs', 'is creating'),
 ('Intel', 'Silicon Valley', 'is helping build'),
 ('Field of dreams',
  "America's future will be built",
  'is the ground on which')]

用来源回答问题

本笔记本介绍了如何使用 LangChain 对文档列表中的来源进行问题回答。它涵盖四种不同的链条类型:stuff, map_reduce, refine, map-rerank。有关这些链类型的更深入解释,请参见此处。

准备数据

首先我们准备数据。对于这个例子,我们对矢量数据库进行相似性搜索,但可以以任何方式获取这些文档(本笔记本的重点是强调获取文档后要做什么)。

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document
from langchain.prompts import PromptTemplate
with open("../../state_of_the_union.txt") as f:
    state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)

embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{"source": str(i)} for i in range(len(texts))])
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
query = "What did the president say about Justice Breyer"
docs = docsearch.similarity_search(query)
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.llms import OpenAI

快速入门

如果你只是想尽快开始,这是推荐的方法:

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'output_text': ' The president thanked Justice Breyer for his service.\nSOURCES: 30-pl'}

如果您想更好地控制和了解正在发生的事情,请参阅以下信息。

连锁stuff_

本节显示使用stuffChain 对源进行问题回答的结果。

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'output_text': ' The president thanked Justice Breyer for his service.\nSOURCES: 30-pl'}

自定义提示

您还可以将自己的提示与此链一起使用。在这个例子中,我们将用意大利语回复。

template = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
Respond in Italian.

QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER IN ITALIAN:"""
PROMPT = PromptTemplate(template=template, input_variables=["summaries", "question"])

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff", prompt=PROMPT)
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'output_text': '\nNon so cosa abbia detto il presidente riguardo a Justice Breyer.\nSOURCES: 30, 31, 33'}

连锁map_reduce_

本节显示使用map_reduceChain 对源进行问题回答的结果。

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_reduce")
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'output_text': ' The president thanked Justice Breyer for his service.\nSOURCES: 30-pl'}

中间步骤

map_reduce如果我们想检查它们,我们也可以返回链的中间步骤。这是通过return_intermediate_steps变量完成的。

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_reduce", return_intermediate_steps=True)
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'intermediate_steps': [' "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service."',
  ' None',
  ' None',
  ' None'],
 'output_text': ' The president thanked Justice Breyer for his service.\nSOURCES: 30-pl'}

自定义提示

您还可以将自己的提示与此链一起使用。在这个例子中,我们将用意大利语回复。

question_prompt_template = """Use the following portion of a long document to see if any of the text is relevant to answer the question. 
Return any relevant text in Italian.
{context}
Question: {question}
Relevant text, if any, in Italian:"""
QUESTION_PROMPT = PromptTemplate(
    template=question_prompt_template, input_variables=["context", "question"]
)

combine_prompt_template = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
Respond in Italian.

QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER IN ITALIAN:"""
COMBINE_PROMPT = PromptTemplate(
    template=combine_prompt_template, input_variables=["summaries", "question"]
)

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_reduce", return_intermediate_steps=True, question_prompt=QUESTION_PROMPT, combine_prompt=COMBINE_PROMPT)
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'intermediate_steps': ["\nStasera vorrei onorare qualcuno che ha dedicato la sua vita a servire questo paese: il giustizia Stephen Breyer - un veterano dell'esercito, uno studioso costituzionale e un giustizia in uscita della Corte Suprema degli Stati Uniti. Giustizia Breyer, grazie per il tuo servizio.",
  ' Non pertinente.',
  ' Non rilevante.',
  " Non c'è testo pertinente."],
 'output_text': ' Non conosco la risposta. SOURCES: 30, 31, 33, 20.'}

批量大小

使用map_reduce链时,要记住的一件事是您在映射步骤中使用的批量大小。如果这个值太高,可能会导致速率限制错误。您可以通过在所使用的 LLM 上设置批量大小来控制这一点。请注意,这仅适用于具有此参数的 LLM。下面是一个这样做的例子:

llm = OpenAI(batch_size=5, temperature=0)

连锁refine_

本节显示使用refineChain 对源进行问题回答的结果。

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="refine")
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'output_text': "\n\nThe president said that he was honoring Justice Breyer for his dedication to serving the country and that he was a retiring Justice of the United States Supreme Court. He also thanked him for his service and praised his career as a top litigator in private practice, a former federal public defender, and a family of public school educators and police officers. He noted Justice Breyer's reputation as a consensus builder and the broad range of support he has received from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. He also highlighted the importance of securing the border and fixing the immigration system in order to advance liberty and justice, and mentioned the new technology, joint patrols, dedicated immigration judges, and commitments to support partners in South and Central America that have been put in place. He also expressed his commitment to the LGBTQ+ community, noting the need for the bipartisan Equality Act and the importance of protecting transgender Americans from state laws targeting them. He also highlighted his commitment to bipartisanship, noting the 80 bipartisan bills he signed into law last year, and his plans to strengthen the Violence Against Women Act. Additionally, he announced that the Justice Department will name a chief prosecutor for pandemic fraud and his plan to lower the deficit by more than one trillion dollars in a"}

中间步骤

refine如果我们想检查它们,我们也可以返回链的中间步骤。这是通过return_intermediate_steps变量完成的。

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="refine", return_intermediate_steps=True)
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'intermediate_steps': ['\nThe president said that he was honoring Justice Breyer for his dedication to serving the country and that he was a retiring Justice of the United States Supreme Court. He also thanked Justice Breyer for his service.',
  '\n\nThe president said that he was honoring Justice Breyer for his dedication to serving the country and that he was a retiring Justice of the United States Supreme Court. He also thanked Justice Breyer for his service, noting his background as a top litigator in private practice, a former federal public defender, and a family of public school educators and police officers. He praised Justice Breyer for being a consensus builder and for receiving a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. He also noted that in order to advance liberty and justice, it was necessary to secure the border and fix the immigration system, and that the government was taking steps to do both. \n\nSource: 31',
  '\n\nThe president said that he was honoring Justice Breyer for his dedication to serving the country and that he was a retiring Justice of the United States Supreme Court. He also thanked Justice Breyer for his service, noting his background as a top litigator in private practice, a former federal public defender, and a family of public school educators and police officers. He praised Justice Breyer for being a consensus builder and for receiving a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. He also noted that in order to advance liberty and justice, it was necessary to secure the border and fix the immigration system, and that the government was taking steps to do both. He also mentioned the need to pass the bipartisan Equality Act to protect LGBTQ+ Americans, and to strengthen the Violence Against Women Act that he had written three decades ago. \n\nSource: 31, 33',
  '\n\nThe president said that he was honoring Justice Breyer for his dedication to serving the country and that he was a retiring Justice of the United States Supreme Court. He also thanked Justice Breyer for his service, noting his background as a top litigator in private practice, a former federal public defender, and a family of public school educators and police officers. He praised Justice Breyer for being a consensus builder and for receiving a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. He also noted that in order to advance liberty and justice, it was necessary to secure the border and fix the immigration system, and that the government was taking steps to do both. He also mentioned the need to pass the bipartisan Equality Act to protect LGBTQ+ Americans, and to strengthen the Violence Against Women Act that he had written three decades ago. Additionally, he mentioned his plan to lower costs to give families a fair shot, lower the deficit, and go after criminals who stole billions in relief money meant for small businesses and millions of Americans. He also announced that the Justice Department will name a chief prosecutor for pandemic fraud. \n\nSource: 20, 31, 33'],
 'output_text': '\n\nThe president said that he was honoring Justice Breyer for his dedication to serving the country and that he was a retiring Justice of the United States Supreme Court. He also thanked Justice Breyer for his service, noting his background as a top litigator in private practice, a former federal public defender, and a family of public school educators and police officers. He praised Justice Breyer for being a consensus builder and for receiving a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. He also noted that in order to advance liberty and justice, it was necessary to secure the border and fix the immigration system, and that the government was taking steps to do both. He also mentioned the need to pass the bipartisan Equality Act to protect LGBTQ+ Americans, and to strengthen the Violence Against Women Act that he had written three decades ago. Additionally, he mentioned his plan to lower costs to give families a fair shot, lower the deficit, and go after criminals who stole billions in relief money meant for small businesses and millions of Americans. He also announced that the Justice Department will name a chief prosecutor for pandemic fraud. \n\nSource: 20, 31, 33'}

自定义提示

您还可以将自己的提示与此链一起使用。在这个例子中,我们将用意大利语回复。

refine_template = (
    "The original question is as follows: {question}\n"
    "We have provided an existing answer, including sources: {existing_answer}\n"
    "We have the opportunity to refine the existing answer"
    "(only if needed) with some more context below.\n"
    "------------\n"
    "{context_str}\n"
    "------------\n"
    "Given the new context, refine the original answer to better "
    "answer the question (in Italian)"
    "If you do update it, please update the sources as well. "
    "If the context isn't useful, return the original answer."
)
refine_prompt = PromptTemplate(
    input_variables=["question", "existing_answer", "context_str"],
    template=refine_template,
)


question_template = (
    "Context information is below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the question in Italian: {question}\n"
)
question_prompt = PromptTemplate(
    input_variables=["context_str", "question"], template=question_template
)
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="refine", return_intermediate_steps=True, question_prompt=question_prompt, refine_prompt=refine_prompt)
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
{'intermediate_steps': ['\nIl presidente ha detto che Justice Breyer ha dedicato la sua vita al servizio di questo paese e ha onorato la sua carriera.',
  "\n\nIl presidente ha detto che Justice Breyer ha dedicato la sua vita al servizio di questo paese, ha onorato la sua carriera e ha contribuito a costruire un consenso. Ha ricevuto un ampio sostegno, dall'Ordine Fraterno della Polizia a ex giudici nominati da democratici e repubblicani. Inoltre, ha sottolineato l'importanza di avanzare la libertà e la giustizia attraverso la sicurezza delle frontiere e la risoluzione del sistema di immigrazione. Ha anche menzionato le nuove tecnologie come scanner all'avanguardia per rilevare meglio il traffico di droga, le pattuglie congiunte con Messico e Guatemala per catturare più trafficanti di esseri umani, l'istituzione di giudici di immigrazione dedicati per far sì che le famiglie che fuggono da per",
  "\n\nIl presidente ha detto che Justice Breyer ha dedicato la sua vita al servizio di questo paese, ha onorato la sua carriera e ha contribuito a costruire un consenso. Ha ricevuto un ampio sostegno, dall'Ordine Fraterno della Polizia a ex giudici nominati da democratici e repubblicani. Inoltre, ha sottolineato l'importanza di avanzare la libertà e la giustizia attraverso la sicurezza delle frontiere e la risoluzione del sistema di immigrazione. Ha anche menzionato le nuove tecnologie come scanner all'avanguardia per rilevare meglio il traffico di droga, le pattuglie congiunte con Messico e Guatemala per catturare più trafficanti di esseri umani, l'istituzione di giudici di immigrazione dedicati per far sì che le famiglie che fuggono da per",
  "\n\nIl presidente ha detto che Justice Breyer ha dedicato la sua vita al servizio di questo paese, ha onorato la sua carriera e ha contribuito a costruire un consenso. Ha ricevuto un ampio sostegno, dall'Ordine Fraterno della Polizia a ex giudici nominati da democratici e repubblicani. Inoltre, ha sottolineato l'importanza di avanzare la libertà e la giustizia attraverso la sicurezza delle frontiere e la risoluzione del sistema di immigrazione. Ha anche menzionato le nuove tecnologie come scanner all'avanguardia per rilevare meglio il traffico di droga, le pattuglie congiunte con Messico e Guatemala per catturare più trafficanti di esseri umani, l'istituzione di giudici di immigrazione dedicati per far sì che le famiglie che fuggono da per"],
 'output_text': "\n\nIl presidente ha detto che Justice Breyer ha dedicato la sua vita al servizio di questo paese, ha onorato la sua carriera e ha contribuito a costruire un consenso. Ha ricevuto un ampio sostegno, dall'Ordine Fraterno della Polizia a ex giudici nominati da democratici e repubblicani. Inoltre, ha sottolineato l'importanza di avanzare la libertà e la giustizia attraverso la sicurezza delle frontiere e la risoluzione del sistema di immigrazione. Ha anche menzionato le nuove tecnologie come scanner all'avanguardia per rilevare meglio il traffico di droga, le pattuglie congiunte con Messico e Guatemala per catturare più trafficanti di esseri umani, l'istituzione di giudici di immigrazione dedicati per far sì che le famiglie che fuggono da per"}

连锁map-rerank_

本节显示使用map-rerankChain 对源进行问题回答的结果。

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_rerank", metadata_keys=['source'], return_intermediate_steps=True)
query = "What did the president say about Justice Breyer"
result = chain({"input_documents": docs, "question": query}, return_only_outputs=True)
result["output_text"]
' The President thanked Justice Breyer for his service and honored him for dedicating his life to serve the country.'
result["intermediate_steps"]
[{'answer': ' The President thanked Justice Breyer for his service and honored him for dedicating his life to serve the country.',
  'score': '100'},
 {'answer': ' This document does not answer the question', 'score': '0'},
 {'answer': ' This document does not answer the question', 'score': '0'},
 {'answer': ' This document does not answer the question', 'score': '0'}]

自定义提示

您还可以将自己的提示与此链一起使用。在这个例子中,我们将用意大利语回复。

from langchain.output_parsers import RegexParser

output_parser = RegexParser(
    regex=r"(.*?)\nScore: (.*)",
    output_keys=["answer", "score"],
)

prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

In addition to giving an answer, also return a score of how fully it answered the user's question. This should be in the following format:

Question: [question here]
Helpful Answer In Italian: [answer here]
Score: [score between 0 and 100]

Begin!

Context:
---------
{context}
---------
Question: {question}
Helpful Answer In Italian:"""
PROMPT = PromptTemplate(
    template=prompt_template,
    input_variables=["context", "question"],
    output_parser=output_parser,
)
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_rerank", metadata_keys=['source'], return_intermediate_steps=True, prompt=PROMPT)
query = "What did the president say about Justice Breyer"
result = chain({"input_documents": docs, "question": query}, return_only_outputs=True)
result
{'source': 30,
 'intermediate_steps': [{'answer': ' Il presidente ha detto che Justice Breyer ha dedicato la sua vita a servire questo paese e ha onorato la sua carriera.',
   'score': '100'},
  {'answer': ' Il presidente non ha detto nulla sulla Giustizia Breyer.',
   'score': '100'},
  {'answer': ' Non so.', 'score': '0'},
  {'answer': ' Il presidente non ha detto nulla sulla giustizia Breyer.',
   'score': '100'}],
 'output_text': ' Il presidente ha detto che Justice Breyer ha dedicato la sua vita a servire questo paese e ha onorato la sua carriera.'}

矢量 DB 文本生成

本笔记本介绍了如何使用 LangChain 通过向量索引生成文本。如果我们想要生成能够从大量自定义文本中提取的文本,这将很有用,例如,生成对以前撰写的博客文章有理解的博客文章,或者可以参考产品文档的产品教程。

准备数据

首先,我们准备数据。对于此示例,我们获取一个文档站点,该站点包含托管在 Github 上的降价文件,并将它们拆分为足够小的文档。

from langchain.llms import OpenAI
from langchain.docstore.document import Document
import requests
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import PromptTemplate
import pathlib
import subprocess
import tempfile
def get_github_docs(repo_owner, repo_name):
    with tempfile.TemporaryDirectory() as d:
        subprocess.check_call(
            f"git clone --depth 1 https://github.com/{repo_owner}/{repo_name}.git .",
            cwd=d,
            shell=True,
        )
        git_sha = (
            subprocess.check_output("git rev-parse HEAD", shell=True, cwd=d)
            .decode("utf-8")
            .strip()
        )
        repo_path = pathlib.Path(d)
        markdown_files = list(repo_path.glob("*/*.md")) + list(
            repo_path.glob("*/*.mdx")
        )
        for markdown_file in markdown_files:
            with open(markdown_file, "r") as f:
                relative_path = markdown_file.relative_to(repo_path)
                github_url = f"https://github.com/{repo_owner}/{repo_name}/blob/{git_sha}/{relative_path}"
                yield Document(page_content=f.read(), metadata={"source": github_url})

sources = get_github_docs("yirenlu92", "deno-manual-forked")

source_chunks = []
splitter = CharacterTextSplitter(separator=" ", chunk_size=1024, chunk_overlap=0)
for source in sources:
    for chunk in splitter.split_text(source.page_content):
        source_chunks.append(Document(page_content=chunk, metadata=source.metadata))
Cloning into '.'...

设置向量数据库

现在我们有了分块的文档内容,让我们将所有这些信息放在一个向量索引中以便于检索。

search_index = Chroma.from_documents(source_chunks, OpenAIEmbeddings())

使用自定义提示设置 LLM 链

接下来,让我们设置一个简单的 LLM 链,但为其提供自定义提示以生成博客文章。请注意,自定义提示已参数化并接受两个输入:context,它将是从矢量搜索中获取的文档,以及topic,这是由用户提供的。

from langchain.chains import LLMChain
prompt_template = """Use the context below to write a 400 word blog post about the topic below:
    Context: {context}
    Topic: {topic}
    Blog post:"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "topic"]
)

llm = OpenAI(temperature=0)

chain = LLMChain(llm=llm, prompt=PROMPT)

生成文本

最后,我们编写一个函数来将我们的输入应用到链上。该函数接受一个输入参数topic。我们在向量索引中找到与那个对应的文档topic,并将它们用作我们简单的 LLM 链中的附加上下文。

def generate_blog_post(topic):
    docs = search_index.similarity_search(topic, k=4)
    inputs = [{"context": doc.page_content, "topic": topic} for doc in docs]
    print(chain.apply(inputs))
generate_blog_post("environment variables")
[{'text': '\n\nEnvironment variables are a great way to store and access sensitive information in your Deno applications. Deno offers built-in support for environment variables with `Deno.env`, and you can also use a `.env` file to store and access environment variables.\n\nUsing `Deno.env` is simple. It has getter and setter methods, so you can easily set and retrieve environment variables. For example, you can set the `FIREBASE_API_KEY` and `FIREBASE_AUTH_DOMAIN` environment variables like this:\n\n```ts\nDeno.env.set("FIREBASE_API_KEY", "examplekey123");\nDeno.env.set("FIREBASE_AUTH_DOMAIN", "firebasedomain.com");\n\nconsole.log(Deno.env.get("FIREBASE_API_KEY")); // examplekey123\nconsole.log(Deno.env.get("FIREBASE_AUTH_DOMAIN")); // firebasedomain.com\n```\n\nYou can also store environment variables in a `.env` file. This is a great'}, {'text': '\n\nEnvironment variables are a powerful tool for managing configuration settings in a program. They allow us to set values that can be used by the program, without having to hard-code them into the code. This makes it easier to change settings without having to modify the code.\n\nIn Deno, environment variables can be set in a few different ways. The most common way is to use the `VAR=value` syntax. This will set the environment variable `VAR` to the value `value`. This can be used to set any number of environment variables before running a command. For example, if we wanted to set the environment variable `VAR` to `hello` before running a Deno command, we could do so like this:\n\n```\nVAR=hello deno run main.ts\n```\n\nThis will set the environment variable `VAR` to `hello` before running the command. We can then access this variable in our code using the `Deno.env.get()` function. For example, if we ran the following command:\n\n```\nVAR=hello && deno eval "console.log(\'Deno: \' + Deno.env.get(\'VAR'}, {'text': '\n\nEnvironment variables are a powerful tool for developers, allowing them to store and access data without having to hard-code it into their applications. In Deno, you can access environment variables using the `Deno.env.get()` function.\n\nFor example, if you wanted to access the `HOME` environment variable, you could do so like this:\n\n```js\n// env.js\nDeno.env.get("HOME");\n```\n\nWhen running this code, you\'ll need to grant the Deno process access to environment variables. This can be done by passing the `--allow-env` flag to the `deno run` command. You can also specify which environment variables you want to grant access to, like this:\n\n```shell\n# Allow access to only the HOME env var\ndeno run --allow-env=HOME env.js\n```\n\nIt\'s important to note that environment variables are case insensitive on Windows, so Deno also matches them case insensitively (on Windows only).\n\nAnother thing to be aware of when using environment variables is subprocess permissions. Subprocesses are powerful and can access system resources regardless of the permissions you granted to the Den'}, {'text': '\n\nEnvironment variables are an important part of any programming language, and Deno is no exception. Deno is a secure JavaScript and TypeScript runtime built on the V8 JavaScript engine, and it recently added support for environment variables. This feature was added in Deno version 1.6.0, and it is now available for use in Deno applications.\n\nEnvironment variables are used to store information that can be used by programs. They are typically used to store configuration information, such as the location of a database or the name of a user. In Deno, environment variables are stored in the `Deno.env` object. This object is similar to the `process.env` object in Node.js, and it allows you to access and set environment variables.\n\nThe `Deno.env` object is a read-only object, meaning that you cannot directly modify the environment variables. Instead, you must use the `Deno.env.set()` function to set environment variables. This function takes two arguments: the name of the environment variable and the value to set it to. For example, if you wanted to set the `FOO` environment variable to `bar`, you would use the following code:\n\n```'}]

相关推荐

为何越来越多的编程语言使用JSON(为什么编程)

JSON是JavascriptObjectNotation的缩写,意思是Javascript对象表示法,是一种易于人类阅读和对编程友好的文本数据传递方法,是JavaScript语言规范定义的一个子...

何时在数据库中使用 JSON(数据库用json格式存储)

在本文中,您将了解何时应考虑将JSON数据类型添加到表中以及何时应避免使用它们。每天?分享?最新?软件?开发?,Devops,敏捷?,测试?以及?项目?管理?最新?,最热门?的?文章?,每天?花?...

MySQL 从零开始:05 数据类型(mysql数据类型有哪些,并举例)

前面的讲解中已经接触到了表的创建,表的创建是对字段的声明,比如:上述语句声明了字段的名称、类型、所占空间、默认值和是否可以为空等信息。其中的int、varchar、char和decimal都...

JSON对象花样进阶(json格式对象)

一、引言在现代Web开发中,JSON(JavaScriptObjectNotation)已经成为数据交换的标准格式。无论是从前端向后端发送数据,还是从后端接收数据,JSON都是不可或缺的一部分。...

深入理解 JSON 和 Form-data(json和formdata提交区别)

在讨论现代网络开发与API设计的语境下,理解客户端和服务器间如何有效且可靠地交换数据变得尤为关键。这里,特别值得关注的是两种主流数据格式:...

JSON 语法(json 语法 priority)

JSON语法是JavaScript语法的子集。JSON语法规则JSON语法是JavaScript对象表示法语法的子集。数据在名称/值对中数据由逗号分隔花括号保存对象方括号保存数组JS...

JSON语法详解(json的语法规则)

JSON语法规则JSON语法是JavaScript对象表示法语法的子集。数据在名称/值对中数据由逗号分隔大括号保存对象中括号保存数组注意:json的key是字符串,且必须是双引号,不能是单引号...

MySQL JSON数据类型操作(mysql的json)

概述mysql自5.7.8版本开始,就支持了json结构的数据存储和查询,这表明了mysql也在不断的学习和增加nosql数据库的有点。但mysql毕竟是关系型数据库,在处理json这种非结构化的数据...

JSON的数据模式(json数据格式示例)

像XML模式一样,JSON数据格式也有Schema,这是一个基于JSON格式的规范。JSON模式也以JSON格式编写。它用于验证JSON数据。JSON模式示例以下代码显示了基本的JSON模式。{"...

前端学习——JSON格式详解(后端json格式)

JSON(JavaScriptObjectNotation)是一种轻量级的数据交换格式。易于人阅读和编写。同时也易于机器解析和生成。它基于JavaScriptProgrammingLa...

什么是 JSON:详解 JSON 及其优势(什么叫json)

现在程序员还有谁不知道JSON吗?无论对于前端还是后端,JSON都是一种常见的数据格式。那么JSON到底是什么呢?JSON的定义...

PostgreSQL JSON 类型:处理结构化数据

PostgreSQL提供JSON类型,以存储结构化数据。JSON是一种开放的数据格式,可用于存储各种类型的值。什么是JSON类型?JSON类型表示JSON(JavaScriptO...

JavaScript:JSON、三种包装类(javascript 包)

JOSN:我们希望可以将一个对象在不同的语言中进行传递,以达到通信的目的,最佳方式就是将一个对象转换为字符串的形式JSON(JavaScriptObjectNotation)-JS的对象表示法...

Python数据分析 只要1分钟 教你玩转JSON 全程干货

Json简介:Json,全名JavaScriptObjectNotation,JSON(JavaScriptObjectNotation(记号、标记))是一种轻量级的数据交换格式。它基于J...

比较一下JSON与XML两种数据格式?(json和xml哪个好)

JSON(JavaScriptObjectNotation)和XML(eXtensibleMarkupLanguage)是在日常开发中比较常用的两种数据格式,它们主要的作用就是用来进行数据的传...

取消回复欢迎 发表评论:

请填写验证码