当前位置: 首页 > news >正文

23个营销专业术语深圳网站设计十年乐云seo

23个营销专业术语,深圳网站设计十年乐云seo,电话手表网站,简洁大气企业网站欣赏LangChain学习文档 【LangChain】检索器(Retrievers)【LangChain】检索器之MultiQueryRetriever【LangChain】检索器之上下文压缩 上下文压缩 LangChain学习文档 概要内容使用普通向量存储检索器使用 LLMChainExtractor 添加上下文压缩(Adding contextual compression with an…

LangChain学习文档

  • 【LangChain】检索器(Retrievers)
  • 【LangChain】检索器之MultiQueryRetriever
  • 【LangChain】检索器之上下文压缩

上下文压缩

    • LangChain学习文档
  • 概要
  • 内容
  • 使用普通向量存储检索器
  • 使用 LLMChainExtractor 添加上下文压缩(Adding contextual compression with an LLMChainExtractor)
  • 更多内置压缩机:过滤器(More built-in compressors: filters)
    • LLMChainFilter
    • EmbeddingsFilter
  • 将压缩器和文档转换器串在一起(Stringing compressors and document transformers together)
  • 总结

概要

检索的一项挑战是,通常我们不知道:当数据引入系统时,文档存储系统会面临哪些特定查询。

这意味着与查询最相关的信息可能被隐藏在包含大量不相关文本的文档中。

通过我们的应用程序传递完整的文件可能会导致更昂贵的llm通话和更差的响应。

上下文压缩旨在解决这个问题。

这个想法很简单:我们可以使用给定查询的上下文来压缩它们,以便只返回相关信息,而不是立即按原样返回检索到的文档。

这里的“压缩”既指压缩单个文档的内容,也指批量过滤文档。

要使用上下文压缩检索器,我们需要:

  • 基础检索器
  • 文档压缩器

上下文压缩检索器将查询传递给基础检索器,获取初始文档并将它们传递给文档压缩器。文档压缩器获取文档列表并通过减少文档内容或完全删除文档来缩短它。

在这里插入图片描述

内容

# 打印文档的辅助功能def pretty_print_docs(docs):print(f"\n{'-' * 100}\n".join([f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]))

使用普通向量存储检索器

让我们首先初始化一个简单的向量存储检索器并存储 2023 年国情咨文演讲(以块的形式)。我们可以看到,给定一个示例问题,我们的检索器返回一两个相关文档和一些不相关的文档。甚至相关文档中也有很多不相关的信息。

from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS
# 加载文档
documents = TextLoader('../../../state_of_the_union.txt').load()
# 拆分器
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
# 拆分文档
texts = text_splitter.split_documents(documents)
# 构建索引,并构建检索器
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
# 运行
docs = retriever.get_relevant_documents("What did the president say about Ketanji Brown Jackson")
# 美化打印
pretty_print_docs(docs)

结果:

    Document 1:Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.----------------------------------------------------------------------------------------------------Document 2:A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling.  We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers.  We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.----------------------------------------------------------------------------------------------------Document 3:And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. And soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. So tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together.  First, beat the opioid epidemic.----------------------------------------------------------------------------------------------------Document 4:Tonight, I’m announcing a crackdown on these companies overcharging American businesses and consumers. And as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up.  That ends on my watch. Medicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. We’ll also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. Let’s pass the Paycheck Fairness Act and paid leave.  Raise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. Let’s increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls America’s best-kept secret: community colleges.

使用 LLMChainExtractor 添加上下文压缩(Adding contextual compression with an LLMChainExtractor)

现在让我们用 ContextualCompressionRetriever 包装我们的基本检索器。我们将添加一个 LLMChainExtractor,它将迭代最初返回的文档,并从每个文档中仅提取与查询相关的内容。

from langchain.llms import OpenAI
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor
# 构建大模型
llm = OpenAI(temperature=0)
# 从大模型中构建LLMChainExtractor
compressor = LLMChainExtractor.from_llm(llm)
# 构建压缩检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)

结果:

    Document 1:"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence."----------------------------------------------------------------------------------------------------Document 2:"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."

更多内置压缩机:过滤器(More built-in compressors: filters)

LLMChainFilter

LLMChainFilter 是稍微简单但更强大的压缩器,它使用 LLM Chain来决定过滤掉最初检索到的文档中的哪些文档以及返回哪些文档,而无需操作文档内容。

from langchain.retrievers.document_compressors import LLMChainFilter# 构建LLMChainFilter
_filter = LLMChainFilter.from_llm(llm)
# 构建上下文压缩检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=_filter, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)
    Document 1:Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.

EmbeddingsFilter

对每个检索到的文档进行额外的 LLM 调用既昂贵又缓慢。 EmbeddingsFilter 通过嵌入文档和查询并仅返回那些与查询具有足够相似嵌入的文档来提供更便宜且更快的选项。

from langchain.embeddings import OpenAIEmbeddings
from langchain.retrievers.document_compressors import EmbeddingsFilter
# 构建嵌入
embeddings = OpenAIEmbeddings()
# 构建EmbeddingsFilter
embeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
# 构建上下文压缩检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)

结果:

    Document 1:Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.----------------------------------------------------------------------------------------------------Document 2:A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling.  We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers.  We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.----------------------------------------------------------------------------------------------------Document 3:And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. And soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. So tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together.  First, beat the opioid epidemic.

将压缩器和文档转换器串在一起(Stringing compressors and document transformers together)

使用 DocumentCompressorPipeline 我们还可以轻松地按顺序组合多个压缩器。除了压缩器之外,我们还可以将 BaseDocumentTransformers 添加到管道中,它不执行任何上下文压缩,而只是对一组文档执行一些转换。

例如,TextSplitters 可以用作文档转换器,将文档分割成更小的部分,而 EmbeddingsRedundantFilter 可以用于根据文档之间嵌入的相似性来过滤掉冗余文档。

下面我们创建一个压缩器管道,首先将文档分割成更小的块,然后删除冗余文档,然后根据与查询的相关性进行过滤。

from langchain.document_transformers import EmbeddingsRedundantFilter
from langchain.retrievers.document_compressors import DocumentCompressorPipeline
from langchain.text_splitter import CharacterTextSplitter
# 构建拆分器
splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=0, separator=". ")
# 构建EmbeddingsRedundantFilter
redundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)
# 构建嵌入过滤器:EmbeddingsFilter
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
# 构建文档管道
pipeline_compressor = DocumentCompressorPipeline(transformers=[splitter, redundant_filter, relevant_filter]
)
# 构建上下文检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)

结果:

    Document 1:One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson----------------------------------------------------------------------------------------------------Document 2:As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year----------------------------------------------------------------------------------------------------Document 3:A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder

总结

我们在进行文档搜索的时候,正相关的文档是少部分,大部分都是不相关的文档。
我们可以使用上下文压缩检索器,只返回正相关的那部分文档。

主要步骤:

  1. 构建一个普通检索器:retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
  2. 构建一个上下文压缩检索器:ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)

特别是第二步骤:构建上下文压缩器的第一个参数,有很多花样:
① LLMChainExtractor 提取,精炼
② LLMChainFilter 普通过滤
③ EmbeddingsFilter 嵌入过滤
④ DocumentCompressorPipeline 文档管道,可以将多个过滤器组合在一起。

参考地址:

https://python.langchain.com/docs/modules/data_connection/retrievers/how_to/contextual_compression/

http://www.khdw.cn/news/38582.html

相关文章:

  • 简述网站建设的seo是什么专业的课程
  • 网站架设软件seo教程seo入门讲解
  • 站长工具里查看的网站描述和关键词都不显示成都百度推广开户公司
  • go做的网站如何制作一个简单的网页
  • 成全视频免费观看在线看厨房电视剧下载windows优化大师兑换码
  • 湖南做网站 e磐石网络怎么制作网页页面
  • 平面网页设计教学关键词优化骗局
  • 做网站管理员需要哪些知识优化网站怎么做
  • 网站头部导航代码成都百度网站排名优化
  • 寿县网站建设湖南seo推广
  • 网站兼容性测试怎么做百度免费打开
  • 南宁青秀网站建设南昌seo排名优化
  • 闵行网站建设sem竞价推广托管代运营公司
  • 企业网站规范线上渠道推广有哪些方式
  • 2017企业网站建设方案百度手机助手下载安卓
  • 专门做生鲜的网站向日葵seo
  • 独立网站做外贸怎么样资阳市网站seo
  • 网站收录了但是搜索不到长沙网络推广外包费用
  • web网站开发需要什么网页优化seo广州
  • 在网站上做宣传属于广告费用吗百度经验首页
  • 韩国购物网站免费发布友链
  • 保险做的好的网站有哪些seo网络营销案例分析
  • 做外贸的国外平台有哪些上海企业seo
  • 如何做收费会员定制网站可以推广赚钱的软件
  • 手机网速关键词排名优化公司
  • 有什么做动图比较方便的网站外贸网站优化
  • 海外主机做黄色网站优化大师免费下载
  • 网站强制分享链接怎么做的电脑优化是什么意思
  • 网站后台中表格制作免费引流推广工具
  • 企业网站优化应该怎么做企业网站设计优化公司