From 68763bd25fa0234fe795b7427b50fb1c4a34fe79 Mon Sep 17 00:00:00 2001 From: Bagatur <22008038+baskaryan@users.noreply.github.com> Date: Thu, 27 Jul 2023 12:55:13 -0700 Subject: [PATCH] mv popular and additional chains to use cases (#8242) --- .../safety}/constitutional_chain.mdx | 0 .../docs/guides/safety/index.mdx | 6 + .../safety}/moderation.mdx | 0 .../docs/modules/chains/additional/index.mdx | 8 - .../chains/additional/multi_prompt_router.mdx | 7 - .../docs/modules/chains/popular/index.mdx | 8 - .../retrievers/integrations/_category_.yml | 1 - .../chains/popular => use_cases/apis}/api.mdx | 0 .../question_answering/how_to/_category_.yml | 1 + .../how_to}/analyze_document.mdx | 0 .../how_to}/chat_vector_db.mdx | 2 +- .../how_to}/multi_retrieval_qa_router.mdx | 2 +- .../how_to}/question_answering.mdx | 2 +- .../how_to}/vector_db_qa.mdx | 2 +- .../summarization}/summarize.mdx | 0 .../popular => use_cases/tabular}/sqlite.mdx | 0 docs/docs_skeleton/vercel.json | 192 +++++- docs/extras/guides/model_laboratory.ipynb | 4 +- .../integrations/callbacks/argilla.ipynb | 2 +- .../document_loaders/tomarkdown.ipynb | 2 +- .../integrations/providers/arangodb.mdx | 2 +- .../extras/integrations/providers/argilla.mdx | 2 +- .../integrations/providers/cassandra.mdx | 2 +- .../integrations/providers/databricks.ipynb | 2 +- .../integrations/providers/databricks.md | 4 +- .../extras/integrations/providers/momento.mdx | 2 +- .../integrations/providers/motherduck.mdx | 2 +- docs/extras/integrations/providers/openai.mdx | 2 +- docs/extras/integrations/providers/redis.mdx | 2 +- .../integrations/toolkits/openapi_nla.ipynb | 2 +- .../integrations/toolkits/spark_sql.ipynb | 2 +- .../integrations/toolkits/sql_database.ipynb | 2 +- .../chains/additional/extraction.ipynb | 566 ------------------ .../modules/chains/how_to/async_chain.ipynb | 2 +- .../openai_functions.ipynb | 6 +- .../custom_agent_with_plugin_retrieval.ipynb | 2 +- .../use_cases/{apis.mdx => apis/index.mdx} | 4 +- .../apis}/llm_requests.ipynb | 0 .../apis}/openai_openapi.yaml | 0 .../apis}/openapi.ipynb | 0 .../apis}/openapi_openai.ipynb | 0 docs/extras/use_cases/code/index.mdx | 2 +- .../code_writing}/cpal.ipynb | 0 docs/extras/use_cases/code_writing/index.mdx | 14 + .../code_writing}/llm_bash.ipynb | 0 .../code_writing}/llm_math.ipynb | 0 .../code_writing}/llm_symbolic_math.ipynb | 0 .../code_writing}/pal.ipynb | 0 .../{extraction.mdx => extraction/index.mdx} | 0 .../extraction/openai_extraction.ipynb | 566 ++++++++++++++++++ .../graph}/graph_arangodb_qa.ipynb | 0 .../graph}/graph_cypher_qa.ipynb | 0 .../graph}/graph_hugegraph_qa.ipynb | 0 .../graph}/graph_kuzu_qa.ipynb | 0 .../graph}/graph_nebula_qa.ipynb | 0 .../graph}/graph_qa.ipynb | 0 .../graph}/graph_sparql_qa.ipynb | 0 docs/extras/use_cases/graph/index.mdx | 7 + .../graph}/neptune_cypher_qa.ipynb | 0 .../additional => use_cases/graph}/tot.ipynb | 0 .../document-context-aware-QA.ipynb | 4 +- .../question_answering/how_to}/flare.ipynb | 7 +- .../question_answering/how_to}/hyde.ipynb | 4 +- .../{ => how_to}/local_retrieval_qa.ipynb | 4 +- .../how_to}/qa_citations.ipynb | 4 +- .../how_to}/vector_db_text_generation.ipynb | 2 +- .../use_cases/question_answering/index.mdx | 365 ++++------- .../integrations/_category_.yml | 1 + .../openai_functions_retrieval_qa.ipynb | 7 +- .../semantic-search-over-chat.ipynb | 12 +- docs/extras/use_cases/self_check/index.mdx | 8 + .../self_check}/llm_checker.ipynb | 0 .../llm_summarization_checker.ipynb | 0 .../index.mdx} | 2 +- .../tabular}/elasticsearch_database.ipynb | 0 .../{tabular.mdx => tabular/index.mdx} | 10 +- .../additional => use_cases}/tagging.ipynb | 0 .../chains/additional/multi_prompt_router.mdx | 107 ---- 78 files changed, 958 insertions(+), 1013 deletions(-) rename docs/docs_skeleton/docs/{modules/chains/additional => guides/safety}/constitutional_chain.mdx (100%) create mode 100644 docs/docs_skeleton/docs/guides/safety/index.mdx rename docs/docs_skeleton/docs/{modules/chains/additional => guides/safety}/moderation.mdx (100%) delete mode 100644 docs/docs_skeleton/docs/modules/chains/additional/index.mdx delete mode 100644 docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx delete mode 100644 docs/docs_skeleton/docs/modules/chains/popular/index.mdx delete mode 100644 docs/docs_skeleton/docs/modules/data_connection/retrievers/integrations/_category_.yml rename docs/docs_skeleton/docs/{modules/chains/popular => use_cases/apis}/api.mdx (100%) create mode 100644 docs/docs_skeleton/docs/use_cases/question_answering/how_to/_category_.yml rename docs/docs_skeleton/docs/{modules/chains/additional => use_cases/question_answering/how_to}/analyze_document.mdx (100%) rename docs/docs_skeleton/docs/{modules/chains/popular => use_cases/question_answering/how_to}/chat_vector_db.mdx (95%) rename docs/docs_skeleton/docs/{modules/chains/additional => use_cases/question_answering/how_to}/multi_retrieval_qa_router.mdx (90%) rename docs/docs_skeleton/docs/{modules/chains/additional => use_cases/question_answering/how_to}/question_answering.mdx (93%) rename docs/docs_skeleton/docs/{modules/chains/popular => use_cases/question_answering/how_to}/vector_db_qa.mdx (92%) rename docs/docs_skeleton/docs/{modules/chains/popular => use_cases/summarization}/summarize.mdx (100%) rename docs/docs_skeleton/docs/{modules/chains/popular => use_cases/tabular}/sqlite.mdx (100%) delete mode 100644 docs/extras/modules/chains/additional/extraction.ipynb rename docs/extras/modules/chains/{popular => how_to}/openai_functions.ipynb (97%) rename docs/extras/use_cases/{apis.mdx => apis/index.mdx} (86%) rename docs/extras/{modules/chains/additional => use_cases/apis}/llm_requests.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/apis}/openai_openapi.yaml (100%) rename docs/extras/{modules/chains/additional => use_cases/apis}/openapi.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/apis}/openapi_openai.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/code_writing}/cpal.ipynb (100%) create mode 100644 docs/extras/use_cases/code_writing/index.mdx rename docs/extras/{modules/chains/additional => use_cases/code_writing}/llm_bash.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/code_writing}/llm_math.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/code_writing}/llm_symbolic_math.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/code_writing}/pal.ipynb (100%) rename docs/extras/use_cases/{extraction.mdx => extraction/index.mdx} (100%) create mode 100644 docs/extras/use_cases/extraction/openai_extraction.ipynb rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_arangodb_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_cypher_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_hugegraph_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_kuzu_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_nebula_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/graph_sparql_qa.ipynb (100%) create mode 100644 docs/extras/use_cases/graph/index.mdx rename docs/extras/{modules/chains/additional => use_cases/graph}/neptune_cypher_qa.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/graph}/tot.ipynb (100%) rename docs/extras/use_cases/question_answering/{ => how_to}/document-context-aware-QA.ipynb (99%) rename docs/extras/{modules/chains/additional => use_cases/question_answering/how_to}/flare.ipynb (99%) rename docs/extras/{modules/chains/additional => use_cases/question_answering/how_to}/hyde.ipynb (99%) rename docs/extras/use_cases/question_answering/{ => how_to}/local_retrieval_qa.ipynb (99%) rename docs/extras/{modules/chains/additional => use_cases/question_answering/how_to}/qa_citations.ipynb (98%) rename docs/extras/{modules/chains/additional => use_cases/question_answering/how_to}/vector_db_text_generation.ipynb (99%) create mode 100644 docs/extras/use_cases/question_answering/integrations/_category_.yml rename docs/extras/{modules/chains/additional => use_cases/question_answering/integrations}/openai_functions_retrieval_qa.ipynb (99%) rename docs/extras/use_cases/question_answering/{ => integrations}/semantic-search-over-chat.ipynb (95%) create mode 100644 docs/extras/use_cases/self_check/index.mdx rename docs/extras/{modules/chains/additional => use_cases/self_check}/llm_checker.ipynb (100%) rename docs/extras/{modules/chains/additional => use_cases/self_check}/llm_summarization_checker.ipynb (100%) rename docs/extras/use_cases/{summarization.mdx => summarization/index.mdx} (82%) rename docs/extras/{modules/chains/additional => use_cases/tabular}/elasticsearch_database.ipynb (100%) rename docs/extras/use_cases/{tabular.mdx => tabular/index.mdx} (80%) rename docs/extras/{modules/chains/additional => use_cases}/tagging.ipynb (100%) delete mode 100644 docs/snippets/modules/chains/additional/multi_prompt_router.mdx diff --git a/docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx b/docs/docs_skeleton/docs/guides/safety/constitutional_chain.mdx similarity index 100% rename from docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx rename to docs/docs_skeleton/docs/guides/safety/constitutional_chain.mdx diff --git a/docs/docs_skeleton/docs/guides/safety/index.mdx b/docs/docs_skeleton/docs/guides/safety/index.mdx new file mode 100644 index 0000000000..a64b1ea041 --- /dev/null +++ b/docs/docs_skeleton/docs/guides/safety/index.mdx @@ -0,0 +1,6 @@ +# Preventing harmful outputs + +One of the key concerns with using LLMs is that they may generate harmful or unethical text. This is an area of active research in the field. Here we present some built-in chains inspired by this research, which are intended to make the outputs of LLMs safer. + +- [Moderation chain](/docs/use_cases/safety/moderation): Explicitly check if any output text is harmful and flag it. +- [Constitutional chain](/docs/use_cases/safety/constitutional_chain): Prompt the model with a set of principles which should guide it's behavior. diff --git a/docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx b/docs/docs_skeleton/docs/guides/safety/moderation.mdx similarity index 100% rename from docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx rename to docs/docs_skeleton/docs/guides/safety/moderation.mdx diff --git a/docs/docs_skeleton/docs/modules/chains/additional/index.mdx b/docs/docs_skeleton/docs/modules/chains/additional/index.mdx deleted file mode 100644 index 3f7d4f56b8..0000000000 --- a/docs/docs_skeleton/docs/modules/chains/additional/index.mdx +++ /dev/null @@ -1,8 +0,0 @@ ---- -sidebar_position: 4 ---- -# Additional - -import DocCardList from "@theme/DocCardList"; - - diff --git a/docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx b/docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx deleted file mode 100644 index 060952df81..0000000000 --- a/docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx +++ /dev/null @@ -1,7 +0,0 @@ -# Dynamically selecting from multiple prompts - -This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the prompt to use for a given input. Specifically we show how to use the `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt. - -import Example from "@snippets/modules/chains/additional/multi_prompt_router.mdx" - - diff --git a/docs/docs_skeleton/docs/modules/chains/popular/index.mdx b/docs/docs_skeleton/docs/modules/chains/popular/index.mdx deleted file mode 100644 index 8fd7a29c71..0000000000 --- a/docs/docs_skeleton/docs/modules/chains/popular/index.mdx +++ /dev/null @@ -1,8 +0,0 @@ ---- -sidebar_position: 3 ---- -# Popular - -import DocCardList from "@theme/DocCardList"; - - diff --git a/docs/docs_skeleton/docs/modules/data_connection/retrievers/integrations/_category_.yml b/docs/docs_skeleton/docs/modules/data_connection/retrievers/integrations/_category_.yml deleted file mode 100644 index 5131f3e6ed..0000000000 --- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/integrations/_category_.yml +++ /dev/null @@ -1 +0,0 @@ -label: 'Integrations' diff --git a/docs/docs_skeleton/docs/modules/chains/popular/api.mdx b/docs/docs_skeleton/docs/use_cases/apis/api.mdx similarity index 100% rename from docs/docs_skeleton/docs/modules/chains/popular/api.mdx rename to docs/docs_skeleton/docs/use_cases/apis/api.mdx diff --git a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/_category_.yml b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/_category_.yml new file mode 100644 index 0000000000..4ed055b08b --- /dev/null +++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/_category_.yml @@ -0,0 +1 @@ +label: 'How to' diff --git a/docs/docs_skeleton/docs/modules/chains/additional/analyze_document.mdx b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/analyze_document.mdx similarity index 100% rename from docs/docs_skeleton/docs/modules/chains/additional/analyze_document.mdx rename to docs/docs_skeleton/docs/use_cases/question_answering/how_to/analyze_document.mdx diff --git a/docs/docs_skeleton/docs/modules/chains/popular/chat_vector_db.mdx b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx similarity index 95% rename from docs/docs_skeleton/docs/modules/chains/popular/chat_vector_db.mdx rename to docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx index 5eb1840253..906d576c51 100644 --- a/docs/docs_skeleton/docs/modules/chains/popular/chat_vector_db.mdx +++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx @@ -2,7 +2,7 @@ sidebar_position: 2 --- -# Conversational Retrieval QA +# Store and reference chat history The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component. It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response. diff --git a/docs/docs_skeleton/docs/modules/chains/additional/multi_retrieval_qa_router.mdx b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router.mdx similarity index 90% rename from docs/docs_skeleton/docs/modules/chains/additional/multi_retrieval_qa_router.mdx rename to docs/docs_skeleton/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router.mdx index 0341e199ac..a8f6d19b71 100644 --- a/docs/docs_skeleton/docs/modules/chains/additional/multi_retrieval_qa_router.mdx +++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router.mdx @@ -1,4 +1,4 @@ -# Dynamically selecting from multiple retrievers +# Dynamically select from multiple retrievers This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects which Retrieval system to use. Specifically we show how to use the `MultiRetrievalQAChain` to create a question-answering chain that selects the retrieval QA chain which is most relevant for a given question, and then answers the question using it. diff --git a/docs/docs_skeleton/docs/modules/chains/additional/question_answering.mdx b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/question_answering.mdx similarity index 93% rename from docs/docs_skeleton/docs/modules/chains/additional/question_answering.mdx rename to docs/docs_skeleton/docs/use_cases/question_answering/how_to/question_answering.mdx index 56ed1ec9df..30d709f65c 100644 --- a/docs/docs_skeleton/docs/modules/chains/additional/question_answering.mdx +++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/question_answering.mdx @@ -1,4 +1,4 @@ -# Document QA +# QA over in-memory documents Here we walk through how to use LangChain for question answering over a list of documents. Under the hood we'll be using our [Document chains](/docs/modules/chains/document/). diff --git a/docs/docs_skeleton/docs/modules/chains/popular/vector_db_qa.mdx b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/vector_db_qa.mdx similarity index 92% rename from docs/docs_skeleton/docs/modules/chains/popular/vector_db_qa.mdx rename to docs/docs_skeleton/docs/use_cases/question_answering/how_to/vector_db_qa.mdx index 986169ad7f..57db52bf94 100644 --- a/docs/docs_skeleton/docs/modules/chains/popular/vector_db_qa.mdx +++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/vector_db_qa.mdx @@ -1,7 +1,7 @@ --- sidebar_position: 1 --- -# Retrieval QA +# QA using a Retriever This example showcases question answering over an index. diff --git a/docs/docs_skeleton/docs/modules/chains/popular/summarize.mdx b/docs/docs_skeleton/docs/use_cases/summarization/summarize.mdx similarity index 100% rename from docs/docs_skeleton/docs/modules/chains/popular/summarize.mdx rename to docs/docs_skeleton/docs/use_cases/summarization/summarize.mdx diff --git a/docs/docs_skeleton/docs/modules/chains/popular/sqlite.mdx b/docs/docs_skeleton/docs/use_cases/tabular/sqlite.mdx similarity index 100% rename from docs/docs_skeleton/docs/modules/chains/popular/sqlite.mdx rename to docs/docs_skeleton/docs/use_cases/tabular/sqlite.mdx diff --git a/docs/docs_skeleton/vercel.json b/docs/docs_skeleton/vercel.json index beca9d71d4..fade6e103d 100644 --- a/docs/docs_skeleton/vercel.json +++ b/docs/docs_skeleton/vercel.json @@ -1610,59 +1610,59 @@ }, { "source": "/en/latest/modules/chains/examples/flare.html", - "destination": "/docs/modules/chains/additional/flare" + "destination": "/docs/use_cases/question_answering/how_to/flare" }, { "source": "/en/latest/modules/chains/examples/graph_cypher_qa.html", - "destination": "/docs/modules/chains/additional/graph_cypher_qa" + "destination": "/docs/use_cases/graph/graph_cypher_qa" }, { "source": "/en/latest/modules/chains/examples/graph_nebula_qa.html", - "destination": "/docs/modules/chains/additional/graph_nebula_qa" + "destination": "/docs/use_cases/graph/graph_nebula_qa" }, { "source": "/en/latest/modules/chains/index_examples/graph_qa.html", - "destination": "/docs/modules/chains/additional/graph_qa" + "destination": "/docs/use_cases/graph/graph_qa" }, { "source": "/en/latest/modules/chains/index_examples/hyde.html", - "destination": "/docs/modules/chains/additional/hyde" + "destination": "/docs/use_cases/question_answering/how_to/hyde" }, { "source": "/en/latest/modules/chains/examples/llm_bash.html", - "destination": "/docs/modules/chains/additional/llm_bash" + "destination": "/docs/use_cases/code_writing/llm_bash" }, { "source": "/en/latest/modules/chains/examples/llm_checker.html", - "destination": "/docs/modules/chains/additional/llm_checker" + "destination": "/docs/use_cases/self_check/llm_checker" }, { "source": "/en/latest/modules/chains/examples/llm_math.html", - "destination": "/docs/modules/chains/additional/llm_math" + "destination": "/docs/use_cases/code_writing/llm_math" }, { "source": "/en/latest/modules/chains/examples/llm_requests.html", - "destination": "/docs/modules/chains/additional/llm_requests" + "destination": "/docs/use_cases/apis/llm_requests" }, { "source": "/en/latest/modules/chains/examples/llm_summarization_checker.html", - "destination": "/docs/modules/chains/additional/llm_summarization_checker" + "destination": "/docs/use_cases/self_check/llm_summarization_checker" }, { "source": "/en/latest/modules/chains/examples/openapi.html", - "destination": "/docs/modules/chains/additional/openapi" + "destination": "/docs/use_cases/apis/openapi" }, { "source": "/en/latest/modules/chains/examples/pal.html", - "destination": "/docs/modules/chains/additional/pal" + "destination": "/docs/use_cases/code_writing/pal" }, { "source": "/en/latest/modules/chains/examples/tagging.html", - "destination": "/docs/modules/chains/additional/tagging" + "destination": "/docs/use_cases/tagging" }, { "source": "/en/latest/modules/chains/index_examples/vector_db_text_generation.html", - "destination": "/docs/modules/chains/additional/vector_db_text_generation" + "destination": "/docs/use_cases/question_answering/how_to/vector_db_text_generation" }, { "source": "/en/latest/modules/chains/generic/router.html", @@ -3771,6 +3771,170 @@ { "source": "/en/latest/:path*", "destination": "/docs/:path*" + }, + { + "source": "/docs/modules/chains/additional/constitutional_chain", + "destination": "/docs/guides/safety/constitutional_chain" + }, + { + "source": "/docs/modules/chains/additional/moderation", + "destination": "/docs/guides/safety/moderation" + }, + { + "source": "/docs/modules/chains/popular/api", + "destination": "/docs/use_cases/apis/api" + }, + { + "source": "/docs/modules/chains/additional/analyze_document", + "destination": "/docs/use_cases/question_answering/how_to/analyze_document" + }, + { + "source": "/docs/modules/chains/popular/chat_vector_db", + "destination": "/docs/use_cases/question_answering/how_to/chat_vector_db" + }, + { + "source": "/docs/modules/chains/additional/multi_retrieval_qa_router", + "destination": "/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router" + }, + { + "source": "/docs/modules/chains/additional/question_answering", + "destination": "/docs/use_cases/question_answering/how_to/question_answering" + }, + { + "source": "/docs/modules/chains/popular/vector_db_qa", + "destination": "/docs/use_cases/question_answering/how_to/vector_db_qa" + }, + { + "source": "/docs/modules/chains/popular/summarize", + "destination": "/docs/use_cases/summarization/summarize" + }, + { + "source": "/docs/modules/chains/popular/sqlite", + "destination": "/docs/use_cases/tabular/sqlite" + }, + { + "source": "/docs/modules/chains/popular/openai_functions", + "destination": "/docs/modules/chains/how_to/openai_functions" + }, + { + "source": "/docs/modules/chains/additional/llm_requests", + "destination": "/docs/use_cases/apis/llm_requests" + }, + { + "source": "/docs/modules/chains/additional/openai_openapi", + "destination": "/docs/use_cases/apis/openai_openapi" + }, + { + "source": "/docs/modules/chains/additional/openapi", + "destination": "/docs/use_cases/apis/openapi" + }, + { + "source": "/docs/modules/chains/additional/openapi_openai", + "destination": "/docs/use_cases/apis/openapi_openai" + }, + { + "source": "/docs/modules/chains/additional/cpal", + "destination": "/docs/use_cases/code_writing/cpal" + }, + { + "source": "/docs/modules/chains/additional/llm_bash", + "destination": "/docs/use_cases/code_writing/llm_bash" + }, + { + "source": "/docs/modules/chains/additional/llm_math", + "destination": "/docs/use_cases/code_writing/llm_math" + }, + { + "source": "/docs/modules/chains/additional/llm_symbolic_math", + "destination": "/docs/use_cases/code_writing/llm_symbolic_math" + }, + { + "source": "/docs/modules/chains/additional/pal", + "destination": "/docs/use_cases/code_writing/pal" + }, + { + "source": "/docs/modules/chains/additional/graph_arangodb_qa", + "destination": "/docs/use_cases/graph/graph_arangodb_qa" + }, + { + "source": "/docs/modules/chains/additional/graph_cypher_qa", + "destination": "/docs/use_cases/graph/graph_cypher_qa" + }, + { + "source": "/docs/modules/chains/additional/graph_hugegraph_qa", + "destination": "/docs/use_cases/graph/graph_hugegraph_qa" + }, + { + "source": "/docs/modules/chains/additional/graph_kuzu_qa", + "destination": "/docs/use_cases/graph/graph_kuzu_qa" + }, + { + "source": "/docs/modules/chains/additional/graph_nebula_qa", + "destination": "/docs/use_cases/graph/graph_nebula_qa" + }, + { + "source": "/docs/modules/chains/additional/graph_qa", + "destination": "/docs/use_cases/graph/graph_qa" + }, + { + "source": "/docs/modules/chains/additional/graph_sparql_qa", + "destination": "/docs/use_cases/graph/graph_sparql_qa" + }, + { + "source": "/docs/modules/chains/additional/neptune_cypher_qa", + "destination": "/docs/use_cases/graph/neptune_cypher_qa" + }, + { + "source": "/docs/modules/chains/additional/tot", + "destination": "/docs/use_cases/graph/tot" + }, + { + "source": "/docs/use_cases/question_answering//document-context-aware-QA", + "destination": "/docs/use_cases/question_answering/how_to/document-context-aware-QA" + }, + { + "source": "/docs/modules/chains/additional/flare", + "destination": "/docs/use_cases/question_answering/how_to/flare" + }, + { + "source": "/docs/modules/chains/additional/hyde", + "destination": "/docs/use_cases/question_answering/how_to/hyde" + }, + { + "source": "/docs/use_cases/question_answering//local_retrieval_qa", + "destination": "/docs/use_cases/question_answering/how_to/local_retrieval_qa" + }, + { + "source": "/docs/modules/chains/additional/qa_citations", + "destination": "/docs/use_cases/question_answering/how_to/qa_citations" + }, + { + "source": "/docs/modules/chains/additional/vector_db_text_generation", + "destination": "/docs/use_cases/question_answering/how_to/vector_db_text_generation" + }, + { + "source": "/docs/modules/chains/additional/openai_functions_retrieval_qa", + "destination": "/docs/use_cases/question_answering/integrations/openai_functions_retrieval_qa" + }, + { + "source": "/docs/use_cases/question_answering//semantic-search-over-chat", + "destination": "/docs/use_cases/question_answering/integrations/semantic-search-over-chat" + }, + { + "source": "/docs/modules/chains/additional/llm_checker", + "destination": "/docs/use_cases/self_check/llm_checker" + }, + { + "source": "/docs/modules/chains/additional/llm_summarization_checker", + "destination": "/docs/use_cases/self_check/llm_summarization_checker" + }, + { + "source": "/docs/modules/chains/additional/elasticsearch_database", + "destination": "/docs/use_cases/tabular/elasticsearch_database" + }, + { + "source": "/docs/modules/chains/additional/tagging", + "destination": "/docs/use_cases/tagging" } ] } \ No newline at end of file diff --git a/docs/extras/guides/model_laboratory.ipynb b/docs/extras/guides/model_laboratory.ipynb index 181b764894..24fd5f7760 100644 --- a/docs/extras/guides/model_laboratory.ipynb +++ b/docs/extras/guides/model_laboratory.ipynb @@ -5,7 +5,7 @@ "id": "920a3c1a", "metadata": {}, "source": [ - "# Model Comparison\n", + "# Model comparison\n", "\n", "Constructing your language model application will likely involved choosing between many different options of prompts, models, and even chains to use. When doing so, you will want to compare these different options on different inputs in an easy, flexible, and intuitive way. \n", "\n", @@ -254,7 +254,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.9" + "version": "3.11.3" } }, "nbformat": 4, diff --git a/docs/extras/integrations/callbacks/argilla.ipynb b/docs/extras/integrations/callbacks/argilla.ipynb index c231a49772..7a78b3198c 100644 --- a/docs/extras/integrations/callbacks/argilla.ipynb +++ b/docs/extras/integrations/callbacks/argilla.ipynb @@ -14,7 +14,7 @@ "> using both human and machine feedback. We provide support for each step in the MLOps cycle, \n", "> from data labeling to model monitoring.\n", "\n", - "\n", + "\n", " \"Open\n", "" ] diff --git a/docs/extras/integrations/document_loaders/tomarkdown.ipynb b/docs/extras/integrations/document_loaders/tomarkdown.ipynb index 23415e0bf3..359c4c88ee 100644 --- a/docs/extras/integrations/document_loaders/tomarkdown.ipynb +++ b/docs/extras/integrations/document_loaders/tomarkdown.ipynb @@ -113,7 +113,7 @@ "\n", "The modules are (from least to most complex):\n", "\n", - "- [Models](https://python.langchain.com/en/latest/modules/models.html): Supported model types and integrations.\n", + "- [Models](https://python.langchain.com/docs/modules/model_io/models/): Supported model types and integrations.\n", "\n", "- [Prompts](https://python.langchain.com/en/latest/modules/prompts.html): Prompt management, optimization, and serialization.\n", "\n", diff --git a/docs/extras/integrations/providers/arangodb.mdx b/docs/extras/integrations/providers/arangodb.mdx index e2650f374f..5866dc9231 100644 --- a/docs/extras/integrations/providers/arangodb.mdx +++ b/docs/extras/integrations/providers/arangodb.mdx @@ -13,7 +13,7 @@ pip install python-arango Connect your ArangoDB Database with a Chat Model to get insights on your data. -See the notebook example [here](/docs/modules/chains/additional/graph_arangodb_qa.html). +See the notebook example [here](/docs/use_cases/graph/graph_arangodb_qa.html). ```python from arango import ArangoClient diff --git a/docs/extras/integrations/providers/argilla.mdx b/docs/extras/integrations/providers/argilla.mdx index a3653860c2..3c882a3294 100644 --- a/docs/extras/integrations/providers/argilla.mdx +++ b/docs/extras/integrations/providers/argilla.mdx @@ -22,7 +22,7 @@ If you don't you can refer to [Argilla - 🚀 Quickstart](https://docs.argilla.i ## Tracking -See a [usage example of `ArgillaCallbackHandler`](/docs/modules/callbacks/integrations/argilla.html). +See a [usage example of `ArgillaCallbackHandler`](/docs/integrations/callbacks/argilla.html). ```python from langchain.callbacks import ArgillaCallbackHandler diff --git a/docs/extras/integrations/providers/cassandra.mdx b/docs/extras/integrations/providers/cassandra.mdx index 404a44dd98..3ab57a83df 100644 --- a/docs/extras/integrations/providers/cassandra.mdx +++ b/docs/extras/integrations/providers/cassandra.mdx @@ -28,7 +28,7 @@ from langchain.memory import CassandraChatMessageHistory ## Memory -See a [usage example](/docs/modules/memory/integrations/cassandra_chat_message_history). +See a [usage example](/docs/integrations/memory/cassandra_chat_message_history). ```python from langchain.memory import CassandraChatMessageHistory diff --git a/docs/extras/integrations/providers/databricks.ipynb b/docs/extras/integrations/providers/databricks.ipynb index 21ffc08a25..4064b1c264 100644 --- a/docs/extras/integrations/providers/databricks.ipynb +++ b/docs/extras/integrations/providers/databricks.ipynb @@ -166,7 +166,7 @@ "source": [ "### SQL Database Agent example\n", "\n", - "This example demonstrates the use of the [SQL Database Agent](/docs/modules/agents/toolkits/sql_database.html) for answering questions over a Databricks database." + "This example demonstrates the use of the [SQL Database Agent](/docs/integrations/toolkits/sql_database.html) for answering questions over a Databricks database." ] }, { diff --git a/docs/extras/integrations/providers/databricks.md b/docs/extras/integrations/providers/databricks.md index 8dd3bf3d4c..0b4fc630e5 100644 --- a/docs/extras/integrations/providers/databricks.md +++ b/docs/extras/integrations/providers/databricks.md @@ -32,11 +32,11 @@ See [MLflow AI Gateway](/docs/ecosystem/integrations/mlflow_ai_gateway). Databricks as an LLM provider ----------------------------- -The notebook [Wrap Databricks endpoints as LLMs](/docs/modules/model_io/models/llms/integrations/databricks.html) illustrates the method to wrap Databricks endpoints as LLMs in LangChain. It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development. +The notebook [Wrap Databricks endpoints as LLMs](/docs/integrations/llms/databricks.html) illustrates the method to wrap Databricks endpoints as LLMs in LangChain. It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development. Databricks endpoints support Dolly, but are also great for hosting models like MPT-7B or any other models from the Hugging Face ecosystem. Databricks endpoints can also be used with proprietary models like OpenAI to provide a governance layer for enterprises. Databricks Dolly ---------------- -Databricks’ Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. The model is available on Hugging Face Hub as databricks/dolly-v2-12b. See the notebook [Hugging Face Hub](/docs/modules/model_io/models/llms/integrations/huggingface_hub.html) for instructions to access it through the Hugging Face Hub integration with LangChain. +Databricks’ Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. The model is available on Hugging Face Hub as databricks/dolly-v2-12b. See the notebook [Hugging Face Hub](/docs/integrations/llms/huggingface_hub.html) for instructions to access it through the Hugging Face Hub integration with LangChain. diff --git a/docs/extras/integrations/providers/momento.mdx b/docs/extras/integrations/providers/momento.mdx index 5f7659b867..2317c80cd7 100644 --- a/docs/extras/integrations/providers/momento.mdx +++ b/docs/extras/integrations/providers/momento.mdx @@ -51,4 +51,4 @@ Momento can be used as a distributed memory store for LLMs. ### Chat Message History Memory -See [this notebook](/docs/modules/memory/integrations/momento_chat_message_history.html) for a walkthrough of how to use Momento as a memory store for chat message history. +See [this notebook](/docs/integrations/memory/momento_chat_message_history.html) for a walkthrough of how to use Momento as a memory store for chat message history. diff --git a/docs/extras/integrations/providers/motherduck.mdx b/docs/extras/integrations/providers/motherduck.mdx index b8256586a5..a388bd96fc 100644 --- a/docs/extras/integrations/providers/motherduck.mdx +++ b/docs/extras/integrations/providers/motherduck.mdx @@ -31,7 +31,7 @@ db = SQLDatabase.from_uri(conn_str) db_chain = SQLDatabaseChain.from_llm(OpenAI(temperature=0), db, verbose=True) ``` -From here, see the [SQL Chain](/docs/modules/chains/popular/sqlite.html) documentation on how to use. +From here, see the [SQL Chain](/docs/use_cases/tabular/sqlite.html) documentation on how to use. ## LLMCache diff --git a/docs/extras/integrations/providers/openai.mdx b/docs/extras/integrations/providers/openai.mdx index 82745c2dfc..63463fc478 100644 --- a/docs/extras/integrations/providers/openai.mdx +++ b/docs/extras/integrations/providers/openai.mdx @@ -58,7 +58,7 @@ For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_ ## Chain -See a [usage example](/docs/modules/chains/additional/moderation). +See a [usage example](/docs/guides/safety/moderation). ```python from langchain.chains import OpenAIModerationChain diff --git a/docs/extras/integrations/providers/redis.mdx b/docs/extras/integrations/providers/redis.mdx index b7350a847b..d1316e4d5b 100644 --- a/docs/extras/integrations/providers/redis.mdx +++ b/docs/extras/integrations/providers/redis.mdx @@ -106,4 +106,4 @@ Redis can be used to persist LLM conversations. For a more detailed walkthrough of the `VectorStoreRetrieverMemory` wrapper, see [this notebook](/docs/modules/memory/integrations/vectorstore_retriever_memory.html). #### Chat Message History Memory -For a detailed example of Redis to cache conversation message history, see [this notebook](/docs/modules/memory/integrations/redis_chat_message_history.html). +For a detailed example of Redis to cache conversation message history, see [this notebook](/docs/integrations/memory/redis_chat_message_history.html). diff --git a/docs/extras/integrations/toolkits/openapi_nla.ipynb b/docs/extras/integrations/toolkits/openapi_nla.ipynb index 03e13d92d1..c2f3b90e41 100644 --- a/docs/extras/integrations/toolkits/openapi_nla.ipynb +++ b/docs/extras/integrations/toolkits/openapi_nla.ipynb @@ -9,7 +9,7 @@ "\n", "Natural Language API Toolkits (NLAToolkits) permit LangChain Agents to efficiently plan and combine calls across endpoints. This notebook demonstrates a sample composition of the Speak, Klarna, and Spoonacluar APIs.\n", "\n", - "For a detailed walkthrough of the OpenAPI chains wrapped within the NLAToolkit, see the [OpenAPI Operation Chain](/docs/modules/chains/additional/openapi.html) notebook.\n", + "For a detailed walkthrough of the OpenAPI chains wrapped within the NLAToolkit, see the [OpenAPI Operation Chain](/docs/use_cases/apis/openapi.html) notebook.\n", "\n", "### First, import dependencies and load the LLM" ] diff --git a/docs/extras/integrations/toolkits/spark_sql.ipynb b/docs/extras/integrations/toolkits/spark_sql.ipynb index aad7af482c..c29f6841c9 100644 --- a/docs/extras/integrations/toolkits/spark_sql.ipynb +++ b/docs/extras/integrations/toolkits/spark_sql.ipynb @@ -6,7 +6,7 @@ "source": [ "# Spark SQL Agent\n", "\n", - "This notebook shows how to use agents to interact with a Spark SQL. Similar to [SQL Database Agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/sql_database.html), it is designed to address general inquiries about Spark SQL and facilitate error recovery.\n", + "This notebook shows how to use agents to interact with a Spark SQL. Similar to [SQL Database Agent](https://python.langchain.com/docs/integrations/toolkits/sql_database), it is designed to address general inquiries about Spark SQL and facilitate error recovery.\n", "\n", "**NOTE: Note that, as this agent is in active development, all answers might not be correct. Additionally, it is not guaranteed that the agent won't perform DML statements on your Spark cluster given certain questions. Be careful running it on sensitive data!**" ] diff --git a/docs/extras/integrations/toolkits/sql_database.ipynb b/docs/extras/integrations/toolkits/sql_database.ipynb index 7f65f48af0..4be39ea9fb 100644 --- a/docs/extras/integrations/toolkits/sql_database.ipynb +++ b/docs/extras/integrations/toolkits/sql_database.ipynb @@ -8,7 +8,7 @@ "source": [ "# SQL Database Agent\n", "\n", - "This notebook showcases an agent designed to interact with a sql databases. The agent builds off of [SQLDatabaseChain](https://python.langchain.com/docs/modules/chains/popular/sqlite) and is designed to answer more general questions about a database, as well as recover from errors.\n", + "This notebook showcases an agent designed to interact with a sql databases. The agent builds off of [SQLDatabaseChain](https://python.langchain.com/docs/use_cases/tabular/sqlite) and is designed to answer more general questions about a database, as well as recover from errors.\n", "\n", "Note that, as this agent is in active development, all answers might not be correct. Additionally, it is not guaranteed that the agent won't perform DML statements on your database given certain questions. Be careful running it on sensitive data!\n", "\n", diff --git a/docs/extras/modules/chains/additional/extraction.ipynb b/docs/extras/modules/chains/additional/extraction.ipynb deleted file mode 100644 index a57c12f9c9..0000000000 --- a/docs/extras/modules/chains/additional/extraction.ipynb +++ /dev/null @@ -1,566 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "6605e7f7", - "metadata": {}, - "source": [ - "# Extraction\n", - "\n", - "The extraction chain uses the OpenAI `functions` parameter to specify a schema to extract entities from a document. This helps us make sure that the model outputs exactly the schema of entities and properties that we want, with their appropriate types.\n", - "\n", - "The extraction chain is to be used when we want to extract several entities with their properties from the same passage (i.e. what people were mentioned in this passage?)" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "34f04daf", - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.4) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n", - " warnings.warn(\n" - ] - } - ], - "source": [ - "from langchain.chat_models import ChatOpenAI\n", - "from langchain.chains import create_extraction_chain, create_extraction_chain_pydantic\n", - "from langchain.prompts import ChatPromptTemplate" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "a2648974", - "metadata": {}, - "outputs": [], - "source": [ - "llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")" - ] - }, - { - "cell_type": "markdown", - "id": "5ef034ce", - "metadata": {}, - "source": [ - "## Extracting entities" - ] - }, - { - "cell_type": "markdown", - "id": "78ff9df9", - "metadata": {}, - "source": [ - "To extract entities, we need to create a schema where we specify all the properties we want to find and the type we expect them to have. We can also specify which of these properties are required and which are optional." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "4ac43eba", - "metadata": {}, - "outputs": [], - "source": [ - "schema = {\n", - " \"properties\": {\n", - " \"name\": {\"type\": \"string\"},\n", - " \"height\": {\"type\": \"integer\"},\n", - " \"hair_color\": {\"type\": \"string\"},\n", - " },\n", - " \"required\": [\"name\", \"height\"],\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "640bd005", - "metadata": {}, - "outputs": [], - "source": [ - "inp = \"\"\"\n", - "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", - " \"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "64313214", - "metadata": {}, - "outputs": [], - "source": [ - "chain = create_extraction_chain(schema, llm)" - ] - }, - { - "cell_type": "markdown", - "id": "17c48adb", - "metadata": {}, - "source": [ - "As we can see, we extracted the required entities and their properties in the required format (it even calculated Claudia's height before returning!)" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "cc5436ed", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'name': 'Alex', 'height': 5, 'hair_color': 'blonde'},\n", - " {'name': 'Claudia', 'height': 6, 'hair_color': 'brunette'}]" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "chain.run(inp)" - ] - }, - { - "cell_type": "markdown", - "id": "8d51fcdc", - "metadata": {}, - "source": [ - "## Several entity types" - ] - }, - { - "cell_type": "markdown", - "id": "5813affe", - "metadata": {}, - "source": [ - "Notice that we are using OpenAI functions under the hood and thus the model can only call one function per request (with one, unique schema)" - ] - }, - { - "cell_type": "markdown", - "id": "511b9838", - "metadata": {}, - "source": [ - "If we want to extract more than one entity type, we need to introduce a little hack - we will define our properties with an included entity type. \n", - "\n", - "Following we have an example where we also want to extract dog attributes from the passage. Notice the 'person_' and 'dog_' prefixes we use for each property; this tells the model which entity type the property refers to. In this way, the model can return properties from several entity types in one single call." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "cf243a26", - "metadata": {}, - "outputs": [], - "source": [ - "schema = {\n", - " \"properties\": {\n", - " \"person_name\": {\"type\": \"string\"},\n", - " \"person_height\": {\"type\": \"integer\"},\n", - " \"person_hair_color\": {\"type\": \"string\"},\n", - " \"dog_name\": {\"type\": \"string\"},\n", - " \"dog_breed\": {\"type\": \"string\"},\n", - " },\n", - " \"required\": [\"person_name\", \"person_height\"],\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "52841fb3", - "metadata": {}, - "outputs": [], - "source": [ - "inp = \"\"\"\n", - "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", - "Alex's dog Frosty is a labrador and likes to play hide and seek.\n", - " \"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "93f904ab", - "metadata": {}, - "outputs": [], - "source": [ - "chain = create_extraction_chain(schema, llm)" - ] - }, - { - "cell_type": "markdown", - "id": "eb074f7b", - "metadata": {}, - "source": [ - "People attributes and dog attributes were correctly extracted from the text in the same call" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "db3e9e17", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'person_name': 'Alex',\n", - " 'person_height': 5,\n", - " 'person_hair_color': 'blonde',\n", - " 'dog_name': 'Frosty',\n", - " 'dog_breed': 'labrador'},\n", - " {'person_name': 'Claudia',\n", - " 'person_height': 6,\n", - " 'person_hair_color': 'brunette'}]" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "chain.run(inp)" - ] - }, - { - "cell_type": "markdown", - "id": "0273e0e2", - "metadata": {}, - "source": [ - "## Unrelated entities" - ] - }, - { - "cell_type": "markdown", - "id": "c07b3480", - "metadata": {}, - "source": [ - "What if our entities are unrelated? In that case, the model will return the unrelated entities in different dictionaries, allowing us to successfully extract several unrelated entity types in the same call." - ] - }, - { - "cell_type": "markdown", - "id": "01d98af0", - "metadata": {}, - "source": [ - "Notice that we use `required: []`: we need to allow the model to return **only** person attributes or **only** dog attributes for a single entity (person or dog)" - ] - }, - { - "cell_type": "code", - "execution_count": 48, - "id": "e584c993", - "metadata": {}, - "outputs": [], - "source": [ - "schema = {\n", - " \"properties\": {\n", - " \"person_name\": {\"type\": \"string\"},\n", - " \"person_height\": {\"type\": \"integer\"},\n", - " \"person_hair_color\": {\"type\": \"string\"},\n", - " \"dog_name\": {\"type\": \"string\"},\n", - " \"dog_breed\": {\"type\": \"string\"},\n", - " },\n", - " \"required\": [],\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "id": "ad6b105f", - "metadata": {}, - "outputs": [], - "source": [ - "inp = \"\"\"\n", - "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", - "\n", - "Willow is a German Shepherd that likes to play with other dogs and can always be found playing with Milo, a border collie that lives close by.\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 50, - "id": "6bfe5a33", - "metadata": {}, - "outputs": [], - "source": [ - "chain = create_extraction_chain(schema, llm)" - ] - }, - { - "cell_type": "markdown", - "id": "24fe09af", - "metadata": {}, - "source": [ - "We have each entity in its own separate dictionary, with only the appropriate attributes being returned" - ] - }, - { - "cell_type": "code", - "execution_count": 51, - "id": "f6e1fd89", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'person_name': 'Alex', 'person_height': 5, 'person_hair_color': 'blonde'},\n", - " {'person_name': 'Claudia',\n", - " 'person_height': 6,\n", - " 'person_hair_color': 'brunette'},\n", - " {'dog_name': 'Willow', 'dog_breed': 'German Shepherd'},\n", - " {'dog_name': 'Milo', 'dog_breed': 'border collie'}]" - ] - }, - "execution_count": 51, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "chain.run(inp)" - ] - }, - { - "cell_type": "markdown", - "id": "0ac466d1", - "metadata": {}, - "source": [ - "## Extra info for an entity" - ] - }, - { - "cell_type": "markdown", - "id": "d240ffc1", - "metadata": {}, - "source": [ - "What if.. _we don't know what we want?_ More specifically, say we know a few properties we want to extract for a given entity but we also want to know if there's any extra information in the passage. Fortunately, we don't need to structure everything - we can have unstructured extraction as well. \n", - "\n", - "We can do this by introducing another hack, namely the *extra_info* attribute - let's see an example." - ] - }, - { - "cell_type": "code", - "execution_count": 68, - "id": "f19685f6", - "metadata": {}, - "outputs": [], - "source": [ - "schema = {\n", - " \"properties\": {\n", - " \"person_name\": {\"type\": \"string\"},\n", - " \"person_height\": {\"type\": \"integer\"},\n", - " \"person_hair_color\": {\"type\": \"string\"},\n", - " \"dog_name\": {\"type\": \"string\"},\n", - " \"dog_breed\": {\"type\": \"string\"},\n", - " \"dog_extra_info\": {\"type\": \"string\"},\n", - " },\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": 81, - "id": "200c3477", - "metadata": {}, - "outputs": [], - "source": [ - "inp = \"\"\"\n", - "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", - "\n", - "Willow is a German Shepherd that likes to play with other dogs and can always be found playing with Milo, a border collie that lives close by.\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 82, - "id": "ddad7dc6", - "metadata": {}, - "outputs": [], - "source": [ - "chain = create_extraction_chain(schema, llm)" - ] - }, - { - "cell_type": "markdown", - "id": "e5c0dbbc", - "metadata": {}, - "source": [ - "It is nice to know more about Willow and Milo!" - ] - }, - { - "cell_type": "code", - "execution_count": 83, - "id": "c22cfd30", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'person_name': 'Alex', 'person_height': 5, 'person_hair_color': 'blonde'},\n", - " {'person_name': 'Claudia',\n", - " 'person_height': 6,\n", - " 'person_hair_color': 'brunette'},\n", - " {'dog_name': 'Willow',\n", - " 'dog_breed': 'German Shepherd',\n", - " 'dog_extra_information': 'likes to play with other dogs'},\n", - " {'dog_name': 'Milo',\n", - " 'dog_breed': 'border collie',\n", - " 'dog_extra_information': 'lives close by'}]" - ] - }, - "execution_count": 83, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "chain.run(inp)" - ] - }, - { - "cell_type": "markdown", - "id": "698b4c4d", - "metadata": {}, - "source": [ - "## Pydantic example" - ] - }, - { - "cell_type": "markdown", - "id": "6504a6d9", - "metadata": {}, - "source": [ - "We can also use a Pydantic schema to choose the required properties and types and we will set as 'Optional' those that are not strictly required.\n", - "\n", - "By using the `create_extraction_chain_pydantic` function, we can send a Pydantic schema as input and the output will be an instantiated object that respects our desired schema. \n", - "\n", - "In this way, we can specify our schema in the same manner that we would a new class or function in Python - with purely Pythonic types." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "id": "6792866b", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import Optional, List\n", - "from pydantic import BaseModel, Field" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "36a63761", - "metadata": {}, - "outputs": [], - "source": [ - "class Properties(BaseModel):\n", - " person_name: str\n", - " person_height: int\n", - " person_hair_color: str\n", - " dog_breed: Optional[str]\n", - " dog_name: Optional[str]" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "8ffd1e57", - "metadata": {}, - "outputs": [], - "source": [ - "chain = create_extraction_chain_pydantic(pydantic_schema=Properties, llm=llm)" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "id": "24baa954", - "metadata": { - "scrolled": false - }, - "outputs": [], - "source": [ - "inp = \"\"\"\n", - "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", - "Alex's dog Frosty is a labrador and likes to play hide and seek.\n", - " \"\"\"" - ] - }, - { - "cell_type": "markdown", - "id": "84e0a241", - "metadata": {}, - "source": [ - "As we can see, we extracted the required entities and their properties in the required format:" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "id": "f771df58", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[Properties(person_name='Alex', person_height=5, person_hair_color='blonde', dog_breed='labrador', dog_name='Frosty'),\n", - " Properties(person_name='Claudia', person_height=6, person_hair_color='brunette', dog_breed=None, dog_name=None)]" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "chain.run(inp)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0df61283", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.1" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} \ No newline at end of file diff --git a/docs/extras/modules/chains/how_to/async_chain.ipynb b/docs/extras/modules/chains/how_to/async_chain.ipynb index f3e979a3f3..866a4b1c91 100644 --- a/docs/extras/modules/chains/how_to/async_chain.ipynb +++ b/docs/extras/modules/chains/how_to/async_chain.ipynb @@ -9,7 +9,7 @@ "\n", "LangChain provides async support for Chains by leveraging the [asyncio](https://docs.python.org/3/library/asyncio.html) library.\n", "\n", - "Async methods are currently supported in `LLMChain` (through `arun`, `apredict`, `acall`) and `LLMMathChain` (through `arun` and `acall`), `ChatVectorDBChain`, and [QA chains](/docs/modules/chains/additional/question_answering.html). Async support for other chains is on the roadmap." + "Async methods are currently supported in `LLMChain` (through `arun`, `apredict`, `acall`) and `LLMMathChain` (through `arun` and `acall`), `ChatVectorDBChain`, and [QA chains](/docs/use_cases/question_answering/how_to/question_answering.html). Async support for other chains is on the roadmap." ] }, { diff --git a/docs/extras/modules/chains/popular/openai_functions.ipynb b/docs/extras/modules/chains/how_to/openai_functions.ipynb similarity index 97% rename from docs/extras/modules/chains/popular/openai_functions.ipynb rename to docs/extras/modules/chains/how_to/openai_functions.ipynb index da4ed68ced..62e9067beb 100644 --- a/docs/extras/modules/chains/popular/openai_functions.ipynb +++ b/docs/extras/modules/chains/how_to/openai_functions.ipynb @@ -494,9 +494,9 @@ "\n", "There are a number of more specific chains that use OpenAI functions.\n", "- [Extraction](/docs/modules/chains/additional/extraction): very similar to structured output chain, intended for information/entity extraction specifically.\n", - "- [Tagging](/docs/modules/chains/additional/tagging): tag inputs.\n", - "- [OpenAPI](/docs/modules/chains/additional/openapi_openai): take an OpenAPI spec and create + execute valid requests against the API, using OpenAI functions under the hood.\n", - "- [QA with citations](/docs/modules/chains/additional/qa_citations): use OpenAI functions ability to extract citations from text." + "- [Tagging](/docs/use_cases/tagging): tag inputs.\n", + "- [OpenAPI](/docs/use_cases/apis/openapi_openai): take an OpenAPI spec and create + execute valid requests against the API, using OpenAI functions under the hood.\n", + "- [QA with citations](/docs/use_cases/question_answering/how_to/qa_citations): use OpenAI functions ability to extract citations from text." ] } ], diff --git a/docs/extras/use_cases/agents/custom_agent_with_plugin_retrieval.ipynb b/docs/extras/use_cases/agents/custom_agent_with_plugin_retrieval.ipynb index 1a2a76886d..a10ebf7eb3 100644 --- a/docs/extras/use_cases/agents/custom_agent_with_plugin_retrieval.ipynb +++ b/docs/extras/use_cases/agents/custom_agent_with_plugin_retrieval.ipynb @@ -10,7 +10,7 @@ "This notebook combines two concepts in order to build a custom agent that can interact with AI Plugins:\n", "\n", "1. [Custom Agent with Tool Retrieval](/docs/modules/agents/how_to/custom_agent_with_tool_retrieval.html): This introduces the concept of retrieving many tools, which is useful when trying to work with arbitrarily many plugins.\n", - "2. [Natural Language API Chains](/docs/modules/chains/additional/openapi.html): This creates Natural Language wrappers around OpenAPI endpoints. This is useful because (1) plugins use OpenAPI endpoints under the hood, (2) wrapping them in an NLAChain allows the router agent to call it more easily.\n", + "2. [Natural Language API Chains](/docs/use_cases/apis/openapi.html): This creates Natural Language wrappers around OpenAPI endpoints. This is useful because (1) plugins use OpenAPI endpoints under the hood, (2) wrapping them in an NLAChain allows the router agent to call it more easily.\n", "\n", "The novel idea introduced in this notebook is the idea of using retrieval to select not the tools explicitly, but the set of OpenAPI specs to use. We can then generate tools from those OpenAPI specs. The use case for this is when trying to get agents to use plugins. It may be more efficient to choose plugins first, then the endpoints, rather than the endpoints directly. This is because the plugins may contain more useful information for selection." ] diff --git a/docs/extras/use_cases/apis.mdx b/docs/extras/use_cases/apis/index.mdx similarity index 86% rename from docs/extras/use_cases/apis.mdx rename to docs/extras/use_cases/apis/index.mdx index 3500c16477..c5f3c12932 100644 --- a/docs/extras/use_cases/apis.mdx +++ b/docs/extras/use_cases/apis/index.mdx @@ -13,7 +13,7 @@ If you are just getting started, and you have relatively simple apis, you should Chains are a sequence of predetermined steps, so they are good to get started with as they give you more control and let you understand what is happening better. -- [API Chain](/docs/modules/chains/popular/api.html) +- [API Chain](/docs/use_cases/apis/api.html) ## Agents @@ -21,4 +21,4 @@ Agents are more complex, and involve multiple queries to the LLM to understand w The downside of agents are that you have less control. The upside is that they are more powerful, which allows you to use them on larger and more complex schemas. -- [OpenAPI Agent](/docs/modules/agents/toolkits/openapi.html) +- [OpenAPI Agent](/docs/integrations/toolkits/openapi.html) diff --git a/docs/extras/modules/chains/additional/llm_requests.ipynb b/docs/extras/use_cases/apis/llm_requests.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/llm_requests.ipynb rename to docs/extras/use_cases/apis/llm_requests.ipynb diff --git a/docs/extras/modules/chains/additional/openai_openapi.yaml b/docs/extras/use_cases/apis/openai_openapi.yaml similarity index 100% rename from docs/extras/modules/chains/additional/openai_openapi.yaml rename to docs/extras/use_cases/apis/openai_openapi.yaml diff --git a/docs/extras/modules/chains/additional/openapi.ipynb b/docs/extras/use_cases/apis/openapi.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/openapi.ipynb rename to docs/extras/use_cases/apis/openapi.ipynb diff --git a/docs/extras/modules/chains/additional/openapi_openai.ipynb b/docs/extras/use_cases/apis/openapi_openai.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/openapi_openai.ipynb rename to docs/extras/use_cases/apis/openapi_openai.ipynb diff --git a/docs/extras/use_cases/code/index.mdx b/docs/extras/use_cases/code/index.mdx index 30da409f00..985025d852 100644 --- a/docs/extras/use_cases/code/index.mdx +++ b/docs/extras/use_cases/code/index.mdx @@ -2,7 +2,7 @@ sidebar_position: 6 --- -# Code Understanding +# Code understanding Overview diff --git a/docs/extras/modules/chains/additional/cpal.ipynb b/docs/extras/use_cases/code_writing/cpal.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/cpal.ipynb rename to docs/extras/use_cases/code_writing/cpal.ipynb diff --git a/docs/extras/use_cases/code_writing/index.mdx b/docs/extras/use_cases/code_writing/index.mdx new file mode 100644 index 0000000000..218b438515 --- /dev/null +++ b/docs/extras/use_cases/code_writing/index.mdx @@ -0,0 +1,14 @@ +# Code writing + +:::warning +All program-writing chains should be treated as *VERY* experimental and should not be used in any environment where sensitive/important data is stored, as there is arbitrary code execution involved in using these. +::: + +Much like humans, LLMs are great at writing out programs, but not always great at executing them. For example, they can write down complex mathematical equations far better than they can compute the results. In such cases, it is useful to combine an LLM with a program runtime, so that the LLM converts unstructured text to a program and then a simpler tool (like a calculator) actually executes the program. + +In other cases, only a program can be used to access the desired information (e.g., the contents of a directory on your computer). In such cases it is again useful to let an LLM generate the code and a separate tool to execute it. + +import DocCardList from "@theme/DocCardList"; + + + diff --git a/docs/extras/modules/chains/additional/llm_bash.ipynb b/docs/extras/use_cases/code_writing/llm_bash.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/llm_bash.ipynb rename to docs/extras/use_cases/code_writing/llm_bash.ipynb diff --git a/docs/extras/modules/chains/additional/llm_math.ipynb b/docs/extras/use_cases/code_writing/llm_math.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/llm_math.ipynb rename to docs/extras/use_cases/code_writing/llm_math.ipynb diff --git a/docs/extras/modules/chains/additional/llm_symbolic_math.ipynb b/docs/extras/use_cases/code_writing/llm_symbolic_math.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/llm_symbolic_math.ipynb rename to docs/extras/use_cases/code_writing/llm_symbolic_math.ipynb diff --git a/docs/extras/modules/chains/additional/pal.ipynb b/docs/extras/use_cases/code_writing/pal.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/pal.ipynb rename to docs/extras/use_cases/code_writing/pal.ipynb diff --git a/docs/extras/use_cases/extraction.mdx b/docs/extras/use_cases/extraction/index.mdx similarity index 100% rename from docs/extras/use_cases/extraction.mdx rename to docs/extras/use_cases/extraction/index.mdx diff --git a/docs/extras/use_cases/extraction/openai_extraction.ipynb b/docs/extras/use_cases/extraction/openai_extraction.ipynb new file mode 100644 index 0000000000..2d39169dd6 --- /dev/null +++ b/docs/extras/use_cases/extraction/openai_extraction.ipynb @@ -0,0 +1,566 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "6605e7f7", + "metadata": {}, + "source": [ + "# Extraction with OpenAI Functions\n", + "\n", + "The extraction chain uses the OpenAI `functions` parameter to specify a schema to extract entities from a document. This helps us make sure that the model outputs exactly the schema of entities and properties that we want, with their appropriate types.\n", + "\n", + "The extraction chain is to be used when we want to extract several entities with their properties from the same passage (i.e. what people were mentioned in this passage?)" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "34f04daf", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.4) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n", + " warnings.warn(\n" + ] + } + ], + "source": [ + "from langchain.chat_models import ChatOpenAI\n", + "from langchain.chains import create_extraction_chain, create_extraction_chain_pydantic\n", + "from langchain.prompts import ChatPromptTemplate" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "a2648974", + "metadata": {}, + "outputs": [], + "source": [ + "llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")" + ] + }, + { + "cell_type": "markdown", + "id": "5ef034ce", + "metadata": {}, + "source": [ + "## Extracting entities" + ] + }, + { + "cell_type": "markdown", + "id": "78ff9df9", + "metadata": {}, + "source": [ + "To extract entities, we need to create a schema where we specify all the properties we want to find and the type we expect them to have. We can also specify which of these properties are required and which are optional." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "4ac43eba", + "metadata": {}, + "outputs": [], + "source": [ + "schema = {\n", + " \"properties\": {\n", + " \"name\": {\"type\": \"string\"},\n", + " \"height\": {\"type\": \"integer\"},\n", + " \"hair_color\": {\"type\": \"string\"},\n", + " },\n", + " \"required\": [\"name\", \"height\"],\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "640bd005", + "metadata": {}, + "outputs": [], + "source": [ + "inp = \"\"\"\n", + "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", + " \"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "64313214", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_extraction_chain(schema, llm)" + ] + }, + { + "cell_type": "markdown", + "id": "17c48adb", + "metadata": {}, + "source": [ + "As we can see, we extracted the required entities and their properties in the required format (it even calculated Claudia's height before returning!)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "cc5436ed", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'name': 'Alex', 'height': 5, 'hair_color': 'blonde'},\n", + " {'name': 'Claudia', 'height': 6, 'hair_color': 'brunette'}]" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chain.run(inp)" + ] + }, + { + "cell_type": "markdown", + "id": "8d51fcdc", + "metadata": {}, + "source": [ + "## Several entity types" + ] + }, + { + "cell_type": "markdown", + "id": "5813affe", + "metadata": {}, + "source": [ + "Notice that we are using OpenAI functions under the hood and thus the model can only call one function per request (with one, unique schema)" + ] + }, + { + "cell_type": "markdown", + "id": "511b9838", + "metadata": {}, + "source": [ + "If we want to extract more than one entity type, we need to introduce a little hack - we will define our properties with an included entity type. \n", + "\n", + "Following we have an example where we also want to extract dog attributes from the passage. Notice the 'person_' and 'dog_' prefixes we use for each property; this tells the model which entity type the property refers to. In this way, the model can return properties from several entity types in one single call." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "cf243a26", + "metadata": {}, + "outputs": [], + "source": [ + "schema = {\n", + " \"properties\": {\n", + " \"person_name\": {\"type\": \"string\"},\n", + " \"person_height\": {\"type\": \"integer\"},\n", + " \"person_hair_color\": {\"type\": \"string\"},\n", + " \"dog_name\": {\"type\": \"string\"},\n", + " \"dog_breed\": {\"type\": \"string\"},\n", + " },\n", + " \"required\": [\"person_name\", \"person_height\"],\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "52841fb3", + "metadata": {}, + "outputs": [], + "source": [ + "inp = \"\"\"\n", + "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", + "Alex's dog Frosty is a labrador and likes to play hide and seek.\n", + " \"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "93f904ab", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_extraction_chain(schema, llm)" + ] + }, + { + "cell_type": "markdown", + "id": "eb074f7b", + "metadata": {}, + "source": [ + "People attributes and dog attributes were correctly extracted from the text in the same call" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "db3e9e17", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'person_name': 'Alex',\n", + " 'person_height': 5,\n", + " 'person_hair_color': 'blonde',\n", + " 'dog_name': 'Frosty',\n", + " 'dog_breed': 'labrador'},\n", + " {'person_name': 'Claudia',\n", + " 'person_height': 6,\n", + " 'person_hair_color': 'brunette'}]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chain.run(inp)" + ] + }, + { + "cell_type": "markdown", + "id": "0273e0e2", + "metadata": {}, + "source": [ + "## Unrelated entities" + ] + }, + { + "cell_type": "markdown", + "id": "c07b3480", + "metadata": {}, + "source": [ + "What if our entities are unrelated? In that case, the model will return the unrelated entities in different dictionaries, allowing us to successfully extract several unrelated entity types in the same call." + ] + }, + { + "cell_type": "markdown", + "id": "01d98af0", + "metadata": {}, + "source": [ + "Notice that we use `required: []`: we need to allow the model to return **only** person attributes or **only** dog attributes for a single entity (person or dog)" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "id": "e584c993", + "metadata": {}, + "outputs": [], + "source": [ + "schema = {\n", + " \"properties\": {\n", + " \"person_name\": {\"type\": \"string\"},\n", + " \"person_height\": {\"type\": \"integer\"},\n", + " \"person_hair_color\": {\"type\": \"string\"},\n", + " \"dog_name\": {\"type\": \"string\"},\n", + " \"dog_breed\": {\"type\": \"string\"},\n", + " },\n", + " \"required\": [],\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "id": "ad6b105f", + "metadata": {}, + "outputs": [], + "source": [ + "inp = \"\"\"\n", + "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", + "\n", + "Willow is a German Shepherd that likes to play with other dogs and can always be found playing with Milo, a border collie that lives close by.\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "id": "6bfe5a33", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_extraction_chain(schema, llm)" + ] + }, + { + "cell_type": "markdown", + "id": "24fe09af", + "metadata": {}, + "source": [ + "We have each entity in its own separate dictionary, with only the appropriate attributes being returned" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "id": "f6e1fd89", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'person_name': 'Alex', 'person_height': 5, 'person_hair_color': 'blonde'},\n", + " {'person_name': 'Claudia',\n", + " 'person_height': 6,\n", + " 'person_hair_color': 'brunette'},\n", + " {'dog_name': 'Willow', 'dog_breed': 'German Shepherd'},\n", + " {'dog_name': 'Milo', 'dog_breed': 'border collie'}]" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chain.run(inp)" + ] + }, + { + "cell_type": "markdown", + "id": "0ac466d1", + "metadata": {}, + "source": [ + "## Extra info for an entity" + ] + }, + { + "cell_type": "markdown", + "id": "d240ffc1", + "metadata": {}, + "source": [ + "What if.. _we don't know what we want?_ More specifically, say we know a few properties we want to extract for a given entity but we also want to know if there's any extra information in the passage. Fortunately, we don't need to structure everything - we can have unstructured extraction as well. \n", + "\n", + "We can do this by introducing another hack, namely the *extra_info* attribute - let's see an example." + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "id": "f19685f6", + "metadata": {}, + "outputs": [], + "source": [ + "schema = {\n", + " \"properties\": {\n", + " \"person_name\": {\"type\": \"string\"},\n", + " \"person_height\": {\"type\": \"integer\"},\n", + " \"person_hair_color\": {\"type\": \"string\"},\n", + " \"dog_name\": {\"type\": \"string\"},\n", + " \"dog_breed\": {\"type\": \"string\"},\n", + " \"dog_extra_info\": {\"type\": \"string\"},\n", + " },\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "id": "200c3477", + "metadata": {}, + "outputs": [], + "source": [ + "inp = \"\"\"\n", + "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", + "\n", + "Willow is a German Shepherd that likes to play with other dogs and can always be found playing with Milo, a border collie that lives close by.\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "id": "ddad7dc6", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_extraction_chain(schema, llm)" + ] + }, + { + "cell_type": "markdown", + "id": "e5c0dbbc", + "metadata": {}, + "source": [ + "It is nice to know more about Willow and Milo!" + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "id": "c22cfd30", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'person_name': 'Alex', 'person_height': 5, 'person_hair_color': 'blonde'},\n", + " {'person_name': 'Claudia',\n", + " 'person_height': 6,\n", + " 'person_hair_color': 'brunette'},\n", + " {'dog_name': 'Willow',\n", + " 'dog_breed': 'German Shepherd',\n", + " 'dog_extra_information': 'likes to play with other dogs'},\n", + " {'dog_name': 'Milo',\n", + " 'dog_breed': 'border collie',\n", + " 'dog_extra_information': 'lives close by'}]" + ] + }, + "execution_count": 83, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chain.run(inp)" + ] + }, + { + "cell_type": "markdown", + "id": "698b4c4d", + "metadata": {}, + "source": [ + "## Pydantic example" + ] + }, + { + "cell_type": "markdown", + "id": "6504a6d9", + "metadata": {}, + "source": [ + "We can also use a Pydantic schema to choose the required properties and types and we will set as 'Optional' those that are not strictly required.\n", + "\n", + "By using the `create_extraction_chain_pydantic` function, we can send a Pydantic schema as input and the output will be an instantiated object that respects our desired schema. \n", + "\n", + "In this way, we can specify our schema in the same manner that we would a new class or function in Python - with purely Pythonic types." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "6792866b", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import Optional, List\n", + "from pydantic import BaseModel, Field" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "36a63761", + "metadata": {}, + "outputs": [], + "source": [ + "class Properties(BaseModel):\n", + " person_name: str\n", + " person_height: int\n", + " person_hair_color: str\n", + " dog_breed: Optional[str]\n", + " dog_name: Optional[str]" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "8ffd1e57", + "metadata": {}, + "outputs": [], + "source": [ + "chain = create_extraction_chain_pydantic(pydantic_schema=Properties, llm=llm)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "24baa954", + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "inp = \"\"\"\n", + "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n", + "Alex's dog Frosty is a labrador and likes to play hide and seek.\n", + " \"\"\"" + ] + }, + { + "cell_type": "markdown", + "id": "84e0a241", + "metadata": {}, + "source": [ + "As we can see, we extracted the required entities and their properties in the required format:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "f771df58", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[Properties(person_name='Alex', person_height=5, person_hair_color='blonde', dog_breed='labrador', dog_name='Frosty'),\n", + " Properties(person_name='Claudia', person_height=6, person_hair_color='brunette', dog_breed=None, dog_name=None)]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chain.run(inp)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0df61283", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb b/docs/extras/use_cases/graph/graph_arangodb_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb rename to docs/extras/use_cases/graph/graph_arangodb_qa.ipynb diff --git a/docs/extras/modules/chains/additional/graph_cypher_qa.ipynb b/docs/extras/use_cases/graph/graph_cypher_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_cypher_qa.ipynb rename to docs/extras/use_cases/graph/graph_cypher_qa.ipynb diff --git a/docs/extras/modules/chains/additional/graph_hugegraph_qa.ipynb b/docs/extras/use_cases/graph/graph_hugegraph_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_hugegraph_qa.ipynb rename to docs/extras/use_cases/graph/graph_hugegraph_qa.ipynb diff --git a/docs/extras/modules/chains/additional/graph_kuzu_qa.ipynb b/docs/extras/use_cases/graph/graph_kuzu_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_kuzu_qa.ipynb rename to docs/extras/use_cases/graph/graph_kuzu_qa.ipynb diff --git a/docs/extras/modules/chains/additional/graph_nebula_qa.ipynb b/docs/extras/use_cases/graph/graph_nebula_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_nebula_qa.ipynb rename to docs/extras/use_cases/graph/graph_nebula_qa.ipynb diff --git a/docs/extras/modules/chains/additional/graph_qa.ipynb b/docs/extras/use_cases/graph/graph_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_qa.ipynb rename to docs/extras/use_cases/graph/graph_qa.ipynb diff --git a/docs/extras/modules/chains/additional/graph_sparql_qa.ipynb b/docs/extras/use_cases/graph/graph_sparql_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/graph_sparql_qa.ipynb rename to docs/extras/use_cases/graph/graph_sparql_qa.ipynb diff --git a/docs/extras/use_cases/graph/index.mdx b/docs/extras/use_cases/graph/index.mdx new file mode 100644 index 0000000000..a9ae6d95a8 --- /dev/null +++ b/docs/extras/use_cases/graph/index.mdx @@ -0,0 +1,7 @@ +# Analyzing graph data + +Graph databases give us a powerful way to represent and query real-world relationships. There are a number of chains that make it easy to use LLMs to interact with various graph DBs. + +import DocCardList from "@theme/DocCardList"; + + \ No newline at end of file diff --git a/docs/extras/modules/chains/additional/neptune_cypher_qa.ipynb b/docs/extras/use_cases/graph/neptune_cypher_qa.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/neptune_cypher_qa.ipynb rename to docs/extras/use_cases/graph/neptune_cypher_qa.ipynb diff --git a/docs/extras/modules/chains/additional/tot.ipynb b/docs/extras/use_cases/graph/tot.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/tot.ipynb rename to docs/extras/use_cases/graph/tot.ipynb diff --git a/docs/extras/use_cases/question_answering/document-context-aware-QA.ipynb b/docs/extras/use_cases/question_answering/how_to/document-context-aware-QA.ipynb similarity index 99% rename from docs/extras/use_cases/question_answering/document-context-aware-QA.ipynb rename to docs/extras/use_cases/question_answering/how_to/document-context-aware-QA.ipynb index 498639df37..ece1dc235a 100644 --- a/docs/extras/use_cases/question_answering/document-context-aware-QA.ipynb +++ b/docs/extras/use_cases/question_answering/how_to/document-context-aware-QA.ipynb @@ -5,7 +5,7 @@ "id": "88d7cc8c", "metadata": {}, "source": [ - "# Context aware text splitting and QA / Chat\n", + "# Perform context-aware text splitting\n", "\n", "Text splitting for vector storage often uses sentences or other delimiters [to keep related text together](https://www.pinecone.io/learn/chunking-strategies/). \n", "\n", @@ -327,7 +327,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.16" + "version": "3.11.3" }, "vscode": { "interpreter": { diff --git a/docs/extras/modules/chains/additional/flare.ipynb b/docs/extras/use_cases/question_answering/how_to/flare.ipynb similarity index 99% rename from docs/extras/modules/chains/additional/flare.ipynb rename to docs/extras/use_cases/question_answering/how_to/flare.ipynb index 5687e06e2e..3c16bf6955 100644 --- a/docs/extras/modules/chains/additional/flare.ipynb +++ b/docs/extras/use_cases/question_answering/how_to/flare.ipynb @@ -5,7 +5,7 @@ "id": "0f0b9afa", "metadata": {}, "source": [ - "# FLARE\n", + "# Retrieve as you generate with FLARE\n", "\n", "This notebook is an implementation of Forward-Looking Active REtrieval augmented generation (FLARE).\n", "\n", @@ -56,8 +56,7 @@ "source": [ "import os\n", "\n", - "os.environ[\"SERPER_API_KEY\"] = \"\"", - "os.environ[\"OPENAI_API_KEY\"] = \"\"" + "os.environ[\"SERPER_API_KEY\"] = \"\"os.environ[\"OPENAI_API_KEY\"] = \"\"" ] }, { @@ -490,7 +489,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.11.3" } }, "nbformat": 4, diff --git a/docs/extras/modules/chains/additional/hyde.ipynb b/docs/extras/use_cases/question_answering/how_to/hyde.ipynb similarity index 99% rename from docs/extras/modules/chains/additional/hyde.ipynb rename to docs/extras/use_cases/question_answering/how_to/hyde.ipynb index 257fc129ed..c640e61637 100644 --- a/docs/extras/modules/chains/additional/hyde.ipynb +++ b/docs/extras/use_cases/question_answering/how_to/hyde.ipynb @@ -5,7 +5,7 @@ "id": "ccb74c9b", "metadata": {}, "source": [ - "# Hypothetical Document Embeddings\n", + "# Improve document indexing with HyDE\n", "This notebook goes over how to use Hypothetical Document Embeddings (HyDE), as described in [this paper](https://arxiv.org/abs/2212.10496). \n", "\n", "At a high level, HyDE is an embedding technique that takes queries, generates a hypothetical answer, and then embeds that generated document and uses that as the final example. \n", @@ -255,7 +255,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.11.3" }, "vscode": { "interpreter": { diff --git a/docs/extras/use_cases/question_answering/local_retrieval_qa.ipynb b/docs/extras/use_cases/question_answering/how_to/local_retrieval_qa.ipynb similarity index 99% rename from docs/extras/use_cases/question_answering/local_retrieval_qa.ipynb rename to docs/extras/use_cases/question_answering/how_to/local_retrieval_qa.ipynb index 668b58166f..9eea135a66 100644 --- a/docs/extras/use_cases/question_answering/local_retrieval_qa.ipynb +++ b/docs/extras/use_cases/question_answering/how_to/local_retrieval_qa.ipynb @@ -5,7 +5,7 @@ "id": "3ea857b1", "metadata": {}, "source": [ - "# Running LLMs locally\n", + "# Use local LLMs\n", "\n", "The popularity of projects like [PrivateGPT](https://github.com/imartinez/privateGPT), [llama.cpp](https://github.com/ggerganov/llama.cpp), and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the importance of running LLMs locally.\n", "\n", @@ -736,7 +736,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.16" + "version": "3.11.3" } }, "nbformat": 4, diff --git a/docs/extras/modules/chains/additional/qa_citations.ipynb b/docs/extras/use_cases/question_answering/how_to/qa_citations.ipynb similarity index 98% rename from docs/extras/modules/chains/additional/qa_citations.ipynb rename to docs/extras/use_cases/question_answering/how_to/qa_citations.ipynb index 5eaf9e5d2a..5c3ab831ca 100644 --- a/docs/extras/modules/chains/additional/qa_citations.ipynb +++ b/docs/extras/use_cases/question_answering/how_to/qa_citations.ipynb @@ -5,7 +5,7 @@ "id": "9b5c258f", "metadata": {}, "source": [ - "# Question-Answering Citations\n", + "# Cite sources\n", "\n", "This notebook shows how to use OpenAI functions ability to extract citations from text." ] @@ -171,7 +171,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.11.3" } }, "nbformat": 4, diff --git a/docs/extras/modules/chains/additional/vector_db_text_generation.ipynb b/docs/extras/use_cases/question_answering/how_to/vector_db_text_generation.ipynb similarity index 99% rename from docs/extras/modules/chains/additional/vector_db_text_generation.ipynb rename to docs/extras/use_cases/question_answering/how_to/vector_db_text_generation.ipynb index 1ce3d52963..e183b5049c 100644 --- a/docs/extras/modules/chains/additional/vector_db_text_generation.ipynb +++ b/docs/extras/use_cases/question_answering/how_to/vector_db_text_generation.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Vector store-augmented text generation\n", + "# Retrieve from vector stores directly\n", "\n", "This notebook walks through how to use LangChain for text generation over a vector index. This is useful if we want to generate text that is able to draw from a large body of custom text, for example, generating blog posts that have an understanding of previous blog posts written, or product tutorials that can refer to product documentation." ] diff --git a/docs/extras/use_cases/question_answering/index.mdx b/docs/extras/use_cases/question_answering/index.mdx index e3edb77120..d668fd70e5 100644 --- a/docs/extras/use_cases/question_answering/index.mdx +++ b/docs/extras/use_cases/question_answering/index.mdx @@ -2,160 +2,116 @@ sidebar_position: 0 --- -# QA and Chat over Documents +# QA over Documents -Chat and Question-Answering (QA) over `data` are popular LLM use-cases. +## Use case +Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. LLMs, given their proficiency in understanding text, are a great tool for this. -`data` can include many things, including: - -* `Unstructured data` (e.g., PDFs) -* `Structured data` (e.g., SQL) -* `Code` (e.g., Python) - -LangChain supports Chat and QA on various `data` types: - -* See [here](https://python.langchain.com/docs/use_cases/code/) and [here](https://twitter.com/cristobal_dev/status/1675745314592915456?s=20) for `Code` -* See [here](https://python.langchain.com/docs/use_cases/tabular) for `Structured data` - -Below we will review Chat and QA on `Unstructured data`. +In this walkthrough we'll go over how to build a question-answering over documents application using LLMs. Two very related use cases which we cover elsewhere are: +- [QA over structured data](/docs/use_cases/tabular) (e.g., SQL) +- [QA over code](/docs/use_cases/code) (e.g., Python) ![intro.png](/img/qa_intro.png) -`Unstructured data` can be loaded from many sources. - -Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders. - +## Overview +The pipeline for converting raw unstructured data into a QA chain looks like this: +1. `Loading`: First we need to load our data. Unstructured data can be loaded from many sources. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders. Each loader returns data as a LangChain [`Document`](https://docs.langchain.com/docs/components/schema/document). - -`Documents` are turned into a Chat or QA app following the general steps below: - -* `Splitting`: [Text splitters](https://python.langchain.com/docs/modules/data_connection/document_transformers/) break `Documents` into splits of specified size -* `Storage`: Storage (e.g., often a [vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/)) will house [and often embed](https://www.pinecone.io/learn/vector-embeddings/) the splits -* `Retrieval`: The app retrieves splits from storage (e.g., often [with similar embeddings](https://www.pinecone.io/learn/k-nearest-neighbor/) to the input question) -* `Output`: An [LLM](https://python.langchain.com/docs/modules/model_io/models/llms/) produces an answer using a prompt that includes the question and the retrieved splits +2. `Splitting`: [Text splitters](/docs/modules/data_connection/document_transformers/) break `Documents` into splits of specified size +3. `Storage`: Storage (e.g., often a [vectorstore](/docs/modules/data_connection/vectorstores/)) will house [and often embed](https://www.pinecone.io/learn/vector-embeddings/) the splits +4. `Retrieval`: The app retrieves splits from storage (e.g., often [with similar embeddings](https://www.pinecone.io/learn/k-nearest-neighbor/) to the input question) +5. `Generation`: An [LLM](/docs/modules/model_io/models/llms/) produces an answer using a prompt that includes the question and the retrieved data +6. `Conversation` (Extension): Hold a multi-turn conversation by adding [Memory](/docs/modules/memory/) to your QA chain. ![flow.jpeg](/img/qa_flow.jpeg) -## Quickstart - -The above pipeline can be wrapped with a `VectorstoreIndexCreator`. - -In particular: +## Quickstart +To give you a sneak preview, the above pipeline can be all be wrapped in a single object: `VectorstoreIndexCreator`. Suppose we want a QA app over this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/). We can create this in a few lines of code: -* Specify a `Document` loader -* The `splitting`, `storage`, `retrieval`, and `output` generation stages are wrapped - -Let's load this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/) on agents as an example `Document`. - -We have a QA app in a few lines of code. - -Set environment variables and get packages: -```python -pip install openai -pip install chromadb +First set environment variables and install packages: +```bash +pip install openai chromadb export OPENAI_API_KEY="..." ``` -Run: +Then run: ```python from langchain.document_loaders import WebBaseLoader from langchain.indexes import VectorstoreIndexCreator -# Document loader + loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/") -# Index that wraps above steps index = VectorstoreIndexCreator().from_loaders([loader]) -# Question-answering -question = "What is Task Decomposition?" -index.query(question) ``` - - +And now ask your questions: +```python +index.query("What is Task Decomposition?") +``` ' Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done using LLM with simple prompting, task-specific instructions, or human inputs. Tree of Thoughts (Yao et al. 2023) is an example of a task decomposition technique that explores multiple reasoning possibilities at each step and generates multiple thoughts per step, creating a tree structure.' +Ok, but what's going on under the hood, and how could we customize this for our specific use case? For that, let's take a look at how we can construct this pipeline piece by piece. +## Step 1. Load -Of course, some users do not want this level of abstraction. - -Below, we will discuss each stage in more detail. - -## 1. Loading, Splitting, Storage - - - -### 1.1 Getting started - -Specify a `Document` loader. - +Specify a `DocumentLoader` to load in your unstructured data as `Documents`. A `Document` is a piece of text (the `page_content`) and associated metadata. ```python -# Document loader from langchain.document_loaders import WebBaseLoader + loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/") data = loader.load() ``` -Split the `Document` into chunks for embedding and vector storage. +### Go deeper +- Browse the > 120 data loader integrations [here](https://integrations.langchain.com/). +- See further documentation on loaders [here](/docs/modules/data_connection/document_loaders/). + +## Step 2. Split +Split the `Document` into chunks for embedding and vector storage. ```python -# Split from langchain.text_splitter import RecursiveCharacterTextSplitter + text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0) all_splits = text_splitter.split_documents(data) ``` -Embed and store the splits in a vector database ([Chroma](https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/chroma)). - - -```python -# Store -from langchain.vectorstores import Chroma -from langchain.embeddings import OpenAIEmbeddings -vectorstore = Chroma.from_documents(documents=all_splits,embedding=OpenAIEmbeddings()) -``` - -Here are the three pieces together: +### Go deeper -![lc.png](/img/qa_data_load.png) - -### 1.2 Going Deeper - -#### 1.2.1 Integrations - -`Document Loaders` - -* Browse the > 120 data loader integrations [here](https://integrations.langchain.com/). +- `DocumentSplitters` are just one type of the more generic `DocumentTransformers`, which can all be useful in this preprocessing step. +- See further documentation on transformers [here](/docs/modules/data_connection/document_transformers/). +- `Context-aware splitters` keep the location ("context") of each split in the original `Document`: + - [Markdown files](/docs/use_cases/question_answering/document-context-aware-QA) + - [Code (py or js)](/docs/modules/data_connection/document_loaders/integrations/source_code) + - [Documents](/docs/modules/data_connection/document_loaders/integrations/grobid) -* See further documentation on loaders [here](https://python.langchain.com/docs/modules/data_connection/document_loaders/). +## Step 3. Store -`Document Transformers` +To be able to look up our document splits, we first need to store them where we can later look them up. +The most common way to do this is to embed the contents of each document then store the embedding and document in a vector store, with the embedding being used to index the document. -* All can ingest loaded `Documents` and process them (e.g., split). - -* See further documentation on transformers [here](https://python.langchain.com/docs/modules/data_connection/document_transformers/). +```python +from langchain.embeddings import OpenAIEmbeddings +from langchain.vectorstores import Chroma -`Vectorstores` - -* Browse the > 35 vectorstores integrations [here](https://integrations.langchain.com/). - -* See further documentation on vectorstores [here](https://python.langchain.com/docs/modules/data_connection/vectorstores/). - -#### 1.2.2 Retaining metadata +vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings()) +``` -`Context-aware splitters` keep the location ("context") of each split in the original `Document`: +### Go deeper +- Browse the > 40 vectorstores integrations [here](https://integrations.langchain.com/). +- See further documentation on vectorstores [here](/docs/modules/data_connection/vectorstores/). +- Browse the > 30 text embedding integrations [here](https://integrations.langchain.com/). +- See further documentation on embedding models [here](/docs/modules/data_connection/text_embedding/). -* [Markdown files](https://python.langchain.com/docs/use_cases/question_answering/document-context-aware-QA) -* [Code (py or js)](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/source_code) -* [Documents](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/grobid) + Here are Steps 1-3: -## 2. Retrieval +![lc.png](/img/qa_data_load.png) -### 2.1 Getting started - -Retrieve [relevant splits](https://www.pinecone.io/learn/what-is-similarity-search/) for any question using `similarity_search`. +## Step 4. Retrieve +Retrieve relevant splits for any question using [similarity search](https://www.pinecone.io/learn/what-is-similarity-search/). ```python question = "What are the approaches to Task Decomposition?" @@ -163,60 +119,39 @@ docs = vectorstore.similarity_search(question) len(docs) ``` - - - 4 +### Go deeper +Vectorstores are commonly used for retrieval, but they are not the only option. For example, SVMs (see thread [here](https://twitter.com/karpathy/status/1647025230546886658?s=20)) can also be used. -### 2.2 Going Deeper - -#### 2.2.1 Retrieval - -Vectorstores are commonly used for retrieval. - -But, they are not the only option. - -For example, SVMs (see thread [here](https://twitter.com/karpathy/status/1647025230546886658?s=20)) can also be used. - -LangChain [has many retrievers](https://python.langchain.com/docs/modules/data_connection/retrievers/) including, but not limited to, vectorstores. - -All retrievers implement some common methods, such as `get_relevant_documents()`. - +LangChain [has many retrievers](/docs/modules/data_connection/retrievers/) including, but not limited to, vectorstores. All retrievers implement a common method `get_relevant_documents()` (and its asynchronous variant `aget_relevant_documents()`). ```python from langchain.retrievers import SVMRetriever + svm_retriever = SVMRetriever.from_documents(all_splits,OpenAIEmbeddings()) docs_svm=svm_retriever.get_relevant_documents(question) len(docs_svm) ``` - - - 4 - - -#### 2.2.2 Advanced retrieval - -Improve on `similarity_search`: - -* `MultiQueryRetriever` [generates variants of the input question](https://python.langchain.com/docs/modules/data_connection/retrievers/how_to/MultiQueryRetriever) to improve retrieval. - -* `Max marginal relevance` selects for [relevance and diversity](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) among the retrieved documents. - -* Documents can be filtered during retrieval using [`metadata` filters](https://python.langchain.com/docs/use_cases/question_answering/document-context-aware-QA). +Some common ways to improve on vector similarity search include: +- `MultiQueryRetriever` [generates variants of the input question](/docs/modules/data_connection/retrievers/how_to/MultiQueryRetriever) to improve retrieval. +- `Max marginal relevance` selects for [relevance and diversity](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) among the retrieved documents. +- Documents can be filtered during retrieval using [`metadata` filters](/docs/use_cases/question_answering/document-context-aware-QA). ```python -# MultiQueryRetriever import logging + from langchain.chat_models import ChatOpenAI from langchain.retrievers.multi_query import MultiQueryRetriever + logging.basicConfig() logging.getLogger('langchain.retrievers.multi_query').setLevel(logging.INFO) + retriever_from_llm = MultiQueryRetriever.from_llm(retriever=vectorstore.as_retriever(), llm=ChatOpenAI(temperature=0)) unique_docs = retriever_from_llm.get_relevant_documents(query=question) @@ -226,79 +161,48 @@ len(unique_docs) INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be approached?', '2. What are the different methods for Task Decomposition?', '3. What are the various approaches to decomposing tasks?'] 5 +## Step 5. Generate - - -## 3. QA - -### 3.1 Getting started - -Distill the retrieved documents into an answer using an LLM (e.g., `gpt-3.5-turbo`) with `RetrievalQA` chain. - +Distill the retrieved documents into an answer using an LLM/Chat model (e.g., `gpt-3.5-turbo`) with `RetrievalQA` chain. ```python -from langchain.chat_models import ChatOpenAI -llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) from langchain.chains import RetrievalQA -qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever()) -qa_chain({"query": question}) -``` - - - - - {'query': 'What are the approaches to Task Decomposition?', - 'result': 'The approaches to task decomposition include:\n\n1. Simple prompting: This approach involves using simple prompts or questions to guide the agent in breaking down a task into smaller subgoals. For example, the agent can be prompted with "Steps for XYZ" and asked to list the subgoals for achieving XYZ.\n\n2. Task-specific instructions: In this approach, task-specific instructions are provided to the agent to guide the decomposition process. For example, if the task is to write a novel, the agent can be instructed to "Write a story outline" as a subgoal.\n\n3. Human inputs: This approach involves incorporating human inputs in the task decomposition process. Humans can provide guidance, feedback, and suggestions to help the agent break down complex tasks into manageable subgoals.\n\nThese approaches aim to enable efficient handling of complex tasks by breaking them down into smaller, more manageable parts.'} - - - -### 3.2 Going Deeper - -#### 3.2.1 Integrations - -`LLMs` - -* Browse the > 55 LLM integrations [here](https://integrations.langchain.com/). - -* See further documentation on LLMs [here](https://python.langchain.com/docs/modules/model_io/models/). - -#### 3.2.2 Running LLMs locally - -The popularity of [PrivateGPT](https://github.com/imartinez/privateGPT) and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the importance of running LLMs locally. - -LangChain has integrations with many open source LLMs that can be run locally. - -Using `GPT4All` is as simple as [downloading the binary]((https://python.langchain.com/docs/integrations/llms/gpt4all)) and then: - +from langchain.chat_models import ChatOpenAI -```python -from langchain.llms import GPT4All -from langchain.chains import RetrievalQA -llm = GPT4All(model="/Users/rlm/Desktop/Code/gpt4all/models/nous-hermes-13b.ggmlv3.q4_0.bin",max_tokens=2048) +llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever()) -``` - - -```python qa_chain({"query": question}) ``` + { + 'query': 'What are the approaches to Task Decomposition?', + 'result': 'The approaches to task decomposition include:\n\n1. Simple prompting: This approach involves using simple prompts or questions to guide the agent in breaking down a task into smaller subgoals. For example, the agent can be prompted with "Steps for XYZ" and asked to list the subgoals for achieving XYZ.\n\n2. Task-specific instructions: In this approach, task-specific instructions are provided to the agent to guide the decomposition process. For example, if the task is to write a novel, the agent can be instructed to "Write a story outline" as a subgoal.\n\n3. Human inputs: This approach involves incorporating human inputs in the task decomposition process. Humans can provide guidance, feedback, and suggestions to help the agent break down complex tasks into manageable subgoals.\n\nThese approaches aim to enable efficient handling of complex tasks by breaking them down into smaller, more manageable parts.' + } +Note, you can pass in an `LLM` or a `ChatModel` (like we did here) to the `RetrievalQA` chain. +### Go deeper - {'query': 'What are the approaches to Task Decomposition?', - 'result': ' There are three main approaches to task decomposition: (1) using language models like GPT-3 for simple prompting such as "Steps for XYZ.\\n1.", (2) using task-specific instructions, and (3) with human inputs.'} +#### Choosing LLMs +- Browse the > 55 LLM and chat model integrations [here](https://integrations.langchain.com/). +- See further documentation on LLMs and chat models [here](/docs/modules/model_io/models/). +- Use local LLMS: The popularity of [PrivateGPT](https://github.com/imartinez/privateGPT) and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the importance of running LLMs locally. +Using `GPT4All` is as simple as [downloading the binary]((/docs/integrations/llms/gpt4all)) and then: + from langchain.llms import GPT4All + from langchain.chains import RetrievalQA + llm = GPT4All(model="/Users/rlm/Desktop/Code/gpt4all/models/nous-hermes-13b.ggmlv3.q4_0.bin",max_tokens=2048) + qa_chain = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever()) -#### 3.2.2 Customizing the prompt +#### Customizing the prompt The prompt in `RetrievalQA` chain can be easily customized. - ```python -# Build prompt +from langchain.chains import RetrievalQA from langchain.prompts import PromptTemplate + template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Use three sentences maximum and keep the answer as concise as possible. @@ -306,33 +210,28 @@ Always say "thanks for asking!" at the end of the answer. {context} Question: {question} Helpful Answer:""" -QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"],template=template,) +QA_CHAIN_PROMPT = PromptTemplate.from_template(template) -# Run chain -from langchain.chains import RetrievalQA llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) -qa_chain = RetrievalQA.from_chain_type(llm, - retriever=vectorstore.as_retriever(), - chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}) - +qa_chain = RetrievalQA.from_chain_type( + llm, + retriever=vectorstore.as_retriever(), + chain_type_kwargs={"prompt": QA_CHAIN_PROMPT} +) result = qa_chain({"query": question}) result["result"] ``` - - - 'The approaches to Task Decomposition are (1) using simple prompting by LLM, (2) using task-specific instructions, and (3) with human inputs. Thanks for asking!' - -#### 3.2.3 Returning source documents +#### Return source documents The full set of retrieved documents used for answer distillation can be returned using `return_source_documents=True`. - ```python from langchain.chains import RetrievalQA + qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(), return_source_documents=True) result = qa_chain({"query": question}) @@ -345,51 +244,46 @@ result['source_documents'][0] -#### 3.2.4 Citations +#### Return citations Answer citations can be returned using `RetrievalQAWithSourcesChain`. ```python from langchain.chains import RetrievalQAWithSourcesChain + qa_chain = RetrievalQAWithSourcesChain.from_chain_type(llm,retriever=vectorstore.as_retriever()) + result = qa_chain({"question": question}) result ``` + { + 'question': 'What are the approaches to Task Decomposition?', + 'answer': 'The approaches to Task Decomposition include (1) using LLM with simple prompting, (2) using task-specific instructions, and (3) incorporating human inputs.\n', + 'sources': 'https://lilianweng.github.io/posts/2023-06-23-agent/' + } - - - {'question': 'What are the approaches to Task Decomposition?', - 'answer': 'The approaches to Task Decomposition include (1) using LLM with simple prompting, (2) using task-specific instructions, and (3) incorporating human inputs.\n', - 'sources': 'https://lilianweng.github.io/posts/2023-06-23-agent/'} - - - -#### 3.2.5 Customizing retrieved docs in the LLM prompt +#### Customizing retrieved document processing Retrieved documents can be fed to an LLM for answer distillation in a few different ways. -`stuff`, `refine`, `map-reduce`, and `map-rerank` chains for passing documents to an LLM prompt are well summarized [here](https://python.langchain.com/docs/modules/chains/document/). +`stuff`, `refine`, `map-reduce`, and `map-rerank` chains for passing documents to an LLM prompt are well summarized [here](/docs/modules/chains/document/). `stuff` is commonly used because it simply "stuffs" all retrieved documents into the prompt. -The [load_qa_chain](https://python.langchain.com/docs/modules/chains/additional/question_answering.html) is an easy way to pass documents to an LLM using these various approaches (e.g., see `chain_type`). +The [load_qa_chain](/docs/use_cases/question_answering/how_to/question_answering.html) is an easy way to pass documents to an LLM using these various approaches (e.g., see `chain_type`). ```python from langchain.chains.question_answering import load_qa_chain + chain = load_qa_chain(llm, chain_type="stuff") chain({"input_documents": unique_docs, "question": question},return_only_outputs=True) ``` - - - {'output_text': 'The approaches to task decomposition include (1) using simple prompting to break down tasks into subgoals, (2) providing task-specific instructions to guide the decomposition process, and (3) incorporating human inputs for task decomposition.'} - - We can also pass the `chain_type` to `RetrievalQA`. @@ -403,55 +297,46 @@ In summary, the user can choose the desired level of abstraction for QA: ![summary_chains.png](/img/summary_chains.png) -## 4. Chat - -### 4.1 Getting started - -To keep chat history, first specify a `Memory buffer` to track the conversation inputs / outputs. +## Step 6. Converse (Extension) +To hold a conversation, a chain needs to be able to refer to past interactions. Chain `Memory` allows us to do this. To keep chat history, we can specify a Memory buffer to track the conversation inputs / outputs. ```python from langchain.memory import ConversationBufferMemory + memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) ``` The `ConversationalRetrievalChain` uses chat in the `Memory buffer`. - ```python from langchain.chains import ConversationalRetrievalChain -retriever=vectorstore.as_retriever() -chat = ConversationalRetrievalChain.from_llm(llm,retriever=retriever,memory=memory) -``` +retriever = vectorstore.as_retriever() +chat = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory) +``` ```python result = chat({"question": "What are some of the main ideas in self-reflection?"}) result['answer'] ``` - - - "Some of the main ideas in self-reflection include:\n1. Iterative improvement: Self-reflection allows autonomous agents to improve by refining past action decisions and correcting mistakes.\n2. Trial and error: Self-reflection is crucial in real-world tasks where trial and error are inevitable.\n3. Two-shot examples: Self-reflection is created by showing pairs of failed trajectories and ideal reflections for guiding future changes in the plan.\n4. Working memory: Reflections are added to the agent's working memory, up to three, to be used as context for querying.\n5. Performance evaluation: Self-reflection involves continuously reviewing and analyzing actions, self-criticizing behavior, and reflecting on past decisions and strategies to refine approaches.\n6. Efficiency: Self-reflection encourages being smart and efficient, aiming to complete tasks in the least number of steps." - - -The `Memory buffer` has context to resolve `"it"` ("self-reflection") in the below question. - +The Memory buffer has context to resolve `"it"` ("self-reflection") in the below question. ```python result = chat({"question": "How does the Reflexion paper handle it?"}) result['answer'] ``` - - - "The Reflexion paper handles self-reflection by showing two-shot examples to the Learning Language Model (LLM). Each example consists of a failed trajectory and an ideal reflection that guides future changes in the agent's plan. These reflections are then added to the agent's working memory, up to a maximum of three, to be used as context for querying the LLM. This allows the agent to iteratively improve its reasoning skills by refining past action decisions and correcting previous mistakes." +### Go deeper +The [documentation](/docs/use_cases/question_answering/how_to/chat_vector_db) on `ConversationalRetrievalChain` offers a few extensions, such as streaming and source documents. -### 4.2 Going deeper -The [documentation](https://python.langchain.com/docs/modules/chains/popular/chat_vector_db) on `ConversationalRetrievalChain` offers a few extensions, such as streaming and source documents. +## Further reading +- Check out the [How to](/docs/use_cases/question_answer/how_to/) section for all the variations of chains that can be used for QA over docs in different settings. +- Check out the [Integrations-specific](/docs/use_cases/question_answer/integrations/) section for chains that use specific integrations. diff --git a/docs/extras/use_cases/question_answering/integrations/_category_.yml b/docs/extras/use_cases/question_answering/integrations/_category_.yml new file mode 100644 index 0000000000..4a4b0b2f28 --- /dev/null +++ b/docs/extras/use_cases/question_answering/integrations/_category_.yml @@ -0,0 +1 @@ +label: 'Integration-specific' diff --git a/docs/extras/modules/chains/additional/openai_functions_retrieval_qa.ipynb b/docs/extras/use_cases/question_answering/integrations/openai_functions_retrieval_qa.ipynb similarity index 99% rename from docs/extras/modules/chains/additional/openai_functions_retrieval_qa.ipynb rename to docs/extras/use_cases/question_answering/integrations/openai_functions_retrieval_qa.ipynb index b3dcbdee5d..c64c3427f2 100644 --- a/docs/extras/modules/chains/additional/openai_functions_retrieval_qa.ipynb +++ b/docs/extras/use_cases/question_answering/integrations/openai_functions_retrieval_qa.ipynb @@ -5,7 +5,7 @@ "id": "71a43144", "metadata": {}, "source": [ - "# Retrieval QA using OpenAI functions\n", + "# Structure answers with OpenAI functions\n", "\n", "OpenAI functions allows for structuring of response output. This is often useful in question answering when you want to not only get the final answer but also supporting evidence, citations, etc.\n", "\n", @@ -337,7 +337,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "ac9e4626", "metadata": {}, @@ -431,7 +430,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -445,7 +444,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.5" + "version": "3.11.3" } }, "nbformat": 4, diff --git a/docs/extras/use_cases/question_answering/semantic-search-over-chat.ipynb b/docs/extras/use_cases/question_answering/integrations/semantic-search-over-chat.ipynb similarity index 95% rename from docs/extras/use_cases/question_answering/semantic-search-over-chat.ipynb rename to docs/extras/use_cases/question_answering/integrations/semantic-search-over-chat.ipynb index c877a2a524..800866053b 100644 --- a/docs/extras/use_cases/question_answering/semantic-search-over-chat.ipynb +++ b/docs/extras/use_cases/question_answering/integrations/semantic-search-over-chat.ipynb @@ -1,18 +1,16 @@ { "cells": [ { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ - "# Question answering over a group chat messages using Activeloop's DeepLake\n", + "# QA using Activeloop's DeepLake\n", "In this tutorial, we are going to use Langchain + Activeloop's Deep Lake with GPT4 to semantically search and ask questions over a group chat.\n", "\n", "View a working demo [here](https://twitter.com/thisissukh_/status/1647223328363679745)" ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -29,7 +27,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -37,7 +34,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [] @@ -73,7 +69,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -83,7 +78,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -124,7 +118,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -155,7 +148,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -213,7 +205,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.11.3" } }, "nbformat": 4, diff --git a/docs/extras/use_cases/self_check/index.mdx b/docs/extras/use_cases/self_check/index.mdx new file mode 100644 index 0000000000..a424ea4370 --- /dev/null +++ b/docs/extras/use_cases/self_check/index.mdx @@ -0,0 +1,8 @@ +# Self-checking + +One of the main issues with using LLMs is that they can often hallucinate and make false claims. One of the surprisingly effective ways to remediate this is to use the LLM itself to check its own answers. + +import DocCardList from "@theme/DocCardList"; + + + diff --git a/docs/extras/modules/chains/additional/llm_checker.ipynb b/docs/extras/use_cases/self_check/llm_checker.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/llm_checker.ipynb rename to docs/extras/use_cases/self_check/llm_checker.ipynb diff --git a/docs/extras/modules/chains/additional/llm_summarization_checker.ipynb b/docs/extras/use_cases/self_check/llm_summarization_checker.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/llm_summarization_checker.ipynb rename to docs/extras/use_cases/self_check/llm_summarization_checker.ipynb diff --git a/docs/extras/use_cases/summarization.mdx b/docs/extras/use_cases/summarization/index.mdx similarity index 82% rename from docs/extras/use_cases/summarization.mdx rename to docs/extras/use_cases/summarization/index.mdx index 1e9a7e2f49..7f5e97c763 100644 --- a/docs/extras/use_cases/summarization.mdx +++ b/docs/extras/use_cases/summarization/index.mdx @@ -16,7 +16,7 @@ chain.run(docs) ``` The following resources exist: -- [Summarization notebook](/docs/modules/chains/popular/summarize.html): A notebook walking through how to accomplish this task. +- [Summarization notebook](/docs/use_cases/summarization/summarize.html): A notebook walking through how to accomplish this task. Additional related resources include: - [Modules for working with documents](/docs/modules/data_connection): Core components for working with documents. diff --git a/docs/extras/modules/chains/additional/elasticsearch_database.ipynb b/docs/extras/use_cases/tabular/elasticsearch_database.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/elasticsearch_database.ipynb rename to docs/extras/use_cases/tabular/elasticsearch_database.ipynb diff --git a/docs/extras/use_cases/tabular.mdx b/docs/extras/use_cases/tabular/index.mdx similarity index 80% rename from docs/extras/use_cases/tabular.mdx rename to docs/extras/use_cases/tabular/index.mdx index d642307632..497acdc71d 100644 --- a/docs/extras/use_cases/tabular.mdx +++ b/docs/extras/use_cases/tabular/index.mdx @@ -10,7 +10,7 @@ This page covers all resources available in LangChain for working with data in t ## Document loading If you have text data stored in a tabular format, you may want to load the data into a Document and then index it as you would other text/unstructured data. For this, you should use a document loader like the [CSVLoader](/docs/modules/data_connection/document_loaders/how_to/csv.html) -and then you should [create an index](/docs/modules/data_connection) over that data, and [query it that way](/docs/modules/chains/popular/vector_db_qa.html). +and then you should [create an index](/docs/modules/data_connection) over that data, and [query it that way](/docs/use_cases/question_answering/how_to/vector_db_qa.html). ## Querying If you have more numeric tabular data, or have a large amount of data and don't want to index it, you should get started @@ -22,7 +22,7 @@ If you are just getting started, and you have relatively small/simple tabular da Chains are a sequence of predetermined steps, so they are good to get started with as they give you more control and let you understand what is happening better. -- [SQL Database Chain](/docs/modules/chains/popular/sqlite.html) +- [SQL Database Chain](/docs/use_cases/tabular/sqlite.html) ### Agents @@ -30,6 +30,6 @@ Agents are more complex, and involve multiple queries to the LLM to understand w The downside of agents are that you have less control. The upside is that they are more powerful, which allows you to use them on larger databases and more complex schemas. -- [SQL Agent](/docs/modules/agents/toolkits/sql_database.html) -- [Pandas Agent](/docs/modules/agents/toolkits/pandas.html) -- [CSV Agent](/docs/modules/agents/toolkits/csv.html) +- [SQL Agent](/docs/integrations/toolkits/sql_database.html) +- [Pandas Agent](/docs/integrations/toolkits/pandas.html) +- [CSV Agent](/docs/integrations/toolkits/csv.html) diff --git a/docs/extras/modules/chains/additional/tagging.ipynb b/docs/extras/use_cases/tagging.ipynb similarity index 100% rename from docs/extras/modules/chains/additional/tagging.ipynb rename to docs/extras/use_cases/tagging.ipynb diff --git a/docs/snippets/modules/chains/additional/multi_prompt_router.mdx b/docs/snippets/modules/chains/additional/multi_prompt_router.mdx deleted file mode 100644 index 526469814b..0000000000 --- a/docs/snippets/modules/chains/additional/multi_prompt_router.mdx +++ /dev/null @@ -1,107 +0,0 @@ -```python -from langchain.chains.router import MultiPromptChain -from langchain.llms import OpenAI -``` - - -```python -physics_template = """You are a very smart physics professor. \ -You are great at answering questions about physics in a concise and easy to understand manner. \ -When you don't know the answer to a question you admit that you don't know. - -Here is a question: -{input}""" - - -math_template = """You are a very good mathematician. You are great at answering math questions. \ -You are so good because you are able to break down hard problems into their component parts, \ -answer the component parts, and then put them together to answer the broader question. - -Here is a question: -{input}""" -``` - - -```python -prompt_infos = [ - { - "name": "physics", - "description": "Good for answering questions about physics", - "prompt_template": physics_template - }, - { - "name": "math", - "description": "Good for answering math questions", - "prompt_template": math_template - } -] -``` - - -```python -chain = MultiPromptChain.from_prompts(OpenAI(), prompt_infos, verbose=True) -``` - - -```python -print(chain.run("What is black body radiation?")) -``` - - - -``` - - - > Entering new MultiPromptChain chain... - physics: {'input': 'What is black body radiation?'} - > Finished chain. - - - Black body radiation is the emission of electromagnetic radiation from a body due to its temperature. It is a type of thermal radiation that is emitted from the surface of all objects that are at a temperature above absolute zero. It is a spectrum of radiation that is influenced by the temperature of the body and is independent of the composition of the emitting material. -``` - - - - -```python -print(chain.run("What is the first prime number greater than 40 such that one plus the prime number is divisible by 3")) -``` - - - -``` - - - > Entering new MultiPromptChain chain... - math: {'input': 'What is the first prime number greater than 40 such that one plus the prime number is divisible by 3'} - > Finished chain. - ? - - The first prime number greater than 40 such that one plus the prime number is divisible by 3 is 43. To solve this problem, we can break down the question into two parts: finding the first prime number greater than 40, and then finding a number that is divisible by 3. - - The first step is to find the first prime number greater than 40. A prime number is a number that is only divisible by 1 and itself. The next prime number after 40 is 41. - - The second step is to find a number that is divisible by 3. To do this, we can add 1 to 41, which gives us 42. Now, we can check if 42 is divisible by 3. 42 divided by 3 is 14, so 42 is divisible by 3. - - Therefore, the answer to the question is 43. -``` - - - - -```python -print(chain.run("What is the name of the type of cloud that rins")) -``` - - - -``` - - - > Entering new MultiPromptChain chain... - None: {'input': 'What is the name of the type of cloud that rains?'} - > Finished chain. - The type of cloud that typically produces rain is called a cumulonimbus cloud. This type of cloud is characterized by its large vertical extent and can produce thunderstorms and heavy precipitation. Is there anything else you'd like to know? -``` - -