nbdoc -> quarto (#14156)

Switches to a more maintained solution for building ipynb -> md files
(`quarto`)

Also bumps us down to python3.8 because it's significantly faster in the
vercel build step. Uses default openssl version instead of upgrading as
well.
This commit is contained in:
Erick Friis 2023-12-04 12:50:56 -08:00 committed by GitHub
parent eecfa3f9e5
commit f6d68d78f3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
19 changed files with 84 additions and 110 deletions

View File

@ -12,14 +12,12 @@ mkdir -p ../_dist
rsync -ruv . ../_dist
cd ../_dist
poetry run python scripts/model_feat_table.py
poetry run nbdoc_build --srcdir docs --pause 0
mkdir docs/templates
cp ../templates/docs/INDEX.md docs/templates/index.md
cp ../cookbook/README.md src/pages/cookbook.mdx
cp ../.github/CONTRIBUTING.md docs/contributing.md
mkdir -p docs/templates
cp ../templates/docs/INDEX.md docs/templates/index.md
wget https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
poetry run python scripts/generate_api_reference_links.py
yarn install
yarn start
yarn
quarto preview docs

View File

@ -10,7 +10,7 @@
"title: Why use LCEL\n",
"---\n",
"\n",
"import { ColumnContainer, Column } from '@theme/Columns';"
"{ import { ColumnContainer, Column } from \"@theme/Columns\"; }"
]
},
{
@ -18,7 +18,8 @@
"id": "919a5ae2-ed21-4923-b98f-723c111bac67",
"metadata": {},
"source": [
":::tip We recommend reading the LCEL [Get started](/docs/expression_language/get_started) section first.\n",
":::tip \n",
"We recommend reading the LCEL [Get started](/docs/expression_language/get_started) section first.\n",
":::"
]
},
@ -62,11 +63,12 @@
"In the simplest case, we just want to pass in a topic string and get back a joke string:\n",
"\n",
"<ColumnContainer>\n",
"\n",
"<Column>\n",
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -76,6 +78,7 @@
"metadata": {},
"outputs": [],
"source": [
"\n",
"from typing import List\n",
"\n",
"import openai\n",
@ -111,7 +114,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -156,7 +159,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -201,7 +204,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -233,7 +236,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -265,7 +268,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -296,7 +299,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -337,7 +340,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>\n",
"<div style=\"zoom:80%\">\n",
"\n",
"```python\n",
"chain.ainvoke(\"ice cream\")\n",
@ -362,7 +365,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -398,7 +401,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -439,7 +442,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -481,7 +484,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -522,7 +525,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -607,7 +610,7 @@
"\n",
"#### With LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -677,7 +680,7 @@
"\n",
"We'll `print` intermediate steps for illustrative purposes\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -711,7 +714,7 @@
"#### LCEL\n",
"Every component has built-in integrations with LangSmith. If we set the following two environment variables, all chain traces are logged to LangSmith.\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -757,7 +760,7 @@
"#### Without LCEL\n",
"\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -804,7 +807,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -845,7 +848,7 @@
"\n",
"#### Without LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{
@ -1029,7 +1032,7 @@
"\n",
"#### LCEL\n",
"\n",
"<div style={{ zoom: \"80%\" }}>"
"<div style=\"zoom:80%\">"
]
},
{

View File

@ -12,7 +12,7 @@ Platforms with tracing capabilities like [LangSmith](/docs/langsmith/) and [Wand
For anyone building production-grade LLM applications, we highly recommend using a platform like this.
![LangSmith run](/img/run_details.png)
![LangSmith run](../../static/img/run_details.png)
## `set_debug` and `set_verbose`

View File

@ -32,7 +32,7 @@
"1. `Base model`: What is the base-model and how was it trained?\n",
"2. `Fine-tuning approach`: Was the base-model fine-tuned and, if so, what [set of instructions](https://cameronrwolfe.substack.com/p/beyond-llama-the-power-of-open-llms#%C2%A7alpaca-an-instruction-following-llama-model) was used?\n",
"\n",
"![Image description](/img/OSS_LLM_overview.png)\n",
"![Image description](../../static/img/OSS_LLM_overview.png)\n",
"\n",
"The relative performance of these models can be assessed using several leaderboards, including:\n",
"\n",
@ -55,7 +55,7 @@
"\n",
"In particular, see [this excellent post](https://finbarr.ca/how-is-llama-cpp-possible/) on the importance of quantization.\n",
"\n",
"![Image description](/img/llama-memory-weights.png)\n",
"![Image description](../../static/img/llama-memory-weights.png)\n",
"\n",
"With less precision, we radically decrease the memory needed to store the LLM in memory.\n",
"\n",
@ -63,7 +63,7 @@
"\n",
"A Mac M2 Max is 5-6x faster than a M1 for inference due to the larger GPU memory bandwidth.\n",
"\n",
"![Image description](/img/llama_t_put.png)\n",
"![Image description](../../static/img/llama_t_put.png)\n",
"\n",
"## Quickstart\n",
"\n",

View File

@ -9,7 +9,7 @@
"\n",
"The map reduce documents chain first applies an LLM chain to each document individually (the Map step), treating the chain output as a new document. It then passes all the new documents to a separate combine documents chain to get a single output (the Reduce step). It can optionally first compress, or collapse, the mapped documents to make sure that they fit in the combine documents chain (which will often pass them to an LLM). This compression step is performed recursively if necessary.\n",
"\n",
"![map_reduce_diagram](/img/map_reduce.jpg)"
"![map_reduce_diagram](../../../../static/img/map_reduce.jpg)"
]
},
{

View File

@ -9,7 +9,7 @@
"\n",
"The map re-rank documents chain runs an initial prompt on each document, that not only tries to complete a task but also gives a score for how certain it is in its answer. The highest scoring response is returned.\n",
"\n",
"![map_rerank_diagram](/img/map_rerank.jpg)"
"![map_rerank_diagram](../../../../static/img/map_rerank.jpg)"
]
},
{

View File

@ -24,7 +24,7 @@
"The obvious tradeoff is that this chain will make far more LLM calls than, for example, the Stuff documents chain.\n",
"There are also certain tasks which are difficult to accomplish iteratively. For example, the Refine chain can perform poorly when documents frequently cross-reference one another or when a task requires detailed information from many documents.\n",
"\n",
"![refine_diagram](/img/refine.jpg)\n"
"![refine_diagram](../../../../static/img/refine.jpg)\n"
]
},
{

View File

@ -20,7 +20,7 @@
"\n",
"This chain is well-suited for applications where documents are small and only a few are passed in for most calls.\n",
"\n",
"![stuff_diagram](/img/stuff.jpg)"
"![stuff_diagram](../../../../static/img/stuff.jpg)"
]
},
{

View File

@ -34,7 +34,7 @@
"* `Functions`: For example, [OpenAI functions](https://platform.openai.com/docs/guides/gpt/function-calling) is one popular means of doing this.\n",
"* `LLM-generated interface`: Use an LLM with access to API documentation to create an interface.\n",
"\n",
"![Image description](/img/api_use_case.png)"
"![Image description](../../static/img/api_use_case.png)"
]
},
{
@ -188,7 +188,7 @@
" }\n",
" ```\n",
" \n",
"![Image description](/img/api_function_call.png)\n",
"![Image description](../../static/img/api_function_call.png)\n",
" \n",
"* This `Dict` above split and the [API is called here](https://github.com/langchain-ai/langchain/blob/7fc07ba5df99b9fa8bef837b0fafa220bc5c932c/libs/langchain/langchain/chains/openai_functions/openapi.py#L215)."
]
@ -293,12 +293,12 @@
"\n",
"* The `api_request_chain` produces the API url from our question and the API documentation:\n",
"\n",
"![Image description](/img/api_chain.png)\n",
"![Image description](../../static/img/api_chain.png)\n",
"\n",
"* [Here](https://github.com/langchain-ai/langchain/blob/bbd22b9b761389a5e40fc45b0570e1830aabb707/libs/langchain/langchain/chains/api/base.py#L82) we make the API request with the API url.\n",
"* The `api_answer_chain` takes the response from the API and provides us with a natural language response:\n",
"\n",
"![Image description](/img/api_chain_response.png)"
"![Image description](../../static/img/api_chain_response.png)"
]
},
{

View File

@ -30,7 +30,7 @@
"id": "56615b45",
"metadata": {},
"source": [
"![Image description](/img/chat_use_case.png)"
"![Image description](../../static/img/chat_use_case.png)"
]
},
{
@ -546,7 +546,7 @@
"source": [
"We can see the chat history preserved in the prompt using the [LangSmith trace](https://smith.langchain.com/public/dce34c57-21ca-4283-9020-a8e0d78a59de/r).\n",
"\n",
"![Image description](/img/chat_use_case_2.png)"
"![Image description](../../static/img/chat_use_case_2.png)"
]
},
{

View File

@ -34,7 +34,7 @@
"id": "178dbc59",
"metadata": {},
"source": [
"![Image description](/img/extraction.png)"
"![Image description](../../static/img/extraction.png)"
]
},
{
@ -139,7 +139,7 @@
"\n",
"The [LangSmith trace](https://smith.langchain.com/public/72bc3205-7743-4ca6-929a-966a9d4c2a77/r) shows that we call the function `information_extraction` on the input string, `inp`.\n",
"\n",
"![Image description](/img/extraction_trace_function.png)\n",
"![Image description](../../static/img/extraction_trace_function.png)\n",
"\n",
"This `information_extraction` function is defined [here](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chains/openai_functions/extraction.py) and returns a dict.\n",
"\n",
@ -497,7 +497,7 @@
"source": [
"We can see from the [LangSmith trace](https://smith.langchain.com/public/8e3aa858-467e-46a5-aa49-5db65f0a2b9a/r) that we get the same output as above.\n",
"\n",
"![Image description](/img/extraction_trace_function_2.png)\n",
"![Image description](../../static/img/extraction_trace_function_2.png)\n",
"\n",
"We can see that we provide a two-shot prompt in order to instruct the LLM to output in our desired format.\n",
"\n",
@ -577,7 +577,7 @@
"\n",
"We can look at the [LangSmith trace](https://smith.langchain.com/public/69f11d41-41be-4319-93b0-6d0eda66e969/r) to see exactly what is going on under the hood.\n",
"\n",
"![Image description](/img/extraction_trace_joke.png)\n",
"![Image description](../../static/img/extraction_trace_joke.png)\n",
"\n",
"### Going deeper\n",
"\n",
@ -587,6 +587,12 @@
"* [JSONFormer](/docs/integrations/llms/jsonformer_experimental) offers another way for structured decoding of a subset of the JSON Schema.\n",
"* [Kor](https://eyurtsev.github.io/kor/) is another library for extraction where schema and examples can be provided to the LLM."
]
},
{
"cell_type": "markdown",
"id": "aab95ecf",
"metadata": {},
"source": []
}
],
"metadata": {

View File

@ -40,7 +40,7 @@
"2. `Query a SQL database` using chains for query creation and execution\n",
"3. `Interact with a SQL database` using agents for robust and flexible querying \n",
"\n",
"![sql_usecase.png](/img/sql_usecase.png)\n",
"![sql_usecase.png](../../../static/img/sql_usecase.png)\n",
"\n",
"## Quickstart\n",
"\n",
@ -240,7 +240,7 @@
"* Followed by three example rows in a `SELECT` statement\n",
"\n",
"`create_sql_query_chain` adopts this the best practice (see more in this [blog](https://blog.langchain.dev/llms-and-sql/)). \n",
"![sql_usecase.png](/img/create_sql_query_chain.png)\n",
"![sql_usecase.png](../../../static/img/create_sql_query_chain.png)\n",
"\n",
"**Improvements**\n",
"\n",
@ -397,7 +397,7 @@
"\n",
"* Then, it executes the query and passes the results to an LLM for synthesis.\n",
"\n",
"![sql_usecase.png](/img/sqldbchain_trace.png)\n",
"![sql_usecase.png](../../../static/img/sqldbchain_trace.png)\n",
"\n",
"**Improvements**\n",
"\n",
@ -661,7 +661,7 @@
"\n",
"* It finally executes the generated query using tool `sql_db_query`\n",
"\n",
"![sql_usecase.png](/img/SQLDatabaseToolkit.png)"
"![sql_usecase.png](../../../static/img/SQLDatabaseToolkit.png)"
]
},
{

View File

@ -24,7 +24,7 @@
"- Using LLMs for suggesting refactors or improvements\n",
"- Using LLMs for documenting the code\n",
"\n",
"![Image description](/img/code_understanding.png)\n",
"![Image description](../../../static/img/code_understanding.png)\n",
"\n",
"## Overview\n",
"\n",
@ -339,7 +339,7 @@
"* In particular, the code well structured and kept together in the retrieval output\n",
"* The retrieved code and chat history are passed to the LLM for answer distillation\n",
"\n",
"![Image description](/img/code_retrieval.png)"
"![Image description](../../../static/img/code_retrieval.png)"
]
},
{

View File

@ -58,13 +58,13 @@
"2. **Split**: [Text splitters](/docs/modules/data_connection/document_transformers/) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't in a model's finite context window.\n",
"3. **Store**: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a [VectorStore](/docs/modules/data_connection/vectorstores/) and [Embeddings](/docs/modules/data_connection/text_embedding/) model.\n",
"\n",
"![index_diagram](/img/rag_indexing.png)\n",
"![index_diagram](../../../static/img/rag_indexing.png)\n",
"\n",
"#### Retrieval and generation\n",
"4. **Retrieve**: Given a user input, relevant splits are retrieved from storage using a [Retriever](/docs/modules/data_connection/retrievers/).\n",
"5. **Generate**: A [ChatModel](/docs/modules/model_io/chat_models) / [LLM](/docs/modules/model_io/llms/) produces an answer using a prompt that includes the question and the retrieved data\n",
"\n",
"![retrieval_diagram](/img/rag_retrieval_generation.png)"
"![retrieval_diagram](../../../static/img/rag_retrieval_generation.png)"
]
},
{

View File

@ -32,7 +32,7 @@
"id": "8e233997",
"metadata": {},
"source": [
"![Image description](/img/summarization_use_case_1.png)"
"![Image description](../../static/img/summarization_use_case_1.png)"
]
},
{
@ -56,7 +56,7 @@
"id": "08ec66bc",
"metadata": {},
"source": [
"![Image description](/img/summarization_use_case_2.png)"
"![Image description](../../static/img/summarization_use_case_2.png)"
]
},
{
@ -514,7 +514,7 @@
"* The blog post and associated [repo](https://github.com/mendableai/QA_clustering) also introduce clustering as a means of summarization.\n",
"* This opens up a third path beyond the `stuff` or `map-reduce` approaches that is worth considering.\n",
"\n",
"![Image description](/img/summarization_use_case_3.png)"
"![Image description](../../static/img/summarization_use_case_3.png)"
]
},
{

View File

@ -28,7 +28,7 @@
"- covered topics\n",
"- political tendency\n",
"\n",
"![Image description](/img/tagging.png)\n",
"![Image description](../../static/img/tagging.png)\n",
"\n",
"## Overview\n",
"\n",
@ -293,7 +293,7 @@
"* As with [extraction](/docs/use_cases/extraction), we call the `information_extraction` function [here](https://github.com/langchain-ai/langchain/blob/269f85b7b7ffd74b38cd422d4164fc033388c3d0/libs/langchain/langchain/chains/openai_functions/extraction.py#L20) on the input string.\n",
"* This OpenAI function extraction information based upon the provided schema.\n",
"\n",
"![Image description](/img/tagging_trace.png)"
"![Image description](../../static/img/tagging_trace.png)"
]
},
{

View File

@ -25,7 +25,7 @@
"* Users have [highlighted it](https://twitter.com/GregKamradt/status/1679913813297225729?s=20) as one of his top desired AI tools. \n",
"* OSS repos like [gpt-researcher](https://github.com/assafelovic/gpt-researcher) are growing in popularity. \n",
" \n",
"![Image description](/img/web_scraping.png)\n",
"![Image description](../../static/img/web_scraping.png)\n",
" \n",
"## Overview\n",
"\n",
@ -443,7 +443,7 @@
"source": [
"We can compare the headlines scraped to the page:\n",
"\n",
"![Image description](/img/wsj_page.png)\n",
"![Image description](../../static/img/wsj_page.png)\n",
"\n",
"Looking at the [LangSmith trace](https://smith.langchain.com/public/c3070198-5b13-419b-87bf-3821cdf34fa6/r), we can see what is going on under the hood:\n",
"\n",
@ -463,7 +463,7 @@
"\n",
"We can automate the process of [web research](https://blog.langchain.dev/automating-web-research/) using a retriever, such as the `WebResearchRetriever` ([docs](https://python.langchain.com/docs/modules/data_connection/retrievers/web_research)).\n",
"\n",
"![Image description](/img/web_research.png)\n",
"![Image description](../../static/img/web_research.png)\n",
"\n",
"Copy requirements [from here](https://github.com/langchain-ai/web-explorer/blob/main/requirements.txt):\n",
"\n",

View File

@ -1,57 +1,24 @@
#!/bin/bash
version_compare() {
local v1=(${1//./ })
local v2=(${2//./ })
for i in {0..2}; do
if (( ${v1[i]} < ${v2[i]} )); then
return 1
fi
done
return 0
}
yum -y update
yum install gcc bzip2-devel libffi-devel zlib-devel wget tar gzip -y
amazon-linux-extras install python3.8 -y
openssl_version=$(openssl version | awk '{print $2}')
required_openssl_version="1.1.1"
# install quarto
wget -q https://github.com/quarto-dev/quarto-cli/releases/download/v1.3.450/quarto-1.3.450-linux-amd64.tar.gz
tar -xzf quarto-1.3.450-linux-amd64.tar.gz
export PATH=$PATH:$(pwd)/quarto-1.3.450/bin/
python_version=$(python3 --version 2>&1 | awk '{print $2}')
required_python_version="3.10"
echo "OpenSSL Version"
echo $openssl_version
echo "Python Version"
echo $python_version
# If openssl version is less than 1.1.1 AND python version is less than 3.10
if ! version_compare $openssl_version $required_openssl_version && ! version_compare $python_version $required_python_version; then
### See: https://github.com/urllib3/urllib3/issues/2168
# Requests lib breaks for old SSL versions,
# which are defaults on Amazon Linux 2 (which Vercel uses for builds)
yum -y update
yum remove openssl-devel -y
yum install gcc bzip2-devel libffi-devel zlib-devel wget tar -y
yum install openssl11 -y
yum install openssl11-devel -y
wget https://www.python.org/ftp/python/3.11.4/Python-3.11.4.tgz
tar xzf Python-3.11.4.tgz
cd Python-3.11.4
./configure
make altinstall
echo "Python Version"
python3.11 --version
cd ..
fi
python3.11 -m venv .venv
python3.8 -m venv .venv
source .venv/bin/activate
python3.11 -m pip install --upgrade pip
python3.11 -m pip install -r vercel_requirements.txt
python3.11 scripts/model_feat_table.py
python3.8 -m pip install --upgrade pip
python3.8 -m pip install -r vercel_requirements.txt
python3.8 scripts/model_feat_table.py
mkdir docs/templates
cp ../templates/docs/INDEX.md docs/templates/index.md
python3.11 scripts/copy_templates.py
python3.8 scripts/copy_templates.py
cp ../cookbook/README.md src/pages/cookbook.mdx
cp ../.github/CONTRIBUTING.md docs/contributing.md
wget https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
nbdoc_build --srcdir docs --pause 0
python3.11 scripts/generate_api_reference_links.py
wget -q https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
quarto render docs/

View File

@ -1,3 +1,3 @@
-e ../libs/langchain
-e ../libs/core
nbdoc
urllib3==1.26.18