Minor update to Nomic cookbook (#16886)
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
2024-02-01 19:28:58 +00:00
|
|
|
{
|
|
|
|
"cells": [
|
|
|
|
{
|
|
|
|
"attachments": {
|
|
|
|
"4015a2e2-3400-4539-bd93-0d987ec5a44e.png": {
|
|
|
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAABLoAAAPKCAYAAACTHTMvAAAMP2lDQ1BJQ0MgUHJvZmlsZQAASImVVwdYU8kWnluSkEBCCSAgJfQmCEgJICWEFkB6EWyEJEAoMQaCiB1dVHDtYgEbuiqi2AGxI3YWwd4XRRSUdbFgV96kgK77yvfO9829//3nzH/OnDu3DADqp7hicQ6qAUCuKF8SGxLAGJucwiB1AwTggAYIgMDl5YlZ0dERANrg+e/27ib0hnbNQab1z/7/app8QR4PACQa4jR+Hi8X4kMA4JU8sSQfAKKMN5+aL5Zh2IC2BCYI8UIZzlDgShlOU+B9cp/4WDbEzQCoqHG5kgwAaG2QZxTwMqAGrQ9iJxFfKAJAnQGxb27uZD7EqRDbQB8xxDJ9ZtoPOhl/00wb0uRyM4awYi5yUwkU5olzuNP+z3L8b8vNkQ7GsIJNLVMSGiubM6zb7ezJ4TKsBnGvKC0yCmItiD8I+XJ/iFFKpjQ0QeGPGvLy2LBmQBdiJz43MBxiQ4iDRTmREUo+LV0YzIEYrhC0UJjPiYdYD+KFgrygOKXPZsnkWGUstC5dwmYp+QtciTyuLNZDaXYCS6n/OlPAUepjtKLM+CSIKRBbFAgTIyGmQeyYlx0XrvQZXZTJjhz0kUhjZflbQBwrEIUEKPSxgnRJcKzSvzQ3b3C+2OZMISdSiQ/kZ8aHKuqDNfO48vzhXLA2gYiVMKgjyBsbMTgXviAwSDF3rFsgSohT6nwQ5wfEKsbiFHFOtNIfNxPkhMh4M4hd8wrilGPxxHy4IBX6eLo4PzpekSdelMUNi1bkgy8DEYANAgEDSGFLA5NBFhC29tb3witFTzDgAgnIAALgoGQGRyTJe0TwGAeKwJ8QCUDe0LgAea8AFED+6xCrODqAdHlvgXxENngKcS4IBznwWiofJRqKlgieQEb4j+hc2Hgw3xzYZP3/nh9kvzMsyEQoGelgRIb6oCcxiBhIDCUGE21xA9wX98Yj4NEfNheciXsOzuO7P+EpoZ3wmHCD0EG4M0lYLPkpyzGgA+oHK2uR9mMtcCuo6YYH4D5QHSrjurgBcMBdYRwW7gcju0GWrcxbVhXGT9p/m8EPd0PpR3Yio+RhZH+yzc8jaXY0tyEVWa1/rI8i17SherOHen6Oz/6h+nx4Dv/ZE1uIHcTOY6exi9gxrB4wsJNYA9aCHZfhodX1RL66BqPFyvPJhjrCf8QbvLOySuY51Tj1OH1R9OULCmXvaMCeLJ4mEWZk5jNY8IsgYHBEPMcRDBcnF1cAZN8XxevrTYz8u4Hotnzn5v0BgM/JgYGBo9+5sJMA7PeAj/+R75wNE346VAG4cIQnlRQoOFx2IMC3hDp80vSBMTAHNnA+LsAdeAN/EATCQBSIB8lgIsw+E65zCZgKZoC5oASUgWVgNVgPNoGtYCfYAw6AenAMnAbnwGXQBm6Ae3D1dIEXoA+8A58RBCEhVISO6CMmiCVij7ggTMQXCUIikFgkGUlFMhARIkVmIPOQMmQFsh7ZglQj+5EjyGnkItKO3EEeIT3Ia+QTiqFqqDZqhFqhI1EmykLD0Xh0ApqBTkGL0PnoEnQtWoXuRuvQ0+hl9Abagb5A+zGAqWK6mCnmgDExNhaFpWDpmASbhZVi5VgVVos1wvt8DevAerGPOBGn4wzcAa7gUDwB5+FT8Fn4Ynw9vhOvw5vxa/gjvA//RqASDAn2BC8ChzCWkEGYSighlBO2Ew4TzsJnqYvwjkgk6hKtiR7wWUwmZhGnExcTNxD3Ek8R24mdxH4SiaRPsif5kKJIXFI+qYS0jrSbdJJ0ldRF+qCiqmKi4qISrJKiIlIpVilX2aVyQuWqyjOVz2QNsiXZixxF5pOnkZeSt5EbyVfIXeTPFE2KNcWHEk/JosylrKXUUs5S7lPeqKqqmql6qsaoClXnqK5V3ad6QfWR6kc1LTU7NbbaeDWp2hK1HWqn1O6ovaFSqVZUf2oKNZ+6hFpNPUN9SP1Ao9McaRwanzabVkGro12lvVQnq1uqs9Qnqhepl6sfVL+i3qtB1rDSYGtwNWZpVGgc0bil0a9J13TWjNLM1VysuUvzoma3FknLSitIi681X2ur1hmtTjpGN6ez6Tz6PPo2+ll6lzZR21qbo52lXaa9R7tVu09HS8dVJ1GnUKdC57hOhy6ma6XL0c3RXap7QPem7qdhRsNYwwTDFg2rHXZ12Hu94Xr+egK9Ur29ejf0Pukz9IP0s/WX69frPzDADewMYgymGmw0OGvQO1x7uPdw3vDS4QeG3zVEDe0MYw2nG241bDHsNzI2CjESG60zOmPUa6xr7G+cZbzK+IRxjwndxNdEaLLK5KTJc4YOg8XIYaxlNDP6TA1NQ02lpltMW00/m1mbJZgVm+01e2BOMWeap5uvMm8y77MwsRhjMcOixuKuJdmSaZlpucbyvOV7K2urJKsFVvVW3dZ61hzrIusa6/s2VBs/myk2VTbXbYm2TNts2w22bXaonZtdpl2F3RV71N7dXmi/wb59BGGE5wjRiKoRtxzUHFgOBQ41Do8cdR0jHIsd6x1fjrQYmTJy+cjzI785uTnlOG1zuues5RzmXOzc6Pzaxc6F51Lhcn0UdVTwqNmjGka9crV3FbhudL3tRncb47bArcntq7uHu8S91r3Hw8Ij1aPS4xZTmxnNXMy84EnwDPCc7XnM86OXu1e+1wGvv7wdvLO9d3l3j7YeLRi9bXSnj5kP12eLT4cvwzfVd7Nvh5+pH9evyu+xv7k/33+7/zOWLSuLtZv1MsApQBJwOOA924s9k30qEAsMCSwNbA3SCkoIWh/0MNgsOCO4JrgvxC1kesipUEJoeOjy0FscIw6PU83pC/MImxnWHK4WHhe+PvxxhF2EJKJxDDombMzKMfcjLSNFkfVRIIoTtTLqQbR19JToozHEmOiYipinsc6xM2LPx9HjJsXtinsXHxC/NP5egk2CNKEpUT1xfGJ14vukwKQVSR1jR46dOfZyskGyMLkhhZSSmLI9pX9c0LjV47rGu40vGX9zgvWEwgkXJxpMzJl4fJL6JO6kg6mE1KTUXalfuFHcKm5/GietMq2Px+at4b3g+/NX8XsEPoIVgmfpPukr0rszfDJWZvRk+mWWZ/YK2cL1wldZoVmbst5nR2XvyB7IScrZm6uSm5p7RKQlyhY1TzaeXDi5XWwvLhF3TPGasnpKnyRcsj0PyZuQ15CvDX/kW6Q20l+kjwp8CyoKPkxNnHqwULNQVNgyzW7aomnPioKLfpuOT+dNb5phOmPujEczWTO3zEJmpc1qmm0+e/7srjkhc3bOpczNnvt7sVPxiuK385LmNc43mj9nfucvIb/UlNBKJCW3Fngv2LQQXyhc2Lpo1KJ1i76V8ksvlTmVlZd9WcxbfOlX51/X/jqwJH1J61L3pRuXEZeJlt1c7rd85wrNFUUrOleOWVm3irGqdNXb1ZNWXyx3Ld+0hrJGuqZjbcTahnUW65at+7I+c/2NioCKvZWGlYsq32/gb7i60X9j7SajTWWbPm0Wbr69JWRLXZVVVflW4taCrU+3JW47/xvzt+rtBtvLtn/dIdrRsTN2Z3O1R3X1LsNdS2vQGmlNz+7xu9v2BO5pqHWo3bJXd2/ZPrBPuu/5/tT9Nw+EH2g6yDxYe8jyUOVh+uHSOqRuWl1ffWZ9R0NyQ/uRsCNNjd6Nh486Ht1xzPRYxXGd40tPUE7MPzFwsuhk/ynxqd7TGac7myY13Tsz9sz15pjm1rPhZy+cCz535jzr/MkLPheOXfS6eOQS81L9ZffLdS1uLYd/d/v9cKt7a90VjysNbZ5tje2j209c9bt6+lrgtXPXOdcv34i80X4z4ebtW+Nvddzm3+6+k3Pn1d2Cu5/vzblPuF/6QONB+UPDh1V/2P6xt8O94/ijwEctj+Me3+vkdb54kvfkS9f8p9Sn5c9MnlV3u3Qf6wnuaXs+7nnXC/GLz70lf2r+WfnS5uWhv/z/aukb29f1SvJq4PXiN/pvdrx1fdvUH93/8F3uu8/vSz/of9j5kfnx/KekT88+T/1C+rL2q+3Xxm/h
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "d8da6094-30c7-43f3-a608-c91717b673db",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"# Nomic Embeddings\n",
|
|
|
|
"\n",
|
|
|
|
"Nomic has released a new embedding model with strong performance for long context retrieval (8k context window).\n",
|
|
|
|
"\n",
|
|
|
|
"The cookbook walks through the process of building and deploying (via LangServe) a RAG app using Nomic embeddings.\n",
|
|
|
|
"\n",
|
|
|
|
"![Screenshot 2024-02-01 at 9.14.15 AM.png](attachment:4015a2e2-3400-4539-bd93-0d987ec5a44e.png)\n",
|
|
|
|
"\n",
|
|
|
|
"## Signup\n",
|
|
|
|
"\n",
|
|
|
|
"Get your API token, then run:\n",
|
|
|
|
"```\n",
|
|
|
|
"! nomic login\n",
|
|
|
|
"```\n",
|
|
|
|
"\n",
|
|
|
|
"Then run with your generated API token \n",
|
|
|
|
"```\n",
|
|
|
|
"! nomic login < token > \n",
|
|
|
|
"```"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "f737ec15-e9ab-4629-b54c-24be69e8b60b",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"! nomic login"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "8ab7434a-2930-42b5-9164-dc2c03abe232",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"! nomic login token"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "a3501e2a-4686-4b95-8a1c-f19e035ea354",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"! pip install -U langchain-nomic langchain_community tiktoken langchain-openai chromadb langchain"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "85cecf42-0144-425b-86c8-219ff17c0195",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"# Optional: LangSmith API keys\n",
|
|
|
|
"import os\n",
|
|
|
|
"\n",
|
|
|
|
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
|
|
|
|
"os.environ[\"LANGCHAIN_ENDPOINT\"] = \"https://api.smith.langchain.com\"\n",
|
|
|
|
"os.environ[\"LANGCHAIN_API_KEY\"] = \"api_key\""
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"attachments": {},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "134475f2-f256-4c13-9712-c55783e6a4e2",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"## Document Loading\n",
|
|
|
|
"\n",
|
|
|
|
"Let's test 3 interesting blog posts."
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "01c4d270-171e-45c2-a1b6-e350faa74117",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"from langchain_community.document_loaders import WebBaseLoader\n",
|
|
|
|
"\n",
|
|
|
|
"urls = [\n",
|
|
|
|
" \"https://lilianweng.github.io/posts/2023-06-23-agent/\",\n",
|
|
|
|
" \"https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/\",\n",
|
|
|
|
" \"https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/\",\n",
|
|
|
|
"]\n",
|
|
|
|
"\n",
|
|
|
|
"docs = [WebBaseLoader(url).load() for url in urls]\n",
|
|
|
|
"docs_list = [item for sublist in docs for item in sublist]"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"attachments": {},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "75ab7f74-873c-4d84-af5a-5cf19c61239d",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"## Splitting \n",
|
|
|
|
"\n",
|
|
|
|
"Long context retrieval "
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "f512e128-629e-4304-926f-94fe5c999527",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
2024-03-01 02:33:21 +00:00
|
|
|
"from langchain_text_splitters import CharacterTextSplitter\n",
|
Minor update to Nomic cookbook (#16886)
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
2024-02-01 19:28:58 +00:00
|
|
|
"\n",
|
|
|
|
"text_splitter = CharacterTextSplitter.from_tiktoken_encoder(\n",
|
|
|
|
" chunk_size=7500, chunk_overlap=100\n",
|
|
|
|
")\n",
|
|
|
|
"doc_splits = text_splitter.split_documents(docs_list)"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "d2a69cf0-e3ab-4c92-a1d0-10da45c08b3b",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"import tiktoken\n",
|
|
|
|
"\n",
|
|
|
|
"encoding = tiktoken.get_encoding(\"cl100k_base\")\n",
|
|
|
|
"encoding = tiktoken.encoding_for_model(\"gpt-3.5-turbo\")\n",
|
|
|
|
"for d in doc_splits:\n",
|
|
|
|
" print(\"The document is %s tokens\" % len(encoding.encode(d.page_content)))"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"attachments": {},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "c58d1e9b-e98e-4bd9-b52f-4dfc2a4e69f4",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"## Index \n",
|
|
|
|
"\n",
|
|
|
|
"Nomic embeddings [here](https://docs.nomic.ai/reference/endpoints/nomic-embed-text). "
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "76447866-bf8b-412b-93bc-d6ea8ec35952",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"import os\n",
|
|
|
|
"\n",
|
|
|
|
"from langchain_community.vectorstores import Chroma\n",
|
|
|
|
"from langchain_core.output_parsers import StrOutputParser\n",
|
|
|
|
"from langchain_core.runnables import RunnableLambda, RunnablePassthrough\n",
|
|
|
|
"from langchain_nomic import NomicEmbeddings\n",
|
|
|
|
"from langchain_nomic.embeddings import NomicEmbeddings"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "15b3eab2-2689-49d4-8cb0-67ef2adcbc49",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"# Add to vectorDB\n",
|
|
|
|
"vectorstore = Chroma.from_documents(\n",
|
|
|
|
" documents=doc_splits,\n",
|
|
|
|
" collection_name=\"rag-chroma\",\n",
|
|
|
|
" embedding=NomicEmbeddings(model=\"nomic-embed-text-v1\"),\n",
|
|
|
|
")\n",
|
|
|
|
"retriever = vectorstore.as_retriever()"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"attachments": {},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "41131122-3591-4566-aac1-ed19d496820a",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"## RAG Chain\n",
|
|
|
|
"\n",
|
|
|
|
"We can use the Mistral `v0.2`, which is [fine-tuned for 32k context](https://x.com/dchaplot/status/1734198245067243629?s=20).\n",
|
|
|
|
"\n",
|
|
|
|
"We can [use Ollama](https://ollama.ai/library/mistral) -\n",
|
|
|
|
"```\n",
|
|
|
|
"ollama pull mistral:instruct\n",
|
|
|
|
"```\n",
|
|
|
|
"\n",
|
|
|
|
"We can also run [GPT-4 128k](https://openai.com/blog/new-models-and-developer-products-announced-at-devday). "
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "1397de64-5b4a-4001-adc5-570ff8d31ff6",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"from langchain_community.chat_models import ChatOllama\n",
|
|
|
|
"from langchain_core.prompts import ChatPromptTemplate\n",
|
|
|
|
"from langchain_openai import ChatOpenAI\n",
|
|
|
|
"\n",
|
|
|
|
"# Prompt\n",
|
|
|
|
"template = \"\"\"Answer the question based only on the following context:\n",
|
|
|
|
"{context}\n",
|
|
|
|
"\n",
|
|
|
|
"Question: {question}\n",
|
|
|
|
"\"\"\"\n",
|
|
|
|
"prompt = ChatPromptTemplate.from_template(template)\n",
|
|
|
|
"\n",
|
|
|
|
"# LLM API\n",
|
|
|
|
"model = ChatOpenAI(temperature=0, model=\"gpt-4-1106-preview\")\n",
|
|
|
|
"\n",
|
|
|
|
"# Local LLM\n",
|
|
|
|
"ollama_llm = \"mistral:instruct\"\n",
|
|
|
|
"model_local = ChatOllama(model=ollama_llm)\n",
|
|
|
|
"\n",
|
|
|
|
"# Chain\n",
|
|
|
|
"chain = (\n",
|
|
|
|
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
|
|
|
|
" | prompt\n",
|
|
|
|
" | model_local\n",
|
|
|
|
" | StrOutputParser()\n",
|
|
|
|
")"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "1548e00c-1ff6-4e88-aa13-69badf2088fb",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": [
|
|
|
|
"# Question\n",
|
|
|
|
"chain.invoke(\"What are the types of agent memory?\")"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"attachments": {},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "5ec5b4c3-757d-44df-92ea-dd5f08017dd6",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"**Mistral**\n",
|
|
|
|
"\n",
|
|
|
|
"Trace: 24k prompt tokens.\n",
|
|
|
|
"\n",
|
|
|
|
"* https://smith.langchain.com/public/3e04d475-ea08-4ee3-ae66-6416a93d8b08/r\n",
|
|
|
|
"\n",
|
|
|
|
"--- \n",
|
|
|
|
"\n",
|
|
|
|
"Some considerations are noted in the [needle in a haystack analysis](https://twitter.com/GregKamradt/status/1722386725635580292?lang=en):\n",
|
|
|
|
"\n",
|
|
|
|
"* LLMs may suffer with retrieval from large context depending on where the information is placed."
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"attachments": {
|
|
|
|
"0afd4ea4-7ba2-4bfb-8e6d-57300e7a651f.png": {
|
|
|
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAABFIAAALDCAYAAAA2UCTWAAAMP2lDQ1BJQ0MgUHJvZmlsZQAASImVVwdYU8kWnluSkEBCCSAgJfQmCEgJICWEFkB6EWyEJEAoMQaCiB1dVHDtYgEbuiqi2AGxI3YWwd4XRRSUdbFgV96kgK77yvfO9829//3nzH/OnDu3DADqp7hicQ6qAUCuKF8SGxLAGJucwiB1AwTggAYIgMDl5YlZ0dERANrg+e/27ib0hnbNQab1z/7/app8QR4PACQa4jR+Hi8X4kMA4JU8sSQfAKKMN5+aL5Zh2IC2BCYI8UIZzlDgShlOU+B9cp/4WDbEzQCoqHG5kgwAaG2QZxTwMqAGrQ9iJxFfKAJAnQGxb27uZD7EqRDbQB8xxDJ9ZtoPOhl/00wb0uRyM4awYi5yUwkU5olzuNP+z3L8b8vNkQ7GsIJNLVMSGiubM6zb7ezJ4TKsBnGvKC0yCmItiD8I+XJ/iFFKpjQ0QeGPGvLy2LBmQBdiJz43MBxiQ4iDRTmREUo+LV0YzIEYrhC0UJjPiYdYD+KFgrygOKXPZsnkWGUstC5dwmYp+QtciTyuLNZDaXYCS6n/OlPAUepjtKLM+CSIKRBbFAgTIyGmQeyYlx0XrvQZXZTJjhz0kUhjZflbQBwrEIUEKPSxgnRJcKzSvzQ3b3C+2OZMISdSiQ/kZ8aHKuqDNfO48vzhXLA2gYiVMKgjyBsbMTgXviAwSDF3rFsgSohT6nwQ5wfEKsbiFHFOtNIfNxPkhMh4M4hd8wrilGPxxHy4IBX6eLo4PzpekSdelMUNi1bkgy8DEYANAgEDSGFLA5NBFhC29tb3witFTzDgAgnIAALgoGQGRyTJe0TwGAeKwJ8QCUDe0LgAea8AFED+6xCrODqAdHlvgXxENngKcS4IBznwWiofJRqKlgieQEb4j+hc2Hgw3xzYZP3/nh9kvzMsyEQoGelgRIb6oCcxiBhIDCUGE21xA9wX98Yj4NEfNheciXsOzuO7P+EpoZ3wmHCD0EG4M0lYLPkpyzGgA+oHK2uR9mMtcCuo6YYH4D5QHSrjurgBcMBdYRwW7gcju0GWrcxbVhXGT9p/m8EPd0PpR3Yio+RhZH+yzc8jaXY0tyEVWa1/rI8i17SherOHen6Oz/6h+nx4Dv/ZE1uIHcTOY6exi9gxrB4wsJNYA9aCHZfhodX1RL66BqPFyvPJhjrCf8QbvLOySuY51Tj1OH1R9OULCmXvaMCeLJ4mEWZk5jNY8IsgYHBEPMcRDBcnF1cAZN8XxevrTYz8u4Hotnzn5v0BgM/JgYGBo9+5sJMA7PeAj/+R75wNE346VAG4cIQnlRQoOFx2IMC3hDp80vSBMTAHNnA+LsAdeAN/EATCQBSIB8lgIsw+E65zCZgKZoC5oASUgWVgNVgPNoGtYCfYAw6AenAMnAbnwGXQBm6Ae3D1dIEXoA+8A58RBCEhVISO6CMmiCVij7ggTMQXCUIikFgkGUlFMhARIkVmIPOQMmQFsh7ZglQj+5EjyGnkItKO3EEeIT3Ia+QTiqFqqDZqhFqhI1EmykLD0Xh0ApqBTkGL0PnoEnQtWoXuRuvQ0+hl9Abagb5A+zGAqWK6mCnmgDExNhaFpWDpmASbhZVi5VgVVos1wvt8DevAerGPOBGn4wzcAa7gUDwB5+FT8Fn4Ynw9vhOvw5vxa/gjvA//RqASDAn2BC8ChzCWkEGYSighlBO2Ew4TzsJnqYvwjkgk6hKtiR7wWUwmZhGnExcTNxD3Ek8R24mdxH4SiaRPsif5kKJIXFI+qYS0jrSbdJJ0ldRF+qCiqmKi4qISrJKiIlIpVilX2aVyQuWqyjOVz2QNsiXZixxF5pOnkZeSt5EbyVfIXeTPFE2KNcWHEk/JosylrKXUUs5S7lPeqKqqmql6qsaoClXnqK5V3ad6QfWR6kc1LTU7NbbaeDWp2hK1HWqn1O6ovaFSqVZUf2oKNZ+6hFpNPUN9SP1Ao9McaRwanzabVkGro12lvVQnq1uqs9Qnqhepl6sfVL+i3qtB1rDSYGtwNWZpVGgc0bil0a9J13TWjNLM1VysuUvzoma3FknLSitIi681X2ur1hmtTjpGN6ez6Tz6PPo2+ll6lzZR21qbo52lXaa9R7tVu09HS8dVJ1GnUKdC57hOhy6ma6XL0c3RXap7QPem7qdhRsNYwwTDFg2rHXZ12Hu94Xr+egK9Ur29ejf0Pukz9IP0s/WX69frPzDADewMYgymGmw0OGvQO1x7uPdw3vDS4QeG3zVEDe0MYw2nG241bDHsNzI2CjESG60zOmPUa6xr7G+cZbzK+IRxjwndxNdEaLLK5KTJc4YOg8XIYaxlNDP6TA1NQ02lpltMW00/m1mbJZgVm+01e2BOMWeap5uvMm8y77MwsRhjMcOixuKuJdmSaZlpucbyvOV7K2urJKsFVvVW3dZ61hzrIusa6/s2VBs/myk2VTbXbYm2TNts2w22bXaonZtdpl2F3RV71N7dXmi/wb59BGGE5wjRiKoRtxzUHFgOBQ41Do8cdR0jHIsd6x1fjrQYmTJy+cjzI785uTnlOG1zuues5RzmXOzc6Pzaxc6F51Lhcn0UdVTwqNmjGka9crV3FbhudL3tRncb47bArcntq7uHu8S91r3Hw8Ij1aPS4xZTmxnNXMy84EnwDPCc7XnM86OXu1e+1wGvv7wdvLO9d3l3j7YeLRi9bXSnj5kP12eLT4cvwzfVd7Nvh5+pH9evyu+xv7k/33+7/zOWLSuLtZv1MsApQBJwOOA924s9k30qEAsMCSwNbA3SCkoIWh/0MNgsOCO4JrgvxC1kesipUEJoeOjy0FscIw6PU83pC/MImxnWHK4WHhe+PvxxhF2EJKJxDDombMzKMfcjLSNFkfVRIIoTtTLqQbR19JToozHEmOiYipinsc6xM2LPx9HjJsXtinsXHxC/NP5egk2CNKEpUT1xfGJ14vukwKQVSR1jR46dOfZyskGyMLkhhZSSmLI9pX9c0LjV47rGu40vGX9zgvWEwgkXJxpMzJl4fJL6JO6kg6mE1KTUXalfuFHcKm5/GietMq2Px+at4b3g+/NX8XsEPoIVgmfpPukr0rszfDJWZvRk+mWWZ/YK2cL1wldZoVmbst5nR2XvyB7IScrZm6uSm5p7RKQlyhY1TzaeXDi5XWwvLhF3TPGasnpKnyRcsj0PyZuQ15CvDX/kW6Q20l+kjwp8CyoKPkxNnHqwULNQVNgyzW7aomnPioKLfpuOT+dNb5phOmPujEczWTO3zEJmpc1qmm0+e/7srjkhc3bOpczNnvt7sVPxiuK385LmNc43mj9nfucvIb/UlNBKJCW3Fngv2LQQXyhc2Lpo1KJ1i76V8ksvlTmVlZd9WcxbfOlX51/X/jqwJH1J61L3pRuXEZeJlt1c7rd85wrNFUUrOleOWVm3irGqdNXb1ZNWXyx3Ld+0hrJGuqZjbcTahnUW65at+7I+c/2NioCKvZWGlYsq32/gb7i60X9j7SajTWWbPm0Wbr69JWRLXZVVVflW4taCrU+3JW47/xvzt+rtBtvLtn/dIdrRsTN2Z3O1R3X1LsNdS2vQGmlNz+7xu9v2BO5pqHWo3bJXd2/ZPrBPuu/5/tT9Nw+EH2g6yDxYe8jyUOVh+uHSOqRuWl1ffWZ9R0NyQ/uRsCNNjd6Nh486Ht1xzPRYxXGd40tPUE7MPzFwsuhk/ynxqd7TGac7myY13Tsz9sz15pjm1rPhZy+cCz535jzr/MkLPheOXfS6eOQS81L9ZffLdS1uLYd/d/v9cKt7a90VjysNbZ5tje2j209c9bt6+lrgtXPXOdcv34i80X4z4ebtW+Nvddzm3+6+k3Pn1d2Cu5/vzblPuF/6QONB+UPDh1V/2P6xt8O94/ijwEctj+Me3+vkdb54kvfkS9f8p9Sn5c9MnlV3u3Qf6wnuaXs+7nnXC/GLz70lf2r+WfnS5uWhv/z/aukb29f1SvJq4PXiN/pvdrx1fdvUH93/8F3uu8/vSz/of9j5kfnx/KekT88+T/1C+rL2q+3Xxm/h
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"cell_type": "markdown",
|
|
|
|
"id": "de7e6f9e-0c69-47a7-be8a-0ae9233e036c",
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"## LangServe\n",
|
|
|
|
"\n",
|
|
|
|
"Create a LangServe app. \n",
|
|
|
|
"\n",
|
|
|
|
"![Screenshot 2024-02-01 at 10.36.05 AM.png](attachment:0afd4ea4-7ba2-4bfb-8e6d-57300e7a651f.png)\n",
|
|
|
|
"\n",
|
|
|
|
"```\n",
|
|
|
|
"$ conda create -n template-testing-env python=3.11\n",
|
|
|
|
"$ conda activate template-testing-env\n",
|
|
|
|
"$ pip install -U \"langchain-cli[serve]\" \"langserve[all]\"\n",
|
|
|
|
"$ langchain app new .\n",
|
|
|
|
"$ poetry add langchain-nomic langchain_community tiktoken langchain-openai chromadb langchain\n",
|
|
|
|
"$ poetry install\n",
|
|
|
|
"```\n",
|
|
|
|
"\n",
|
|
|
|
"---\n",
|
|
|
|
"\n",
|
|
|
|
"Add above logic to new file `chain.py`.\n",
|
|
|
|
"\n",
|
|
|
|
"---\n",
|
|
|
|
"\n",
|
|
|
|
"Add to `server.py` -\n",
|
|
|
|
"\n",
|
|
|
|
"```\n",
|
|
|
|
"from app.chain import chain as nomic_chain\n",
|
|
|
|
"add_routes(app, nomic_chain, path=\"/nomic-rag\")\n",
|
|
|
|
"```\n",
|
|
|
|
"\n",
|
|
|
|
"Run - \n",
|
|
|
|
"```\n",
|
|
|
|
"$ poetry run langchain serve\n",
|
|
|
|
"```"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"cell_type": "code",
|
|
|
|
"execution_count": null,
|
|
|
|
"id": "0b4f8022-8aa2-4df4-be7c-635568ef8e24",
|
|
|
|
"metadata": {},
|
|
|
|
"outputs": [],
|
|
|
|
"source": []
|
|
|
|
}
|
|
|
|
],
|
|
|
|
"metadata": {
|
|
|
|
"kernelspec": {
|
|
|
|
"display_name": "Python 3 (ipykernel)",
|
|
|
|
"language": "python",
|
|
|
|
"name": "python3"
|
|
|
|
},
|
|
|
|
"language_info": {
|
|
|
|
"codemirror_mode": {
|
|
|
|
"name": "ipython",
|
|
|
|
"version": 3
|
|
|
|
},
|
|
|
|
"file_extension": ".py",
|
|
|
|
"mimetype": "text/x-python",
|
|
|
|
"name": "python",
|
|
|
|
"nbconvert_exporter": "python",
|
|
|
|
"pygments_lexer": "ipython3",
|
|
|
|
"version": "3.11.4"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"nbformat": 4,
|
|
|
|
"nbformat_minor": 5
|
|
|
|
}
|