docs: turn on link check (#18924)

2 months ago · 34284c25d4
parent 93ef8ead0b
commit 34284c25d4
15 changed files with 64 additions and 66 deletions
--- a/docs/docs/integrations/llms/huggingface_pipelines.ipynb
+++ b/docs/docs/integrations/llms/huggingface_pipelines.ipynb
@ -11,7 +11,7 @@
    "\n",
    "The [Hugging Face Model Hub](https://huggingface.co/models) hosts over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.\n",
    "\n",
-    "These can be called from LangChain either through this local pipeline wrapper or by calling their hosted inference endpoints through the HuggingFaceHub class. For more information on the hosted pipelines, see the [HuggingFaceHub](./huggingface_hub) notebook."
+    "These can be called from LangChain either through this local pipeline wrapper or by calling their hosted inference endpoints through the HuggingFaceHub class."
   ]
  },
  {
--- a/docs/docs/integrations/platforms/huggingface.mdx
+++ b/docs/docs/integrations/platforms/huggingface.mdx
@ -2,28 +2,26 @@

 All functionality related to the [Hugging Face Platform](https://huggingface.co/).

-## LLMs
+## Chat models

-### Hugging Face Hub
+### Models from Hugging Face

->The [Hugging Face Hub](https://huggingface.co/docs/hub/index) is a platform 
-> with over 350k models, 75k datasets, and 150k demo apps (Spaces), all open source 
-> and publicly available, in an online platform where people can easily 
-> collaborate and build ML together. The Hub works as a central place where anyone 
-> can explore, experiment, collaborate, and build technology with Machine Learning.
+We can use the `Hugging Face` LLM classes or directly use the `ChatHuggingFace` class.

-To use, we should have the `huggingface_hub` python [package installed](https://huggingface.co/docs/huggingface_hub/installation).
+We need to install several python packages.

 ```bash
 pip install huggingface_hub
+pip install transformers
 ```
-
-See a [usage example](/docs/integrations/llms/huggingface_hub).
+See a [usage example](/docs/integrations/chat/huggingface).

 ```python
-from langchain_community.llms import HuggingFaceHub
+from langchain_community.chat_models.huggingface import ChatHuggingFace
 ```

+## LLMs
+
 ### Hugging Face Local Pipelines

 Hugging Face models can be run locally through the `HuggingFacePipeline` class.
@ -56,41 +54,6 @@ optimum-cli export openvino --model gpt2 ov_model

 To apply [weight-only quantization](https://github.com/huggingface/optimum-intel?tab=readme-ov-file#export) when exporting your model.

-### Hugging Face TextGen Inference
-
->[Text Generation Inference](https://github.com/huggingface/text-generation-inference) is 
-> a Rust, Python and gRPC server for text generation inference. Used in production at 
-> [HuggingFace](https://huggingface.co/) to power LLMs api-inference widgets.
-
-We need to install `text_generation` python package.
-
-```bash
-pip install text_generation
-```
-
-See a [usage example](/docs/integrations/llms/huggingface_textgen_inference).
-
-```python
-from langchain_community.llms import HuggingFaceTextGenInference
-```
-
-## Chat models
-
-### Models from Hugging Face
-
-We can use the `Hugging Face` LLM classes or directly use the `ChatHuggingFace` class.
-
-We need to install several python packages.
-
-```bash
-pip install huggingface_hub
-pip install transformers
-```
-See a [usage example](/docs/integrations/chat/huggingface).
-
-```python
-from langchain_community.chat_models.huggingface import ChatHuggingFace
-```

 ## Embedding Models

--- a/docs/docs/integrations/providers/arangodb.mdx
+++ b/docs/docs/integrations/providers/arangodb.mdx
@ -15,11 +15,11 @@ pip install python-arango

 Connect your `ArangoDB` Database with a chat model to get insights on your data. 

-See the notebook example [here](/docs/use_cases/graph/graph_arangodb_qa).
+See the notebook example [here](/docs/use_cases/graph/integrations/graph_arangodb_qa).

 ```python
 from arango import ArangoClient

 from langchain_community.graphs import ArangoGraph
 from langchain.chains import ArangoGraphQAChain
-```
+```
--- a/docs/docs/integrations/providers/neo4j.mdx
+++ b/docs/docs/integrations/providers/neo4j.mdx
@ -35,7 +35,7 @@ from langchain_community.graphs import Neo4jGraph
 from langchain.chains import GraphCypherQAChain
 ```

-See a [usage example](/docs/use_cases/graph/graph_cypher_qa)
+See a [usage example](/docs/use_cases/graph/integrations/graph_cypher_qa)

 ## Constructing a knowledge graph from text

@ -49,7 +49,7 @@ from langchain_community.graphs import Neo4jGraph
 from langchain_experimental.graph_transformers.diffbot import DiffbotGraphTransformer
 ```

-See a [usage example](/docs/use_cases/graph/diffbot_graphtransformer)
+See a [usage example](/docs/use_cases/graph/integrations/diffbot_graphtransformer)

 ## Memory

--- a/docs/docs/integrations/providers/ontotext_graphdb.mdx
+++ b/docs/docs/integrations/providers/ontotext_graphdb.mdx
@ -13,9 +13,9 @@ pip install rdflib==7.0.0

 Connect your GraphDB Database with a chat model to get insights on your data.

-See the notebook example [here](/docs/use_cases/graph/graph_ontotext_graphdb_qa).
+See the notebook example [here](/docs/use_cases/graph/integrations/graph_ontotext_graphdb_qa).

 ```python
 from langchain_community.graphs import OntotextGraphDBGraph
 from langchain.chains import OntotextGraphDBQAChain
-```
+```
--- a/docs/docs/integrations/providers/sparkllm.mdx
+++ b/docs/docs/integrations/providers/sparkllm.mdx
@ -5,7 +5,7 @@ It has cross-domain knowledge and language understanding ability by learning a l
 It can understand and perform tasks based on natural dialogue.

 ## SparkLLM LLM Model
-An example is available at [example](/docs/integrations/llm/sparkllm).
+An example is available at [example](/docs/integrations/llms/sparkllm).

 ## SparkLLM Chat Model
 An example is available at [example](/docs/integrations/chat/sparkllm).
--- a/docs/docs/integrations/tools/passio_nutrition_ai.ipynb
+++ b/docs/docs/integrations/tools/passio_nutrition_ai.ipynb
@ -11,7 +11,7 @@
    "\n",
    "## Define tools\n",
    "\n",
-    "We first need to create [the Passio NutritionAI tool](/docs/integrations/tools/passio_nutritionai)."
+    "We first need to create [the Passio NutritionAI tool](/docs/integrations/tools/passio_nutrition_ai)."
   ]
  },
  {
@ -19,7 +19,7 @@
   "id": "c335d1bf",
   "metadata": {},
   "source": [
-    "### [Passio Nutrition AI](/docs/integrations/tools/passio_nutritionai-agent)\n",
+    "### [Passio Nutrition AI](/docs/integrations/tools/passio_nutrition_ai)\n",
    "\n",
    "We have a built-in tool in LangChain to easily use Passio NutritionAI to find food nutrition facts.\n",
    "Note that this requires an API key - they have a free tier.\n",
@ -2098,7 +2098,7 @@
   "source": [
    "## Create the agent\n",
    "\n",
-    "Now that we have defined the tools, we can create the agent. We will be using an OpenAI Functions agent - for more information on this type of agent, as well as other options, see [this guide](./agent_types)\n",
+    "Now that we have defined the tools, we can create the agent. We will be using an OpenAI Functions agent - for more information on this type of agent, as well as other options, see [this guide](/docs/modules/agents/agent_types/)\n",
    "\n",
    "First, we choose the LLM we want to be guiding the agent."
   ]
@ -2156,7 +2156,7 @@
   "id": "f8014c9d",
   "metadata": {},
   "source": [
-    "Now, we can initalize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](./concepts)"
+    "Now, we can initalize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/modules/agents/concepts)"
   ]
  },
  {
@ -2176,7 +2176,7 @@
   "id": "1a58c9f8",
   "metadata": {},
   "source": [
-    "Finally, we combine the agent (the brains) with the tools inside the AgentExecutor (which will repeatedly call the agent and execute tools). For more information about how to think about these components, see our [conceptual guide](./concepts)"
+    "Finally, we combine the agent (the brains) with the tools inside the AgentExecutor (which will repeatedly call the agent and execute tools). For more information about how to think about these components, see our [conceptual guide](/docs/modules/agents/concepts)"
   ]
  },
  {
--- a/docs/docs/use_cases/extraction/how_to/handle_files.ipynb
+++ b/docs/docs/use_cases/extraction/how_to/handle_files.ipynb
@ -19,13 +19,13 @@
   "source": [
    "Besides raw text data, you may wish to extract information from other file types such as PowerPoint presentations or PDFs.\n",
    "\n",
-    "You can use LangChain [document loaders](/modules/data_connection/document_loaders/) to parse files into a text format that can be fed into LLMs.\n",
+    "You can use LangChain [document loaders](/docs/modules/data_connection/document_loaders/) to parse files into a text format that can be fed into LLMs.\n",
    "\n",
    "LangChain features a large number of [document loader integrations](/docs/integrations/document_loaders).\n",
    "\n",
    "## MIME type based parsing\n",
    "\n",
-    "For basic parsing exmaples take a look [at document loaders](/modules/data_connection/document_loaders/).\n",
+    "For basic parsing exmaples take a look [at document loaders](/docs/modules/data_connection/document_loaders/).\n",
    "\n",
    "Here, we'll be looking at mime-type based parsing which is often useful for extraction based applications if you're writing server code that accepts user uploaded files.\n",
    "\n",
--- a/docs/docs/use_cases/extraction/index.ipynb
+++ b/docs/docs/use_cases/extraction/index.ipynb
@ -63,7 +63,7 @@
    "## Other Resources\n",
    "\n",
    "* The [output parser](/docs/modules/model_io/output_parsers/) documentation includes various parser examples for specific types (e.g., lists, datetime, enum, etc).\n",
-    "* LangChain [document loaders](/modules/data_connection/document_loaders/) to load content from files. Please see list of [integrations](/docs/integrations/document_loaders).\n",
+    "* LangChain [document loaders](/docs/modules/data_connection/document_loaders/) to load content from files. Please see list of [integrations](/docs/integrations/document_loaders).\n",
    "* The experimental [Anthropic function calling](https://python.langchain.com/docs/integrations/chat/anthropic_functions) support provides similar functionality to Anthropic chat models.\n",
    "* [LlamaCPP](https://python.langchain.com/docs/integrations/llms/llamacpp#grammars) natively supports constrained decoding using custom grammars, making it easy to output structured content using local LLMs \n",
    "* [JSONFormer](/docs/integrations/llms/jsonformer_experimental) offers another way for structured decoding of a subset of the JSON Schema.\n",
--- a/docs/docs/use_cases/summarization.ipynb
+++ b/docs/docs/use_cases/summarization.ipynb
@ -183,7 +183,7 @@
    "* 16k token OpenAI `gpt-3.5-turbo-1106` \n",
    "* 100k token Anthropic [Claude-2](https://www.anthropic.com/index/claude-2)\n",
    "\n",
-    "We can also supply `chain_type=\"map_reduce\"` or `chain_type=\"refine\"` (read more [here](/docs/modules/chains/document/refine))."
+    "We can also supply `chain_type=\"map_reduce\"` or `chain_type=\"refine\"`."
   ]
  },
  {
--- a/docs/docusaurus.config.js
+++ b/docs/docusaurus.config.js
@ -20,7 +20,7 @@ const config = {
  // For GitHub pages deployment, it is often '/<projectName>/'
  baseUrl: "/",

-  onBrokenLinks: "warn",
+  onBrokenLinks: "throw",
  onBrokenMarkdownLinks: "throw",

  themes: ["@docusaurus/theme-mermaid"],
--- a/docs/scripts/copy_templates.py
+++ b/docs/scripts/copy_templates.py
@ -28,7 +28,9 @@ sidebar_class_name: hidden
 TEMPLATES_INDEX_DESTINATION = DOCS_TEMPLATES_DIR / "index.md"
 with open(TEMPLATES_INDEX_DESTINATION, "r") as f:
    content = f.read()
+
 # replace relative links
 content = re.sub("\]\(\.\.\/", "](/docs/templates/", content)
+
 with open(TEMPLATES_INDEX_DESTINATION, "w") as f:
    f.write(sidebar_hidden + content)
--- a/docs/scripts/resolve_local_links.py
+++ b/docs/scripts/resolve_local_links.py
@ -0,0 +1,21 @@
+import os
+import re
+import sys
+from pathlib import Path
+
+DOCS_DIR = Path(os.path.abspath(__file__)).parents[1]
+
+
+def update_links(doc_path, docs_link):
+    with open(DOCS_DIR / doc_path, "r") as f:
+        content = f.read()
+
+    # replace relative links
+    content = re.sub("\]\(\.\/", f"]({docs_link}", content)
+
+    with open(DOCS_DIR / doc_path, "w") as f:
+        f.write(content)
+
+
+if __name__ == "__main__":
+    update_links(sys.argv[1], sys.argv[2])
--- a/docs/vercel_build.sh
+++ b/docs/vercel_build.sh
@ -10,15 +10,27 @@ tar -xzf quarto-1.3.450-linux-amd64.tar.gz
 export PATH=$PATH:$(pwd)/quarto-1.3.450/bin/


+# setup python env
 python3.8 -m venv .venv
 source .venv/bin/activate
 python3.8 -m pip install --upgrade pip
 python3.8 -m pip install -r vercel_requirements.txt
+
+# autogenerate integrations tables
 python3.8 scripts/model_feat_table.py
+
+# copy in external files
 mkdir docs/templates
 cp ../templates/docs/INDEX.md docs/templates/index.md
 python3.8 scripts/copy_templates.py
+
 cp ../cookbook/README.md src/pages/cookbook.mdx
+
 wget -q https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
+python3.8 scripts/resolve_local_links.py docs/langserve.md https://github.com/langchain-ai/langserve/tree/main/
+
 wget -q https://raw.githubusercontent.com/langchain-ai/langgraph/main/README.md -O docs/langgraph.md
+python3.8 scripts/resolve_local_links.py docs/langgraph.md https://github.com/langchain-ai/langgraph/tree/main/
+
+# render
 quarto render docs/
--- a/templates/sql-pgvector/README.md
+++ b/templates/sql-pgvector/README.md
@ -40,7 +40,7 @@ Apart from having `pgvector` extension enabled, you will need to do some setup b

 In order to run RAG over your postgreSQL database you will need to generate the embeddings for the specific columns you want. 

-This process is covered in the [RAG empowered SQL cookbook](cookbook/retrieval_in_sql.ipynb), but the overall approach consist of:
+This process is covered in the [RAG empowered SQL cookbook](https://github.com/langchain-ai/langchain/blob/master/cookbook/retrieval_in_sql.ipynb), but the overall approach consist of:
 1. Querying for unique values in the column
 2. Generating embeddings for those values
 3. Store the embeddings in a separate column or in an auxiliary table.
@ -102,4 +102,4 @@ We can access the template from code with:
 from langserve.client import RemoteRunnable

 runnable = RemoteRunnable("http://localhost:8000/sql-pgvector")
-```
+```