You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs
Shotaro Sano d647ff1a9a
docs: Fix execution results of `docs/docs/modules/data_connection/indexing.ipynb` (#19112)
## Description
This PR addresses a documentation issue in the
[Indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
page. Specifically, it corrects the execution results of the Jupyter
notebook under the
[Source](https://python.langchain.com/docs/modules/data_connection/indexing#source)
section, which were broken as detailed below.

## Problem
The execution results following the statement, `This should delete the
old versions of documents associated with doggy.txt source and replace
them with the new versions.`, appear to be incorrect, as described
below.

### Current Behavior
- For some reason, the `index` function fails to add the new content of
`doggy.txt`. Although it deletes the document objects associated with
the `doggy.txt` source, it does not add the objects in
`changed_doggy_docs`. Consequently, the execution result displays
`num_added: 0`.
- This unexpected behavior also impacts the results of
`vectorstore.similarity_search("dog", k=30)`, showing only the contents
of `kitty.txt`. It appears as though the contents of `doggy.txt` have
been completely removed from the index:

```
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

### Expected Behavior
- The `index` function should successfully add the objects in
`changed_doggy_docs` after removing the old content of `doggy.txt`. The
anticipated execution result is `num_added: 2`.
- Subsequently, the modified content of `doggy.txt` should appear in the
results of `vectorstore.similarity_search("dog", k=30)` as follows:

```
[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

## Fix
I reran `docs/docs/modules/data_connection/indexing.ipynb` and have
included the diff in this PR.
7 months ago
..
api_reference community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106) 7 months ago
data 👥 Update LangChain people data (#18473) 7 months ago
docs docs: Fix execution results of `docs/docs/modules/data_connection/indexing.ipynb` (#19112) 7 months ago
scripts docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 7 months ago
src docs[patch]: properly load/use env vars (#18942) 7 months ago
static docs: Add graph construction docs (#18904) 7 months ago
.gitignore docs[minor]: Swap gtag for supabase (#18937) 7 months ago
.local_build.sh docs: partner packages (#16960) 8 months ago
.yarnrc.yml docs[minor]: Add thumbs up/down to all docs pages (#18526) 7 months ago
README.md docs: developer docs (#14776) 10 months ago
babel.config.js Restructure docs (#11620) 1 year ago
code-block-loader.js Restructure docs (#11620) 1 year ago
docusaurus.config.js docs[patch]: properly load/use env vars (#18942) 7 months ago
package.json docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 7 months ago
settings.ini Restructure docs (#11620) 1 year ago
sidebars.js docs: `Toolkits` menu (#16217) 8 months ago
vercel.json docs: `providers` update 4 (#18540) 7 months ago
vercel_build.sh docs: fix vercel build script (#19090) 7 months ago
vercel_requirements.txt infra: docs build install community editable (#14739) 10 months ago
yarn.lock docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 7 months ago

README.md

LangChain Documentation

For more information on contributing to our documentation, see the Documentation Contributing Guide