Commit Graph

3269 Commits (b43a9d58084373d873cd72a7997ba63eb3ac8488)

Author SHA1 Message Date
aditya thomas b43a9d5808
docs: adding voyageai to the list of partner packages (#19376)
**Description:** Adding VoyageAI to the list of partners
**Issue:** A standalone langchain-voyageai package has been added
**Dependencies:** None
3 months ago
Zeeland 2549df00cd
docs: fix error bilibili url (#19375)
Thank you for contributing to LangChain!

bilibili-api-python use https://github.com/Nemo2011/bilibili-api repo.
Change to the correct address.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
3 months ago
aditya thomas 375ab7bf59
docs: update module imports for fireworks documentation (#19377)
**Description:** Update module imports for Fireworks documentation
**Issue:** Module imports not present or in incorrect location
**Dependencies:** None
3 months ago
aditya thomas 0cc0467267
docs: update import paths and move to lcel for llama.cpp examples (#19391)
**Description:** Update import paths and move to lcel for llama.cpp
examples
**Issue:** Update import paths to reflect package refactoring and move
chains to LCEL in examples
**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
3 months ago
fengjial 3b52ee05d1
community[patch]: fix bugs in baiduvectordb as vectorstore (#19380)
fix small bugs in vectorstore/baiduvectordb
3 months ago
Cailin Wang 5402aef32e
docs: Add `partition` parameter to DashVector (#19385)
**Description**: Add `partition` parameter to DashVector
dashvector.ipynb
**Related PR**: https://github.com/langchain-ai/langchain/pull/19023
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
3 months ago
aditya thomas 16ef88a87d
docs: moving FireworksEmbeddings documentation to docs folder (#19398)
**Description:** Moving FireworksEmbeddings documentation to the
location docs/integration/text_embedding/ from langchain_fireworks/docs/
**Issue:** FireworksEmbeddings documentation was not in the correct
location
**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
3 months ago
Ray Bell 7d36ee38b7
docs: point to titantic dataset on web (#19455)
Updated `pd.read_csv("titantic.csv")` to
`pd.read_csv("https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv")`
i.e. it will read it
https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv
and allow anyone to run the code.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
3 months ago
Ray Bell f959fad56e
docs: use invoke instead of run (#19457)
Updated the deprecated run with invoke

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
3 months ago
老阿張 9dfce56b31
docs: Fix typo in infino.ipynb (#18640)
Description: "conquerer should be conqueror "? 🤔
Issue: Typo
Dependencies: Nope
Twitter handle: laoazhang
3 months ago
aditya thomas e46419c851
docs: contribute / integrations code examples update (#19319)
**Description:** Update to make the code examples consistent with the
actual use
**Issue:** Code examples were different from actual use in the LangChain
code
**Dependencies:** Changes on top of
https://github.com/langchain-ai/langchain/pull/19294

Note: If these changes are acceptable, please merge them after
https://github.com/langchain-ai/langchain/pull/19294.
3 months ago
Brace Sproul 40f846e65d
docs[minor]: Add chat model selection tabs component (#19296)
<img width="1728" alt="image"
src="https://github.com/langchain-ai/langchain/assets/46789226/45e70a92-c2ee-48c8-9964-100eed22687b">
3 months ago
Nithish Raghunandanan 7ad0a3f2a7
community: add Couchbase Vector Store (#18994)
- **Description:** Added support for Couchbase Vector Search to
LangChain.
- **Dependencies:** couchbase>=4.1.12
- **Twitter handle:** @nithishr

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
4 months ago
Chris Papademetrious 305d74c67a
core: implement a batch_size parameter for CacheBackedEmbeddings (#18070)
**Description:**

Currently, `CacheBackedEmbeddings` computes vectors for *all* uncached
documents before updating the store. This pull request updates the
embedding computation loop to compute embeddings in batches, updating
the store after each batch.

I noticed this when I tried `CacheBackedEmbeddings` on our 30k document
set and the cache directory hadn't appeared on disk after 30 minutes.

The motivation is to minimize compute/data loss when problems occur:

* If there is a transient embedding failure (e.g. a network outage at
the embedding endpoint triggers an exception), at least the completed
vectors are written to the store instead of being discarded.
* If there is an issue with the store (e.g. no write permissions), the
condition is detected early without computing (and discarding!) all the
vectors.

**Issue:**
Implements enhancement #18026.

**Testing:**
I was unable to run unit tests; details in [this
post](https://github.com/langchain-ai/langchain/discussions/15019#discussioncomment-8576684).

---------

Signed-off-by: chrispy <chrispy@synopsys.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Christophe Bornet 30e4a35d7a
community: Use langchain-astradb for AstraDB caches (#18419)
- [x] Needs https://github.com/langchain-ai/langchain-datastax/pull/4
- [x] Needs a new release of langchain-astradb
4 months ago
Brace Sproul 17c62e0f3a
ci[minor]: Bump LC scripts package, add retry option (#19285)
The `retryFailed` option will retry all failed links, once at a time
with the goal of not triggering bot protection

`microsoft.com` is now hard coded into the whitelist
4 months ago
Erick Friis 7eb376d5fc
docs: integration deprecation docs (#19283) 4 months ago
HatsuneMK00 4761c09e94
docs: update slack toolkit ipynb in integration (#19219)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- **PR message**:
- **Description:** Update the slack toolkit doc to use an agent that
support multiple inputs. Using ReAct agent will cause a ValidationError
when invoking the slack tools. This is because the agent return a string
like `'{"channel": "C05LDF54S21", "message": "Hello, world!"}'` but the
ReAct agent does not support multiple inputs.
- **Issue:** This is related to this
[Discussion#18083](https://github.com/langchain-ai/langchain/discussions/18083)
    - **Dependencies:** No dependencies required

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
Vittorio Rigamonti 9b2f9ee952
community: VectorStore Infinispan, adding autoconfiguration (#18967)
**Description**:
this PR enable VectorStore autoconfiguration for Infinispan: if
metadatas are only of basic types, protobuf
config will be automatically generated for the user.
4 months ago
Anthony Shaw bb0dd8f82f
docs: Embellish article on splitting by tokens with more examples and missing details (#18997)
**Description**

This PR adds some missing details from the "Split by tokens" page in the
documentation. Specifically:

- The `.from_tiktoken_encoder()` class methods for both the
`CharacterTextSplitter` and `RecursiveCharacterTextSplitter` default to
the old `gpt-2` encoding. I've added a comment to suggest specifying
`model_name` or `encoding`
- The docs didn't mention that the `from_tiktoken_encoder()` class
method passes additional kwargs down to the constructor of the splitter.
I only discovered this by reading the source code
- Added an example of using the `.from_tiktoken_encoder()` class method
with `RecursiveCharacterTextSplitter` which is the recommended approach
for most scenarios above `CharacterTextSplitter`
- Added a warning that `TokenTextSplitter` can split characters which
have multiple tokens (e.g. 猫 has 3 cl100k_base tokens) between multiple
chunks which creates malformed Unicode strings and should not be used in
these situations.

Side note: I think the default argument of `gpt2` for
`.from_tiktoken_encoder()` should be updated?

**Twitter handle** anthonypjshaw

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Simon Stone 58c7687174
langchain: preserve document metadata in `FlashrankRerank` (#19148)
**Description:** Preserves document metadata in `FlashrankRerank`
    - **Issue:** #19142
    - **Dependencies:** None
    - **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
4 months ago
Simon Stone dc4ce82ddd
docs: fix import path for `FlashrankRerank` example notebook (#19146)
**Description:** Fixes the import paths for the `FlashrankRerank`
example notebook.
 **Issue:** #19139 
 **Dependencies:** None
 **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Saurav Kumar bde199d128
Updating format of pip install (#19198)
Thank you for contributing to LangChain!

- [x] **PR title**: "Updating format of pip install in two files of
docs/cookbook"
- pip install is not reflecting properly in some of the files in
cookbook
- Example:
[docs/expression_language/cookbook/sql_db](https://python.langchain.com/docs/expression_language/cookbook/sql_db)


- [x] **PR message**: Updating format of pip install in two files of
docs/cookbook
    - **Description:** a description of the change
    - **Issue:** #19197 

- Note - let's do squash merge for the PR

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
HowardChan ae3c7f702c
docs:Make url as a markdown link (#19212)
**Description**: same as the title

Co-authored-by: ChenZhengHao <chenzhenghao@mail.teletraan.io>
4 months ago
Estephania Calvo Carvajal 94e58dd827
docs:Fix links to LangSmith docs on Evaluation page (#19210) (#19216)
- **Description:** Same as the title
- **Issue:** #19210
4 months ago
Kenzie Mihardja 21f75991d4
deprecate community docugami loader (#19230)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: deprecate DocugamiLoader"

- [x] **PR message**: Deprecate the langchain_community and use the
docugami_langchain DocugamiLoader

---------

Co-authored-by: Kenzie Mihardja <kenzie28@cs.washington.edu>
4 months ago
Anubhav Madhav 9235dade90
docs: provided hyperlinks to text and fixed grammar (#19092)
1) Provided links to text in the prompt (Refer Page Link 1, Page Link 2
and Page Link 3)
2) Fixed Grammar in Considerations of Model I/O Concepts documentation
page - Update concepts.mdx (Page Link 4)

*Issues are on the following pages:*
Page Link 1:
https://python.langchain.com/docs/modules/model_io/concepts#prompttemplate
Page Link 2:
https://python.langchain.com/docs/modules/model_io/concepts#messageprompttemplate
Page Link 3:
https://python.langchain.com/docs/modules/model_io/concepts#chatprompttemplate
Page Link 4:
https://python.langchain.com/docs/modules/model_io/concepts#considerations


**Fix 1**:
Description: Fixed Grammar in Considerations of Model I/O Documentation
Page
Issue: "to work well with the model are you using" # "to work well with
the model you are using"
Dependencies: None
Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav)

**Fix 2**:
Description: Provided links to text in the prompt (Refer Page Link 1,
Page Link 2 and Page Link 3)
Issue: links not provided # links have been provided to the text
Dependencies: None
Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav)
baskaryan, efriis, eyurtsev, hwchase17.


*For Fix 1*
Refer to the first word 'This" word in the image attached with this PR.
PFA
<img width="839" alt="Screenshot 2024-03-15 at 3 04 17 AM"
src="https://github.com/langchain-ai/langchain/assets/42323737/94e8db16-249f-48c3-a1d1-dee8d36067fa">


If no one reviews your PR within a few days, please @-mention one of

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
inpyeong 7c092f479f
docs: Update why.ipynb (#19173)
I think that cell type for pip command may be 'code'.
Please check, thank you :)

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
Vitalii Korsakov d96e0b2de7
docs: Remove duplicated line in Get Started section (#19182)
Line `from langchain_openai import ChatOpenAI` is put twice in Get
Started / Serving with LangServe section.
Imports on lines 559 and 566 are identical

Co-authored-by: Vitalii <vitalii@localhost>
4 months ago
Rodrigo Nogueira e64cf1aba4
community: Add model argument for maritalk models and better error handling (#19187) 4 months ago
samanhappy ff94f86ce1
docs: fix link to interface TextSplitter (#19177) 4 months ago
aditya thomas 05008c4f94
docs: update stale links in Together AI documentation (#19011)
**Description:** Update stales link in Together AI documentation
**Issue:** Some links pointed to legacy webpages on the Together AI
website
**Dependencies:** None
**Lint and test**: `make format`, `make lint` were run
4 months ago
wulixuan f79d0cb9fb
docs: update docs for yuan2 in LLMs and Chat models integration. (#19028)
update yuan2.0 notebook in LLMs and Chat models.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
4 months ago
Taraka Nithin Vankala eec023766e
docs: Corrected error (#19030)
- [ ] **PR title**: "docs: correction in
"https://github.com/langchain-ai/langchain/blob/master/docs/docs/get_started/quickstart.mdx",
line 289".
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: 
    - Corrected the spelling mistake
    - #18981
4 months ago
Christophe Bornet f2a7dda4bd
community[patch]: Use langchain-astradb for AstraDB doc loader (#19071)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Leonid Ganeline a49ac55964
docs: `providers` update 8 (#19053)
Added missed providers. Added missed integrations. Fixed format.
4 months ago
Holt Skinner cee03630d9
community[patch]: Add Blended Search Support to `GoogleVertexAISearchRetriever` (#19082)
https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#multi-data-stores

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
William W Wang 0a784074d1
docs: Update llm_caching.ipynb (#19085) 4 months ago
William W Wang 6327be9048
docsUpdate azure_cosmos_db.ipynb (#19087)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Anubhav Madhav 553a520ab6
docs: Fixed Grammar in Considerations of Model I/O Concepts (#19091)
Fixed Grammar in Considerations of Model I/O Concepts documentation page
- Update concepts.mdx

Page Link:
https://python.langchain.com/docs/modules/model_io/concepts#considerations

- **Description:** Fixed Grammar in Considerations of Model I/O
Documentation Page
- **Issue:** "to work well with the model are you using" # "to work well
with the model you are using"
- **Dependencies:** None
- **Twitter handle:** @Anubhav_Madhav
(https://twitter.com/Anubhav_Madhav)


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Shotaro Sano d647ff1a9a
docs: Fix execution results of `docs/docs/modules/data_connection/indexing.ipynb` (#19112)
## Description
This PR addresses a documentation issue in the
[Indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
page. Specifically, it corrects the execution results of the Jupyter
notebook under the
[Source](https://python.langchain.com/docs/modules/data_connection/indexing#source)
section, which were broken as detailed below.

## Problem
The execution results following the statement, `This should delete the
old versions of documents associated with doggy.txt source and replace
them with the new versions.`, appear to be incorrect, as described
below.

### Current Behavior
- For some reason, the `index` function fails to add the new content of
`doggy.txt`. Although it deletes the document objects associated with
the `doggy.txt` source, it does not add the objects in
`changed_doggy_docs`. Consequently, the execution result displays
`num_added: 0`.
- This unexpected behavior also impacts the results of
`vectorstore.similarity_search("dog", k=30)`, showing only the contents
of `kitty.txt`. It appears as though the contents of `doggy.txt` have
been completely removed from the index:

```
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

### Expected Behavior
- The `index` function should successfully add the objects in
`changed_doggy_docs` after removing the old content of `doggy.txt`. The
anticipated execution result is `num_added: 2`.
- Subsequently, the modified content of `doggy.txt` should appear in the
results of `vectorstore.similarity_search("dog", k=30)` as follows:

```
[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

## Fix
I reran `docs/docs/modules/data_connection/indexing.ipynb` and have
included the diff in this PR.
4 months ago
Guangdong Liu cced3eb9bc
community[patch]: Fix sparkllm embeddings api bug. (#19122)
- **Description:** Fix sparkllm embeddings api bug.
@baskaryan PTAL
4 months ago
samanhappy b9c62fb905
docs: fix API link for BaseLoader (#19128)
The link to the BaseLoader API requires an update as it has been moved
into the `langchain_core` package.
4 months ago
Kostas Botsas 527676a753
docs: Fix source column xata.ipynb (#19137)
Docs fix: replace column name search with source.

The Xata integration expects metadata column named "source".

The docs suggest the name "search", which if used, yields the following
error:

```
File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/xata.py", line 95, in _add_vectors
    raise Exception(f"Error adding vectors to Xata: {r.status_code} {r}")
Exception: Error adding vectors to Xata: 400 {'errors': [{'status': 400, 'message': 'invalid record: column [source]: column not found'}]}
```
4 months ago
fengjial c922ea36cb
community[minor]: Add Baidu VectorDB as vector store (#17997)
Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>
4 months ago
aditya thomas 190887c5cd
docs: update the list of providers (#19012)
**Description:** Update the list of LangChain providers
**Issue:** Make the list of LangChain providers current
**Dependencies:** None
4 months ago
Erick Friis bbe164ad28
docs: voyageai as provider (#19154) 4 months ago
Erick Friis 781aee0068
community, langchain, infra: revert store extended test deps outside of poetry (#19153)
Reverts langchain-ai/langchain#18995

Because it makes installing dependencies in python 3.11 extended testing
take 80 minutes
4 months ago
Leonid Kuligin e3ff107e4f
docs: updated google integration related imports in the documentation (#19131)
updated imports in the documentation for google vertex
4 months ago
Erick Friis 9e569d85a4
community, langchain, infra: store extended test deps outside of poetry (#18995)
poetry can't reliably handle resolving the number of optional "extended
test" dependencies we have. If we instead just rely on pip to install
extended test deps in CI, this isn't an issue.
4 months ago