Commit Graph

6585 Commits

Author SHA1 Message Date
Nuno Campos
71076cceaf
Move json and xml parsers to core (#15026)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-21 12:36:56 -08:00
Nuno Campos
d5533b7081
Add option to make messages placeholder optional (#15031)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-21 12:36:37 -08:00
Bagatur
40f42b8947
community[patch]: Release 0.0.6 (#15023) 2023-12-21 14:37:44 -05:00
Bagatur
7eb1100925
core[patch]: Release 0.1.3 (#15022) 2023-12-21 14:35:15 -05:00
Nuno Campos
63e512b680
Implement streaming for all list output parsers (#14981)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-21 11:30:35 -08:00
Nuno Campos
b471166df7
Implement streaming for xml output parser (#14984)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-21 11:30:18 -08:00
Erick Friis
94bc3967a1
infra: api docs build order (#15018) 2023-12-21 11:05:02 -08:00
Jacob Lee
1b01ee0e3c
community[minor]: add hf chat wrapper (#14736)
Builds on #14040 with community refactor merged and notebook updated.

Note that with this refactor, models will be imported from
`langchain_community.chat_models.huggingface` rather than the main
`langchain` repo.

---------

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Signed-off-by: Yuchen Liang <yuchenl3@andrew.cmu.edu>
Co-authored-by: Andrew Reed <andrew.reed.r@gmail.com>
Co-authored-by: Andrew Reed <areed1242@gmail.com>
Co-authored-by: A-Roucher <aymeric.roucher@gmail.com>
Co-authored-by: Aymeric Roucher <69208727+A-Roucher@users.noreply.github.com>
2023-12-21 12:28:30 -05:00
Leonid Kuligin
b99274c9d8
community[patch]: changed default for VertexAIEmbeddings (#14614)
Replace this entire comment with:
- **Description:** @kurtisvg has raised a point that it's a good idea to
have a fixed version for embeddings (since otherwise a user might run a
query with one version vs a vectorstore where another version was used).
In order to avoid breaking changes, I'd suggest to give users a warning,
and make a `model_name` a required argument in 1.5 months.
2023-12-21 12:15:19 -05:00
Yannick Müller
138bc49759
docs: fixed wrong link in documentation (#14999)
See #14998
2023-12-21 12:06:43 -05:00
Karim Lalani
228ddabc3b
community: fix for surrealdb client 0.3.2 update + store and retrieve metadata (#14997)
Surrealdb client changes from 0.3.1 to 0.3.2 broke the surrealdb vectore
integration.
This PR updates the code to work with the updated client. The change is
backwards compatible with previous versions of surrealdb client.
Also expanded the vector store implementation to store and retrieve
metadata that's included with the document object.
2023-12-21 12:04:57 -05:00
Ikko Eltociear Ashimine
c7be59c122
docs: Update templates README.md (#15013)
Mulitple -> Multiple
2023-12-21 12:04:05 -05:00
Lance Martin
535db72607
Update Ollama multi-modal multi-vector template README.md (#14995) 2023-12-20 20:07:38 -08:00
Lance Martin
94586ec242
Update Ollama multi-modal template README.md (#14994) 2023-12-20 20:07:27 -08:00
Lance Martin
1db7450bc2
Update Gemini template README.md (#14993) 2023-12-20 20:07:20 -08:00
Lance Martin
8996d1a65d
Update multi-modal multi-vector template README.md (#14992) 2023-12-20 20:07:12 -08:00
Lance Martin
448b4d3522
Update multi-modal template README.md (#14991)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-20 20:06:52 -08:00
JaguarDB
ca0a75e1fc
community[patch]: JaguarHttpClient conditional import (#14985)
- **Description:** Fixed jaguar.py to import JaguarHttpClient with try
and catch
- **Issue:** the issue # Unable to use the JaguarHttpClient at run time
  - **Dependencies:** It requires "pip install -U jaguardb-http-client" 
  - **Twitter handle:** workbot

---------

Co-authored-by: JY <jyjy@jaguardb>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-20 19:11:57 -08:00
Michael Landis
1c934fff0e
community[patch]: support momento vector index filter expressions (#14978)
**Description**

For the Momento Vector Index (MVI) vector store implementation, pass
through `filter_expression` kwarg to the MVI client, if specified. This
change will enable the MVI self query implementation in a future PR.

Also fixes some integration tests.
2023-12-20 19:11:43 -08:00
Yacine
300c1cbf92
community[patch]: Fix typo in class Docstring (#14982)
- **Description:** Fix typo in class Docstring to replace
AZURE_OPENAI_API_ENDPOINT by AZURE_OPENAI_ENDPOINT
  - **Issue:** the issue #14901 
  - **Dependencies:** NA
  - **Twitter handle:**

Co-authored-by: Yacine Bouakkaz <Yacine.Bouakkaz@evokegroup.com>
2023-12-20 19:03:45 -08:00
Lance Martin
320c3ae4c8
templates: Add Ollama multi-modal templates (#14868)
Templates for [local multi-modal
LLMs](https://llava-vl.github.io/llava-interactive/) using -
* Image summaries
* Multi-modal embeddings

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-12-20 15:28:53 -08:00
chyroc
57d1eb733f
core[patch]: update langchain-core runtime library name (#14884)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-12-20 14:35:48 -08:00
Quy Tang
42822484ef
core(minor): Implement stream and astream for RunnableBranch (#14805)
* This PR adds `stream` implementations to Runnable Branch.
* Runnable Branch still does not support `transform` so it'll break streaming if it happens in middle or end of sequence, but will work if happens at beginning of sequence.
* Fixes use the async callback manager for async methods
* Handle BaseException rather than Exception, so more errors could be logged as errors when they are encountered


---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-12-20 15:37:56 -05:00
Leonid Ganeline
65a9193db2
docs: alibaba cloud (#14772)
The [provider
page](https://python.langchain.com/docs/integrations/providers/alibabacloud_opensearch)
holds the vector store information. The [Chat
example](https://python.langchain.com/docs/integrations/chat/pai_eas_chat_endpoint)
was incorrectly sorted in the navbar because of the wrong file name.
- Recreated a provide page
- Added missed links and descriptions
- Compound information about vector store from two pages into one
- Fixed file name
2023-12-20 12:32:33 -08:00
Bagatur
99f839d6f3
infra: pr template update (#14963) 2023-12-20 11:53:38 -08:00
MING KANG
ed5e0cfe57
community: add OCI Endpoint (#14250)
- **Description:** 
- [OCI Data
Science](https://docs.oracle.com/en-us/iaas/data-science/using/home.htm)
is a fully managed and serverless platform for data science teams to
build, train, and manage machine learning models in the Oracle Cloud
Infrastructure. This PR add integration for using LangChain with an LLM
hosted on a [OCI Data Science Model
Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
To authenticate,
[oracle-ads](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/cli/authentication.html)
has been used to automatically load credentials for invoking endpoint.
- **Issue:** None
- **Dependencies:** `oracle-ads`
- **Tag maintainer:** @baskaryan
- **Twitter handle:** None

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-12-20 11:52:20 -08:00
Erick Friis
75ba22793f
community: Vectara summarization (#14970)
Description: Adding Summarization to Vectara, to reflect it provides not
only vector-store type functionality but also can return a summary.
Also added:
MMR capability (in the Vectara platform side)

Updated templates

Updated documentation and IPYNB examples

Tag maintainer: @baskaryan
Twitter handle: @ofermend

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
2023-12-20 11:51:33 -08:00
Erick Friis
cf6951a0c9
docs: links (#14940) 2023-12-20 11:51:18 -08:00
Liang Zhang
6479aab74f
community[patch]: Add param "task" to Databricks LLM to work around serialization of transform_output_fn (#14933)
**What is the reproduce code?**

```python
from langchain.chains import LLMChain, load_chain
from langchain.llms import Databricks
from langchain.prompts import PromptTemplate

def transform_output(response):
    # Extract the answer from the responses.
    return str(response["candidates"][0]["text"])

def transform_input(**request):
    full_prompt = f"""{request["prompt"]}
    Be Concise.
    """
    request["prompt"] = full_prompt
    return request

chat_model = Databricks(
    endpoint_name="llama2-13B-chat-Brambles",
    transform_input_fn=transform_input,
    transform_output_fn=transform_output,
    verbose=True,
)
print(f"Test chat model: {chat_model('What is Apache Spark')}") # This works

llm_chain = LLMChain(llm=chat_model, prompt=PromptTemplate.from_template("{chat_input}"))
llm_chain("colorful socks") # this works
llm_chain.save("databricks_llm_chain.yaml") # transform_input_fn and transform_output_fn are not serialized into the model yaml file
loaded_chain = load_chain("databricks_llm_chain.yaml") # The Databricks LLM is recreated with transform_input_fn=None, transform_output_fn=None.
loaded_chain("colorful socks") # Thus this errors. The transform_output_fn is needed to produce the correct output
```


Error:
```
 File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-6c34afab-3473-421d-877f-1ef18930ef4d/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation
text
  str type expected (type=type_error.str)
 request payload: {'query': 'What is a databricks notebook?'}'}
```

**What does the error mean?**

When the LLM generates an answer, represented by a Generation data
object. The Generation data object takes a str field called text, e.g.
Generation(text=”blah”). However, the Databricks LLM tried to put a
non-str to text, e.g. Generation(text={“candidates”:[{“text”: “blah”}]})
Thus, pydantic errors.

**Why the output format becomes incorrect after saving and loading the
Databricks LLM?**

Databrick LLM does not support serializing transform_input_fn and
transform_output_fn, so they are not serialized into the model yaml
file. When the Databricks LLM is loaded, it is recreated with
transform_input_fn=None, transform_output_fn=None. Without
transform_output_fn, the output text is not unwrapped, thus errors.

Missing transform_output_fn causes this error.
Missing transform_input_fn causes the additional prompt “Be Concise.” to
be lost after saving and loading.
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-20 12:50:23 -05:00
Bagatur
1ea6d83188
langchain[patch]: Release 0.0.352 (#14961) 2023-12-20 10:27:03 -05:00
Bagatur
b03845e069
community[patch]: Release 0.0.5 (#14960) 2023-12-20 10:25:15 -05:00
Bagatur
a841f62791
core[patch]: 0.1.2 (#14959) 2023-12-20 10:13:54 -05:00
Anush
60c70effe9
community[minor]: Qdrant sparse vector retriever (#14814)
## Description

This PR intends to add support for Qdrant's new [sparse vector
retrieval](https://qdrant.tech/articles/sparse-vectors/) by introducing
a new retriever class, `QdrantSparseVectorRetriever`.

Necessary usage docs and integration tests have been added for the
retriever.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-20 02:22:19 -05:00
mogith-pn
c53fab63a3
community[patch]: Fixed duplicate input id issue in clarifai vectorstore (#14914)
- **Description:** 
This PR fixes the issue faces with duplicate input id in Clarifai
vectorstore class when ingesting documents into the vectorstore more
than the batch size.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-20 02:21:36 -05:00
Sypherd
5642132c0c
community[patch]: Add safe lookup to OpenAI response adapter (#14765)
## Description
Similar to https://github.com/langchain-ai/langchain/issues/5861, I've
experienced `KeyError`s resulting from unsafe lookups in the
`convert_dict_to_message` function in [this
file](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/adapters/openai.py).
While that issue focused on `KeyError 'content'`, I've opened another
issue (#14764) about how the problem still exists in the same function
but with `KeyError 'role'`. The fix for #5861 only added a safe lookup
to the specific line that was giving them trouble.. This PR fixes the
unsafe lookup in the rest of the function but the problem still exists
across the repo.

## Issues
* #14764
* #5861 

## Dependencies
* None

## Checklist
[x] make format
[x] make lint
[ ] make test - Results in `make: *** No rule to make target 'test'.
Stop.`

## Maintainers
* @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-20 01:17:23 -05:00
AlpinDale
b0588774f1
community[minor]: Add Aphrodite Engine support (#14759)
This PR adds support for PygmalionAI's [Aphrodite
Engine](https://github.com/PygmalionAI/aphrodite-engine), based on
vLLM's attention mechanism. At the moment, this PR does not include
support for the API servers, but they will be added in a later PR.

The only dependency as of now is `aphrodite-engine==0.4.2`. We pin the
version to prevent breakage due to changes in the aphrodite-engine
library.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-20 01:16:57 -05:00
Dmitry Tyumentsev
d21f44b484
community[minor]: Add YandexGPT embeddings (#14767)
- **Description:** Introducing an ability to work with the
[YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) embeddings
models.
---------

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
2023-12-20 01:11:07 -05:00
Nicolas Suzor
529144649e
community[patch]: add png support for vertexai._parse_chat_history_gemini() (#14788)
- **Description:** Modify community chat model vertexai to handle png
and other image types encoded in base64
  - **Dependencies:** added `import re` but no new dependencies.

This addresses a problem where the vertexai method
_parse_chat_history_gemini() was only recognizing image uris in jpeg
format. I made a simple change to cover other extension types.
2023-12-20 00:58:39 -05:00
Dr. Christoph Mittendorf
f348ad4ba8
docs: typo LLaMA2_sql_chat.ipynb (#14798)
"language" (right) vs "langugae" (wrong)
2023-12-20 00:54:06 -05:00
Liu Jun
b0c48dc983
community[patch]: make ak and sk optional in qianfan endpoint (#14835)
- **Description:** The Qianfan SDK offers multiple authentication
methods, but in the `QianfanEndpoint` of Langchain, it currently only
supports authentication through AK and SK. In order to accommodate users
who wish to use alternative authentication methods, this pull request
makes AK and SK optional. This change should not impact existing users,
while allowing users to configure other authentication methods as per
the Qianfan SDK documentation.
  - **Issue:** /
  - **Dependencies:** No
  - **Tag maintainer:** No
  - **Twitter handle:**
2023-12-20 00:49:33 -05:00
Archan Ghosh
65678b3816
community[patch]: Update arxiv.py with Entry ID as a return value (#14915)
Added Entry ID as a return value inside get_summaries_as_docs

- **Description:** Added the Entry ID as a return, so it's easier to
track the IDs of the papers that are being returned.


With the addition return of the entry ID in functions like
ArxivRetriever, it will be easier to reference the ID of the paper
itself.
2023-12-20 00:30:24 -05:00
thehunmonkgroup
dc20766513
docs: readme for langchain-mistralai (#14917)
- **Description:** Add README doc for MistralAI partner package.
  - **Tag maintainer:** @baskaryan
2023-12-20 00:22:43 -05:00
Elena Mata Yandiola
b66659fc28
docs: Clarification google_cloud_storage_directory.ipynb (#14922)
- Description: Just a minor add to the documentation to clarify how to
load all files from a folder. I assumed and try to do it specifying it
in the bucket (BUCKET/FOLDER), instead of using the prefix.
2023-12-20 00:21:42 -05:00
Ari Roffe
8bcadfd446
docs: nit embedding_distance.ipynb (#14929)
**Description:** Fix the docs about embedding distance evaluations
guide.
2023-12-20 00:13:17 -05:00
Yacine
20eacd4b5e
docs: update notebook documentation for custom tool (#14942)
- **Description:** Documentation update. The custom tool notebook
documentation is updated to revome the warning caused by directly
instantiating of the LLMMathChain with an llm which is is deprecated.
The from_llm class method is used instead. LLM output results gets
updated as well.
  - **Issue:** no applicable
  - **Dependencies:** No dependencies
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ybouakkaz

Co-authored-by: Yacine Bouakkaz <Yacine.Bouakkaz@evokegroup.com>
2023-12-20 00:08:58 -05:00
Bagatur
345acb26ac
community[patch]: Matching engine, return doc id (#14930) 2023-12-20 00:03:11 -05:00
Erick Friis
8a3360edf6
anthropic: beta messages integration (#14928) 2023-12-19 18:55:19 -08:00
Erick Friis
795cf2ddda
together: package and embedding model (#14936) 2023-12-19 18:48:32 -08:00
Erick Friis
c21379438c
docs: remove unused contributor steps (#14938) 2023-12-19 18:41:50 -08:00
William FH
758bcd4671
Add langsmith and benchmark repo links (#14931)
Think we could link to these in more places
2023-12-19 17:44:31 -08:00