Commit Graph

3233 Commits

Author SHA1 Message Date
William FH
e2a99bd169
Different error strings (#8010) 2023-07-20 09:58:25 -07:00
Bagatur
ec4f93b629
bump 238 (#8012) 2023-07-20 09:21:15 -07:00
vrushankportkey
5f10d2ea1d
Add Portkey LLMOps integration (#7877)
Integrating Portkey, which adds production features like caching,
tracing, tagging, retries, etc. to langchain apps.

  - Dependencies: None
  - Twitter handle: https://twitter.com/portkeyai
  - test_portkey.py added for tests
  - example notebook added in new utilities folder in modules
  
 Also fixed a bug with OpenAIEmbeddings where headers weren't passing.

cc @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 09:08:44 -07:00
Boris Nieuwenhuis
095937ad52
Add google place ID to google places tool response (#7789)
- Description: this change will add the google place ID of the found
location to the response of the GooglePlacesTool
  - Issue: Not applicable
  - Dependencies: no dependencies
  - Tag maintainer: @hinthornw
  - Twitter handle: Not applicable
2023-07-20 09:04:31 -07:00
Bagatur
7c24a6b9d1
Bagatur/apify (#8008)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Jiří Moravčík <jiri.moravcik@gmail.com>
Co-authored-by: Jan Čurn <jan.curn@gmail.com>
2023-07-20 08:36:01 -07:00
Aiden Le
1d7414a371
Feature: Add openai_api_model attribute to Doctran models (#7868)
- Description: Added the ability to define the open AI model.
- Issue: Currently the Doctran instance uses gpt-4 by default, this does
not work if the user has no access to gpt -4.
  - rlancemartin, @eyurtsev, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:27:56 -07:00
Dwai Banerjee
d8c40253c3
Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927)
BedrockEmbeddings does not have endpoint_url so that switching to custom
endpoint is not possible. I have access to Bedrock custom endpoint and
cannot use BedrockEmbeddings

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:25:59 -07:00
Bagatur
ea028b66ab
undo vectstore memory bug (#8007) 2023-07-20 07:25:23 -07:00
Mohammad Mohtashim
453d4c3a99
VectorStoreRetrieverMemory exclude additional input keys feature (#7941)
- Description: Added a parameter in VectorStoreRetrieverMemory which
filters the input given by the key when constructing the buffering the
document for Vector. This feature is helpful if you have certain inputs
apart from the VectorMemory's own memory_key that needs to be ignored
e.g when using combined memory, we might need to filter the memory_key
of the other memory, Please see the issue.
  - Issue: #7695
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:23:27 -07:00
Constantin Musca
d593833e4d
Add Golden Query Tool (#7930)
**Description:** Golden Query is a wrapper on top of the [Golden Query
API](https://docs.golden.com/reference/query-api) which enables
programmatic access to query results on entities across Golden's
Knowledge Base. For more information about Golden API, please see the
[Golden API Getting
Started](https://docs.golden.com/reference/getting-started) page.
**Issue:** None
**Dependencies:** requests(already present in project)
**Tag maintainer:** @hinthornw

Signed-off-by: Constantin Musca <constantin.musca@gmail.com>
2023-07-20 07:03:20 -07:00
eahova
aea97efe8b
Adding code to allow pandas to show all columns instead of truncating… (#7901)
- Description: Adding code to set pandas dataframe to display all the
columns. Otherwise, some data get truncated (it puts a "..." in the
middle and just shows the first 4 and last 4 columns) and the LLM
doesn't realize it isn't getting the full data. Default value is 8, so
this helps Dataframes larger than that.
  - Issue: none
  - Dependencies: none
  - Tag maintainer: @hinthornw 
  - Twitter handle: none
2023-07-20 07:02:01 -07:00
Santiago Delgado
c416dbe8e0
Amadeus Flight and Travel Search Tool (#7890)
## Background
With the addition on email and calendar tools, LangChain is continuing
to complete its functionality to automate business processes.

## Challenge
One of the pieces of business functionality that LangChain currently
doesn't have is the ability to search for flights and travel in order to
book business travel.

## Changes
This PR implements an integration with the
[Amadeus](https://developers.amadeus.com/) travel search API for
LangChain, enabling seamless search for flights with a single
authentication process.

## Who can review?
@hinthornw

## Appendix
@tsolakoua and @minjikarin, I utilized your
[amadeus-python](https://github.com/amadeus4dev/amadeus-python) library
extensively. Given the rising popularity of LangChain and similar AI
frameworks, the convergence of libraries like amadeus-python and tools
like this one is likely. So, I wanted to keep you updated on our
progress.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:59:29 -07:00
Hanit
ea149dbd89
Allowing outside parameters for Qdrant. (#7910)
@baskaryan @rlancemartin, @eyurtsev
2023-07-20 06:58:54 -07:00
Sheik Irfan Basha
d6493590da
Add Verbose support (#7982) (#7984)
- Description: Add verbose support for the extraction_chain
- Issue: Fixes #7982 
- Dependencies: NA
- Twitter handle: sheikirfanbasha
@hwchase17 and @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:52:13 -07:00
Junlin Zhou
812a1643db
chore(hf-text-gen): extract default params for reusing (#7929)
This PR extract common code (default generation params) for
`HuggingFaceTextGenInference`.

Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>
2023-07-20 06:49:12 -07:00
Yun Kim
54e02e4392
Add datadog-langchain integration doc (#7955)
## Description
Added a doc about the [Datadog APM integration for
LangChain](https://github.com/DataDog/dd-trace-py/pull/6137).
Note that the integration is on `ddtrace`'s end and so no code is
introduced/required by this integration into the langchain library. For
that reason I've refrained from adding an example notebook (although
I've added setup instructions for enabling the integration in the doc)
as no code is technically required to enable the integration.

Tagging @baskaryan as reviewer on this PR, thank you very much!

## Dependencies
Datadog APM users will need to have `ddtrace` installed, but the
integration is on `ddtrace` end and so does not introduce any external
dependencies to the LangChain project.


Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:44:58 -07:00
Wian Stipp
0ffb7fc10c
One Line Fix: missing text output with huggingface TGI LLM (#7972)
Small bug fix. The async _call method was missing a line to return the
generated text.

@baskaryan
2023-07-20 06:44:29 -07:00
Jithin James
493cbc9410
docs: fix a couple of small indentation errors in the strings (#7951)
Fixed a few indentations I came across in the docs @baskaryan
2023-07-20 06:34:01 -07:00
Bhashithe Abeysinghe
73901ef132
Added windows specific instructions to Llama.cpp documentation. (#8000)
- Description: Added windows specific instructions on llama.cpp in the
notebook file
  - Issue: #6356 
  - Dependencies: None
  - Tag maintainer: @baskaryan
2023-07-20 06:31:25 -07:00
Leonid Ganeline
24b26a922a
docstrings for embeddings (#7973)
Added/updated docstrings for the `embeddings`

@baskaryan
2023-07-20 06:26:44 -07:00
Leonid Ganeline
0613ed5b95
docstrings for LLMs (#7976)
docstrings for the `llms/`:
- added missed docstrings
- update existing docstrings to consistent format (no `Wrappers`!)
@baskaryan
2023-07-20 06:26:16 -07:00
Jeff Huber
5694e7b8cf
Update chroma notebook (#7978)
Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
2023-07-20 06:25:31 -07:00
Harutaka Kawamura
4a5894db47
Fix incorrect field name in MLflow AI Gateway config example (#7983) 2023-07-20 06:24:59 -07:00
Kacper Łukawski
19e8472521
Add async Qdrant to async_agent.ipynb (#7993)
I added Qdrant to the async API docs. This is the only vector store that
supports full async API.

@baskaryan @rlancemartin, @eyurtsev
2023-07-20 06:23:15 -07:00
Nuno Campos
8edb1db9dc
Fix key errors in weaviate hybrid retriever init (#7988) 2023-07-20 06:22:18 -07:00
Harrison Chase
df84e1bb64
pass callbacks along baby ai (#7908) 2023-07-19 22:40:33 -07:00
William FH
a4c5914c9a
Bump LS Version (#7970) 2023-07-19 17:12:16 -07:00
Bagatur
5d021c0962
nb fix (#7962) 2023-07-19 15:27:43 -07:00
Julien Salinas
3adab5e5be
Integrate NLP Cloud embeddings endpoint (#7931)
Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-19 15:27:34 -07:00
Bagatur
854a2be0ca
Add debugging guide (#7956) 2023-07-19 14:15:11 -07:00
Brendan Collins
9aef79c2e3
Add Geopandas.GeoDataFrame Document Loader (#3817)
Work in Progress.
WIP
Not ready...

Adds Document Loader support for
[Geopandas.GeoDataFrames](https://geopandas.org/)

Example:
- [x] stub out `GeoDataFrameLoader` class
- [x] stub out integration tests
- [ ] Experiment with different geometry text representations
- [ ] Verify CRS is successfully added in metadata
- [ ] Test effectiveness of searches on geometries
- [ ] Test with different geometry types (point, line, polygon with
multi-variants).
- [ ] Add documentation

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
2023-07-19 12:14:41 -07:00
Lance Martin
dfc533aa74
Add llama-v2 to local document QA (#7952) 2023-07-19 11:15:47 -07:00
Bagatur
d9b5bcd691
bump (#7948) 2023-07-19 10:23:21 -07:00
Bagatur
f97535b33e
fix (#7947) 2023-07-19 10:23:10 -07:00
Adilkhan Sarsen
7bb843477f
Removed kwargs from add_texts (#7595)
Removing **kwargs argument from add_texts method in DeepLake vectorstore
as it confuses users and doesn't fail when user is typing incorrect
parameters.

Also added small test to ensure the change is applies correctly.

Guys could pls take a look: @rlancemartin, @eyurtsev, this is a small
PR.

Thx so much!
2023-07-19 09:23:49 -07:00
Bagatur
4d8b48bdb3
bump 236 (#7938) 2023-07-19 07:51:40 -07:00
Harutaka Kawamura
f6839a8682
Add integration for MLflow AI Gateway (#7113)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->


- Adds integration for MLflow AI Gateway (this will be shipped in MLflow
2.5 this week).


Manual testing:

```sh
# Move to mlflow repo
cd /path/to/mlflow

# install langchain
pip install git+https://github.com/harupy/langchain.git@gateway-integration

# launch gateway service
mlflow gateway start --config-path examples/gateway/openai/config.yaml

# Then, run the examples in this PR
```
2023-07-19 07:40:55 -07:00
David Preti
6792a3557d
Update openai.py compatibility with azure 2023-07-01-preview (#7937)
Fixed missing "content" field in azure. 
Added a check for "content" in _dict (missing for azure
api=2023-07-01-preview)
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-19 07:31:18 -07:00
王斌(Bin Wang)
b65102bdb2
fix: pgvector search_type of similarity_score_threshold not working (#7771)
- Description: VectorStoreRetriever->similarity_score_threshold with
search_type of "similarity_score_threshold" not working with the
following two minor issues,
- Issue: 1. In line 237 of `vectorstores/base.py`, "score_threshold" is
passed to `_similarity_search_with_relevance_scores` as in the kwargs,
while score_threshold is not a valid argument of this method. As a fix,
before calling `_similarity_search_with_relevance_scores`,
score_threshold is popped from kwargs. 2. In line 596 to 607 of
`vectorstores/pgvector.py`, it's checking the distance_strategy against
the string in Enum. However, self.distance_strategy will get the
property of distance_strategy from line 316, where the callable function
is passed. To solve this issue, self.distance_strategy is changed to
self._distance_strategy to avoid calling the property method.,
  - Dependencies: No,
  - Tag maintainer: @rlancemartin, @eyurtsev,
  - Twitter handle: No

---------

Co-authored-by: Bin Wang <bin@arcanum.ai>
2023-07-19 07:20:52 -07:00
William FH
9d7e57f5c0
Docs Nit (#7918) 2023-07-18 21:47:28 -07:00
Wilson Leao Neto
8bb33f2296
Exposes Kendra result item DocumentAttributes in the document metadata (#7781)
- Description: exposes the ResultItem DocumentAttributes as document
metadata with key 'document_attributes' and refactors
AmazonKendraRetriever by providing a ResultItem base class in order to
avoid duplicate code;
- Tag maintainer: @3coins @hupe1980 @dev2049 @baskaryan
- Twitter handle: wilsonleao

### Why?
Some use cases depend on specific document attributes returned by the
retriever in order to improve the quality of the overall completion and
adjust what will be displayed to the user. For the sake of consistency,
we need to expose the DocumentAttributes as document metadata so we are
sure that we are using the values returned by the kendra request issued
by langchain.

I would appreciate your review @3coins @hupe1980 @dev2049. Thank you in
advance!

### References
- [Amazon Kendra
DocumentAttribute](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttribute.html)
- [Amazon Kendra
DocumentAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttributeValue.html)

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
2023-07-18 18:46:38 -07:00
Wilson Leao Neto
efa67ed0ef
fix #7782: check title and excerpt separately for page_content (#7783)
- Description: check title and excerpt separately for page_content so
that if title is empty but excerpt is present, the page_content will
only contain the excerpt
  - Issue: #7782 
  - Tag maintainer: @3coins @baskaryan 
  - Twitter handle: wilsonleao
2023-07-18 18:46:23 -07:00
Leonid Ganeline
d92926cbc2
docstrings chains (#7892)
Added/updated docstrings.
2023-07-18 18:25:42 -07:00
Leonid Ganeline
4a810756f8
docstrings chains (#7892)
Added/updated docstrings.

@baskaryan
2023-07-18 18:25:27 -07:00
Jarek Kazmierczak
f2ef3ff54a
Google Cloud Enterprise Search retriever (#7857)
Added a retriever that encapsulated Google Cloud Enterprise Search.


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 18:24:08 -07:00
Alonso Silva Allende
1152f4d48b
Allow chat models that do not return token usage (#7907)
- Description: It allows to use chat models that do not return token
usage
- Issue: [#7900](https://github.com/hwchase17/langchain/issues/7900)
- Dependencies: None
- Tag maintainer: @agola11 @hwchase17 
- Twitter handle: @alonsosilva

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-07-18 18:12:09 -07:00
Zizhong Zhang
bdf0c2267f
docs(custom_chain) fix typo (#7898)
Fix typo in the document of custom_chain
2023-07-18 18:03:19 -07:00
Jeff Huber
2139d0197e
upgrade chroma to 0.4.0 (#7749)
** This should land Monday the 17th ** 

Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to
build, more durable, faster, smaller, and more extensible. This comes
with a few changes:

1. A simplified and improved client setup. Instead of having to remember
weird settings, users can just do `EphemeralClient`, `PersistentClient`
or `HttpClient` (the underlying direct `Client` implementation is also
still accessible)

2. We migrated data stores away from `duckdb` and `clickhouse`. This
changes the api for the `PersistentClient` that used to reference
`chroma_db_impl="duckdb+parquet"`. Now we simply set
`is_persistent=true`. `is_persistent` is set for you to `true` if you
use `PersistentClient`.

3. Because we migrated away from `duckdb` and `clickhouse` - this also
means that users need to migrate their data into the new layout and
schema. Chroma is committed to providing extension notification and
tooling around any schema and data migrations (for example - this PR!).

After upgrading to `0.4.0` - if users try to access their data that was
stored in the previous regime, the system will throw an `Exception` and
instruct them how to use the migration assistant to migrate their data.
The migration assitant is a pip installable CLI: `pip install
chroma_migrate`. And is runnable by calling `chroma_migrate`

-- TODO ADD here is a short video demonstrating how it works. 

Please reference the readme at
[chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate)
to see a full write-up of our philosophy on migrations as well as more
details about this particular migration.

Please direct any users facing issues upgrading to our Discord channel
called
[#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883).
We have also created a [email
listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers
directly in the future about breaking changes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 17:20:54 -07:00
Gergely Papp
10246375a5
Gpapp/chromadb (#7891)
- Description: version check to make sure chromadb >=0.4.0 does not
throw an error, and uses the default sqlite persistence engine when the
directory is set,
  - Issue: the issue #7887 

For attention of
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 17:03:42 -07:00
Lance Martin
41c841ec85
Add Llama-v2 to Llama.cpp notebook (#7913) 2023-07-18 15:13:27 -07:00