Commit Graph

4319 Commits (fed137a8a9a1387278d02b4533016a11ae631873)
 

Author SHA1 Message Date
Jon Bennion fed137a8a9
adding new chain for logical fallacy removal from model output in chain (#9887)
Description: new chain for logical fallacy removal from model output in
chain and docs
Issue: n/a see above
Dependencies: none
Tag maintainer: @hinthornw in past from my end but not sure who that
would be for maintenance of chains
Twitter handle: no twitter feel free to call out my git user if shout
out j-space-b

Note: created documentation in docs/extras

---------

Co-authored-by: Jon Bennion <jb@Jons-MacBook-Pro.local>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Harrison Chase 794ff2dae8
Harrison/hf lru (#10154)
Co-authored-by: Pascal Bro <git@pascalbrokmeier.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
Stanko Kuveljic 4765c09703
Pinecone upsert parallelization (#9859)
Issue: closes #9855

* consolidates `from_texts` and `add_texts` functions for pinecone
upsert
* adds two types of batching (one for embeddings and one for index
upsert)
* adds thread pool size when instantiating pinecone index
12 months ago
Lance Martin 16a27ab244
Add prompt hub for various use-cases (#9879)
Use prompt hub in our use-case docs and guides.
12 months ago
Lorenzo 00a7c31ffd
Fix: Nested Dicts Handling of Document Metadata (#9880)
## Description
When the `MultiQueryRetriever` is used to get the list of documents
relevant according to a query, inside a vector store, and at least one
of these contain metadata with nested dictionaries, a `TypeError:
unhashable type: 'dict'` exception is thrown.
This is caused by the `unique_union` function which, to guarantee the
uniqueness of the returned documents, tries, unsuccessfully, to hash the
nested dictionaries and use them as a part of key.
```python
unique_documents_dict = {
    (doc.page_content, tuple(sorted(doc.metadata.items()))): doc
    for doc in documents
}
```

## Issue
#9872 (MultiQueryRetriever (get_relevant_documents) raises TypeError:
unhashable type: 'dict' with dic metadata)

## Solution
A possible solution is to dump the metadata dict to a string and use it
as a part of hashed key.
```python
unique_documents_dict = {
    (doc.page_content, json.dumps(doc.metadata, sort_keys=True)): doc
    for doc in documents
}
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
Leonid Ganeline a52fe9528e
docs: fixed title in `Bittensor` example (#9893)
Fixed title in the `Bittensor` example. The old title brakes the sorted
order of items in the navbar.
Added some formatting.
12 months ago
Davide Menini b8baead70c
fix (Html2TextTransformer): allow configuration of html2text (#9914)
Hi, this PR enables configuring the html2text package, instead of being
bound to use the hardcoded values. While simply passing `ignore_links`
and `ignore_images` to the `transform_documents` method was possible, I
preferred passing them to the `__init__` method for 2 reasons:

1. It is more efficient in case of subsequent calls to
`transform_documents`.
2. It allows to move the "complexity" to the instantiation, keeping the
actual execution simple and general enough. IMO the transformers should
all follow this pattern, allowing something like this:
```python
# Instantiate transformers
transformers = [
    TransformerA(foo='bar'),
    TransformerB(bar='foo'),
    # others
]

# During execution, call them sequentially
documents = ...
for tr in transformers:
    documents = tr.transform_documents(documents)
```

Thanks for the reviews!

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
12 months ago
seamusp abd8681341
docs: chains & memory fixes (#9895)
Various improvements to the Chains & Memory sections of the
documentation including formatting, spelling, and grammar fixes to
improve readability.
12 months ago
Frédéric Lepied 4dc47bd3ac
time_weighted_retriever: use a timestamp if needed (#9906)
If last_accessed_at metadata is a float use it as a timestamp. This
allows to support vector stores that do not store datetime objects like
ChromaDb.

Fixes: https://github.com/langchain-ai/langchain/issues/3685

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
12 months ago
Josh White bc8cceebf7
Extend DynamoDBChatMessageHistory to support composite keys (#9896)
- Description: Adds two optional parameters to the
DynamoDBChatMessageHistory class to enable users to pass in a name for
their PrimaryKey, or a Key object itself to enable the use of composite
keys, a common DynamoDB paradigm.
  
[AWS DynamoDB Key
docs](https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/)
  
  - Issue: N/A
  - Dependencies: N/A
  - Twitter handle: N/A

---------

Co-authored-by: Josh White <josh@ctrlstack.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
Programmers Emperor 872d829201
Update __init__.py (#9955)
Add SQLDatabaseSequentialChain Class to __init__.py so it can be
accessed and used

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- Description: SQLDatabaseSequentialChain is not found when importing
Langchain_experimental package, when I open __init__.py
Langchain_expermental.sql, I found that SQLDatabaseSequentialChain is
imported and add to __all__ list
- Issue: SQLDatabaseSequentialChain is not found in
Langchain_experimental package
  - Dependencies: None,
  - Tag maintainer: None,
  - Twitter handle: None,

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
12 months ago
Lucas Rodrigues Pereira 5c7afe8aae
Fix json parsing error of MULTI_PROMPT_ROUTER_TEMPLATE (#9944)
The output at times lacks the closing markdown code block. The prompt is
changed to explicitly request the closing backticks.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
12 months ago
Lance Martin 387813bfb2
Sort by most recent chatIDs (#9946)
When we `lazy_load` iMessage chats, return chats w/ most recent msg
first (matches what is visualized in app).
12 months ago
German Martin cf5a50469f
TextGen is missing async methods. (#9986)
Adding _acall and _astream method that were missing. Preventing
streaming during async executions.

 @rlancemartin.
12 months ago
Blake (Yung Cher Ho) f4bed8a04c
Takeoff baseurl support (#10091)
## Description
This PR introduces a minor change to the TitanTakeoff integration. 
Instead of specifying a port on localhost, this PR will allow users to
specify a baseURL instead. This will allow users to use the integration
if they have TitanTakeoff deployed externally (not on localhost). This
removes the hardcoded reference to localhost "http://localhost:{port}".

### Info about Titan Takeoff
Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.

Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)

### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.

Thanks for your help and please let me know if you have any questions.
cc: @hwchase17 @baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Pu Cao 05664a6f20
docs(text_splitter): update document of character splitter with tiktoken (#10001)
The current document has not mentioned that splits larger than chunk
size would happen. I update the related document and explain why it
happens and how to solve it.

related issue #1349 #3838 #2140
12 months ago
Eddie Cohen 565c021730
Add ne comparator (#10006)
Description: Adds the not comparator and operator to pinecone, chroma
and deeplake.
Issue: Not a registered issue but when using a selfqueryretriever with
pinecone I got this error + stacktrace when I entered a query that asked
to not include specific data:
 
>  raised following `error:`
> Received unrecognized function ne. Valid functions are [<Operator.AND:
'and'>, <Operator.OR: 'or'>, <Operator.NOT: 'not'>, <Comparator.EQ:
'eq'>, <Comparator.GT: 'gt'>, <Comparator.GTE: 'gte'>, <Comparator.LT:
'lt'>, <Comparator.LTE: 'lte'>]

I noticed that chroma and deeplake also support not equals/not filtering
so I added it there as well



[pinecone](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language)
[chroma](https://docs.trychroma.com/usage-guide#filtering-by-metadata)

[deeplake](https://docs.activeloop.ai/enterprise-features/compute-engine/querying-datasets/query-syntax#and-or-not)
12 months ago
Leonid Ganeline 2221194450
`Yahoo Finance News` tool (#10014)
Added:
- the `Yahoo Finance News` tool
- Ut-s
- An example
12 months ago
Ismail Pelaseyed 5c3e9c9083
Add example of running Q&A over structured data using the `Airbyte` loaders and `pandas` (#10069)
- Description: Added example of running Q&A over structured data using
the `Airbyte` loaders and `pandas`
  - Dependencies: any dependencies required for this change,
  - Tag maintainer: @hwchase17 
  - Twitter handle: @pelaseyed
12 months ago
Lars von Wedel 6d82503eb1
Add parser and loader for Azure document intelligence service. (#10136)
Hi,

this PR contains loader / parser for Azure Document intelligence which
is a ML-based service to ingest arbitrary PDFs / images, even if
scanned. The loader generates Documents by pages of the original
document. This is my first contribution to LangChain.

Unfortunately I could not find the correct place for test cases. Happy
to add one if you can point me to the location, but as this is a
cloud-based service, a test would require network access and credentials
- so might be of limited help.

Dependencies: The needed dependency was already part of pyproject.toml,
no change.
Twitter: feel free to mention @LarsAC on the announcement
12 months ago
Harrison Chase 4abe85be57
Harrison/string inplace (#10153)
Co-authored-by: Wrick Talukdar <wrick.talukdar@gmail.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Lucky-Lance <77819606+Lucky-Lance@users.noreply.github.com>
Co-authored-by: 陆徐东 <luxudong@MacBook-Pro.local>
12 months ago
Harrison Chase f5af756397
fake messages list model (#10152)
create a fake chat model that you can configure with list of messages
12 months ago
Harrison Chase 9e6cc7b236
make hub push public by default (#10138) 12 months ago
Nino Risteski 0c0a7d19eb
Update openai_multi_functions_agent.ipynb (#10144)
typo fix
12 months ago
Nino Risteski f968b86652
Update apis.ipynb (#10145)
few typo fixes
12 months ago
Guy Korland 765ef3b486
Add FalkorDB to imports (#10151) 12 months ago
Nino Risteski 746c6ff9c3
Update index.mdx (#10142)
fixed typos
12 months ago
Nino Risteski fdebd3e02f
Update chat_vector_db.mdx (#10141)
typo fix
12 months ago
Bagatur 0e4c5dd176
bump 13 (#10130) 12 months ago
Bagatur 42582adb66
bump 280 (#10117) 12 months ago
Bagatur 9e196cb470
rm sqlite3 import (#10115) 12 months ago
Arpan Pokharel f8bca156d4
Add where filter in weaviate similarity search with score (#9978)
- Description: Add where filter in weaviate similarity search with score
  - Issue: #9853 
  - Dependencies: -
  - Tag maintainer: -
  - Twitter handle: -
12 months ago
Leonid Kuligin 30239b3025
added support for inference from Model Garden (#9367)
#8850

---------

Co-authored-by: Leonid Kuligin <kuligin@google.com>
12 months ago
Leonid Ganeline 54a8df87b9
📖 docs: fixed `integration/llms` navbar (#9277)
Fixed navbar:
- renamed several files, so ToC is sorted correctly
- made ToC items consistent: formatted several Titles
- added several links
- reformatted several docs to a consistent format
- renamed several files (removed `_example` suffix)
- added renamed files to the `docs/docs_skeleton/vercel.json`
12 months ago
Bagatur b485c3048b
rm base64 images from docs (#10110)
Causing problems indexing docs and notebook images don't render after markdown conversion anyways
12 months ago
William FH f2fc4173c3
Update redirects meta tags (#10109) 12 months ago
Leonid Ganeline 37e435bd00
docs: `youtube_search` tool example update (#9958)
Added a link to source package; updated title, description.
12 months ago
Leonid Ganeline 3b8ee74e38
docs: `google-drive-tool` example fix (#10000)
This notebook was mistakenly placed in the `toolkits` folder and appears
within `Agents & Toolkits` menu. But it should be in `Tools`.
Moved example into `tools/`; updated title to consistent format.
12 months ago
seamusp afd96b2460
docs: agents & callbacks fixes (#10066)
Various improvements to the Agents & Callbacks sections of the
documentation including formatting, spelling, and grammar fixes to
improve readability.
12 months ago
Benjamin Matson 58d7d86e51
feat: add bedrock chat model (#8017)
Replace this comment with:
  - Description: Add Bedrock implementation of Anthropic Claude for Chat
  - Tag maintainer: @hwchase17, @baskaryan
  - Twitter handle: @bwmatson

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
Massimiliano Pronesti a7c9bd30d4
feat(llms): add missing params to huggingface text-generation (#9724)
This small PR aims at supporting the following missing parameters in the
`HuggingfaceTextGen` LLM:
- `return_full_text` - sometimes useful for completion tasks
- `do_sample` - quite handy to control the randomness of the model.
- `watermark`

@hwchase17 @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
KyrianC 491089754d
EdenAI LLM update. Add models name option (#8963)
This PR follows the **Eden AI (LLM + embeddings) integration**. #8633 

We added an optional parameter to choose different AI models for
providers (like 'text-bison' for provider 'google', 'text-davinci-003'
for provider 'openai', etc.).

Usage:

```python
llm = EdenAI(
    feature="text",
    provider="google",
    params={
        "model": "text-bison",  # new
        "temperature": 0.2,
        "max_tokens": 250,
    },
)

```

You can also change the provider + model after initialization
```python
llm = EdenAI(
    feature="text",
    provider="google",
    params={
        "temperature": 0.2,
        "max_tokens": 250,
    },
)

prompt = """
hi 
"""

llm(prompt, providers='openai', model='text-davinci-003')  # change provider & model
```

The jupyter notebook as been updated with an example well.


Ping: @hwchase17, @baskaryan

---------

Co-authored-by: RedhaWassim <rwasssim@gmail.com>
Co-authored-by: sam <melaine.samy@gmail.com>
12 months ago
maks-operlejn-ds b5a74fb973
Temporarily remove language selection (#10097)
Adapting Microsoft Presidio to other languages requires a bit more work,
so for now it will be good idea to remove the language option to choose,
so as not to cause errors and confusion.
https://microsoft.github.io/presidio/analyzer/languages/

I will handle different languages after the weekend 😄
12 months ago
Bagatur 71c418725f
index rename delete_mode -> cleanup (#10103) 12 months ago
Nuno Campos 427f696fb0
Nc/runnables seqmap tags (#9753) 12 months ago
Bagatur b927277809
Bagatur/eden type 2 (#10102) 12 months ago
Bagatur d4380339c1
eden tool nb nit (#10101) 12 months ago
Harrison Chase d7bf7dc412
add repr for not serializable (#10071)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
12 months ago
Bagatur 355ff09cce
bump 279 (#10098) 12 months ago
Pihplipe Oegr 3dafbd852e
Add sqlite-vss as a vector database (#10047)
This adds sqlite-vss as an option for a vector database. Contains the
code and a few tests. Tests are passing and the library sqlite-vss is
added as optional as explained in the contributing guidelines. I
adjusted the code for lint/black/ and mypy. It looks that everything is
currently passing.

Adding sqlite-vss was mentioned in this issue:
https://github.com/langchain-ai/langchain/issues/1019.
Also mentioned here in the sqlite-vss repo for the curious:
https://github.com/asg017/sqlite-vss/issues/66

Maintainer tag: @baskaryan

---------

Co-authored-by: Philippe Oger <philippe.oger@adevinta.com>
12 months ago