Commit Graph

1368 Commits (docker)

Author SHA1 Message Date
Davis Chase 440b8761f4
Redis kwargs fix (#4936)
cc @tylerhutcherson
1 year ago
elBarkey a8ded21b69
FIX: GPTCache cache_obj creation loop (#4827)
_get_gptcache method keep creating new gptcache instance, here's the fix

# Fix GPTCache cache_obj creation loop

Fixes #4830 

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Alexey Nominas c9e2a01875
Update GPT4ALL integration (#4567)
# Update GPT4ALL integration

GPT4ALL have completely changed their bindings. They use a bit odd
implementation that doesn't fit well into base.py and it will probably
be changed again, so it's a temporary solution.

Fixes #3839, #4628
1 year ago
Leonid Ganeline e2d7677526
docs: compound ecosystem and integrations (#4870)
# Docs: compound ecosystem and integrations

**Problem statement:** We have a big overlap between the
References/Integrations and Ecosystem/LongChain Ecosystem pages. It
confuses users. It creates a situation when new integration is added
only on one of these pages, which creates even more confusion.
- removed References/Integrations page (but move all its information
into the individual integration pages - in the next PR).
- renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations.
I like the Ecosystem term. It is more generic and semantically richer
than the Integration term. But it mentally overloads users. The
`integration` term is more concrete.
UPDATE: after discussion, the Ecosystem is the term.
Ecosystem/Integrations is the page (in place of Ecosystem/LongChain
Ecosystem).

As a result, a user gets a single place to start with the individual
integration.
1 year ago
Harrison Chase d5a0704544
dont error on sql import (#4647)
this makes it so we dont throw errors when importing langchain when
sqlalchemy==1.3.1

we dont really want to support 1.3.1 (seems like unneccessary maintance
cost) BUT we would like it to not terribly error should someone decide
to run on it
1 year ago
Harrison Chase c9a362e482
add alias for model (#4553)
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Richard He 7642f2159c
Add human message as input variable to chat agent prompt creation (#4542)
# Add human message as input variable to chat agent prompt creation

This PR adds human message and system message input to
`CHAT_ZERO_SHOT_REACT_DESCRIPTION` agent, similar to [conversational
chat
agent](7bcf238a1a/langchain/agents/conversational_chat/base.py (L64-L71)).

I met this issue trying to use `create_prompt` function when using the
[BabyAGI agent with tools
notebook](https://python.langchain.com/en/latest/use_cases/autonomous_agents/baby_agi_with_agent.html),
since BabyAGI uses “task” instead of “input” input variable. For normal
zero shot react agent this is fine because I can manually change the
suffix to “{input}/n/n{agent_scratchpad}” just like the notebook, but I
cannot do this with conversational chat agent, therefore blocking me to
use BabyAGI with chat zero shot agent.

I tested this in my own project
[Chrome-GPT](https://github.com/richardyc/Chrome-GPT) and this fix
worked.

## Request for review
Agents / Tools / Toolkits
- @vowelparrot
1 year ago
Yuekai Zhang 1ed4228822
Fix bilibili (#4860)
# Fix bilibili api import error

bilibili-api package is depracated and there is no sync module.

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Fixes #2673 #2724 

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@vowelparrot  @liaokongVFX 

<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->
1 year ago
Eugene Yurtsev e46202829f
feat #4479: TextLoader auto detect encoding and improved exceptions (#4927)
# TextLoader auto detect encoding and enhanced exception handling

- Add an option to enable encoding detection on `TextLoader`. 
- The detection is done using `chardet`
- The loading is done by trying all detected encodings by order of
confidence or raise an exception otherwise.

### New Dependencies:
- `chardet`

Fixes #4479 

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @eyurtsev

---------

Co-authored-by: blob42 <spike@w530>
1 year ago
张城铭 8c28ad6dac
API update: Engines -> Models (#4915)
# API update: Engines -> Models

see: https://community.openai.com/t/api-update-engines-models/18597

Co-authored-by: assert <zhangchengming@kkguan.com>
1 year ago
Eugene Yurtsev c06a47a691
Load specific file types from Google Drive (issue #4878) (#4926)
# Load specific file types from Google Drive (issue #4878)
Add the possibility to define what file types you want to load from
Google Drive.
 
```
 loader = GoogleDriveLoader(
    folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
    file_types=["document", "pdf"]
    recursive=False
)
```

Fixes ##4878

## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
DataLoaders
- @eyurtsev

Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
RicChilligerDude#7589

---------

Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com>
1 year ago
Harrison Chase b8d48939a2
Harrison/unified objectives (#4905)
Co-authored-by: Matthias Samwald <samwald@gmx.at>
1 year ago
Harrison Chase 9165267f8a
Harrison/improved retry tool (#4842) 1 year ago
Harrison Chase ba023d53ca
Harrison/faiss norm (#4903)
Co-authored-by: Jiaxin Shan <seedjeffwan@gmail.com>
1 year ago
Harrison Chase 9e2227ba11
Harrison/serper api bug (#4902)
Co-authored-by: Jerry Luan <xmaswillyou@gmail.com>
1 year ago
Davis Chase 8966f61ca5
Zep memory (#4898)
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
1 year ago
Davis Chase e28bdf4453
Cadlabs/python tool sanitization (#4754)
Co-authored-by: BenSchZA <BenSchZA@users.noreply.github.com>
1 year ago
Eugene Yurtsev 0dc304ca80
Add html parsers (#4874)
# Add bs4 html parser

* Some minor refactors
* Extract the bs4 html parsing code from the bs html loader
* Move some tests from integration tests to unit tests
1 year ago
Eugene Yurtsev 8e41143bf5
Add a generic document loader (#4875)
# Add generic document loader

* This PR adds a generic document loader which can assemble a loader
from a blob loader and a parser
* Adds a registry for parsers
* Populate registry with a default mimetype based parser


## Expected changes

- Parsing involves loading content via IO so can be sped up via:
  * Threading in sync
  * Async  
- The actual parsing logic may be computatinoally involved: may need to
figure out to add multi-processing support
- May want to add suffix based parser since suffixes are easier to
specify in comparison to mime types

## Before submitting

No notebooks yet, we first need to get a few of the basic parsers up
(prior to advertising the interface)
1 year ago
Davis Chase df0c33a005
Faiss no avx2 (#4895)
Co-authored-by: Ali Mirlou <alimirlou@gmail.com>
1 year ago
Emil Ahlbäck 5c9205d5f4
ConversationalChatAgent: Allow customizing `TEMPLATE_TOOL_RESPONSE` (#2361)
It's currently not possible to change the `TEMPLATE_TOOL_RESPONSE`
prompt for ConversationalChatAgent, this PR changes that.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Zander Chase 1ff7c958b0
Bold Crumbs (#4876) 1 year ago
Taqi Jaffri ef8b5f64bc
Tiny code review and docs fix for Docugami DataLoader (#4877)
# Docs and code review fixes for Docugami DataLoader

1. I noticed a couple of hyperlinks that are not loading in the
langchain docs (I guess need explicit anchor tags). Added those.
2. In code review @eyurtsev had a
[suggestion](https://github.com/hwchase17/langchain/pull/4727#discussion_r1194069347)
to allow string paths. Turns out just updating the type works (I tested
locally with string paths).

# Pre-submission checks
I ran `make lint` and `make tests` successfully.

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
1 year ago
Leonid Ganeline b96ab4b763
docs `retriever` improvements (#4430)
# Docs: improvements in the `retrievers/examples/` notebooks

Its primary purpose is to make the Jupyter notebook examples
**consistent** and more suitable for first-time viewers.
- add links to the integration source (if applicable) with a short
description of this source;
- removed `_retriever` suffix from the file names (where it existed) for
consistency;
- removed ` retriever` from the notebook title (where it existed) for
consistency;
- added code to install necessary Python package(s);
- added code to set up the necessary API Key.
- very small fixes in notebooks from other folders (for consistency):
  - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb
  - docs/modules/indexes/vectorstores/examples/pinecone.ipynb
  - docs/modules/models/llms/integrations/cohere.ipynb
- fixed misspelling in langchain/retrievers/time_weighted_retriever.py
comment (sorry, about this change in a .py file )

## Who can review
@dev2049
1 year ago
Yong Fu 3e12f0957a
Remove unused variables in Milvus vectorstore (#4868)
# Remove unused variables in Milvus vectorstore
This PR simply removes a variable unused in Milvus. The variable looks
like a copy-paste from other functions in Milvus but it is really
unnecessary.
1 year ago
Ryan Culligan 6a9cdc43f5
Fix TypeError in Vectorstore Redis class methods (#4857)
# Fix TypeError in Vectorstore Redis class methods

This change resolves a TypeError that was raised when invoking the
`from_texts_return_keys` method from the `from_texts` method in the
`Redis` class. The error was due to the `cls` argument being passed
explicitly, which led to it being provided twice since it's also
implicitly passed in class methods. No relevant tests were added as the
issue appeared to be better suited for linters to catch proactively.

Changes:
- Removed `cls=cls` from the call to `from_texts_return_keys` in the
`from_texts` method.

Related to:
https://github.com/hwchase17/langchain/pull/4653
1 year ago
Eugene Yurtsev 2d20a1196e
Hugging Face Loader: Add lazy load (#4799)
# Add lazy load to HF datasets loader

Unfortunately, there are no tests as far as i can tell. Verified code manually.
1 year ago
Zander Chase 8dcad0f272
Add Support for Flexible Input Format for LLM and Chat Model Runs (#4805)
Previously, the client expected a strict 'prompt' or 'messages' format
and wouldn't permit running a chat model or llm on prompts or messages
(respectively).

Since many datasets may want to specify custom key: string , relax this
requirement.
Also, add support for running a chat model on raw prompts and LLM on
chat messages through their respective fallbacks.
1 year ago
Zander Chase a47c62fcba
Add dev option (#4828)
enable running
```
langchain plus start --dev
```

To use the RC iamges instead
1 year ago
Harrison Chase 720ac49f42
2markdown loader (#4796)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Ankush Gola aa73a888fa
Some notebook and client fixes (add retries, clean up docs, etc) (#4820)
# Your PR Title (What it does)

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->
1 year ago
Davis Chase 0a591da6db
Add weaviate by_text (#4824)
Thanks @ZouhairElhadi! Made small change

Closes #4742

---------

Co-authored-by: Zouhair Elhadi <zouhair11elhadi@gmail.com>
Co-authored-by: ZouhairElhadi <87149442+ZouhairElhadi@users.noreply.github.com>
1 year ago
Zander Chase d1b6839d97
Retry session and tenant (#4822) 1 year ago
Nguyen Trung Duc (john) 49e4aaf673
Fix subclassing OpenAIEmbeddings (#4500)
# Fix subclassing OpenAIEmbeddings

Fixes #4498 

## Before submitting

- Problem: Due to annotated type `Tuple[()]`.
- Fix: Change the annotated type to "Iterable[str]". Even though
tiktoken use
[Collection[str]](095924e02c/tiktoken/core.py (L80))
type annotation, but pydantic doesn't support Collection type, and
[Iterable](https://docs.pydantic.dev/latest/usage/types/#typing-iterables)
is the closest to Collection.
1 year ago
Harrison Chase 08df80bed6
console callback verbose (#4696)
add verbose callback

Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
1 year ago
Django bcffc704c1
fix: agenerate miss run_manager args in llm.py (#4566)
# fix: agenerate miss run_manager args in llm.py

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)
fix: agenerate miss run_manager args in llm.py


<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->
1 year ago
Steve Kim e90654f39b
Added cleaning up the downloaded PDF files (#4601)
ArxivAPIWrapper searches and downloads PDFs to get related information.
But I found that it doesn't delete the downloaded file. The reason why
this is a problem is that a lot of PDF files remain on the server. For
example, one size is about 28M.
So, I added a delete line because it's too big to maintain on the
server.

# Clean up downloaded PDF files
- Changes: Added new line to delete downloaded file
- Background: To get the information on arXiv's paper, ArxivAPIWrapper
class downloads a PDF.
It's a natural approach, but the wrapper retains a lot of PDF files on
the server.
- Problem: One size of PDFs is about 28M. It's too big to maintain on a
small server like AWS.
- Dependency: import os

Thank you.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Quinn 6fbd5e837f
Update planner_prompt.py, change usery to user (#4623)
# Fix misspell in planner_prompt.py

before

```
Usery query: I want to buy a couch
```

after

```
User query: I want to buy a couch
```
1 year ago
Tony Zhang 432421ffa5
[Fix][GenerativeAgent] Get the memory importance score from regex matched group (#4636)
# Get the memory importance score from regex matched group

In `GenerativeAgentMemory`, the `_score_memory_importance()` will make a
prompt to get a rating score. The prompt is:
```
        prompt = PromptTemplate.from_template(
            "On the scale of 1 to 10, where 1 is purely mundane"
            + " (e.g., brushing teeth, making bed) and 10 is"
            + " extremely poignant (e.g., a break up, college"
            + " acceptance), rate the likely poignancy of the"
            + " following piece of memory. Respond with a single integer."
            + "\nMemory: {memory_content}"
            + "\nRating: "
        )
```
For some LLM, it will respond with, for example, `Rating: 8`. Thus we
might want to get the score from the matched regex group.
1 year ago
rajib e28f4a5f39
changed cohere.py to update the default model of embedding (#4709)
# The cohere embedding model do not use large, small. It is deprecated.
Changed the modules default model

Fixes #4694


Co-authored-by: rajib76 <rajib76@yahoo.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
charosen 75fe9d3555
Add from_file method to message prompt template (#4713)
**Feature**: This PR adds `from_template_file` class method to
BaseStringMessagePromptTemplate. This is useful to help user to create
message prompt templates directly from template files, including
`ChatMessagePromptTemplate`, `HumanMessagePromptTemplate`,
`AIMessagePromptTemplate` & `SystemMessagePromptTemplate`.

**Tests**: Unit tests have been added in this PR.

Co-authored-by: charosen <charosen@bupt.cn>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Chandan Routray e8d46bdd9b
Replaced `SQLDatabaseChain` deprecated direct initialisation with `from_llm` method (#4778)
# Removed usage of deprecated methods

Replaced `SQLDatabaseChain` deprecated direct initialisation with
`from_llm` method

## Who can review?

@hwchase17
@agola11

---------

Co-authored-by: imeckr <chandanroutray2012@gmail.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Chandan Routray 11341fcecb
Fixed query checker for SQLDatabaseChain (#4780)
# Fixed query checker for SQLDatabaseChain

When `SQLDatabaseChain`'s llm attribute was deprecated, the query
checker stopped working if `SQLDatabaseChain` is initialised via
`from_llm` method. With this fix, `SQLDatabaseChain`'s query checker
would use the same `llm` as used in the `llm_chain`


## Who can review?
@hwchase17 - project lead

Co-authored-by: imeckr <chandanroutray2012@gmail.com>
1 year ago
Yeong0228 08876ad066
Fix SelfQueryRetriever, passing new query to vector store (#4774)
# Fix SelfQueryRetriever, passing new query to vector store
1 year ago
yujiosaka 6561efebb7
Accept uuids kwargs for weaviate (#4800)
# Accept uuids kwargs for weaviate

Fixes #4791
1 year ago
Adam Quigley e78c9be312
Add Confluence Loader unit tests (#3333)
Adds some basic unit tests for the ConfluenceLoader that can be extended
later. Ports this [PR from
llama-hub](https://github.com/emptycrown/llama-hub/pull/208) and adapts
it to `langchain`.

@Jflick58 and @zywilliamli adding you here as potential reviewers

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Magnus Friberg d126276693
Specify which data to return from chromadb (#4393)
# Improve the Chroma get() method by adding the optional "include"
parameter.

The Chroma get() method excludes embeddings by default. You can
customize the response by specifying the "include" parameter to
selectively retrieve the desired data from the collection.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Raduan Al-Shedivat 00c6ec8a2d
fix(document_loaders/telegram): fix pandas calls + add tests (#4806)
# Fix Telegram API loader + add tests.
I was testing this integration and it was broken with next error:
```python
message_threads = loader._get_message_threads(df)
KeyError: False
```
Also, this particular loader didn't have any tests / related group in
poetry, so I added those as well.

@hwchase17 / @eyurtsev please take a look on this fix PR.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Zander Chase 206c87d525
Change server start name (#4811)
to `langchain plus start/stop`
1 year ago
了空 f7e3d97b19
Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619)
- Remove unnecessary spaces from document object’s page_content of
BiliBiliLoader
- Fix BiliBiliLoader document and test file
1 year ago