Commit Graph

4581 Commits (61cecf8b1b5d8708f4a8f418c4be759c33c8b59f)
 

Author SHA1 Message Date
shibuiwilliam d9bc46186d
Add missing test for retrievers self_query (#8783)
# What
- Add missing test for retrievers self_query
- Add missing import validation

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: Add missing test for retrievers self_query
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ
  
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Snehil Kumar 1bd4890506
Update links on QA Use Case docs (#8784)
- Description: 2 links were not working on Question Answering Use Cases
documentation page. Hence, changed them to nearest useful links,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @baskaryan,
  - Twitter handle: NA

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Wilson Leao Neto b0d0338f21
feat: expose Kendra result item id and document id as document metadata (#8796)
- Description: we expose Kendra result item id and document id as
document metadata.
  - Tag maintainer: @3coins @baskaryan 
  - Twitter handle: wilsonleao

**Why**
The result item id and document id might be used to keep track of the
retrieved resources.
1 year ago
Bal Narendra Sapa a22d502248
added the embeddings part (#8805)
Description: forgot to add the embeddings part in the documentation.
sorry 😅

@baskaryan
1 year ago
Bagatur 9b86235a56
bump 253 (#8798) 1 year ago
Bagatur 9fc9018951
Bagatur/nuclia (#8404)
Co-authored-by: Eric BREHAULT <ebrehault@gmail.com>
1 year ago
Francisco Ingham ef5bc1fef1
Refactor for extraction docs (#8465)
Refactor for the extraction use case documentation

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
William FH 1d68470bac
Same Project for Eval Runs (#8781) 1 year ago
William FH c8f3615aa6
Support evaluating runnables and arbitrary functions (#8698)
Added a couple of "integration tests" for these that I ran.

Main design point of feedback: at this point, would it just be better to
have separate arguments for each type? Little confusing what is or isn't
supported and what is the intended usage at this point since I try to
wrap the function as runnable or pack or unpack chains/llms.

```
run_on_dataset(
...
llm_or_chain_factory = None,
llm = None,
chain = NOne,
runnable=None,
function=None
):
# raise error if none set
```

Downside with runnables and arbitrary function support is that you get
much less helpful validation and error messages, but I don't think we
should block you from this, at least.
1 year ago
liguoqinjim d00a247da7
fix:get bilibili subtitles (#8165)
- Description: fix the Loader 'BiliBiliLoader'
  - Issue: the API response was changed

![image](https://github.com/langchain-ai/langchain/assets/2113954/91216793-82f8-4c82-a018-d49f36f5f6aa)
The previously used API no longer returns the "subtitle_url" property.

![image](https://github.com/langchain-ai/langchain/assets/2113954/a8ec2a7a-f40d-4c2a-b7d0-0ccdf2b327cc)
We should use another API to get `subtitle_url` property. 
The `subtitle_url` returned by this API does not include the http schema
and needs to be added.

  - Dependencies: Nope
  - Tag maintainer: @rlancemartin
1 year ago
Bagatur 21771a6f1c
rm sklearn links (#8773) 1 year ago
Joshua Carroll e5fed7d535
Extend the StreamlitChatMessageHistory docs with a fuller example and… (#8774)
Add more details to the [notebook for
StreamlitChatMessageHistory](https://python.langchain.com/docs/integrations/memory/streamlit_chat_message_history),
including a link to a [running example
app](https://langchain-st-memory.streamlit.app/).

Original PR: https://github.com/langchain-ai/langchain/pull/8497
1 year ago
Eugene Yurtsev 19dfe166c9
Update documentation for prompts (#8381)
* Documentation to favor creation without declaring input_variables
* Cut out obvious examples, but add more description in a few places

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
1 year ago
Dayou Liu 91a0817e39
docs: llamacpp minor fixes (#8738)
- Description: minor updates on llama cpp doc
1 year ago
Bagatur f437311eef
Bagatur/runnable with fallbacks (#8543) 1 year ago
Eugene Yurtsev 003e1ca9a0
Update api references (#8646)
Update API reference documentation. This PR will pick up a number of missing classes, it also applies selective formatting based on the class / object type.
1 year ago
Piyush Jain 8374367de2
Amazon Textract as document loader (#8661)
Description: Adding support for [Amazon
Textract](https://aws.amazon.com/textract/) as a PDF document loader

---------

Co-authored-by: schadem <45048633+schadem@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline 82ef1f587d
fix makefile help (#8723)
Fixed the `makefile` help. It was not up-to-date.
 @baskaryan
1 year ago
Neil Murphy b0d0399d34
(issue #5163) Append reminder to nest multi-prompt router prompt output in JSON markdown code block, resolving JSON parsing error. (#8709)
Resolves occasional JSON parsing error when some predictions are passed
through a `MultiPromptChain`.

Makes [this
modification](https://github.com/langchain-ai/langchain/issues/5163#issuecomment-1652220401)
to `multi_prompt_prompt.py`, which is much cleaner than appending an
entire example object, which is another community-reported solution.

@hwchase17, @baskaryan

cc: @SimasJan
1 year ago
Snehil Kumar a6ee646ef3
Update get_started.mdx (#8744)
- Description: Added a missing word and rearranged a sentence in the
documentation of Self Query Retrievers.,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @baskaryan,
  - Twitter handle: NA

Thanks for your time.
1 year ago
Bal Narendra Sapa bd61757423
add documentation for serializer function (#8769)
Description: Added necessary documentation for serializer functions

@baskaryan
1 year ago
rjanardhan3 affaaea87b
Updates fireworks (#8765)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: Updates to Fireworks Documentation, 
  - Issue: N/A,
  - Dependencies: N/A,
  - Tag maintainer: @rlancemartin,

---------

Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>
1 year ago
Bagatur 8c35fcb571
update rss doc (#8761) 1 year ago
Bagatur e45be8b3f6
bump 252 (#8759) 1 year ago
Bagatur 0d5a90f30a
Revert "add filter to sklearn vector store functions (#8113)" (#8760) 1 year ago
Ben Auffarth 6b007e2829
update repo username to langchain-ai (#8747)
Time for this minor update? @hwchase17
1 year ago
Lance Martin be638ad77d
Chatbots use case (#8554)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 115a77142a
support for arbitrary kwargs for llamacpp (#8727)
llamacpp params (per their own code) are unstable, so instead of
adding/deleting them constantly adding a model_kwargs parameter that
allows for arbitrary additional kwargs

cc @jsjolund and @zacps re #8599 and #8704
1 year ago
Alec Flett f0b0c72d98
add `load()` deserializer function that bypasses need for json serialization (#7626)
There is already a `loads()` function which takes a JSON string and
loads it using the Reviver

But in the callbacks system, there is a `serialized` object that is
passed in and that object is already a deserialized JSON-compatible
object. This allows you to call `load(serialized)` and bypass
intermediate JSON encoding.

I found one other place in the code that benefited from this
short-circuiting (string_run_evaluator.py) so I fixed that too.

Tagging @baskaryan for general/utility stuff.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
1 year ago
Ruiqi Guo 6aee589eec
Add ScaNN support in vectorstore. (#8251)
Description: Add ScaNN vectorstore to langchain.
ScaNN is a Open Source, high performance vector similarity library
optimized for AVX2-enabled CPUs.
https://github.com/google-research/google-research/tree/master/scann

- Dependencies: scann

Python notebook to illustrate the usage:
docs/extras/integrations/vectorstores/scann.ipynb
Integration test:
libs/langchain/tests/integration_tests/vectorstores/test_scann.py

@rlancemartin, @eyurtsev for review.

Thanks!
1 year ago
Moonsik Kang 5b7ff215e8
Fix load map reduce documents chain (#7915)
This PR updates _load_reduce_documents_chain to handle
`reduce_documents_chain` and `combine_documents_chain` config

Please review @hwchase17, @baskaryan

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
shibuiwilliam 0f0ccfe7f6
add filter to sklearn vector store functions (#8113)
# What
- This is to add filter option to sklearn vectore store functions

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: Add filter to sklearn vectore store functions.
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
shibuiwilliam 2759e2d857
add save and load tfidf vectorizer and docs for TFIDFRetriever (#8112)
This is to add save_local and load_local to tfidf_vectorizer and docs in
tfidf_retriever to make the vectorizer reusable.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: add save_local and load_local to tfidf_vectorizer and
docs in tfidf_retriever
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
aerickson-clt 0f68054401
Issue #8089 Improve painless script scoring with params.query_value. (#8086)
This is a minor improvement that replaces the full query_vector with the
reference string `params.query_value` used in the painless scripting
docs. I have tested it manually and it works on an example. This makes
the query about half the size and much easier to read.


https://opensearch.org/docs/latest/search-plugins/knn/painless-functions/#get-started-with-k-nns-painless-scripting-functions

@babbldev 
#8089

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
linpan 0ead8ea708
typo: ignored to ignore (#8740)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
aerickson-clt c7ea6e9ff8
Issue 8081 Fix query results size bug. Other bug: pass vector_field param. (#8085)
@baskaryan
#8081 

Likely the reason why the issue occurred is that OpenSearch's default k
is 10, so it needs to be specified.

Here's a similar question about its cousin ElasticSearch

https://discuss.elastic.co/t/elasticsearch-returns-only-10-records-but-the-hit-is-507/136605

I tested this manually and also fixed the same issue in
`_default_painless_scripting_query`. In addition,
`_default_painless_scripting_query` was not passing the `vector_field`
name to a sub call, so I fixed that too.


![image](https://github.com/hwchase17/langchain/assets/32244272/cfb7aad1-f701-49d9-9beb-a723aa276817)

I also tested this in the aws opensearch developer tools.


![image](https://github.com/hwchase17/langchain/assets/32244272/24544682-1578-4bbb-9eb5-980463c5b41b)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Sidchat95 812419d946
Removing score threshold parameter of faiss _similarity_search_with_r… (#8093)
Removing score threshold parameter of faiss
_similarity_search_with_relevance_scores as the thresholding part is
implemented in similarity_search_with_relevance_scores method which
calls this method.

As this method is supposed to be a private method of faiss.py this will
never receive the score threshold parameter as it is popped in the super
method similarity_search_with_relevance_scores.

@baskaryan @hwchase17
1 year ago
Mathias Panzenböck 873a80e496
Reduce generation of temporary objects (#7950)
Just a tiny change to use `list.append(...)` and `list.extend(...)`
instead of `list += [...]` so that no unnecessary temporary lists are
created.

Since its a tiny miscellaneous thing I guess @baskaryan is the
maintainer to tag?

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Lance Martin d1b95db874
Retriever that can re-phase user inputs (#8026)
Simple retriever that applies an LLM between the user input and the
query pass the to retriever.

It can be used to pre-process the user input in any way.

The default prompt:

```
DEFAULT_QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an assistant tasked with taking a natural languge query from a user
    and converting it into a query for a vectorstore. In this process, you strip out
    information that is not relevant for the retrieval task. Here is the user query: {question} """
)
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 6c3573e7f6
Harrison/aleph alpha (#8735)
Co-authored-by: PiotrMazurek <piotr.mazurek@aleph-alpha.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Wilson Leao Neto 179a39954d
Provides access to a Document page_content formatter in the AmazonKendraRetriever (#8034)
- Description: 
- Provides a new attribute in the AmazonKendraRetriever which processes
a ResultItem and returns a string that will be used as page_content;
- The excerpt metadata should not be changed, it will be kept as was
retrieved. But it is cleaned when composing the page_content;
    - Refactors the AmazonKendraRetriever to improve code reusability;
- Issue: #7787 
- Tag maintainer: @3coins @baskaryan
- Twitter handle: wilsonleao

**Why?**

Some use cases need to adjust the page_content by dynamically combining
the ResultItem attributes depending on the context of the item.
1 year ago
Ilya 6f0bccfeb5
Add regex control over separators in character text splitter (#7933)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
#7854

Added the ability to use the `separator` ase a regex or a simple
character.
Fixed a bug where `start_index` was incorrectly counting from -1.

Who can review?
@eyurtsev
@hwchase17 
@mmz-001
1 year ago
Vasileios Mansolas e68a1d73d0
Fix Issue #6650: Enable Azure Active Directory token-based auth access for AzureChatOpenAI (#8622)
When using AzureChatOpenAI the openai_api_type defaults to "azure". The
utils' get_from_dict_or_env() function triggered by the root validator
does not look for user provided values from environment variables
OPENAI_API_TYPE, so other values like "azure_ad" are replaced with
"azure". This does not allow the use of token-based auth.

By removing the "default" value, this allows environment variables to be
pulled at runtime for the openai_api_type and thus enables the other
api_types which are expected to work.

This fixes #6650

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Ofer Mendelevitch 29f51055e8
Updates to Vectara documentation (#8699)
- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Alec Flett 5d765408ce
propagate callbacks through load_summarize_chain (#7565)
This lets you pass callbacks when you create the summarize chain:

```
summarize = load_summarize_chain(llm, chain_type="map_reduce", callbacks=[my_callbacks])
summary = summarize(documents)
```
See #5572 for a similar surgical fix.

tagging @hwchase17 for callbacks work

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Alec Flett 404d103c41
propagate RetrievalQA chain callbacks through its own LLMChain and StuffDocumentsChain (#7853)
This is another case, similar to #5572 and #7565 where the callbacks are
getting dropped during construction of the chains.

tagging @hwchase17 and @agola11 for callbacks propagation

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Bal Narendra Sapa 47eea32f6a
add serializer methods (#7914)
Description: I have added two methods serializer and deserializer
methods. There was method called save local but it saves the to the
local disk. I wanted the vectorstore in the format using which i can
push it to the sql database's blob field. I have used this while i was
working on something

@rlancemartin, @eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Ryan Sloan b786335dd1
fix RecursiveUrlLoader (#8582)
Description: the recursive url loader does not fully crawl for all urls
under base url
Maintainer: @baskaryan
1 year ago
William FH f81e613086
Fix Async Retry Event Handling (#8659)
It fails currently because the event loop is already running.

The `retry` decorator alraedy infers an `AsyncRetrying` handler for
coroutines (see [tenacity
line](aa6f8f0a24/tenacity/__init__.py (L535)))
However before_sleep always gets called synchronously (see [tenacity
line](aa6f8f0a24/tenacity/__init__.py (L338))).


Instead, check for a running loop and use that it exists. Of course,
it's running an async method synchronously which is not _nice_. Given
how important LLMs are, it may make sense to have a task list or
something but I'd want to chat with @nfcampos on where that would live.

This PR also fixes the unit tests to check the handler is called and to
make sure the async test is run (it looks like it's just been being
skipped). It would have failed prior to the proposed fixes but passes
now.
1 year ago
ruze 8ef7e14a85
RSS Feed / OPML loader (#8694)
Replace this comment with:
- Description: added a document loader for a list of RSS feeds or OPML.
It iterates through the list and uses NewsURLLoader to load each
article.
  - Issue: N/A
  - Dependencies: feedparser, listparser
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @ruze

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago