Commit Graph

2432 Commits

Author SHA1 Message Date
Harrison Chase
ca1afa7213
add test for structured tools (#5989) 2023-06-10 14:37:26 -07:00
constDave
5f356b9993
Fixed typo missing "use" (#5991)
<!--
Fixed a simple typo on
https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/vectorstore.html
where the word "use" was missing.

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-10 14:31:58 -07:00
Kaarthik Andavar
d6f5d0c6b1
Fix: SnowflakeLoader returning empty documents (#5967)
**Fix SnowflakeLoader's Behavior of Returning Empty Documents**

**Description:**

This PR addresses the issue where the SnowflakeLoader was consistently
returning empty documents. After investigation, it was found that the
query method within the SnowflakeLoader was not properly fetching and
processing the data.

**Changes:**

1. Modified the query method in SnowflakeLoader to handle data fetch and
processing more accurately.
2. Enhanced error handling within the SnowflakeLoader to catch and log
potential issues that may arise during data loading.

**Impact:**

This fix will ensure the SnowflakeLoader reliably returns the expected
documents instead of empty ones, improving the efficiency and
reliability of data processing tasks in the LangChain project.

Before Fix:

`[
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={}),
    Document(page_content='', metadata={})
]`

After Fix:

`[Document(page_content='CUSTOMER_ID: 1\nFIRST_NAME: John\nLAST_NAME:
Doe\nEMAIL: john.doe@example.com\nPHONE: 555-123-4567\nADDRESS: 123 Elm
St, San Francisco, CA 94102', metadata={}),
Document(page_content='CUSTOMER_ID: 2\nFIRST_NAME: Jane\nLAST_NAME:
Doe\nEMAIL: jane.doe@example.com\nPHONE: 555-987-6543\nADDRESS: 456 Oak
St, San Francisco, CA 94103', metadata={}),
Document(page_content='CUSTOMER_ID: 3\nFIRST_NAME: Michael\nLAST_NAME:
Smith\nEMAIL: michael.smith@example.com\nPHONE: 555-234-5678\nADDRESS:
789 Pine St, San Francisco, CA 94104', metadata={}),
Document(page_content='CUSTOMER_ID: 4\nFIRST_NAME: Emily\nLAST_NAME:
Johnson\nEMAIL: emily.johnson@example.com\nPHONE: 555-345-6789\nADDRESS:
321 Maple St, San Francisco, CA 94105', metadata={}),
Document(page_content='CUSTOMER_ID: 5\nFIRST_NAME: David\nLAST_NAME:
Williams\nEMAIL: david.williams@example.com\nPHONE:
555-456-7890\nADDRESS: 654 Birch St, San Francisco, CA 94106',
metadata={}), Document(page_content='CUSTOMER_ID: 6\nFIRST_NAME:
Emma\nLAST_NAME: Jones\nEMAIL: emma.jones@example.com\nPHONE:
555-567-8901\nADDRESS: 987 Cedar St, San Francisco, CA 94107',
metadata={}), Document(page_content='CUSTOMER_ID: 7\nFIRST_NAME:
Oliver\nLAST_NAME: Brown\nEMAIL: oliver.brown@example.com\nPHONE:
555-678-9012\nADDRESS: 147 Cherry St, San Francisco, CA 94108',
metadata={}), Document(page_content='CUSTOMER_ID: 8\nFIRST_NAME:
Sophia\nLAST_NAME: Davis\nEMAIL: sophia.davis@example.com\nPHONE:
555-789-0123\nADDRESS: 369 Walnut St, San Francisco, CA 94109',
metadata={}), Document(page_content='CUSTOMER_ID: 9\nFIRST_NAME:
James\nLAST_NAME: Taylor\nEMAIL: james.taylor@example.com\nPHONE:
555-890-1234\nADDRESS: 258 Hawthorn St, San Francisco, CA 94110',
metadata={}), Document(page_content='CUSTOMER_ID: 10\nFIRST_NAME:
Isabella\nLAST_NAME: Wilson\nEMAIL: isabella.wilson@example.com\nPHONE:
555-901-2345\nADDRESS: 963 Aspen St, San Francisco, CA 94111',
metadata={})]
`

**Tests:**

All unit and integration tests have been run and passed successfully.
Additional tests were added to validate the new behavior of the
SnowflakeLoader.

**Checklist:**

- [x] Code changes are covered by tests
- [x] Code passes `make format` and `make lint`
- [x] This PR does not introduce any breaking changes

Please review and let me know if any changes are required.
2023-06-10 13:03:50 -07:00
Harrison Chase
62ec10a7f5
bump version to 196 (#5988) 2023-06-10 09:06:35 -07:00
German Martin
736a1819aa
LOTR: Lord of the Retrievers. A retriever that merge several retrievers together applying document_formatters to them. (#5798)
"One Retriever to merge them all, One Retriever to expose them, One
Retriever to bring them all and in and process them with Document
formatters."

Hi @dev2049! Here bothering people again!

I'm using this simple idea to deal with merging the output of several
retrievers into one.
I'm aware of DocumentCompressorPipeline and
ContextualCompressionRetriever but I don't think they allow us to do
something like this. Also I was getting in trouble to get the pipeline
working too. Please correct me if i'm wrong.

This allow to do some sort of "retrieval" preprocessing and then using
the retrieval with the curated results anywhere you could use a
retriever.
My use case is to generate diff indexes with diff embeddings and sources
for a more colorful results then filtering them with one or many
document formatters.

I saw some people looking for something like this, here:
https://github.com/hwchase17/langchain/issues/3991
and something similar here:
https://github.com/hwchase17/langchain/issues/5555

This is just a proposal I know I'm missing tests , etc. If you think
this is a worth it idea I can work on tests and anything you want to
change.
Let me know!

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-10 08:41:02 -07:00
Lance Martin
f3e7ac0a2c
Add load() to snowflake loader (#5956)
Quick fix for recently added [snowflake data
loader](https://github.com/hwchase17/langchain/pull/5825/files).
2023-06-09 11:27:29 -07:00
Harrison Chase
3678cba0be
bump ver to 195 (#5949) 2023-06-09 09:17:08 -07:00
Harrison Chase
7af186fddf
fixes to docs (#5919) 2023-06-09 09:15:53 -07:00
Kacper Łukawski
7cc200766e
Expose full params in Qdrant (#5947)
# Expose full params in Qdrant

There were many questions regarding supporting some additional
parameters in Qdrant integration. Qdrant supports many vector search
optimizations that were impossible to use directly in Qdrant before.
That includes:

1. Possibility to manipulate collection params while using
`Qdrant.from_texts`. The PR allows setting things such as quantization,
HNWS config, optimizers config, etc. That makes it consistent with raw
`QdrantClient`.
2. Extended options while searching. It includes HNSW options, exact
search, score threshold filtering, and read consistency in distributed
mode.

After merging that PR, #4858 might also be closed.

## Who can review?

VectorStores / Retrievers / Memory

@dev2049 @hwchase17
2023-06-09 08:56:32 -07:00
Rubén Martínez
db7ef635c0
Add support for the endpoint URL in DynamoDBChatMesasgeHistory (#5836)
This PR adds the possibility of specifying the endpoint URL to AWS in
the DynamoDBChatMessageHistory, so that it is possible to target not
only the AWS cloud services, but also a local installation.

Specifying the endpoint URL, which is normally not done when addressing
the cloud services, is very helpful when targeting a local instance
(like [Localstack](https://localstack.cloud/)) when running local tests.

Fixes #5835

#### Who can review?

Tag maintainers/contributors who might be interested: @dev2049

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-08 23:21:11 -07:00
Lior
0eb1bc1a02
Fix the issue where the parameters passed to VertexAI ignored #5889 (#5891)
Fixes #5889 and fixes the name of the argument in init_vertexai
@hwchase17
@agola11

Co-authored-by: Lior Durahly <lior.durahly@superwise.ai>
2023-06-08 23:15:22 -07:00
Fei Wang
63fcf41bea
Fix openai proxy error (#5914)
Fixes proxy error.
Since openai does not parse proxy parameters and uses openai.proxy
directly, the proxy method needs to be modified.


7610c5adfa/openai/api_requestor.py (LL90)

#### Who can review?
  @hwchase17 - project lead

  Models
  - @hwchase17
  - @agola11

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-08 23:15:06 -07:00
felpigeon
2791a753bf
Add start index to metadata in TextSplitter (#5912)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

#### Add start index to metadata in TextSplitter

- Modified method `create_documents` to track start position of each
chunk
- The `start_index` is included in the metadata if the `add_start_index`
parameter in the class constructor is set to `True`

This enables referencing back to the original document, particularly
useful when a specific chunk is retrieved.

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@eyurtsev @agola11
<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-08 23:09:32 -07:00
Philip Kiely - Baseten
a09a0e3511
Baseten integration (#5862)
This PR adds a Baseten integration. I've done my best to follow the
contributor's guidelines and add docs, an example notebook, and an
integration test modeled after similar integrations' test.

Please let me know if there is anything I can do to improve the PR. When
it is merged, please tag https://twitter.com/basetenco and
https://twitter.com/philip_kiely as contributors (the note on the PR
template said to include Twitter accounts)
2023-06-08 23:05:57 -07:00
Tamara Lazarevic
0ce8745928
Fix typo (#5894) 2023-06-08 23:05:22 -07:00
Andrew Grangaard
d8ae925425
arxiv: Correct name of search client attribute to 'arxiv_search' from incorrect 'arxiv_client' (#5917)
+ this private attribute is referenced as `arxiv_search` in internal
usage and is set when verifying the environment

twitter: @spazm 


#### Who can review?

Any of @hwchase17, @leo-gan, or @bongsang might be interested in
reviewing.

+ Mismatch between `arxiv_client` attribute vs `arxiv_search` in
validation and usage is present in the initial commit by @hwchase17.
+ @leo-gan has made most of the edits.
+ @bongsang implemented pdf download.
2023-06-08 22:49:11 -07:00
sergiolrinditex
fe8bbc2da7
Create snowflake Loader (#5825)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-08 22:03:00 -07:00
Zander Chase
77c286cf02
Use LCP Client in Tracer (#5908)
Move the LCP calls to the client.
2023-06-08 21:15:14 -07:00
Frank Hübner
3ec6400d70
Feature/add AWS Kendra Index Retriever (#5856)
adding a new retriever for AWS Kendra

@dev2049 please take a look!
2023-06-08 15:44:09 -07:00
Piyush Jain
a6ebffb695
Fixes model arguments for amazon models (#5896)
Fixes #5713 
#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17
@agola11
@aarora79
@rsgrewal-aws
2023-06-08 14:16:01 -07:00
小铭
767fa91eae
Fix the shortcut conflict for document page search (#5874)
Fix the document page to open both search and Mendable when pressing
Ctrl+K.
I have changed the shortcut for Mendable to Ctrl+J.



<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?
  @hwchase17
Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-08 14:15:19 -07:00
Zander Chase
5f74db4500
Update run eval imports in init (#5858) 2023-06-08 10:44:36 -07:00
warjiang
511c12dd39
fix: update qa_chain doc for "chai_type" (#5877)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->
`load_qa_with_sources_chain` method already support four type of chain,
including `map_rerank`. update document to prevent any misunderstandings
😀.

![image](https://github.com/hwchase17/langchain/assets/6478745/325260b2-6121-4900-aef9-001febff811a)

<!-- Remove if not applicable -->

Fixes # (issue)
No, just update document.

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?
@hwchase17 
Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-08 07:32:51 -07:00
Harrison Chase
893d20f735
bump version to 194 (#5866) 2023-06-07 22:47:48 -07:00
Harrison Chase
35cfd25db3
Harrison/nebula graph (#5865)
Co-authored-by: Wey Gu <weyl.gu@gmail.com>
Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>
2023-06-07 21:56:43 -07:00
Harrison Chase
658f8bdee7
Harrison/fauna loader (#5864)
Co-authored-by: Shadid12 <Shadid12@users.noreply.github.com>
2023-06-07 21:32:23 -07:00
Liang Zhang
5518f24ec3
Implement saving and loading of RetrievalQA chain (#5818)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes #3983
Mimicing what we do for saving and loading VectorDBQA chain, I added the
logic for RetrievalQA chain.
Also added a unit test. I did not find how we test other chains for
their saving and loading functionality, so I just added a file with one
test case. Let me know if there are recommended ways to test it.

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@dev2049
<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-07 21:07:13 -07:00
Liang Zhang
b93638ef1e
Refactor and update databricks integration page (#5575)
# Your PR Title (What it does)

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

## Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-07 20:45:47 -07:00
volodymyr-memsql
a1549901ce
Added SingleStoreDB Vector Store (#5619)
- Added `SingleStoreDB` vector store, which is a wrapper over the
SingleStore DB database, that can be used as a vector storage and has an
efficient similarity search.
- Added integration tests for the vector store
- Added jupyter notebook with the example

@dev2049

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-07 20:45:33 -07:00
jjzhuo
78aa59c68b
Fix serialization issue with W&B (#5693)
The chain input_documents are not displaying properly in W&B, due to
serialization issue:

<img width="1164" alt="Screenshot 2023-06-04 at 11 58 26 AM"
src="https://github.com/hwchase17/langchain/assets/134809928/f31f14f6-0935-4cca-9913-6760cd40eadf">

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-07 20:44:59 -07:00
Alec Flett
ec0dd6e34a
propagate callbacks to ConversationalRetrievalChain (#5572)
# Allow callbacks to monitor ConversationalRetrievalChain

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

I ran into an issue where load_qa_chain was not passing the callbacks
down to the child LLM chains, and so made sure that callbacks are
propagated. There are probably more improvements to do here but this
seemed like a good place to stop.

Note that I saw a lot of references to callbacks_manager, which seems to
be deprecated. I left that code alone for now.



## Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@agola11
<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-07 20:25:21 -07:00
Jeff Vestal
3294774148
Add knn and query search field options to ElasticKnnSearch (#5641)
in the `ElasticKnnSearch` class added 2 arguments that were not exposed
properly

`knn_search` added:
- `vector_query_field: Optional[str] = 'vector'`
-- vector_query_field: Field name to use in knn search if not default
'vector'

`knn_hybrid_search` added:
- `vector_query_field: Optional[str] = 'vector'`
-- vector_query_field: Field name to use in knn search if not default
'vector'
- `query_field: Optional[str] = 'text'`
-- query_field: Field name to use in search if not default 'text'



Fixes # https://github.com/hwchase17/langchain/issues/5633


cc: @dev2049 @hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-07 20:19:14 -07:00
Mark Marryatt
cef79ca579
Fix exporting GCP Vertex Matching Engine from vectorstores (#5793)
The Vertex Matching Engine docs include [the
line](b177a29d3f/docs/modules/indexes/vectorstores/examples/matchingengine.ipynb (L32))
`from langchain.vectorstores import MatchingEngine` which doesn't work
as it wasn't added to the vectorestores module exports.



  - @dev2049
2023-06-07 19:45:33 -07:00
Dave Ingram
106364a45c
Update to Getting Started docs page for Memory (#5855)
Simply fixing a small typo in the memory page. 

Also removed an extra code block at the end of the file.

Along the way, the current outputs seem to have changed in a few places
so left that for posterity, and updated the number of runs which seems
harmless, though I can clean that up if preferred.
2023-06-07 19:45:21 -07:00
bnassivet
9355e3f5f5
qdrant vector store - search with relevancy scores (#5781)
Implementation of similarity_search_with_relevance_scores for quadrant
vector store.
As implemented the method is also compatible with other capacities such
as filtering.

Integration tests updated.


#### Who can review?

Tag maintainers/contributors who might be interested:

  VectorStores / Retrievers / Memory
  - @dev2049
2023-06-07 19:26:40 -07:00
Ning Ren
f15763518a
docs: add Shale Protocol integration guide (#5814)
This PR adds documentation for Shale Protocol's integration with
LangChain.

[Shale Protocol](https://shaleprotocol.com) provides forever-free
production-ready inference APIs to the open-source community. We have
global data centers and plan to support all major open LLMs (estimated
~1,000 by 2025).

The team consists of software and ML engineers, AI researchers,
designers, and operators across North America and Asia. Combined
together, the team has 50+ years experience in machine learning, cloud
infrastructure, software engineering and product development. Team
members have worked at places like Google and Microsoft.

#### Who can review?

Tag maintainers/contributors who might be interested:

  - @hwchase17
  - @agola11

---------

Co-authored-by: Karen Sheng <46656667+karensheng@users.noreply.github.com>
2023-06-07 19:25:59 -07:00
Duarte OC
137da7e4b6
Update microsoft loader example with docx2txt dependency (#5832)
@eyurtsev
2023-06-07 19:21:48 -07:00
Aidan Holland
9f4b720a63
Add additional VertexAI Params (#5837)
## Changes

- Added the `stop` param to the `_VertexAICommon` class so it can be set
at llm initialization

## Example Usage

```python
VertexAI(
    # ...
    temperature=0.15,
    max_output_tokens=128,
    top_p=1,
    top_k=40,
    stop=["\n```"],
)
```

## Possible Reviewers

- @hwchase17 
- @agola11
2023-06-07 19:20:37 -07:00
Eduard van Valkenburg
76fcd96dae
Add logging in PBI tool (#5841)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

Add some logging into the powerbi tool so that you can see the queries
being sent to PBI and attempts to correct them.

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested: @vowelparrot 

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-07 19:19:21 -07:00
Matt Robinson
11fec7d4d1
feat: Add UnstructuredCSVLoader for CSV files (#5844)
### Summary

Adds an `UnstructuredCSVLoader` for loading CSVs. One advantage of using
`UnstructuredCSVLoader` relative to the standard `CSVLoader` is that if
you use `UnstructuredCSVLoader` in `"elements"` mode, an HTML
representation of the table will be available in the metadata.

#### Who can review?

@hwchase17
 @eyurtsev
2023-06-07 19:18:01 -07:00
Soos3D
0b4a51930c
Add how to use a custom scraping function with the sitemap loader. (#5847)
Hi! I just added an example of how to use a custom scraping function
with the sitemap loader. I recently used this feature and had to dig in
the source code to find it. I thought it might be useful to other devs
to have an example in the Jupyter Notebook directly.

I only added the example to the documentation page. 

@eyurtsev I was not able to run the lint. Please let me know if I have
to do anything else.

I know this is a very small contribution, but I hope it will be
valuable. My Twitter handle is @web3Dav3.

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-07 19:16:51 -07:00
Yessen Kanapin
c66755b661
Add DeepInfra embeddings integration with tests and examples, better exception handling for Deep Infra LLM (#5854)
#### Who can review?

Tag maintainers/contributors who might be interested:
  @hwchase17 - project lead
  - @agola11

---------

Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>
2023-06-07 19:14:30 -07:00
ugfly1210
4d8cda1c3b
FIX: backslash escaped (#5815)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

LatexTextSplitter needs to use "\n\\\chapter" when separators are
escaped, such as "\n\\\chapter", otherwise it will report an error:
(re.error: bad escape \c at position 1 (line 2, column 1))


Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use

re.error: bad escape \c at position 1 (line 2, column 1)

See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

@hwchase17  @dev2049 

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

Co-authored-by: Pang <ugfly@qq.com>
2023-06-07 16:01:07 -07:00
Zander Chase
3af36943e8
Rm extraneous args to the trace group helper (#5801)
These are being ignored
2023-06-07 13:09:29 -07:00
whysage
8ef7274ee6
feat: issue-5712 add sleep tool (#5715)
Fixes # 5712 added sleep tool
2023-06-07 09:39:02 -07:00
Zander Chase
d9fcc45d05
Add in the async methods and link the run id (#5810) 2023-06-07 08:27:44 -07:00
Harrison Chase
ce7c11625f
bump version to 193 (#5838) 2023-06-07 07:38:57 -07:00
warjiang
5a207cce8f
fix: fullfill openai params when embedding (#5821)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes #5822 
I upgrade my langchain lib by execute `pip install -U langchain`, and
the verion is 0.0.192。But i found that openai.api_base not working. I
use azure openai service as openai backend, the openai.api_base is very
import for me. I hava compared tag/0.0.192 and tag/0.0.191, and figure
out that:

![image](https://github.com/hwchase17/langchain/assets/6478745/e183fdb2-8224-45c9-b3b4-26d62823999a)
openai params is moved inside `_invocation_params` function,and used in
some openai invoke:

![image](https://github.com/hwchase17/langchain/assets/6478745/5a55a048-5fa9-4bf4-aaef-3902226bec5e)

![image](https://github.com/hwchase17/langchain/assets/6478745/85b8cebc-eeb8-4538-a525-814719c8f8df)
but still some case not covered like:

![image](https://github.com/hwchase17/langchain/assets/6478745/e0297620-f2b2-4f4f-98bd-d0ed19022dac)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17 

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-07 07:32:57 -07:00
Harrison Chase
b3ae6bcd3f
bump ver to 192 (#5812) 2023-06-06 22:23:11 -07:00
Harrison Chase
5468528748
rm docs mongo (#5811) 2023-06-06 22:22:44 -07:00