Commit Graph

569 Commits

Author SHA1 Message Date
Guy Korland
39a5d02225
Cleanup of ruff warnings use isinstance() instead of type() (#9655)
Minor cosmetic PR just cleanup of `ruff` warnings use `isinstance()`
instead of `type()`
2023-08-23 07:14:31 -07:00
Joseph McElroy
2a06e7b216
ElasticsearchStore: improve error logging for adding documents (#9648)
Not obvious what the error is when you cannot index. This pr adds the
ability to log the first errors reason, to help the user diagnose the
issue.

Also added some more documentation for when you want to use the
vectorstore with an embedding model deployed in elasticsearch.

Credit: @elastic and @phoey1
2023-08-23 07:04:09 -07:00
Julien Salinas
f1072cc31f
Merge branch 'master' into master 2023-08-23 14:42:40 +02:00
Jun Liu
b379c5f9c8
Fixed the error on ConfluenceLoader when content_format=VIEW and keep_markdown_format=True (#9633)
- Description: a description of the change

when I set `content_format=ContentFormat.VIEW` and
`keep_markdown_format=True` on ConfluenceLoader, it shows the following
error:
```
langchain/document_loaders/confluence.py", line 459, in process_page
    page["body"]["storage"]["value"], heading_style="ATX"
KeyError: 'storage'
```
The reason is because the content format was set to `view` but it was
still trying to get the content from `page["body"]["storage"]["value"]`.

Also added the other content formats which are supported by Atlassian
API

https://stackoverflow.com/questions/34353955/confluence-rest-api-expanding-page-body-when-retrieving-page-by-title/34363386#34363386

  - Issue: the issue # it fixes (if applicable),

Not applicable.

  - Dependencies: any dependencies required for this change,

Added optional dependency `markdownify` if anyone wants to extract in
markdown format.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-22 21:00:15 -07:00
Gabriel Fu
b2d9970fc1
Allow specifying dtype in langchain.llms.VLLM (#9635)
- Description: add `dtype` argument for VLLM 
  - Issue: #9593 
  - Dependencies: none
  - Tag maintainer: @hwchase17, @baskaryan
2023-08-22 20:21:56 -07:00
anifort
900c1f3e8d
Add support for structured data sources with google enterprise search (#9037)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: Added the capability to handles structured data from
google enterprise search,
- Issue: Retriever failed when underline search engine was integrated
with structured data,
  - Dependencies: google-api-core
  - Tag maintainer: @jarokaz
  - Twitter handle: anifort

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Christos Aniftos <aniftos@google.com>
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-08-22 23:18:10 -04:00
Harrison Chase
02545a54b3
python repl improvement for csv agent (#9618) 2023-08-22 17:06:18 -07:00
Erick Friis
fc64e6349e
Hub stub updates (#9577)
Updates the hub stubs to not fail when no api key is found. For
supporting singleton tenants and default values from sdk 0.1.6.

Also adds the ability to define is_public and description for backup
repo creation on push.
2023-08-22 16:05:41 -07:00
Kim Minjong
ca8232a3c1
Update BaseChatModel.astream to respect generation_info (#9430)
Currently, generation_info is not respected by only reflecting messages
in chunks. Change it to add generations so that generation chunks are
merged properly.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-08-22 15:18:24 -07:00
Bagatur
81163e3c0c
parent retriever nit (#9570)
if ids are nullable seems like they should have default val None.
mirrors VectorStore interface as well. cc @mcantillon21 @jacoblee93
2023-08-22 14:58:16 -04:00
Myeongseop Kim
f1e602996a
import tqdm.auto instead of tqdm tqdm for OpenAIEmbeddings (#9584)
- Description: current code does not work very well on jupyter notebook,
so I changed the code so that it imports `tqdm.auto` instead.
  - Issue: #9582 
  - Dependencies: N/A
  - Tag maintainer: @hwchase17, @baskaryan
  - Twitter handle: N/A

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-08-22 14:54:07 -04:00
Predrag Gruevski
d564ec944c
poetry lock the experimental package. (#9478) 2023-08-22 14:09:35 -04:00
Predrag Gruevski
65e893b9cd
poetry lock on langchain. (#9476) 2023-08-22 14:09:23 -04:00
Predrag Gruevski
3c7cc4d440
Test experimental package with langchain on master branch. (#9621)
It's possible that langchain-experimental works fine with the latest
*published* langchain, but is broken with the langchain on `master`.
Unfortunately, you can see this is currently the case — this is why this
PR also includes a minor fix for the `langchain` package itself.

We want to catch situations like that *before* releasing a new
langchain, hence this test.
2023-08-22 13:35:21 -04:00
Eugene Yurtsev
3408810748
Add batch util (#9620)
Add `batch` utility to langchain
2023-08-22 12:31:18 -04:00
Bagatur
2b663089b5
bump 271 (#9615) 2023-08-22 08:10:22 -07:00
klae01
b868ef23bc
Add AINetwork blockchain toolkit integration (#9527)
# Description
This PR introduces a new toolkit for interacting with the AINetwork
blockchain. The toolkit provides a set of tools for performing various
operations on the AINetwork blockchain, such as transferring AIN,
reading and writing values to the blockchain database, managing apps,
setting rules and owners.

# Dependencies
[ain-py](https://github.com/ainblockchain/ain-py) >= 1.0.2

# Misc
The example notebook
(langchain/docs/extras/integrations/toolkits/ainetwork.ipynb) is in the
PR

---------

Co-authored-by: kriii <kriii@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-22 08:03:33 -07:00
Bagatur
e99ef12cb1
Bagatur/litellm model name (#9613)
Co-authored-by: ishaan-jaff <ishaanjaffer0324@gmail.com>
2023-08-22 07:44:00 -07:00
Harrison Chase
1720e99397
add variables for field names (#9563) 2023-08-22 07:43:21 -07:00
Anthony Mahanna
dfb9ff1079
bugfix: ArangoDB Empty Schema Case (#9574)
- Introduces a conditional in `ArangoGraph.generate_schema()` to exclude
empty ArangoDB Collections from the schema
- Add empty collection test case

Issue: N/A
Dependencies: None
2023-08-22 07:41:06 -07:00
Philippe PRADOS
d4c49b16e4
Fix ChatMessageHistory (#9594)
The initialization of the array of ChatMessageHistory is buggy.
The list is shared with all instances.
2023-08-22 07:36:36 -07:00
toddkim95
fba29f203a
Add to support polars (#9610)
### Description
Polars is a DataFrame interface on top of an OLAP Query Engine
implemented in Rust.
Polars is faster to read than pandas, so I'm looking forward to seeing
it added to the document loader.

### Dependencies
polars (https://pola-rs.github.io/polars-book/user-guide/)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-22 07:36:24 -07:00
Aashish Saini
3c4f32c8b8
Replacing Exception type from ValueError to ImportError (#9588)
I have restructured the code to ensure uniform handling of ImportError.
In place of previously used ValueError, I've adopted the standard
practice of raising ImportError with explanatory messages. This
modification enhances code readability and clarifies that any problems
stem from module importation.

@eyurtsev , @baskaryan 

Thanks
2023-08-22 07:34:05 -07:00
Julien Salinas
033b874701 Remove some deprecated text generation parameters. 2023-08-22 09:26:37 +02:00
Bagatur
4e7e6bfe0a revert 2023-08-21 18:01:49 -07:00
Bagatur
a9bf409a09 param 2023-08-21 17:37:07 -07:00
Bagatur
fa478638a9 Merge branch 'master' into bagatur/locals_in_config 2023-08-21 17:31:39 -07:00
Bagatur
182b059bf4 param 2023-08-21 17:31:38 -07:00
Bagatur
04f2d69b83
improve confluence doc loader param validation (#9568) 2023-08-21 15:02:36 -07:00
Zizhong Zhang
00eff8c4a7
feat: Add PromptGuard integration (#9481)
Add PromptGuard integration
-------
There are two approaches to integrate PromptGuard with a LangChain
application.

1. PromptGuardLLMWrapper
2. functions that can be used in LangChain expression.

-----
- Dependencies
`promptguard` python package, which is a runtime requirement if you'd
try out the demo.

- @baskaryan @hwchase17 Thanks for the ideas and suggestions along the
development process.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-21 14:59:36 -07:00
Sathindu
652c542b2f
fix: Imports for the ConfluenceLoader:process_page (#9432)
### Description
When we're loading documents using `ConfluenceLoader`:`load` function
and, if both `include_comments=True` and `keep_markdown_format=True`,
we're getting an error saying `NameError: free variable 'BeautifulSoup'
referenced before assignment in enclosing scope`.
    
    loader = ConfluenceLoader(url="URI", token="TOKEN")
    documents = loader.load(
        space_key="SPACE", 
        include_comments=True, 
        keep_markdown_format=True, 
    )

This happens because previous imports only consider the
`keep_markdown_format` parameter, however to include the comments, it's
using `BeautifulSoup`

Now it's fixed to handle all four scenarios considering both
`include_comments` and `keep_markdown_format`.

### Twitter
`@SathinduGA`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-21 13:44:52 -07:00
Mike Salvatore
7c0b1b8171
Add session to ConfluenceLoader.__init__() (#9437)
- Description: Allows the user of `ConfluenceLoader` to pass a
`requests.Session` object in lieu of an authentication mechanism
- Issue: None
- Dependencies: None
- Tag maintainer: @hwchase17
2023-08-21 13:18:35 -07:00
Kim Minjong
3d1095218c
Update ChatOpenAI._astream to respect finish_reason (#9431)
Currently, ChatOpenAI._astream does not reflect finish_reason to
generation_info. Change it to reflect that.
2023-08-21 12:56:42 -07:00
Matthew Zeiler
949b2cf177
Improvements to the Clarifai integration (#9290)
- Improved docs
- Improved performance in multiple ways through batching, threading,
etc.
 - fixed error message 
 - Added support for metadata filtering during similarity search.

@baskaryan PTAL
2023-08-21 12:53:36 -07:00
ricki-epsilla
66a47d9a61
add Epsilla vectorstore (#9239)
[Epsilla](https://github.com/epsilla-cloud/vectordb) vectordb is an
open-source vector database that leverages the advanced academic
parallel graph traversal techniques for vector indexing.
This PR adds basic integration with
[pyepsilla](https://github.com/epsilla-cloud/epsilla-python-client)(Epsilla
vectordb python client) as a vectorstore.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-21 12:51:15 -07:00
Bagatur
dda5b1e370
Bagatur/doc loader confluence (#9524)
Co-authored-by: chanjetsdp <chanjetsdp@chanjet.com>
2023-08-21 12:40:44 -07:00
Predrag Gruevski
de1f63505b
Add py.typed file to langchain-experimental. (#9557)
The package is linted with mypy, so its type hints are correct and
should be exposed publicly. Without this file, the type hints remain
private and cannot be used by downstream users of the package.
2023-08-21 15:37:16 -04:00
Raynor Chavez
973866c894
fix: Updated marqo integration for marqo version 1.0.0+ (#9521)
- Description: Updated marqo integration to use tensor_fields instead of
non_tensor_fields. Upgraded marqo version to 1.2.4
  - Dependencies: marqo 1.2.4

---------

Co-authored-by: Raynor Kirkson E. Chavez <raynor.chavez@192.168.254.171>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-21 10:43:15 -07:00
Bagatur
c7a5bb6031
bump 270 (#9549) 2023-08-21 10:18:46 -07:00
Nuno Campos
28e1ee4891
Nc/small fixes 21aug (#9542)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-21 18:01:20 +01:00
Bagatur
d11841d760
bump 269 (#9487) 2023-08-21 08:34:16 -07:00
axiangcoding
05aa02005b
feat(llms): support ERNIE Embedding-V1 (#9370)
- Description: support [ERNIE
Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu),
which is part of ERNIE ecology
- Issue: None
- Dependencies: None
- Tag maintainer: @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-21 07:52:25 -07:00
José Ferraz Neto
f116e10d53
Add SharePoint Loader (#4284)
- Added a loader (`SharePointLoader`) that can pull documents (`pdf`,
`docx`, `doc`) from the [SharePoint Document
Library](https://support.microsoft.com/en-us/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872).
- Added a Base Loader (`O365BaseLoader`) to be used for all Loaders that
use [O365](https://github.com/O365/python-o365) Package
- Code refactoring on `OneDriveLoader` to use the new `O365BaseLoader`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-21 07:49:07 -07:00
Utku Ege Tuluk
bb4f7936f9
feat(llms): add streaming support to textgen (#9295)
- Description: Added streaming support to the textgen component in the
llms module.
  - Dependencies: websocket-client = "^1.6.1"
2023-08-21 07:39:14 -07:00
Eugene Yurtsev
02c5c13a6e
Fast linters go first (#9501)
Proposal to reverse the order of linters based on the principle of
running the
fast ones first.
2023-08-21 00:20:54 -07:00
Ofer Mendelevitch
a758496236
Fixed issue with metadata in query (#9500)
- Description: Changed metadata retrieval so that it combines Vectara
doc level and part level metadata
  - Tag maintainer: @rlancemartin
  - Twitter handle: @ofermend
2023-08-20 16:00:14 -07:00
Eugene Yurtsev
e51bccdb28
Add strict flag to the JSON parser (#9471)
This updates the default configuration since I think it's almost always
what we want to happen. But we should evaluate whether there are any issues.
2023-08-19 22:02:12 -04:00
Taqi Jaffri
5cd244e9b7 CR feedback 2023-08-19 13:48:15 -07:00
Predrag Gruevski
be9bc62f8b
Fix bash test regex for Linux under WSL2. (#9475)
It fails with `Permission denied` and not `not found`. Both seem
reasonable.
2023-08-19 09:27:14 -04:00
Lorenzo
5b3dbf12a5
Uniform valid suffixes and clarify exceptions (#9463)
**Description**:
- Uniformed the current valid suffixes (file formats) for loading agents
from hubs and files (to better handle future additions);
 - Clarified exception messages (also in unit test).
2023-08-18 21:35:53 -07:00
Brendan Collins
9f545825b7
Added Geometry Validation, Geometry Metadata, and WKT instead of Python str() to GeoDataFrame Loader (#9466)
@rlancemartin The current implementation within `Geopandas.GeoDataFrame`
loader uses the python builtin `str()` function on the input geometries.
While this looks very close to WKT (Well known text), Python's str
function doesn't guarantee that.

In the interest of interop., I've changed to the of use `wkt` property
on the Shapely geometries for generating the text representation of the
geometries.

Also, included here:
- validation of the input `page_content_column` as being a GeoSeries.
- geometry `crs` (Coordinate Reference System) / bounds
(xmin/ymin/xmax/ymax) added to Document metadata. Having the CRS is
critical... having the bounds is just helpful!

I think there is a larger question of "Should the geometry live in the
`page_content`, or should the record be better summarized and tuck the
geom into metadata?" ...something for another day and another PR.
2023-08-18 21:35:39 -07:00
Kacper Łukawski
616e728ef9
Enhance qdrant vs using async embed documents (#9462)
This is an extension of #8104. I updated some of the signatures so all
the tests pass.

@danhnn I couldn't commit to your PR, so I created a new one. Thanks for
your contribution!

@baskaryan Could you please merge it?

---------

Co-authored-by: Danh Nguyen <dnncntt@gmail.com>
2023-08-18 18:59:48 -07:00
Matt Robinson
83d2a871eb
fix: apply unstructured preprocess functions (#9473)
### Summary

Fixes a bug from #7850 where post processing functions in Unstructured
loaders were not apply. Adds a assertion to the test to verify the post
processing function was applied and also updates the explanation in the
example notebook.
2023-08-18 18:54:28 -07:00
William FH
292ae8468e
Let you specify run id in trace as chain group (#9484)
I think we'll deprecate this soon anyway but still nice to be able to
fetch the run id
2023-08-18 17:21:53 -07:00
Predrag Gruevski
df8e35fd81
Remove incorrect ABC from two Elasticsearch classes. (#9470)
Neither is an ABC because their own example code instantiates them directly.
2023-08-18 15:01:02 -04:00
Predrag Gruevski
82f28ca9ef
ChatPromptTemplate is not an ABC, it's instantiated directly. (#9468)
Its own `__add__` method constructs `ChatPromptTemplate` objects
directly, it cannot be abstract.

Found while debugging something else with @nfcampos.
2023-08-18 14:37:10 -04:00
vamseeyarla
82fb56b79c
Issue 9401 - SequentialChain runs the same callbacks over and over in async mode (#9452)
Issue: https://github.com/langchain-ai/langchain/issues/9401

In the Async mode, SequentialChain implementation seems to run the same
callbacks over and over since it is re-using the same callbacks object.

Langchain version: 0.0.264, master

The implementation of this aysnc route differs from the sync route and
sync approach follows the right pattern of generating a new callbacks
object instead of re-using the old one and thus avoiding the cascading
run of callbacks at each step.

Async mode:
```
        _run_manager = run_manager or AsyncCallbackManagerForChainRun.get_noop_manager()
        callbacks = _run_manager.get_child()
        ...
        for i, chain in enumerate(self.chains):
            _input = await chain.arun(_input, callbacks=callbacks)
            ...
```

Regular mode:
```
        _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
        for i, chain in enumerate(self.chains):
            _input = chain.run(_input, callbacks=_run_manager.get_child(f"step_{i+1}"))
            ...
```

Notice how we are reusing the callbacks object in the Async code which
will have a cascading effect as we run through the chain. It runs the
same callbacks over and over resulting in issues.

Solution:
Define the async function in the same pattern as the regular one and
added tests.
---------

Co-authored-by: vamsee_yarlagadda <vamsee.y@airbnb.com>
2023-08-18 11:26:12 -07:00
William FH
c29fbede59
Wfh/rm num repetitions (#9425)
Makes it hard to do test run comparison views and we'd probably want to
just run multiple runs right now
2023-08-18 10:08:39 -07:00
Predrag Gruevski
eee0d1d0dd
Update repository links in the package metadata. (#9454) 2023-08-18 12:55:43 -04:00
Bagatur
50b8f4dcc7
bump 268 (#9455) 2023-08-18 08:46:39 -07:00
Nuno Campos
354c42afd2 Lint 2023-08-18 15:30:30 +01:00
Nuno Campos
4452314aab Merge branch 'master' into bagatur/locals_in_config 2023-08-18 15:23:05 +01:00
Nuno Campos
d5eb228874
Add kwargs to all other optional runnable methods (#9439)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-18 15:04:26 +01:00
Leonid Ganeline
a3dd4dcadf
📖 docstrings retrievers consistency (#9422)
📜 
- updated the top-level descriptions to a consistent format;
- changed the format of several 100% internal functions from "name" to
"_name". So, these functions are not shown in the Top-level API
Reference page (with lists of classes/functions)
2023-08-18 09:20:39 -04:00
Nuno Campos
9417961b17
Add lock on tee peer cleanup (#9446)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-18 14:20:09 +01:00
Nuno Campos
d3f10d2f4f Update test 2023-08-18 11:36:16 +01:00
Nuno Campos
6ae58da668 Assign defaults in batch calls 2023-08-18 10:53:10 +01:00
Nuno Campos
ddcb4ff5fb Li t 2023-08-18 10:30:42 +01:00
Nuno Campos
1baedc4e18 Move patch_config 2023-08-18 10:28:39 +01:00
Nuno Campos
46f3850794 Lint 2023-08-18 10:25:41 +01:00
Nuno Campos
24a197f96a Merge branch 'master' into bagatur/locals_in_config 2023-08-18 10:12:10 +01:00
Nuno Campos
8ddaaf3d41 Move config helpers 2023-08-18 10:10:35 +01:00
Nuno Campos
a5e7dcec61 Lint 2023-08-18 10:03:28 +01:00
Nuno Campos
c1b1666ec8 Ensure config defaults apply even when a config is passed in 2023-08-18 10:02:29 +01:00
Nuno Campos
7fe474d198 Update snapshots 2023-08-18 10:02:11 +01:00
Jacob Lee
0689628489
Adds streaming for runnable maps (#9283)
@nfcampos @baskaryan

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-08-18 07:46:23 +01:00
Bagatur
ab21af71be wip 2023-08-17 17:28:02 -07:00
Bagatur
6f69b19ff5 wip tests 2023-08-17 16:45:52 -07:00
Bagatur
9e906c39ba nit 2023-08-17 16:22:22 -07:00
Bagatur
6b0a849f59 fix 2023-08-17 16:22:12 -07:00
Bagatur
c447e9a854 cr 2023-08-17 15:29:00 -07:00
Bagatur
bd80cad6db add 2023-08-17 13:52:19 -07:00
Bagatur
8c1a528c71 cr 2023-08-17 13:52:09 -07:00
Bagatur
25cbcd9374 merge 2023-08-17 13:03:28 -07:00
Aashish Saini
ce78877a87
Replaced instances of raising ValueError with raising ImportError. (#9388)
Refactored code to ensure consistent handling of ImportError. Replaced
instances of raising ValueError with raising ImportError.

The choice of raising a ValueError here is somewhat unconventional and
might lead to confusion for anyone reading the code. Typically, when
dealing with import-related errors, the recommended approach is to raise
an ImportError with a descriptive message explaining the issue. This
provides a clearer indication that the problem is related to importing
the required module.

@hwchase17 , @baskaryan , @eyurtsev 

Thanks
Aashish

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-17 12:24:08 -07:00
Bagatur
8c986221e4
make openapi_schema_pydantic opt (#9408) 2023-08-17 11:49:23 -07:00
Eugene Yurtsev
77b359edf5
More missing type annotations (#9406)
This PR fills in more missing type annotations on pydantic models. 

It's OK if it missed some annotations, we just don't want it to get
annotations wrong at this stage.

I'll do a few more passes over the same files!
2023-08-17 12:19:50 -04:00
Bagatur
a69d1b84f4
bump 267 (#9403) 2023-08-17 08:47:13 -07:00
Nuno Campos
c0d67420e5
Use a submodule for pydantic v1 compat (#9371)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
2023-08-17 16:35:49 +01:00
Bagatur
995ef8a7fc
unpin pydantic (#9356) 2023-08-17 01:55:46 -07:00
Tong Gao
3c8e9a9641
Fix typos in eval_chain.py (#9365)
Fixed two minor typos.
2023-08-17 01:53:46 -07:00
Eugene Yurtsev
2673b3a314
Create pydantic v1 namespace in langchain (#9254)
Create pydantic v1 namespace in langchain experimental
2023-08-16 21:19:31 -07:00
Eugene Yurtsev
4c2de2a7f2
Adding missing types in some pydantic models (#9355)
* Adding missing types in some pydantic models -- this change is
required for making the code work with pydantic v2.
2023-08-16 20:10:34 -07:00
Harrison Chase
1c089cadd7
fix import v2 (#9346) 2023-08-16 17:33:01 -07:00
qqjettkgjzhxmwj
84a97d55e1
Fix typo in llm_router.py (#9322)
Fix typo
2023-08-16 15:56:44 -07:00
Joe Reuter
09aa1eac03
Airbyte loaders: Fix last_state getter (#9314)
This PR fixes the Airbyte loaders when doing incremental syncs. The
notebooks are calling out to access `loader.last_state` to get the
current state of incremental syncs, but this didn't work due to a
refactoring of how the loaders are structured internally in the original
PR.

This PR fixes the issue by adding a `last_state` property that forwards
the state correctly from the CDK adapter.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-16 15:56:33 -07:00
Jakub Kuciński
8bebc9206f
Add improved sources splitting in BaseQAWithSourcesChain (#8716)
## Type:
Improvement

---

## Description:
Running QAWithSourcesChain sometimes raises ValueError as mentioned in
issue #7184:
```
ValueError: too many values to unpack (expected 2)
Traceback:

    response = qa({"question": pregunta}, return_only_outputs=True)
File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 166, in __call__
    raise e
File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 160, in __call__
    self._call(inputs, run_manager=run_manager)
File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\qa_with_sources\base.py", line 132, in _call
    answer, sources = re.split(r"SOURCES:\s", answer)
```
This is due to LLM model generating subsequent question, answer and
sources, that is complement in a similar form as below:
```
<final_answer>
SOURCES: <sources>
QUESTION: <new_or_repeated_question>
FINAL ANSWER: <new_or_repeated_final_answer>
SOURCES: <new_or_repeated_sources>
```
It leads the following line
```
 re.split(r"SOURCES:\s", answer)
```
to return more than 2 elements and result in ValueError. The simple fix
is to split also with "QUESTION:\s" and take the first two elements:
```
answer, sources = re.split(r"SOURCES:\s|QUESTION:\s", answer)[:2]
```

Sometimes LLM might also generate some other texts, like alternative
answers in a form:
```
<final_answer_1>
SOURCES: <sources>

<final_answer_2>
SOURCES: <sources>

<final_answer_3>
SOURCES: <sources>
```
In such cases it is the best to split previously obtained sources with
new line:
```
sources = re.split(r"\n", sources.lstrip())[0]
```



---

## Issue:
Resolves #7184

---

## Maintainer:
@baskaryan
2023-08-16 13:30:15 -07:00
Bagatur
a3c79b1909
Add tiktoken integration dep (#9332) 2023-08-16 12:09:22 -07:00
Bagatur
ba5fbaba70
bump 266 (#9296) 2023-08-16 01:13:19 -07:00
axiangcoding
63601551b1
fix(llms): improve the ernie chat model (#9289)
- Description: improve the ernie chat model.
   - fix missing kwargs to payload
   - new test cases
   - add some debug level log
   - improve description
- Issue: None
- Dependencies: None
- Tag maintainer: @baskaryan
2023-08-16 00:48:42 -07:00