Commit Graph

4395 Commits (890ed775a38e4f1d0b9648fc9e50bc2bc07c573b)
 

Author SHA1 Message Date
Nuno Campos 5426712311 Adjust merge logic 1 year ago
Nuno Campos f95bd0bcd9 Fix issue 1 year ago
Nuno Campos f69155b4f7 Add run_id, run_name to RunnableConfig 1 year ago
Nuno Campos a3c69cf41d Add .with_config() method to Runnables which allows binding any config values to a Runnable 1 year ago
jmhayes3 324c86acd5
fix typo in web_research.py (#10076)
fix spelling
1 year ago
Davide Menini 3f8f3de28e
fix (parsers/json): do not escape double quotes if already escaped (#9916)
This PR fixes an issues I found when upgrading to a more recent version
of Langchain. I was using 0.0.142 before, and this issue popped up
already when the `_custom_parser` was added to `output_parsers/json`.

Anyway, the issue is that the parser tries to escape quotes when they
are double-escaped (e.g. `\\"`), leading to OutputParserException.
This is particularly undesired in my app, because I have an Agent that
uses a single input Tool, which expects as input a JSON string with the
structure:
```python
{
    "foo": string,
    "bar": string
}
```
The LLM (GPT3.5) response is (almost) always something like
`"action_input": "{\\"foo\\": \\"bar\\", \\"bar\\": \\"foo\\"}"` and
since the upgrade this is not correctly parsed.

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
1 year ago
Harrison Chase ad9e242a7a
add snippet for max concurrency (#9892) 1 year ago
Harrison Chase 566ce06f4a
add async support for tools (#10058) 1 year ago
Stefano Lottini c710c7303f
fix wrong import line in cassandra doc page for vector store (#10041)
This fixes the exampe import line in the general "cassandra" doc page
mdx file. (it was erroneously a copy of the chat message history import
statement found below).
1 year ago
Jon Bennion cc6a20d3e6
updated prompt name in documentation for sequential chain (#10048)
Description: updated the prompt name in a sequential chain example so
that it is not overwritten by the same prompt name in the next chain
(this is a sequential chain example)
Issue: n/a
Dependencies: none
Tag maintainer: not known
Twitter handle: not on twitter, feel free to use my git username for
anything
1 year ago
Jiří Moravčík 86646ec555
feat: Add `ApifyWrapper` class (#10067)
If you look at documentation
https://python.langchain.com/docs/integrations/tools/apify (or the
actual file
https://github.com/langchain-ai/langchain/blob/master/docs/extras/integrations/tools/apify.ipynb
), there's a class `ApifyWrapper` mentioned. It seems it got lost in
some refactoring, i.e. it does not exist in the codebase ATM.

I just propose to add it back.
It would fix issues e.g.
https://github.com/langchain-ai/langchain/issues/8307 or
https://github.com/langchain-ai/langchain/issues/8201

To add, Apify is a wanted integration, e.g. see
https://twitter.com/hwchase17/status/1695490295914545626 or
https://twitter.com/hwchase17/status/1695470765343461756

Lastly, I offer taking ownership of the Apify-related parts of the
codebase, so you can tag me if anything is needed.
1 year ago
Robert Perrotta 02e51f4217
update_forward_refs for Run (#9969)
Adds a call to Pydantic's `update_forward_refs` for the `Run` class (in
addition to the `ChainRun` and `ToolRun` classes, for which that method
is already called). Without it, the self-reference of child classes
(type `List[Run]`) is problematic. For example:

```python
from langchain.callbacks import StdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from wandb.integration.langchain import WandbTracer

llm = OpenAI()
prompt = PromptTemplate.from_template("1 + {number} = ")

chain = LLMChain(llm=llm, prompt=prompt, callbacks=[StdOutCallbackHandler(), WandbTracer()])
print(chain.run(number=2))

```

results in the following output before the change

```
WARNING:root:Error in on_chain_start callback: field "child_runs" not yet prepared so type is still a ForwardRef, you might need to call Run.update_forward_refs().

> Entering new LLMChain chain...
Prompt after formatting:
1 + 2 = 
WARNING:root:Error in on_chain_end callback: No chain Run found to be traced

> Finished chain.

3
```

but afterwards the callback error messages are gone.
1 year ago
Eugene Yurtsev 74fcfed4e2
lint for pydantic imports (#9937)
Catch pydantic imports
1 year ago
Zizhong Zhang 641b71e2cd
refactor: rename to OpaquePrompts (#10013)
Renamed to OpaquePrompts

cc @baskaryan Thanks in advance!
1 year ago
Bagatur 8d66b00c73
Data anonymizer notebook nit (#10062) 1 year ago
Bagatur 19400ba253
bump 278 (#10052) 1 year ago
Bagatur 29270e0378
fix #3117 (#9957)
fix #3117
1 year ago
Bagatur 5b913003e0 bump 1 year ago
Bagatur 4b15328767
Add indexing support for postgresql (#9933)
Add support to postgresql for the SQL Manager Record

This code was tested locally. I'm looking at how to add testing with
postgres in a separate PR.
1 year ago
Bagatur e60e1cdf23
fixed openai_functions api_response format args err (#9968)
root cause: args may not have a key (params) resulting in an error
1 year ago
Bagatur 3efab8d3df
implement vectorstores by tencent vectordb (#9989)
Hi there!
I'm excited to open this PR to add support for using 'Tencent Cloud
VectorDB' as a vector store.

Tencent Cloud VectorDB is a fully-managed, self-developed,
enterprise-level distributed database service designed for storing,
retrieving, and analyzing multi-dimensional vector data. The database
supports multiple index types and similarity calculation methods, with a
single index supporting vector scales up to 1 billion and capable of
handling millions of QPS with millisecond-level query latency. Tencent
Cloud VectorDB not only provides external knowledge bases for large
models to improve their accuracy, but also has wide applications in AI
fields such as recommendation systems, NLP services, computer vision,
and intelligent customer service.

The PR includes:
 Implementation of Vectorstore.

I have read your [contributing
guidelines](72b7d76d79/.github/CONTRIBUTING.md).
And I have passed the tests below

 make format
 make lint
 make coverage
 make test
1 year ago
Bagatur d43a36c32a
Bagatur/dereference tool schema (#10007)
fix for #9375
1 year ago
Bagatur 6b5a970949
refactor(document_loaders): abstract page evaluation logic in PlaywrightURLLoader (#9995)
This PR brings structural updates to `PlaywrightURLLoader`, aiming at
making the code more readable and extensible through the abstraction of
page evaluation logic. These changes also align this implementation with
a similar structure used in LangChain.js.

The key enhancements include:

1. Introduction of 'PlaywrightEvaluator', an abstract base class for all
evaluators.
2. Creation of 'UnstructuredHtmlEvaluator', a concrete class
implementing 'PlaywrightEvaluator', which uses `unstructured` library
for processing page's HTML content.
3. Extension of 'PlaywrightURLLoader' constructor to optionally accept
an evaluator of the type 'PlaywrightEvaluator'. It defaults to
'UnstructuredHtmlEvaluator' if no evaluator is provided.
4. Refactoring of 'load' and 'aload' methods to use the 'evaluate' and
'evaluate_async' methods of the provided 'PageEvaluator' for page
content handling.

This update brings flexibility to 'PlaywrightURLLoader' as it can now
utilize different evaluators for page processing depending on the
requirement. The abstraction also improves code maintainability and
readability.

Twitter: @ywkim
1 year ago
Bagatur b1644bc9ad cr 1 year ago
Hunsmore 13fef1e5d3
add bloomz_7b, llama-2-7b, llama-2-13b, llama-2-70b to ErnieBotChat (#10024)
- Description: Add bloomz_7b, llama-2-7b, llama-2-13b, llama-2-70b to
ErnieBotChat, which only supported ERNIE-Bot-turbo and ERNIE-Bot.
  - Issue: #10022,
  - Dependencies: no extra dependencies

---------

Co-authored-by: hetianfeng <hetianfeng@meituan.com>
1 year ago
Cameron Vetter e37d51cab6
fix scoring profile example (#10016)
- Description: A change in the documentation example for Azure Cognitive
Vector Search with Scoring Profile so the example works as written
  - Issue: #10015 
  - Dependencies: None
  - Tag maintainer: @baskaryan @ruoccofabrizio
  - Twitter handle: @poshporcupine
1 year ago
skspark 52a3e8a261
Add integration TCs on bing search (#8068) (#10021)
## Description
Added integration TCs on bing search utility

## Issue
#8068 

## Dependencies
None
1 year ago
Hyeokjun seo e2e05ad89e
Fix Typo : `openai_api_key` -> `serpapi_api_key` (#10020)
Fixed typo in the comments Notebook. (which says `openai_api_key` for
SerpAPI)
1 year ago
Tomaz Bratanic f2e8399cc8
Fix link in Neo4j provider page (#10023) 1 year ago
William FH 5341b04d68
Update error message (#9970)
in evals
1 year ago
William FH b82ad19ed2
Check memory address (#9971)
Don't want to dup the collector but can have multiple
1 year ago
Bagatur e805f8e263 add tests 1 year ago
Bagatur 1f5c579ef4 add 1 year ago
Bagatur 240cc289e6 wip 1 year ago
Bagatur 7fa82900cb
guides docs nits (#10005) 1 year ago
Bagatur 2f03e71e67
rename local llm guide (#10004) 1 year ago
Bagatur 781f274d19
make privacy guide section (#10003) 1 year ago
maks-operlejn-ds a8f804a618
Add data anonymizer (#9863)
### Description

The feature for anonymizing data has been implemented. In order to
protect private data, such as when querying external APIs (OpenAI), it
is worth pseudonymizing sensitive data to maintain full privacy.

Anonynization consists of two steps:

1. **Identification:** Identify all data fields that contain personally
identifiable information (PII).
2. **Replacement**: Replace all PIIs with pseudo values or codes that do
not reveal any personal information about the individual but can be used
for reference. We're not using regular encryption, because the language
model won't be able to understand the meaning or context of the
encrypted data.

We use *Microsoft Presidio* together with *Faker* framework for
anonymization purposes because of the wide range of functionalities they
provide. The full implementation is available in `PresidioAnonymizer`.

### Future works

- **deanonymization** - add the ability to reverse anonymization. For
example, the workflow could look like this: `anonymize -> LLMChain ->
deanonymize`. By doing this, we will retain anonymity in requests to,
for example, OpenAI, and then be able restore the original data.
- **instance anonymization** - at this point, each occurrence of PII is
treated as a separate entity and separately anonymized. Therefore, two
occurrences of the name John Doe in the text will be changed to two
different names. It is therefore worth introducing support for full
instance detection, so that repeated occurrences are treated as a single
object.

### Twitter handle
@deepsense_ai / @MaksOpp

---------

Co-authored-by: MaksOpp <maks.operlejn@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 98cce7dcd3
update moderation docs (#10002) 1 year ago
Bagatur b3e3a31240
bump 277 (#9997) 1 year ago
Bagatur 9828701de1
mv base cache to schema (#9953)
if you remove all other imports from langchain.init it exposes a
circular dep
1 year ago
Christophe Bornet 9870bfb9cd
Add bucket and object key to metadata in S3 loader (#9317)
- Description: this PR adds `s3_object_key` and `s3_bucket` to the doc
metadata when loading an S3 file. This is particularly useful when using
`S3DirectoryLoader` to remove the files from the dir once they have been
processed (getting the object keys from the metadata `source` field
seems brittle)
  - Dependencies: N/A
  - Tag maintainer: ?
  - Twitter handle: _cbornet

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Eugene Yurtsev 6da158388b Merge branch 'master' into ywkim/master 1 year ago
Guy Korland 24c0b01c38
Extend the FalkorDB QA demo (#9992)
- Description: Extend the FalkorDB QA demo
  - Tag maintainer: @baskaryan
1 year ago
Eugene Yurtsev 588237ef30
Make document serializable, create utility to create a docstore (#9674)
This PR makes the following changes:

1. Documents become serializable using langhchain serialization
2. Make a utility to create a docstore kw store

Will help to address issue here:
https://github.com/langchain-ai/langchain/issues/9345
1 year ago
Eugene Yurtsev e8f29be350 x 1 year ago
Buckler89 a28e888b36
fix call _get_keys for custom_evaluator (#9763)
In the function _load_run_evaluators the function _get_keys was not
called if only custom_evaluators parameter is used


- Description: In the function _load_run_evaluators the function
_get_keys was not called if only custom_evaluators parameter is used,
  - Issue: no issue created for this yet,
  - Dependencies: None,
  - Tag maintainer: @vowelparrot,
  - Twitter handle: Buckler89

---------

Co-authored-by: ddroghini <d.droghini@mflgroup.com>
1 year ago
Eugene Yurtsev cafce9ed23 x 1 year ago
wlleiiwang 8c4e29240c implement vectorstores by tencent vectordb 1 year ago
Bagatur 2d2b097fab
mv chat history (#9725) 1 year ago