Commit Graph

2344 Commits (eb903e211c315596f0bb8028b6d17c24593d6f47)

Author SHA1 Message Date
Manuel Soria dde1992fdd
Adding custom tools to SQL Agent (#10198)
Changes in:
- `create_sql_agent` function so that user can easily add custom tools
as complement for the toolkit.
- updating **sql use case** notebook to showcase 2 examples of extra
tools.

Motivation for these changes is having the possibility of including
domain expert knowledge to the agent, which improves accuracy and
reduces time/tokens.

---------

Co-authored-by: Manuel Soria <manuel.soria@greyscaleai.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
olgavrou 62cf108700 add random policy and notebook 1 year ago
Lance Martin 8998060d85
Update docs w/ prompt hub (#10197)
Small updates to docs
1 year ago
Bagatur a94dc6ee44
model garden nit (#10194) 1 year ago
Aashish Saini 8bba69ffd0
Fixed some grammatical typos in doc files (#10191)
Fixed some grammatical typos in doc files
CC: @baskaryan, @eyurtsev, @rlancemartin.

---------

Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com>
Co-authored-by: Md Nazish Arman <142379599+MdNazishArmanShorthillsAI@users.noreply.github.com>
Co-authored-by: KamalSharmaShorthillsAI <142474019+KamalSharmaShorthillsAI@users.noreply.github.com>
Co-authored-by: Lakshya <lakshyagupta87@yahoo.com>
Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>
1 year ago
mateusz.wosinski 6b9529e11a Update notebook 1 year ago
mateusz.wosinski 800fe4a73f Integration with eleven labs 1 year ago
IlyaKIS1 de3322609e
Implemented Milvus translator for self-querying (#10162)
- Implemented the MilvusTranslator for self-querying using Milvus vector
store
- Made unit tests to test its functionality
- Documented the Milvus self-querying
1 year ago
Aashish Saini 7403faa063
Fixed typo in get_started.mdx (#10163)
Fix typo: 'Whats up' -> 'What's up'

Thanks
CC: @baskaryan, @eyurtsev, @rlancemartin.

---------

Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com>
Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
1 year ago
Aashish Saini f6f0b0f975
Fixed typo in bittensor.mdx (#10160)
Fixed Typo in bittenaor.mdx

---------

Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
1 year ago
Christophe Bornet 803d0d9656
Add the possibility to configure boto3 in the S3 loaders (#9304)
- Description: this PR adds the possibility to configure boto3 in the S3
loaders. Any named argument you add will be used to create the Boto3
session. This is useful when the AWS credentials can't be passed as env
variables or can't be read from the credentials file.
  - Issue: N/A
  - Dependencies: N/A
  - Tag maintainer: ?
  - Twitter handle: cbornet_

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline 03174c91d0
docs: `MLflow API` and examples (#9547)
Added docs and links to the API and examples provided by MLflow itself
1 year ago
Xiaoyu Xee 9bcfd58580
Add dashvector self query retriever (#9684)
## Description
Add `Dashvector` retriever and self-query retriever

## How to use
```python
from langchain.vectorstores.dashvector import DashVector

vectorstore = DashVector.from_documents(docs, embeddings)
retriever = SelfQueryRetriever.from_llm(
    llm, vectorstore, document_content_description, metadata_field_info, verbose=True
)
```

---------

Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Leonid Ganeline 056e59672b
docs: `DeepLake` example (#9663)
Updated the `Deep Lake` example. Added a link to an example provided by
Activeloop.
1 year ago
seamusp 43c4c6dfcc
docs: misc modelIO fixes (#9734)
Various improvements to the Model I/O section of the documentation

- Changed "Chat Model" to "chat model" in a few spots for internal
consistency
- Minor spelling & grammar fixes to improve readability & comprehension
1 year ago
Nino Risteski 433c4a721e
typo in locall llms fixed (#9755)
Hi, 

I noticed a typo in the local_llms.ipynb file and fixed it. The word
challenge is without 'a' in the original file.
@baskaryan , @eyurtsev

Thanks.

Co-authored-by: Fliprise <fliprise@Fliprises-MacBook-Pro.local>
1 year ago
seamusp 16945c9922
docs: misc retrievers fixes (#9791)
Various miscellaneous fixes to most pages in the 'Retrievers' section of
the documentation:
- "VectorStore" and "vectorstore" changed to "vector store" for
consistency
- Various spelling, grammar, and formatting improvements for readability

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Terry Tan 8bc452a466
Enhance Google search tool SerpApi response (#10157)
Enhance SerpApi response which potential to have more relevant output.

<img width="345" alt="Screenshot 2023-09-01 at 8 26 13 AM"
src="https://github.com/langchain-ai/langchain/assets/10222402/80ff684d-e02e-4143-b218-5c1b102cbf75">

Query: What is the weather in Pomfret?

**Before:**

> I should look up the current weather conditions.
...
Final Answer: The current weather in Pomfret is 73°F with 1% chance of
precipitation and winds at 10 mph.

**After:**

> I should look up the current weather conditions.
...
Final Answer: The current weather in Pomfret is 62°F, 1% precipitation,
61% humidity, and 4 mph wind.

---

Query: Top team in english premier league?

**Before:**

> I need to find out which team is currently at the top of the English
Premier League
...
Final Answer: Liverpool FC is currently at the top of the English
Premier League.

**After:**

> I need to find out which team is currently at the top of the English
Premier League
...
Final Answer: Man City is currently at the top of the English Premier
League.

---

Query: Top team in english premier league?

**Before:**

> I need to find out which team is currently at the top of the English
Premier League
...
Final Answer: Liverpool FC is currently at the top of the English
Premier League.


**After:**

> I need to find out which team is currently at the top of the English
Premier League
...
Final Answer: Man City is currently at the top of the English Premier
League.

---

Query: Any upcoming events in Paris?

**Before:**

> I should look for events in Paris
Action: Search
...
Final Answer: Upcoming events in Paris this month include Whit Sunday &
Whit Monday (French National Holiday), Makeup in Paris, Paris Jazz
Festival, Fete de la Musique, and Salon International de la Maison de.

**After:**

> I should look for events in Paris
Action: Search
...
Final Answer: Upcoming events in Paris include Elektric Park 2023, The
Aces, and BEING AS AN OCEAN.
1 year ago
Aashish Saini fe0e191fb3
Made some Grammatical error fixes (#10156)
Made some Grammatical error fixes.
CC: @baskaryan, @eyurtsev, @rlancemartin.

---------

Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
1 year ago
Geonwoo Kim e34dde3d15
docs: Fix `CustomLLM` and `Question_answering` docs (#9782)
### Description
- Update `CustomLLM._call`: Corrected the _call method in CustomLLM to
include **kwargs, ensuring consistency with parent class.
- Update `Question_answering`: To fix `Page not found` error
- https://python.langchain.com/docs/use_cases/code ->
https://python.langchain.com/docs/use_cases/code_understanding

### Issue
N/A

### Dependencies
N/A

### Tag maintainer
N/A

### Twitter handle
N/A
1 year ago
Aashish Saini 94efede93c
Fixed Typos and grammatical issues in document files (#9789)
Fixed typos and grammatical issues in document files.

@baskaryan , @eyurtsev

---------

Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
1 year ago
Philippe PRADOS f59e5d48ed
Google drive integration (lite) (#9999)
My other
[pull-request](https://github.com/langchain-ai/langchain/pull/5135) is
too big to be acceptable.
I propose another 'lite' version.

I update only notebook to propose an integration with the external
project
[`langchain-googledrive`](https://github.com/pprados/langchain-googledrive).

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Viktor Zhemchuzhnikov 507e46844e
Extend SQLChatMessageHistory (#9849)
### Description

There is a really nice class for saving chat messages into a database -
SQLChatMessageHistory.
It leverages SqlAlchemy to be compatible with any supported database (in
contrast with PostgresChatMessageHistory, which is basically the same
but is limited to Postgres).

However, the class is not really customizable in terms of what you can
store. I can imagine a lot of use cases, when one will need to save a
message date, along with some additional metadata.

To solve this, I propose to extract the converting logic from
BaseMessage to SQLAlchemy model (and vice versa) into a separate class -
message converter. So instead of rewriting the whole
SQLChatMessageHistory class, a user will only need to write a custom
model and a simple mapping class, and pass its instance as a parameter.

I also noticed that there is no documentation on this class, so I added
that too, with an example of custom message converter.

### Issue

N/A

### Dependencies

N/A

### Tag maintainer

Not yet

### Twitter handle

N/A
1 year ago
Jon Bennion fed137a8a9
adding new chain for logical fallacy removal from model output in chain (#9887)
Description: new chain for logical fallacy removal from model output in
chain and docs
Issue: n/a see above
Dependencies: none
Tag maintainer: @hinthornw in past from my end but not sure who that
would be for maintenance of chains
Twitter handle: no twitter feel free to call out my git user if shout
out j-space-b

Note: created documentation in docs/extras

---------

Co-authored-by: Jon Bennion <jb@Jons-MacBook-Pro.local>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Lance Martin 16a27ab244
Add prompt hub for various use-cases (#9879)
Use prompt hub in our use-case docs and guides.
1 year ago
Leonid Ganeline a52fe9528e
docs: fixed title in `Bittensor` example (#9893)
Fixed title in the `Bittensor` example. The old title brakes the sorted
order of items in the navbar.
Added some formatting.
1 year ago
seamusp abd8681341
docs: chains & memory fixes (#9895)
Various improvements to the Chains & Memory sections of the
documentation including formatting, spelling, and grammar fixes to
improve readability.
1 year ago
Josh White bc8cceebf7
Extend DynamoDBChatMessageHistory to support composite keys (#9896)
- Description: Adds two optional parameters to the
DynamoDBChatMessageHistory class to enable users to pass in a name for
their PrimaryKey, or a Key object itself to enable the use of composite
keys, a common DynamoDB paradigm.
  
[AWS DynamoDB Key
docs](https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/)
  
  - Issue: N/A
  - Dependencies: N/A
  - Twitter handle: N/A

---------

Co-authored-by: Josh White <josh@ctrlstack.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Blake (Yung Cher Ho) f4bed8a04c
Takeoff baseurl support (#10091)
## Description
This PR introduces a minor change to the TitanTakeoff integration. 
Instead of specifying a port on localhost, this PR will allow users to
specify a baseURL instead. This will allow users to use the integration
if they have TitanTakeoff deployed externally (not on localhost). This
removes the hardcoded reference to localhost "http://localhost:{port}".

### Info about Titan Takeoff
Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.

Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)

### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.

Thanks for your help and please let me know if you have any questions.
cc: @hwchase17 @baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Pu Cao 05664a6f20
docs(text_splitter): update document of character splitter with tiktoken (#10001)
The current document has not mentioned that splits larger than chunk
size would happen. I update the related document and explain why it
happens and how to solve it.

related issue #1349 #3838 #2140
1 year ago
Leonid Ganeline 2221194450
`Yahoo Finance News` tool (#10014)
Added:
- the `Yahoo Finance News` tool
- Ut-s
- An example
1 year ago
Ismail Pelaseyed 5c3e9c9083
Add example of running Q&A over structured data using the `Airbyte` loaders and `pandas` (#10069)
- Description: Added example of running Q&A over structured data using
the `Airbyte` loaders and `pandas`
  - Dependencies: any dependencies required for this change,
  - Tag maintainer: @hwchase17 
  - Twitter handle: @pelaseyed
1 year ago
Lars von Wedel 6d82503eb1
Add parser and loader for Azure document intelligence service. (#10136)
Hi,

this PR contains loader / parser for Azure Document intelligence which
is a ML-based service to ingest arbitrary PDFs / images, even if
scanned. The loader generates Documents by pages of the original
document. This is my first contribution to LangChain.

Unfortunately I could not find the correct place for test cases. Happy
to add one if you can point me to the location, but as this is a
cloud-based service, a test would require network access and credentials
- so might be of limited help.

Dependencies: The needed dependency was already part of pyproject.toml,
no change.
Twitter: feel free to mention @LarsAC on the announcement
1 year ago
Harrison Chase 4abe85be57
Harrison/string inplace (#10153)
Co-authored-by: Wrick Talukdar <wrick.talukdar@gmail.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Lucky-Lance <77819606+Lucky-Lance@users.noreply.github.com>
Co-authored-by: 陆徐东 <luxudong@MacBook-Pro.local>
1 year ago
Nino Risteski 0c0a7d19eb
Update openai_multi_functions_agent.ipynb (#10144)
typo fix
1 year ago
Nino Risteski f968b86652
Update apis.ipynb (#10145)
few typo fixes
1 year ago
Guy Korland 765ef3b486
Add FalkorDB to imports (#10151) 1 year ago
Nino Risteski 746c6ff9c3
Update index.mdx (#10142)
fixed typos
1 year ago
Nino Risteski fdebd3e02f
Update chat_vector_db.mdx (#10141)
typo fix
1 year ago
Leonid Kuligin 30239b3025
added support for inference from Model Garden (#9367)
#8850

---------

Co-authored-by: Leonid Kuligin <kuligin@google.com>
1 year ago
Leonid Ganeline 54a8df87b9
📖 docs: fixed `integration/llms` navbar (#9277)
Fixed navbar:
- renamed several files, so ToC is sorted correctly
- made ToC items consistent: formatted several Titles
- added several links
- reformatted several docs to a consistent format
- renamed several files (removed `_example` suffix)
- added renamed files to the `docs/docs_skeleton/vercel.json`
1 year ago
Bagatur b485c3048b
rm base64 images from docs (#10110)
Causing problems indexing docs and notebook images don't render after markdown conversion anyways
1 year ago
William FH f2fc4173c3
Update redirects meta tags (#10109) 1 year ago
Leonid Ganeline 37e435bd00
docs: `youtube_search` tool example update (#9958)
Added a link to source package; updated title, description.
1 year ago
Leonid Ganeline 3b8ee74e38
docs: `google-drive-tool` example fix (#10000)
This notebook was mistakenly placed in the `toolkits` folder and appears
within `Agents & Toolkits` menu. But it should be in `Tools`.
Moved example into `tools/`; updated title to consistent format.
1 year ago
seamusp afd96b2460
docs: agents & callbacks fixes (#10066)
Various improvements to the Agents & Callbacks sections of the
documentation including formatting, spelling, and grammar fixes to
improve readability.
1 year ago
Benjamin Matson 58d7d86e51
feat: add bedrock chat model (#8017)
Replace this comment with:
  - Description: Add Bedrock implementation of Anthropic Claude for Chat
  - Tag maintainer: @hwchase17, @baskaryan
  - Twitter handle: @bwmatson

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
KyrianC 491089754d
EdenAI LLM update. Add models name option (#8963)
This PR follows the **Eden AI (LLM + embeddings) integration**. #8633 

We added an optional parameter to choose different AI models for
providers (like 'text-bison' for provider 'google', 'text-davinci-003'
for provider 'openai', etc.).

Usage:

```python
llm = EdenAI(
    feature="text",
    provider="google",
    params={
        "model": "text-bison",  # new
        "temperature": 0.2,
        "max_tokens": 250,
    },
)

```

You can also change the provider + model after initialization
```python
llm = EdenAI(
    feature="text",
    provider="google",
    params={
        "temperature": 0.2,
        "max_tokens": 250,
    },
)

prompt = """
hi 
"""

llm(prompt, providers='openai', model='text-davinci-003')  # change provider & model
```

The jupyter notebook as been updated with an example well.


Ping: @hwchase17, @baskaryan

---------

Co-authored-by: RedhaWassim <rwasssim@gmail.com>
Co-authored-by: sam <melaine.samy@gmail.com>
1 year ago
Bagatur 71c418725f
index rename delete_mode -> cleanup (#10103) 1 year ago
Bagatur b927277809
Bagatur/eden type 2 (#10102) 1 year ago
Bagatur d4380339c1
eden tool nb nit (#10101) 1 year ago
KyrianC c7a5504789
Add EdenAI Tools (#9764)
This PR follows the Eden AI (LLM + embeddings) integration. #8633

We added different Tools to empower agents with new capabilities :

- text: explicit content detection

- image: explicit content detection

- image: object detection

- OCR: invoice parsing

- OCR: ID parsing

- audio: speech to text

- audio: text to speech

 
We plan to add more in the future (like translation, language detection,
+ others).


Usage:

```python
llm=EdenAI(feature="text",provider="openai", params={"temperature" : 0.2,"max_tokens" : 250})

tools = [
    EdenAiTextModerationTool(providers=["openai"],language="en"),
    EdenAiObjectDetectionTool(providers=["google","api4ai"]),
    EdenAiTextToSpeechTool(providers=["amazon"],language="en",voice="MALE"),
    EdenAiExplicitImageTool(providers=["amazon","google"]),
    EdenAiSpeechToTextTool(providers=["amazon"]),
    EdenAiParsingIDTool(providers=["amazon","klippa"],language="en"),
    EdenAiParsingInvoiceTool(providers=["amazon","google"],language="en"),
]

agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    return_intermediate_steps=True,
)

result = agent_chain(""" i have this text : 'i want to slap you' 
                   first : i want to know if this text contains explicit content or not .
                   second : if it does contain explicit content i want to know what is the explicit content in this text, 
                   third : i want to make the text into speech .
                   if there is URL in the observations , you will always put it in the output (final answer) .
                   """)
```

output: 
>  Entering new AgentExecutor chain...
> I need to extract the information from the ID and then convert it to
text and then to speech
> Action: edenai_identity_parsing
> Action Input:
"https://www.citizencard.com/images/citizencard-uk-id-card-2023.jpg"
> Observation: last_name : 
>   value : ANGELA
> given_names : 
>   value : GREENE
> birth_place : 
> birth_date : 
>   value : 2000-11-09
> issuance_date : 
> expire_date : 
> document_id : 
> issuing_state : 
> address : 
> age : 
> country : 
> document_type : 
>   value : DRIVER LICENSE FRONT
> gender : 
> image_id : 
> image_signature : 
> mrz : 
> nationality : 
> Thought: I now need to convert the information to text and then to
speech
> Action: edenai_text_to_speech
> Action Input: "Welcome Angela Greene!"
> Observation:
https://d14uq1pz7dzsdq.cloudfront.net/0c494819-0bbc-4433-bfa4-6e99bd9747ea_.mp3?Expires=1693316851&Signature=YcMoVQgPuIMEOuSpFuvhkFM8JoBMSoGMcZb7MVWdqw7JEf5~67q9dEI90o5todE5mYXB5zSYoib6rGrmfBl4Rn5~yqDwZ~Tmc24K75zpQZIEyt5~ZSnHuXy4IFWGmlIVuGYVGMGKxTGNeCRNUXDhT6TXGZlr4mwa79Ei1YT7KcNyc1dsTrYB96LphnsqOERx4X9J9XriSwxn70X8oUPFfQmLcitr-syDhiwd9Wdpg6J5yHAJjf657u7Z1lFTBMoXGBuw1VYmyno-3TAiPeUcVlQXPueJ-ymZXmwaITmGOfH7HipZngZBziofRAFdhMYbIjYhegu5jS7TxHwRuox32A__&Key-Pair-Id=K1F55BTI9AHGIK
> Thought: I now know the final answer
> Final Answer:
https://d14uq1pz7dzsdq.cloudfront.net/0c494819-0bbc-4433-bfa4-6e99bd9747ea_.mp3?Expires=1693316851&Signature=YcMoVQgPuIMEOuSpFuvhkFM8JoBMSoGMcZb7MVWdqw7JEf5~67q9dEI90o5todE5mYXB5zSYoib6rGrmfBl4Rn5~yqDwZ~Tmc24K75zpQZIEyt5~ZSnHuXy4IFWGmlIVuGYVGMGKxTGNeCRNUXDhT6TXGZlr4mwa79Ei1YT7KcNyc1dsTrYB96LphnsqOERx4X9J9XriSwxn70X8oUPFfQmLcitr-syDhiwd9Wdpg6J5y
> 
>  Finished chain.

Other examples are available in the jupyter notebook.


This PR is made in parallel with  EdenAI LLM update #8963 
I apologize for the messy PR. While working in implementing Tools we
realized there was a few problems we needed to fix on LLM as well.

Ping: @hwchase17, @baskaryan

---------

Co-authored-by: RedhaWassim <rwasssim@gmail.com>
1 year ago
Bagatur 5f1c67b47c
Mv LCEL docs up a level (#10073) 1 year ago
Harrison Chase ad9e242a7a
add snippet for max concurrency (#9892) 1 year ago
Stefano Lottini c710c7303f
fix wrong import line in cassandra doc page for vector store (#10041)
This fixes the exampe import line in the general "cassandra" doc page
mdx file. (it was erroneously a copy of the chat message history import
statement found below).
1 year ago
Jon Bennion cc6a20d3e6
updated prompt name in documentation for sequential chain (#10048)
Description: updated the prompt name in a sequential chain example so
that it is not overwritten by the same prompt name in the next chain
(this is a sequential chain example)
Issue: n/a
Dependencies: none
Tag maintainer: not known
Twitter handle: not on twitter, feel free to use my git username for
anything
1 year ago
Zizhong Zhang 641b71e2cd
refactor: rename to OpaquePrompts (#10013)
Renamed to OpaquePrompts

cc @baskaryan Thanks in advance!
1 year ago
Bagatur 8d66b00c73
Data anonymizer notebook nit (#10062) 1 year ago
Bagatur 3efab8d3df
implement vectorstores by tencent vectordb (#9989)
Hi there!
I'm excited to open this PR to add support for using 'Tencent Cloud
VectorDB' as a vector store.

Tencent Cloud VectorDB is a fully-managed, self-developed,
enterprise-level distributed database service designed for storing,
retrieving, and analyzing multi-dimensional vector data. The database
supports multiple index types and similarity calculation methods, with a
single index supporting vector scales up to 1 billion and capable of
handling millions of QPS with millisecond-level query latency. Tencent
Cloud VectorDB not only provides external knowledge bases for large
models to improve their accuracy, but also has wide applications in AI
fields such as recommendation systems, NLP services, computer vision,
and intelligent customer service.

The PR includes:
 Implementation of Vectorstore.

I have read your [contributing
guidelines](72b7d76d79/.github/CONTRIBUTING.md).
And I have passed the tests below

 make format
 make lint
 make coverage
 make test
1 year ago
Bagatur b1644bc9ad cr 1 year ago
Cameron Vetter e37d51cab6
fix scoring profile example (#10016)
- Description: A change in the documentation example for Azure Cognitive
Vector Search with Scoring Profile so the example works as written
  - Issue: #10015 
  - Dependencies: None
  - Tag maintainer: @baskaryan @ruoccofabrizio
  - Twitter handle: @poshporcupine
1 year ago
Hyeokjun seo e2e05ad89e
Fix Typo : `openai_api_key` -> `serpapi_api_key` (#10020)
Fixed typo in the comments Notebook. (which says `openai_api_key` for
SerpAPI)
1 year ago
Tomaz Bratanic f2e8399cc8
Fix link in Neo4j provider page (#10023) 1 year ago
Bagatur 7fa82900cb
guides docs nits (#10005) 1 year ago
Bagatur 2f03e71e67
rename local llm guide (#10004) 1 year ago
Bagatur 781f274d19
make privacy guide section (#10003) 1 year ago
maks-operlejn-ds a8f804a618
Add data anonymizer (#9863)
### Description

The feature for anonymizing data has been implemented. In order to
protect private data, such as when querying external APIs (OpenAI), it
is worth pseudonymizing sensitive data to maintain full privacy.

Anonynization consists of two steps:

1. **Identification:** Identify all data fields that contain personally
identifiable information (PII).
2. **Replacement**: Replace all PIIs with pseudo values or codes that do
not reveal any personal information about the individual but can be used
for reference. We're not using regular encryption, because the language
model won't be able to understand the meaning or context of the
encrypted data.

We use *Microsoft Presidio* together with *Faker* framework for
anonymization purposes because of the wide range of functionalities they
provide. The full implementation is available in `PresidioAnonymizer`.

### Future works

- **deanonymization** - add the ability to reverse anonymization. For
example, the workflow could look like this: `anonymize -> LLMChain ->
deanonymize`. By doing this, we will retain anonymity in requests to,
for example, OpenAI, and then be able restore the original data.
- **instance anonymization** - at this point, each occurrence of PII is
treated as a separate entity and separately anonymized. Therefore, two
occurrences of the name John Doe in the text will be changed to two
different names. It is therefore worth introducing support for full
instance detection, so that repeated occurrences are treated as a single
object.

### Twitter handle
@deepsense_ai / @MaksOpp

---------

Co-authored-by: MaksOpp <maks.operlejn@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 98cce7dcd3
update moderation docs (#10002) 1 year ago
Christophe Bornet 9870bfb9cd
Add bucket and object key to metadata in S3 loader (#9317)
- Description: this PR adds `s3_object_key` and `s3_bucket` to the doc
metadata when loading an S3 file. This is particularly useful when using
`S3DirectoryLoader` to remove the files from the dir once they have been
processed (getting the object keys from the metadata `source` field
seems brittle)
  - Dependencies: N/A
  - Tag maintainer: ?
  - Twitter handle: _cbornet

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Guy Korland 24c0b01c38
Extend the FalkorDB QA demo (#9992)
- Description: Extend the FalkorDB QA demo
  - Tag maintainer: @baskaryan
1 year ago
wlleiiwang 8c4e29240c implement vectorstores by tencent vectordb 1 year ago
Leonid Ganeline d03d6f6fd9
Merge branch 'master' into docs-tools-menu 1 year ago
Bagatur 8fb0a9594c
Add LLMonitor Callback Handler Integration - open-source observability & analytics (#9870)
Adds support for [llmonitor](https://llmonitor.com) callbacks.

It enables:
- Requests tracking / logging / analytics
- Error debugging
- Cost analytics
- User tracking

Let me know if anythings neds to be changed for merge.

Thank you!
1 year ago
leo-gan 8c1678a8c7 Updated titles, descriptions. 1 year ago
Bagatur 7bba1d911b
Fix typo in code_understanding.ipynb (#9899)
seperate -> separate
1 year ago
Bagatur 2e65434568
docs: Fix the syntax error, replace "dotenv.load_env()" with "dotenv.… (#9900)
Description: The documents incorrectly mentions "dotenv.load_env()", but
it should actually be "dotenv.load_dotenv()". You can see the screenshot
below for reference:

python-dotenv: 1.0.0


![image](https://github.com/langchain-ai/langchain/assets/2959046/94dc4b51-cc2f-412d-92e9-16b8ff0d513e)
1 year ago
Bagatur b416f5c0c8
fix a link name format to the dependents document (#9928) 1 year ago
Bagatur 8f199239b8
docs: `llms/google vertex AI` example update (#9960)
Updated title, description, added sections.
1 year ago
Bagatur 2a03a0087d
docs: `memory` menu (#9947)
The [Memory](https://python.langchain.com/docs/modules/memory/) menu is
clogged with unnecessary wording.
I've made it more concise by simplifying titles of the example
notebooks.
As results, menu is shorter and better for comprehend.
1 year ago
Bagatur f7cc125cac
docs: `memory types` menu (#9949)
The [Memory
Types](https://python.langchain.com/docs/modules/memory/types/) menu is
clogged with unnecessary wording.
I've made it more concise by simplifying titles of the example
notebooks.
As results, menu is shorter and better for comprehend.
1 year ago
Bagatur 16eb935469
Fix for similarity_search_with_score (#9903)
- Description: the implementation for similarity_search_with_score did
not actually include a score or logic to filter. Now fixed.
- Tag maintainer: @rlancemartin
- Twitter handle: @ofermend
1 year ago
Fredrik Gullberg f69d236a4a
docs: Fix spelling mistakes in apis.ipynb (#9911)
- Description: Fix spelling mistakes in apis.ipynb
- Issue: [#9910](https://github.com/langchain-ai/langchain/issues/9910)

Co-authored-by: Fredrik Gullberg <fredrik.gullberg@klarna.com>
1 year ago
Nate Nethercott 0024824a6e
docs: Fix spelling mistakes in retrievers/get_started.mdx (#9920)
Description: Fix spelling mistakes in retrievers/get_started.mdx
1 year ago
leo-gan 210de0c66b Updated title, description, added sections 1 year ago
Cameron Hutchison bcc3463ff4
docs: Azure AD Authentication for Azure OpenAI (#9951)
# Description
This PR adds additional documentation on how to use Azure Active
Directory to authenticate to an OpenAI service within Azure. This method
of authentication allows organizations with more complex security
requirements to use Azure OpenAI.

# Issue
N/A

# Dependencies
N/A

# Twitter
https://twitter.com/CamAHutchison
1 year ago
Guy Korland 7cbe872af8
Add support for Falkordb (ex-RedisGraph) (#9821)
Replace this entire comment with:
  - Description: Add support for Falkordb (ex-RedisGraph)
  - Tag maintainer: @hwchase17
  - Twitter handle: @g_korland
1 year ago
Bagatur ede45f535e
fix intro docs (#9950) 1 year ago
Leonid Ganeline 393816e7bd
Merge branch 'master' into docs-memory-type-menu 1 year ago
Corvus Lee 0fb95ebe66
Docs: enrich SageMaker endpoint embeddings with docstrings and examples (#9924)
Description: added comments to address the relationship between
input/output transformations and the customised inference.py script.
1 year ago
leo-gan 7c7ae34eeb updated .mdx titles and text. 1 year ago
leo-gan d578efba35 updated notebook titles and text. 1 year ago
Leonid Ganeline 4b6e41a939
Merge branch 'master' into docs-memory-menu 1 year ago
Tomaz Bratanic 6092422e10
Add neo4j provider page (#9941) 1 year ago
leo-gan c906041aa8 updated notebook titles and text. 1 year ago
Tomaz Bratanic db13fba7ea
Add neo4j vector support (#9770)
Neo4j has added vector index integration just recently. To allow both
ingestion and integrating it as vector RAG applications, I wrapped it as
a vector store as the implementation is completely different from
`GraphCypherQAChain`. Here, we are not generating any Cypher statements
at query time, we are simply doing the vector similarity search using
the new vector index as if we were dealing with a vector database.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Tudor Golubenco 171b0b183b
Pre-release Xata version no longer required (#9915)
Tiny PR: Since we've released version 1.0.0 of the python SDK, we no
longer need to specify the pre-release version when pip installing.
1 year ago
Mike Nitsenko c80e406e95
Cube semantic loader: allow cubes processing (#9927)
We've started to receive feedback (after launch) that using only views
is confusing.
We're considering this as a good practice, as a view serves as a
"facade" for your data - however, we decided to let users decide this on
their own.

Solves the questions from:
- https://github.com/cube-js/cube/issues/7028
- https://github.com/langchain-ai/langchain/pull/9690
1 year ago
LiaoKong 8f8455b24d fix a link name format to the dependents document 1 year ago
Ofer Mendelevitch 8b8d2a6535 fixed similarity_search_with_score to really use a score
updated unit test with a test for score threshold
Updated demo notebook
1 year ago
Ikko Eltociear Ashimine 766bbd6c6b
Fix typo in code_understanding.ipynb
seperate -> separate
1 year ago
tongtie 82a3c2a557 docs: Fix the syntax error, replace "dotenv.load_env()" with "dotenv.load_dotenv()". 1 year ago
Mazhar (Taha) Mumbaiwala e80834d783
docs: Fix spelling mistakes in Etherscan.ipynb (#9845) 1 year ago
Philippe PRADOS 7fdb7439e0
Update google drive notebooks (#9851)
Update google drive doc loader and retriever notebooks. Show how to use with langchain-googledrive package.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Xiaobing Mi 5d47833ae1
Fix typo in web_scraping.ipynb (#9835) 1 year ago
Leonid Ganeline b1bffea9c7
docs: fix for title of `llm_caching` nb (#9891)
Fixed title for the `extras/integrations/llms/llm_caching.ipynb`.
Existing title breaks the sorted order of items in the navbar.
Updated some formatting.
1 year ago
Leonid Ganeline e01b00aa54
docs: `ainetwork` update (#9871)
* Added links to the AI Network
* Made title consistent to other tool kits
* Added `integrations/providers/` integration card page
* **No changes** in the example code!
1 year ago
Leonid Ganeline cf122b6269
docs: `Infino` example fix (#9888)
- Fixed a broken link in the `integrations/providers/infino.mdx`
- Fixed a title in the `integration/collbacks/infino.ipynb` example
- Updated text format in this example.
1 year ago
Piyush Jain fe1b9ee6b8
Updated notebook for comprehend moderation (#9875)
### Description
Updated the notebook for comprehend moderation.

cc @baskaryan
1 year ago
William FH b14d74dd4d
iMessage loader (#9832)
Add an iMessage chat loader
1 year ago
Lance Martin 8393ba9dab
Add instructions for GGUF (#9874)
llama.cpp migrated to GGUF model format, and new releases (e.g.,
[here](https://huggingface.co/TheBloke)) now use GGUF.
1 year ago
hughcrt 3a4d4c940c Change video width 1 year ago
hughcrt 97741d41c5 Add LLMonitorCallbackHandler 1 year ago
eryk-dsai 7f5713b80a
feat: grammar-based sampling in llama-cpp (#9712)
## Description 

The following PR enables the [grammar-based
sampling](https://github.com/ggerganov/llama.cpp/tree/master/grammars)
in llama-cpp LLM.

In short, loading file with formal grammar definition will constrain
model outputs. For instance, one can force the model to generate valid
JSON or generate only python lists.

In the follow-up PR we will add:
* docs with some description why it is cool and how it works
* maybe some code sample for some task such as in llama repo

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase c1badc1fa2
add gmail loader (#9810) 1 year ago
Vikas Sheoran 63921e327d
docs: Fix a spelling mistake in adding_memory.ipynb (#9794)
# Description 
This pull request fixes a small spelling mistake found while reading
docs.
1 year ago
Rosário P. Fernandes aab01b55db
typo: funtions --> functions (#9784)
Minor typo in the extractions use-case
1 year ago
Sam Partee a28eea5767
Redis metadata filtering and specification, index customization (#8612)
### Description

The previous Redis implementation did not allow for the user to specify
the index configuration (i.e. changing the underlying algorithm) or add
additional metadata to use for querying (i.e. hybrid or "filtered"
search).

This PR introduces the ability to specify custom index attributes and
metadata attributes as well as use that metadata in filtered queries.
Overall, more structure was introduced to the Redis implementation that
should allow for easier maintainability moving forward.

# New Features

The following features are now available with the Redis integration into
Langchain

## Index schema generation

The schema for the index will now be automatically generated if not
specified by the user. For example, the data above has the multiple
metadata categories. The the following example

```python

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores.redis import Redis

embeddings = OpenAIEmbeddings()


rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users"
)
```

Loading the data in through this and the other ``from_documents`` and
``from_texts`` methods will now generate index schema in Redis like the
following.

view index schema with the ``redisvl`` tool. [link](redisvl.com)

```bash
$ rvl index info -i users
```


Index Information:
| Index Name | Storage Type | Prefixes | Index Options | Indexing |

|--------------|----------------|---------------|-----------------|------------|
| users | HASH | ['doc:users'] | [] | 0 |
Index Fields:
| Name | Attribute | Type | Field Option | Option Value |

|----------------|----------------|---------|----------------|----------------|
| user | user | TEXT | WEIGHT | 1 |
| job | job | TEXT | WEIGHT | 1 |
| credit_score | credit_score | TEXT | WEIGHT | 1 |
| content | content | TEXT | WEIGHT | 1 |
| age | age | NUMERIC | | |
| content_vector | content_vector | VECTOR | | |


### Custom Metadata specification

The metadata schema generation has the following rules
1. All text fields are indexed as text fields.
2. All numeric fields are index as numeric fields.

If you would like to have a text field as a tag field, users can specify
overrides like the following for the example data

```python

# this can also be a path to a yaml file
index_schema = {
    "text": [{"name": "user"}, {"name": "job"}],
    "tag": [{"name": "credit_score"}],
    "numeric": [{"name": "age"}],
}

rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users"
)
```
This will change the index specification to 

Index Information:
| Index Name | Storage Type | Prefixes | Index Options | Indexing |

|--------------|----------------|----------------|-----------------|------------|
| users2 | HASH | ['doc:users2'] | [] | 0 |
Index Fields:
| Name | Attribute | Type | Field Option | Option Value |

|----------------|----------------|---------|----------------|----------------|
| user | user | TEXT | WEIGHT | 1 |
| job | job | TEXT | WEIGHT | 1 |
| content | content | TEXT | WEIGHT | 1 |
| credit_score | credit_score | TAG | SEPARATOR | , |
| age | age | NUMERIC | | |
| content_vector | content_vector | VECTOR | | |


and throw a warning to the user (log output) that the generated schema
does not match the specified schema.

```text
index_schema does not match generated schema from metadata.
index_schema: {'text': [{'name': 'user'}, {'name': 'job'}], 'tag': [{'name': 'credit_score'}], 'numeric': [{'name': 'age'}]}
generated_schema: {'text': [{'name': 'user'}, {'name': 'job'}, {'name': 'credit_score'}], 'numeric': [{'name': 'age'}]}
```

As long as this is on purpose,  this is fine.

The schema can be defined as a yaml file or a dictionary

```yaml

text:
  - name: user
  - name: job
tag:
  - name: credit_score
numeric:
  - name: age

```

and you pass in a path like

```python
rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users3",
    index_schema=Path("sample1.yml").resolve()
)
```

Which will create the same schema as defined in the dictionary example


Index Information:
| Index Name | Storage Type | Prefixes | Index Options | Indexing |

|--------------|----------------|----------------|-----------------|------------|
| users3 | HASH | ['doc:users3'] | [] | 0 |
Index Fields:
| Name | Attribute | Type | Field Option | Option Value |

|----------------|----------------|---------|----------------|----------------|
| user | user | TEXT | WEIGHT | 1 |
| job | job | TEXT | WEIGHT | 1 |
| content | content | TEXT | WEIGHT | 1 |
| credit_score | credit_score | TAG | SEPARATOR | , |
| age | age | NUMERIC | | |
| content_vector | content_vector | VECTOR | | |



### Custom Vector Indexing Schema

Users with large use cases may want to change how they formulate the
vector index created by Langchain

To utilize all the features of Redis for vector database use cases like
this, you can now do the following to pass in index attribute modifiers
like changing the indexing algorithm to HNSW.

```python
vector_schema = {
    "algorithm": "HNSW"
}

rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users3",
    vector_schema=vector_schema
)

```

A more complex example may look like

```python
vector_schema = {
    "algorithm": "HNSW",
    "ef_construction": 200,
    "ef_runtime": 20
}

rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users3",
    vector_schema=vector_schema
)
```

All names correspond to the arguments you would set if using Redis-py or
RedisVL. (put in doc link later)


### Better Querying

Both vector queries and Range (limit) queries are now available and
metadata is returned by default. The outputs are shown.

```python
>>> query = "foo"
>>> results = rds.similarity_search(query, k=1)
>>> print(results)
[Document(page_content='foo', metadata={'user': 'derrick', 'job': 'doctor', 'credit_score': 'low', 'age': '14', 'id': 'doc:users:657a47d7db8b447e88598b83da879b9d', 'score': '7.15255737305e-07'})]

>>> results = rds.similarity_search_with_score(query, k=1, return_metadata=False)
>>> print(results) # no metadata, but with scores
[(Document(page_content='foo', metadata={}), 7.15255737305e-07)]

>>> results = rds.similarity_search_limit_score(query, k=6, score_threshold=0.0001)
>>> print(len(results)) # range query (only above threshold even if k is higher)
4
```

### Custom metadata filtering

A big advantage of Redis in this space is being able to do filtering on
data stored alongside the vector itself. With the example above, the
following is now possible in langchain. The equivalence operators are
overridden to describe a new expression language that mimic that of
[redisvl](redisvl.com). This allows for arbitrarily long sequences of
filters that resemble SQL commands that can be used directly with vector
queries and range queries.

There are two interfaces by which to do so and both are shown. 

```python

>>> from langchain.vectorstores.redis import RedisFilter, RedisNum, RedisText

>>> age_filter = RedisFilter.num("age") > 18
>>> age_filter = RedisNum("age") > 18 # equivalent
>>> results = rds.similarity_search(query, filter=age_filter)
>>> print(len(results))
3

>>> job_filter = RedisFilter.text("job") == "engineer" 
>>> job_filter = RedisText("job") == "engineer" # equivalent
>>> results = rds.similarity_search(query, filter=job_filter)
>>> print(len(results))
2

# fuzzy match text search
>>> job_filter = RedisFilter.text("job") % "eng*"
>>> results = rds.similarity_search(query, filter=job_filter)
>>> print(len(results))
2


# combined filters (AND)
>>> combined = age_filter & job_filter
>>> results = rds.similarity_search(query, filter=combined)
>>> print(len(results))
1

# combined filters (OR)
>>> combined = age_filter | job_filter
>>> results = rds.similarity_search(query, filter=combined)
>>> print(len(results))
4
```

All the above filter results can be checked against the data above.


### Other

  - Issue: #3967 
  - Dependencies: No added dependencies
  - Tag maintainer: @hwchase17 @baskaryan @rlancemartin 
  - Twitter handle: @sampartee

---------

Co-authored-by: Naresh Rangan <naresh.rangan0@walmart.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Anish Shah fa0b8f3368
fix broken wandb link in debugging page (#9771)
- Description: Fix broken hyperlink in debugging page
1 year ago
Monami Sharma 12a373810c
Fixing broken links to Moderation and Constitutional chain (#9768)
- Description: Fixing broken links for Moderation and Constitutional
chain
  - Issue: N/A
  - Twitter handle: MonamiSharma
1 year ago
nikhilkjha d57d08fd01
Initial commit for comprehend moderator (#9665)
This PR implements a custom chain that wraps Amazon Comprehend API
calls. The custom chain is aimed to be used with LLM chains to provide
moderation capability that let’s you detect and redact PII, Toxic and
Intent content in the LLM prompt, or the LLM response. The
implementation accepts a configuration object to control what checks
will be performed on a LLM prompt and can be used in a variety of setups
using the LangChain expression language to not only detect the
configured info in chains, but also other constructs such as a
retriever.
The included sample notebook goes over the different configuration
options and how to use it with other chains.

###  Usage sample
```python
from langchain_experimental.comprehend_moderation import BaseModerationActions, BaseModerationFilters

moderation_config = { 
        "filters":[ 
                BaseModerationFilters.PII, 
                BaseModerationFilters.TOXICITY,
                BaseModerationFilters.INTENT
        ],
        "pii":{ 
                "action": BaseModerationActions.ALLOW, 
                "threshold":0.5, 
                "labels":["SSN"],
                "mask_character": "X"
        },
        "toxicity":{ 
                "action": BaseModerationActions.STOP, 
                "threshold":0.5
        },
        "intent":{ 
                "action": BaseModerationActions.STOP, 
                "threshold":0.5
        }
}

comp_moderation_with_config = AmazonComprehendModerationChain(
    moderation_config=moderation_config, #specify the configuration
    client=comprehend_client,            #optionally pass the Boto3 Client
    verbose=True
)

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

responses = [
    "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", 
    "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)

llm_chain = LLMChain(prompt=prompt, llm=llm)

chain = ( 
    prompt 
    | comp_moderation_with_config 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | comp_moderation_with_config 
)

response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})

print(response['output'])


```
### Output
```
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii validation...
Found PII content..stopping..
The prompt contains PII entities and cannot be processed
```

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Lance Martin 4339d21cf1
Code LLaMA in code understanding use case (#9779)
Update Code Understanding use case doc w/ Code-llama.
1 year ago
Lance Martin 2ab04a4e32
Update agent docs, move to use-case sub-directory (#9344)
Re-structure and add new agent page
1 year ago
Lance Martin 985873c497
Update RAG use case (move to ntbk) (#9340) 1 year ago
Harrison Chase 709a67d9bf
multivector notebook (#9740) 1 year ago
Fabrizio Ruocco cacaf487c3
Azure Cognitive Search - update sdk b8, mod user agent, search with scores (#9191)
Description: Update Azure Cognitive Search SDK to version b8 (breaking
change)
Customizable User Agent.
Implemented Similarity search with scores 

@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Margaret Qian 30151c99c7
Update Mosaic endpoint input/output api (#7391)
As noted in prior PRs (https://github.com/hwchase17/langchain/pull/6060,
https://github.com/hwchase17/langchain/pull/7348), the input/output
format has changed a few times as we've stabilized our inference API.
This PR updates the API to the latest stable version as indicated in our
docs: https://docs.mosaicml.com/en/latest/inference.html

The input format looks like this:

`{"inputs": [<prompt>]}
`

The output format looks like this:
`
{"outputs": [<output_text>]}
`
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase ade482c17e
add twitter chat loader doc (#9737) 1 year ago
Leonid Kuligin 87da56fb1e
Added a pdf parser based on DocAI (#9579)
#9578

---------

Co-authored-by: Leonid Kuligin <kuligin@google.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Tudor Golubenco dc30edf51c
Xata as a chat message memory store (#9719)
This adds Xata as a memory store also to the python version of
LangChain, similar to the [one for
LangChain.js](https://github.com/hwchase17/langchainjs/pull/2217).

I have added a Jupyter Notebook with a simple and a more complex example
using an agent.

To run the integration test, you need to execute something like:

```
XATA_API_KEY='xau_...' XATA_DB_URL="https://demo-uni3q8.eu-west-1.xata.sh/db/langchain"  poetry run pytest tests/integration_tests/memory/test_xata.py
```

Where `langchain` is the database you create in Xata.
1 year ago
William FH dff00ea91e
Chat Loaders (#9708)
Still working out interface/notebooks + need discord data dump to test
out things other than copy+paste

Update:
- Going to remove the 'user_id' arg in the loaders themselves and just
standardize on putting the "sender" arg in the extra kwargs. Then can
provide a utility function to map these to ai and human messages
- Going to move the discord one into just a notebook since I don't have
a good dump to test on and copy+paste maybe isn't the greatest thing to
support in v0
- Need to do more testing on slack since it seems the dump only includes
channels and NOT 1 on 1 convos
-

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Bagatur 22b6549a34
sort api classes (#9710) 1 year ago
Tomaz Bratanic dacf96895a
Add the option to use separate LLMs for GraphCypherQA chain (#9689)
The Graph Chains are different in the way that it uses two LLMChains
instead of one like the retrievalQA chains. Therefore, sometimes you
want to use different LLM to generate the database query and to generate
the final answer.

This feature would make it more convenient to use different LLMs in the
same chain.

I have also renamed the Graph DB QA Chain to Neo4j DB QA Chain in the
documentation only as it is used only for Neo4j. The naming was
ambigious as it was the first graphQA chain added and wasn't sure how do
you want to spin it.
1 year ago
Lance Martin c37be7f5fb
Add Code LLaMA to code QA use case (#9713)
Use [Ollama integration](https://ollama.ai/blog/run-code-llama-locally).
1 year ago
Leonid Ganeline cf792891f1
📖 docs: compact api reference (#8651)
Updated design of the "API Reference" text
Here is an example of the current format:

![image](https://github.com/langchain-ai/langchain/assets/2256422/8727f2ba-1b69-497f-aa07-07f939b6da3b)

It changed to
`langchain.retrievers.ElasticSearchBM25Retriever` format. The same
format as it is in the API Reference Toc.

It also resembles code: 
`from langchain.retrievers import ElasticSearchBM25Retriever` (namespace
THEN class_name)

Current format is
`ElasticSearchBM25Retriever from langchain.retrievers` (class_name THEN
namespace)

This change is in line with other formats and improves readability.

 @baskaryan
1 year ago
Patrick Loeber 6bedfdf25a
Fix docs for AssemblyAIAudioTranscriptLoader (shorter import path) (#9687)
Uses the shorter import path

`from langchain.document_loaders import` instead of the full path
`from langchain.document_loaders.assemblyai`

Applies those changes to the docs and the unit test.

See #9667 that adds this new loader.
1 year ago
了空 7cf5c582d2
Added a link to the dependencies document (#9703) 1 year ago
Harrison Chase 9963b32e59
Harrison/multi vector (#9700) 1 year ago
Leonid Ganeline b048236c1a
📖 docs: `integrations/agent_toolkits` (#9333)
Note: There are no changes in the file names!

- The group name on the main navbar changed: `Agent toolkits` -> `Agents
& Toolkits`. Examples here are the mix of the Agent and Toolkit examples
because Agents and Toolkits in examples are always used together.
- Titles changed: removed "Agent" and "Toolkit" suffixes. The reason is
the same.
- Formatting: mostly cleaning the header structure, so it could be
better on the right-side navbar.

Main navbar is looking much cleaner now.
1 year ago
Patrick Loeber 5990651070
Add new document_loader: AssemblyAIAudioTranscriptLoader (#9667)
This PR adds a new document loader `AssemblyAIAudioTranscriptLoader`
that allows to transcribe audio files with the [AssemblyAI
API](https://www.assemblyai.com) and loads the transcribed text into
documents.

- Add new document_loader with class `AssemblyAIAudioTranscriptLoader`
- Add optional dependency `assemblyai`
- Add unit tests (using a Mock client)
- Add docs notebook

This is the equivalent to the JS integration already available in
LangChain.js. See the [LangChain JS docs AssemblyAI
page](https://js.langchain.com/docs/modules/data_connection/document_loaders/integrations/web_loaders/assemblyai_audio_transcription).

At its simplest, you can use the loader to get a transcript back from an
audio file like this:

```python
from langchain.document_loaders.assemblyai import AssemblyAIAudioTranscriptLoader

loader =  AssemblyAIAudioTranscriptLoader(file_path="./testfile.mp3")
docs = loader.load()
```

To use it, it needs the `assemblyai` python package installed, and the
environment variable `ASSEMBLYAI_API_KEY` set with your API key.
Alternatively, the API key can also be passed as an argument.

Twitter handles to shout out if so kindly 🙇
[@AssemblyAI](https://twitter.com/AssemblyAI) and
[@patloeber](https://twitter.com/patloeber)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
seamusp 25f2c82ae8
docs:misc fixes (#9671)
Improve internal consistency in LangChain documentation
- Change occurrences of eg and eg. to e.g.
- Fix headers containing unnecessary capital letters.
- Change instances of "few shot" to "few-shot".
- Add periods to end of sentences where missing.
- Minor spelling and grammar fixes.
1 year ago
Eugene Yurtsev b88dfcb42a
Add indexing support (#9614)
This PR introduces a persistence layer to help with indexing workflows
into
vectostores.

The indexing code helps users to:

1. Avoid writing duplicated content into the vectostore
2. Avoid over-writing content if it's unchanged

Importantly, this keeps on working even if the content being written is
derived
via a set of transformations from some source content (e.g., indexing
children
documents that were derived from parent documents by chunking.)

The two main components are:

1. Persistence layer that keeps track of which keys were updated and
when.
Keeping track of the timestamp of updates, allows to clean up old
content
   safely, and with minimal complexity.
2. HashedDocument which is used to hash the contents (including
metadata) of
   the documents. We rely on the hashes for identifying duplicates.


The indexing code works with **ANY** document loader. To add
transformations
to the documents, users for now can add a custom document loader
that composes an existing loader together with document transformers.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Lakshay Kansal a8c916955f
Updates to Nomic Atlas and GPT4All documentation (#9414)
Description: Updates for Nomic AI Atlas and GPT4All integrations
documentation.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Keras Conv3d cbaea8d63b
tair fix distance_type error, and add hybrid search (#9531)
- fix: distance_type error, 
- feature: Tair add hybrid search

---------

Co-authored-by: thw <hanwen.thw@alibaba-inc.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Jacob Lee 278ef0bdcf
Adds ChatOllama (#9628)
@rlancemartin

---------

Co-authored-by: Adilkhan Sarsen <54854336+adolkhan@users.noreply.github.com>
Co-authored-by: Kim Minjong <make.dirty.code@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 80dd162e0d
mv embedding cache docs (#9664) 1 year ago
Bagatur a40c12bb88
Update the nlpcloud connector after some changes on the NLP Cloud API (#9586)
- Description: remove some text generation deprecated parameters and
update the embeddings doc,
- Tag maintainer: @rlancemartin
1 year ago
Bagatur d8e2dd4c89 mv 1 year ago
Bagatur e2e582f1f6
Fixed source key name for docugami loader (#8598)
The Docugami loader was not returning the source metadata key. This was
triggering this exception when used with retrievers, per
https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/schema/prompt_template.py#L193C1-L195C41

The fix is simple and just updates the metadata key name for the
document each chunk is sourced from, from "name" to "source" as
expected.

I tested by running the python notebook that has an end to end scenario
in it.

Tagging DataLoader maintainers @rlancemartin @eyurtsev
1 year ago
Zizhong Zhang 8a03836160
docs: fix PromptGuard docs (#9659)
Fix PromptGuard docs. Noticed several trivial issues on the docs when
integrating the new class.
cc @baskaryan
1 year ago
Yong woo Song f0ae10a20e
Fix typo in tigris (#9637)
The link has a **typo** in [tigirs
docs](https://python.langchain.com/docs/integrations/providers/tigris),
so I couldn't access it. So, I have corrected it.
Thanks! ☺️
1 year ago
Junlin Zhou 5b9bdcac1b
docs: fix link url (#9643)
This pull request corrects the URL links in the Async API documentation
to align with the updated project layout. The links had not been updated
despite the changes in layout.
1 year ago
Aashish Saini eb92da84a1
Fixings grammatical errors in Doc Files (#9647)
Fixing some typos and grammatical error is doc file.

@eyurtsev , @baskaryan 

Thanks

---------

Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
Co-authored-by: Ishita Chauhan <136303787+IshitaChauhanShortHillsAI@users.noreply.github.com>
1 year ago
Joseph McElroy 2a06e7b216
ElasticsearchStore: improve error logging for adding documents (#9648)
Not obvious what the error is when you cannot index. This pr adds the
ability to log the first errors reason, to help the user diagnose the
issue.

Also added some more documentation for when you want to use the
vectorstore with an embedding model deployed in elasticsearch.

Credit: @elastic and @phoey1
1 year ago
Julien Salinas f1072cc31f
Merge branch 'master' into master 1 year ago
Leonid Ganeline e1f4f9ac3e
docs: `integrations/providers` (#9631)
Added missed pages for `integrations/providers` from `vectorstores`.
Updated several `vectorstores` notebooks.
1 year ago
anifort 900c1f3e8d
Add support for structured data sources with google enterprise search (#9037)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: Added the capability to handles structured data from
google enterprise search,
- Issue: Retriever failed when underline search engine was integrated
with structured data,
  - Dependencies: google-api-core
  - Tag maintainer: @jarokaz
  - Twitter handle: anifort

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Christos Aniftos <aniftos@google.com>
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Jacob Lee 632a83c48e
Update ChatOpenAI docs with fine-tuning example (#9632) 1 year ago
Adilkhan Sarsen f29312eb84
Fixing deeplake.mdx file as it uses outdates links (#9602)
deeplake.mdx was using old links and was not working properly, in the PR
we fix the issue.
1 year ago
klae01 b868ef23bc
Add AINetwork blockchain toolkit integration (#9527)
# Description
This PR introduces a new toolkit for interacting with the AINetwork
blockchain. The toolkit provides a set of tools for performing various
operations on the AINetwork blockchain, such as transferring AIN,
reading and writing values to the blockchain database, managing apps,
setting rules and owners.

# Dependencies
[ain-py](https://github.com/ainblockchain/ain-py) >= 1.0.2

# Misc
The example notebook
(langchain/docs/extras/integrations/toolkits/ainetwork.ipynb) is in the
PR

---------

Co-authored-by: kriii <kriii@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Vanessa Arndorfer 1ea2f9adf4
Document AzureML Deployment Example (#9571)
Description: Link an example of deploying a Langchain app to an AzureML
online endpoint to the deployments documentation page.

Co-authored-by: Vanessa Arndorfer <vaarndor@microsoft.com>
1 year ago
toddkim95 fba29f203a
Add to support polars (#9610)
### Description
Polars is a DataFrame interface on top of an OLAP Query Engine
implemented in Rust.
Polars is faster to read than pandas, so I'm looking forward to seeing
it added to the document loader.

### Dependencies
polars (https://pola-rs.github.io/polars-book/user-guide/)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Julien Salinas 4d0b7bb8e1 Remove Dolphin and GPT-J from the embeddings docs.
These models are not proposed anymore.
1 year ago
Jeremy Suriel 0fa4516ce4
Fix typo (#9565)
Corrected a minor documentation typo here:
https://python.langchain.com/docs/modules/model_io/models/llms/#generate-batch-calls-richer-outputs
1 year ago
Jacob Lee 0fea987dd2
Add missing param to parent document retriever notebook (#9569) 1 year ago
Zizhong Zhang 00eff8c4a7
feat: Add PromptGuard integration (#9481)
Add PromptGuard integration
-------
There are two approaches to integrate PromptGuard with a LangChain
application.

1. PromptGuardLLMWrapper
2. functions that can be used in LangChain expression.

-----
- Dependencies
`promptguard` python package, which is a runtime requirement if you'd
try out the demo.

- @baskaryan @hwchase17 Thanks for the ideas and suggestions along the
development process.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Oleksandr Ichenskyi 8bc1a3dca8
docs: Add memgraph notebook (#9448)
- Description: added graph_memgraph_qa.ipynb which shows how to use LLMs
to provide a natural language interface to a Memgraph database using
[MemgraphGraph](https://github.com/langchain-ai/langchain/pull/8591)
class.
- Dependencies: given that the notebook utilizes the MemgraphGraph
class, it relies on both this class and several Python packages that are
installed in the notebook using pip (langchain, openai, neo4j,
gqlalchemy). The notebook is dependent on having a functional Memgraph
instance running, as it requires this instance to establish a
connection.
1 year ago
Bagatur d09cdb4880
update data connection -> retrieval (#9561) 1 year ago
Matthew Zeiler 949b2cf177
Improvements to the Clarifai integration (#9290)
- Improved docs
- Improved performance in multiple ways through batching, threading,
etc.
 - fixed error message 
 - Added support for metadata filtering during similarity search.

@baskaryan PTAL
1 year ago
ricki-epsilla 66a47d9a61
add Epsilla vectorstore (#9239)
[Epsilla](https://github.com/epsilla-cloud/vectordb) vectordb is an
open-source vector database that leverages the advanced academic
parallel graph traversal techniques for vector indexing.
This PR adds basic integration with
[pyepsilla](https://github.com/epsilla-cloud/epsilla-python-client)(Epsilla
vectordb python client) as a vectorstore.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 4999e8af7e
pin pydantic api ref build (#9556) 1 year ago
axiangcoding 05aa02005b
feat(llms): support ERNIE Embedding-V1 (#9370)
- Description: support [ERNIE
Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu),
which is part of ERNIE ecology
- Issue: None
- Dependencies: None
- Tag maintainer: @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
José Ferraz Neto f116e10d53
Add SharePoint Loader (#4284)
- Added a loader (`SharePointLoader`) that can pull documents (`pdf`,
`docx`, `doc`) from the [SharePoint Document
Library](https://support.microsoft.com/en-us/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872).
- Added a Base Loader (`O365BaseLoader`) to be used for all Loaders that
use [O365](https://github.com/O365/python-o365) Package
- Code refactoring on `OneDriveLoader` to use the new `O365BaseLoader`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Utku Ege Tuluk bb4f7936f9
feat(llms): add streaming support to textgen (#9295)
- Description: Added streaming support to the textgen component in the
llms module.
  - Dependencies: websocket-client = "^1.6.1"
1 year ago
Harrison Chase 9930ddc555
beef up retrieval docs (#9518) 1 year ago
Leonid Ganeline fdbeb52756
`Qwen` model example (#9516)
added an example for `Qwen-7B` model on `HugginfFaceHub` 🤗
1 year ago
Martin Schade 0c8a88b3fa
AmazonTextractPDFLoader documentation updates (#9415)
Description: Updating documentation to add AmazonTextractPDFLoader
according to
[comment](https://github.com/langchain-ai/langchain/pull/8661#issuecomment-1666572992)
from [baskaryan](https://github.com/baskaryan)

Adding one notebook and instructions to the
modules/data_connection/document_loaders/pdf.mdx

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Asif Ahmad 08feed3332
Changed the NIBittensorLLM API URL to the correct one (#9419)
Changed https://api.neuralinterent.ai/ to https://api.neuralinternet.ai/
which is the valid URL for the API of NIBittensorLLM.
1 year ago
EpixMan 103094286e
Fixing class calling error in the documentation of connecting_to_a_feature_store.ipynb (#9508) 1 year ago
IlyaKIS1 fd8fe209cb
Added In-Depth Langchain Agent Execution Guide (#9507)
Made the notion document of how Langchain executes agents method by
method in the codebase.
Can be helpful for developers that just started working with the
Langchain codebase.
1 year ago
Rosário P. Fernandes 09a92bb9bf
chatbots use case - fix broken collab URL (#9491)
The current Collab URL returns a 404, since there is no `chatbots`
directory under `use_cases`.

<!-- Thank you for contributing to LangChain!

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->
1 year ago
bsenst a956b69720
fix typo in huggingface_hub.ipynb (#9499) 1 year ago
Bagatur d87cfd33e8
Update pydantic compatibility guide (#9496) 1 year ago
Taqi Jaffri 069c0a041f comment update for poetry install 1 year ago
Taqi Jaffri 5cd244e9b7 CR feedback 1 year ago
Ikko Eltociear Ashimine 0808949e54
Fix typo in apis.ipynb (#9490)
funtions -> functions
1 year ago
RajneeshSinghShorthillsAI 129d056085
fixed spelling mistake and added missing bracket in parent_document_r… (#9380)
…etriever.ipynb


Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Matt Robinson 83d2a871eb
fix: apply unstructured preprocess functions (#9473)
### Summary

Fixes a bug from #7850 where post processing functions in Unstructured
loaders were not apply. Adds a assertion to the test to verify the post
processing function was applied and also updates the explanation in the
example notebook.
1 year ago
NavanitDubeyShorthillsAI b58d492e05
Update pydantic_compatibility.md (#9382)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
bsenst 083726ecda
fix small typo (#9464) 1 year ago
Leonid Ganeline 99e5eaa9b1
`InternLM` example (#9465)
Added `InternML` model example to the HubbingFace Hub notebook
1 year ago
William FH d4f790fd40
Fix imports in notebook (#9458) 1 year ago
AmitSinghShorthillsAI 2b06792c81
Fixing spelling mistakes in fallbacks.ipynb (#9376)
Fix spelling errors in the text: 'Therefore' and 'Retrying

I want to stress that your feedback is invaluable to us and is genuinely
cherished.
With gratitude,
@baskaryan  @hwchase17
1 year ago
PuneetDhimanShorthillsAI 61e4a06447
Corrected Sentence in router.ipynb (#9377)
Added missing question marks in the lines in the router.ipynb

@baskaryan @hwchase17
1 year ago
呂安 ead04487fd
doc: make install from source more clearer (#9433)
Description: if just `pip install -e .` it will not install anything, we
have to find the right directory to do `pip install -e .`
1 year ago
Leonid Ganeline edcb03943e
👀 docs: updated `dependents` (#9426)
Updated statistics (the previous statistics was taken 1+month ago).
A lot of new dependents and more starts.
1 year ago
Holmodi 89a8121eaa
Fix a dead loop bug caused by assigning two variables with opposite values. (#9447)
- Description: Fix a dead loop bug caused by assigning two variables
with opposite values.
1 year ago
Lance Martin 589927e9e1
Update figure in OSS model guide (#9399) 1 year ago
Bagatur 5d60ced7b3
pydantic compatibility guide fix (#9418) 1 year ago
Bagatur 0c4683ebcc
Revert "Update compatibility guide for pydantic (#9396)" (#9417) 1 year ago
Eugene Yurtsev b11c233304
Update compatibility guide for pydantic (#9396)
Use langchain.pydantic_v1 instead of pydantic_v1
1 year ago
Leonid Kuligin 019aa04b06
fixed a pal chain reference (#9387)
#9386

Co-authored-by: Leonid Kuligin <kuligin@google.com>
1 year ago
Sanskar Tanwar c194828be0
Fixed Typo in Fallbacks.ipynb (#9373)
Removed extra "the" in the sentence about the chicken crossing the road
in fallbacks.ipynb. The sentence now reads correctly: "Why did the
chicken cross the road?" This resolves the grammatical error and
improves the overall quality of the content.

@baskaryan , @hinthornw , @hwchase17
1 year ago
AashutoshPathakShorthillsAI c71afb46d1
Corrected Sentence in .ipynb File (#9372)
Fixed grammatical errors in the sentence by repositioning the word "are"
for improved clarity and readability.

 @baskaryan @hwchase17 @hinthornw
1 year ago
Akshay Tripathi de8dfde7f7
Corrected Grammatical errors in tutorials.mdx (#9358)
I want to extend my heartfelt gratitude to the creator for masterfully
crafting this remarkable application. 🙌 I am truly impressed by the
meticulous attention to grammar and spelling in the documentation, which
undoubtedly contributes to a polished and seamless reader experience.

As always, your feedback holds immense value and is greatly appreciated.

@baskaryan , @hwchase17
1 year ago
Md Nazish Arman e842131425
Fixed Grammatical errors in tutorials.mdx (#9359)
I want to convey my deep appreciation to the creator for their expert
craftsmanship in developing this exceptional application. 👏 The
remarkable dedication to upholding impeccable grammar and spelling in
the documentation significantly enhances the polished and seamless
experience for readers.

I want to stress that your feedback is invaluable to us and is genuinely
cherished.

With gratitude,
@baskaryan, @hwchase17
1 year ago
AnujMauryaShorthillsAI 6dedd94ba4
Update "Langchain" to "LangChain" in the tutorials.mdx file (#9361)
In this commit, I have made a modification to the term "Langchain" to
correctly reflect the project's name as "LangChain". This change ensures
consistency and accuracy throughout the codebase and documentation.

@baskaryan , @hwchase17
1 year ago
Adarsh Shrivastav c5e23293f8
Corrected Typo in MultiPromptChain Example in router.ipynb (#9362)
Refined the example in router.ipynb by addressing a minor typographical
error. The typo "rins" has been corrected to "rains" in the code snippet
that demonstrates the usage of the MultiPromptChain. This change ensures
accuracy and consistency in the provided code example.

This improvement enhances the readability and correctness of the
notebook, making it easier for users to understand and follow the
demonstration. The commit aims to maintain the quality and accuracy of
the content within the repository.

Thank you for your attention to detail, and please review the change at
your convenience.

@baskaryan , @hwchase17
1 year ago
AbhishekYadavShorthillsAI 90d7c55343
Fix Typo in "community.md" (#9360)
Corrected a typographical error in the "community.md" file by removing
an extra word from the sentence.

@baskaryan , @hwchase17
1 year ago
Angel Luis 2e8733cf54
Fix typo in huggingface_textgen_inference.ipynb (#9313)
Replaced incorrect `stream` parameter by `streaming` on Integrations
docs.
1 year ago
Lance Martin b04e472acf
Open source LLM guide (#9266)
Guide for using open source LLMs locally.
1 year ago
Eugene Yurtsev 090411842e
Fix API reference docs (#9321)
Do not document members nested within any private component
1 year ago
Eugene Yurtsev 0f9f213833
Pydantic Compatibility (#9327)
Pydantic Compatibility Guidelines for migration plan + debugging
1 year ago
Chandler May 15f1af8ed6
Fix variable case in code snippet in docs (#9311)
- Description: Fix a minor variable naming inconsistency in a code
snippet in the docs
  - Issue: N/A
  - Dependencies: none
  - Tag maintainer: N/A
  - Twitter handle: N/A
1 year ago
Michael Bianco 23928a3311
docs: remove multiple code blocks from comma-separated docs (#9323) 1 year ago
Navanit Dubey 3e6cea46e2
Guide import readable json (#9291) 1 year ago
axiangcoding 63601551b1
fix(llms): improve the ernie chat model (#9289)
- Description: improve the ernie chat model.
   - fix missing kwargs to payload
   - new test cases
   - add some debug level log
   - improve description
- Issue: None
- Dependencies: None
- Tag maintainer: @baskaryan
1 year ago
Daniel Chalef 1d55141c50
zep/new ZepVectorStore (#9159)
- new ZepVectorStore class
- ZepVectorStore unit tests
- ZepVectorStore demo notebook
- update zep-python to ~1.0.2

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur b9ca5cc5ea
update guide import (#9279) 1 year ago
Bagatur afba2be3dc
update openai functions docs (#9278) 1 year ago
Bagatur 9abf60acb6
Bagatur/vectara regression (#9276)
Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
1 year ago
Xiaoyu Xee b30f449dae
Add dashvector vectorstore (#9163)
## Description
Add `Dashvector` vectorstore for langchain

- [dashvector quick
start](https://help.aliyun.com/document_detail/2510223.html)
- [dashvector package description](https://pypi.org/project/dashvector/)

## How to use
```python
from langchain.vectorstores.dashvector import DashVector

dashvector = DashVector.from_documents(docs, embeddings)
```

---------

Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur bfbb97b74c
Bagatur/deeplake docs fixes (#9275)
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
1 year ago
Kunj-2206 1b3942ba74
Added BittensorLLM (#9250)
Description: Adding NIBittensorLLM via Validator Endpoint to langchain
llms
Tag maintainer: @Kunj-2206

Maintainer responsibilities:
    Models / Prompts: @hwchase17, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Toshish Jawale 852722ea45
Improvements in Nebula LLM (#9226)
- Description: Added improvements in Nebula LLM to perform auto-retry;
more generation parameters supported. Conversation is no longer required
to be passed in the LLM object. Examples are updated.
  - Issue: N/A
  - Dependencies: N/A
  - Tag maintainer: @baskaryan 
  - Twitter handle: symbldotai

---------

Co-authored-by: toshishjawale <toshish@symbl.ai>
1 year ago
Bagatur 1aae77f26f
fix context nb (#9267) 1 year ago
Alex Gamble cf17c58b47
Update documentation for the Context integration with new URL and features (#9259)
Update documentation and URLs for the Langchain Context integration.

We've moved from getcontext.ai to context.ai \o/

Thanks in advance for the review!
1 year ago
Joseph McElroy 5e9687a196
Elasticsearch self-query retriever (#9248)
Now with ElasticsearchStore VectorStore merged, i've added support for
the self-query retriever.

I've added a notebook also to demonstrate capability. I've also added
unit tests.

**Credit**
@elastic and @phoey1 on twitter.
1 year ago
Anthony Mahanna 0a04e63811
docs: Update ArangoDB Links (#9251)
ready for review 

- mdx link update
- colab link update
1 year ago
Hech 4b505060bd
fix: max_marginal_relevance_search and docs in Dingo (#9244) 1 year ago
axiangcoding 664ff28cba
feat(llms): support ernie chat (#9114)
Description: support ernie (文心一言) chat model
Related issue: #7990
Dependencies: None
Tag maintainer: @baskaryan
1 year ago
fanyou-wbd 5e43768f61
docs: update LlamaCpp max_tokens args (#9238)
This PR updates documentations only, `max_length` should be `max_tokens`
according to latest LlamaCpp API doc:
https://api.python.langchain.com/en/latest/llms/langchain.llms.llamacpp.LlamaCpp.html
1 year ago
Bagatur a8aa1aba1c
nit (#9243) 1 year ago
Bagatur 68d8f73698
consolidate redirects (#9242) 1 year ago
Joshua Sundance Bailey ef0664728e
ArcGISLoader update (#9240)
Small bug fixes and added metadata based on user feedback. This PR is
from the author of https://github.com/langchain-ai/langchain/pull/8873 .
1 year ago
Joseph McElroy eac4ddb4bb
Elasticsearch Store Improvements (#8636)
Todo:
- [x] Connection options (cloud, localhost url, es_connection) support
- [x] Logging support
- [x] Customisable field support
- [x] Distance Similarity support 
- [x] Metadata support
  - [x] Metadata Filter support 
- [x] Retrieval Strategies
  - [x] Approx
  - [x] Approx with Hybrid
  - [x] Exact
  - [x] Custom 
  - [x] ELSER (excluding hybrid as we are working on RRF support)
- [x] integration tests 
- [x] Documentation

👋 this is a contribution to improve Elasticsearch integration with
Langchain. Its based loosely on the changes that are in master but with
some notable changes:

## Package name & design improvements
The import name is now `ElasticsearchStore`, to aid discoverability of
the VectorStore.

```py
## Before
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch, ElasticKnnSearch

## Now
from langchain.vectorstores.elasticsearch import ElasticsearchStore
```

## Retrieval Strategy support
Before we had a number of classes, depending on the strategy you wanted.
`ElasticKnnSearch` for approx, `ElasticVectorSearch` for exact / brute
force.

With `ElasticsearchStore` we have retrieval strategies:

### Approx Example
Default strategy for the vast majority of developers who use
Elasticsearch will be inferring the embeddings from outside of
Elasticsearch. Uses KNN functionality of _search.

```py
        texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index"
        )
        output = docsearch.similarity_search("foo", k=1)
```

### Approx, with hybrid
Developers who want to search, using both the embedding and the text
bm25 match. Its simple to enable.

```py
 texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.ApproxRetrievalStrategy(hybrid=True)
        )
        output = docsearch.similarity_search("foo", k=1)
```

### Approx, with `query_model_id`
Developers who want to infer within Elasticsearch, using the model
loaded in the ml node.

This relies on the developer to setup the pipeline and index if they
wish to embed the text in Elasticsearch. Example of this in the test.

```py
 texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.ApproxRetrievalStrategy(
                query_model_id="sentence-transformers__all-minilm-l6-v2"
            ),
        )
        output = docsearch.similarity_search("foo", k=1)
```

### I want to provide my own custom Elasticsearch Query
You might want to have more control over the query, to perform
multi-phase retrieval such as LTR, linearly boosting on document
parameters like recently updated or geo-distance. You can do this with
`custom_query_fn`

```py
        def my_custom_query(query_body: dict, query: str) -> dict:
            return {"query": {"match": {"text": {"query": "bar"}}}}

        texts = ["foo", "bar", "baz"]
        docsearch = ElasticsearchStore.from_texts(
            texts, FakeEmbeddings(), **elasticsearch_connection, index_name=index_name
        )
        docsearch.similarity_search("foo", k=1, custom_query=my_custom_query)

```

### Exact Example
Developers who have a small dataset in Elasticsearch, dont want the cost
of indexing the dims vs tradeoff on cost at query time. Uses
script_score.

```py
        texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.ExactRetrievalStrategy(),
        )
        output = docsearch.similarity_search("foo", k=1)
```

### ELSER Example
Elastic provides its own sparse vector model called ELSER. With these
changes, its really easy to use. The vector store creates a pipeline and
index thats setup for ELSER. All the developer needs to do is configure,
ingest and query via langchain tooling.

```py
texts = ["foo", "bar", "baz"]
       docsearch = ElasticsearchStore.from_texts(
            texts,
            FakeEmbeddings(),
            es_url="http://localhost:9200",
            index_name="sample-index",
            strategy=ElasticsearchStore.SparseVectorStrategy(),
        )
        output = docsearch.similarity_search("foo", k=1)

```

## Architecture
In future, we can introduce new strategies and allow us to not break bwc
as we evolve the index / query strategy.

## Credit
On release, could you credit @elastic and @phoey1 please? Thank you!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 71d5b7c9bf
Harrison/fallbacks (#9233)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Lance Martin 41279a3ae1
Move self-check use case to "more" section (#9137)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Lance Martin 22858d99b5
Move code-writing use case to "more" section (#9134)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 249d7d06a2
adapter doc nit (#9234) 1 year ago
Lance Martin 969e1683de
Move graph use case to "more" section (#8997)
Clean `use_cases` by moving the `GraphDB` to `integrations`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Lance Martin d0a0d560ad
Minor formatting on Web Research Use Case (#9221) 1 year ago
Lance Martin 17ae2998e7
Update Ollama docs (#9220)
Based on discussion w/ team.
1 year ago
Krish Dholakia 49f1d8477c
Adding ChatLiteLLM model (#9020)
Description: Adding a langchain integration for the LiteLLM library 
Tag maintainer: @hwchase17, @baskaryan
Twitter handle: @krrish_dh / @Berri_AI

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Emmanuel Gautier f11e5442d6
docs: update LlamaCpp input args (#9173)
This PR only updates the LlamaCpp args documentation. The input arg has
been flattened.
1 year ago
Massimiliano Pronesti d95eeaedbe
feat(llms): support vLLM's OpenAI-compatible server (#9179)
This PR aims at supporting [vLLM's OpenAI-compatible server
feature](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server),
i.e. allowing to call vLLM's LLMs like if they were OpenAI's.

I've also udpated the related notebook providing an example usage. At
the moment, vLLM only supports the `Completion` API.
1 year ago
Michael Goin 621da3c164
Adds DeepSparse as an LLM (#9184)
Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM
backend. DeepSparse supports running various open-source sparsified
models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for
performance gains on CPUs.

Twitter handles: @mgoin_ @neuralmagic


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 0fa69d8988
Bagatur/zep python 1.0 (#9186)
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
1 year ago
Harrison Chase 8d69dacdf3
multiple retreival in parralel (#9174) 1 year ago
Eugene Yurtsev aca8cb5fba
API Reference: Do not document private modules (#9042)
This PR prevents documentation of private modules in the API reference
1 year ago
UmerHA 8aab39e3ce
Added SmartGPT workflow (issue #4463) (#4816)
# Added SmartGPT workflow by providing SmartLLM wrapper around LLMs
Edit:
As @hwchase17 suggested, this should be a chain, not an LLM. I have
adapted the PR.

It is used like this:
```
from langchain.prompts import PromptTemplate
from langchain.chains import SmartLLMChain
from langchain.chat_models import ChatOpenAI

hard_question = "I have a 12 liter jug and a 6 liter jug. I want to measure 6 liters. How do I do it?"
hard_question_prompt = PromptTemplate.from_template(hard_question)

llm = ChatOpenAI(model_name="gpt-4")
prompt = PromptTemplate.from_template(hard_question)
chain = SmartLLMChain(llm=llm, prompt=prompt, verbose=True)

chain.run({})
```


Original text: 
Added SmartLLM wrapper around LLMs to allow for SmartGPT workflow (as in
https://youtu.be/wVzuvf9D9BU). SmartLLM can be used wherever LLM can be
used. E.g:

```
smart_llm = SmartLLM(llm=OpenAI())
smart_llm("What would be a good company name for a company that makes colorful socks?")
```
or
```
smart_llm = SmartLLM(llm=OpenAI())
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
chain = LLMChain(llm=smart_llm, prompt=prompt)
chain.run("colorful socks")
```

SmartGPT consists of 3 steps:

1. Ideate - generate n possible solutions ("ideas") to user prompt
2. Critique - find flaws in every idea & select best one
3. Resolve - improve upon best idea & return it

Fixes #4463

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @agola11

Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
RicChilligerDude#7589

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 45741bcc1b
Bagatur/vectara nit (#9140)
Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
1 year ago
Dominick DEV 9b64932e55
Add LangChain utility for real-time crypto exchange prices (#4501)
This commit adds the LangChain utility which allows for the real-time
retrieval of cryptocurrency exchange prices. With LangChain, users can
easily access up-to-date pricing information by running the command
".run(from_currency, to_currency)". This new feature provides a
convenient way to stay informed on the latest exchange rates and make
informed decisions when trading crypto.


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Joshua Sundance Bailey eaa505fb09
Create ArcGISLoader & example notebook (#8873)
- Description: Adds the ArcGISLoader class to
`langchain.document_loaders`
  - Allows users to load data from ArcGIS Online, Portal, and similar
- Users can authenticate with `arcgis.gis.GIS` or retrieve public data
anonymously
  - Uses the `arcgis.features.FeatureLayer` class to retrieve the data
  - Defines the most relevant keywords arguments and accepts `**kwargs`
- Dependencies: Using this class requires `arcgis` and, optionally,
`bs4.BeautifulSoup`.

Tagging maintainers:
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Aashish Saini 0aabded97f
Updating interactive walkthrough link in index.md to resolve 404 error (#9063)
Updated interactive walkthrough link in index.md to resolve 404 error.
Also, expressing deep gratitude to LangChain library developers for
their exceptional efforts 🥇 .

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Hai The Dude e4418d1b7e
Added new use case docs for Web Scraping, Chromium loader, BS4 transformer (#8732)
- Description: Added a new use case category called "Web Scraping", and
a tutorial to scrape websites using OpenAI Functions Extraction chain to
the docs.
  - Tag maintainer:@baskaryan @hwchase17 ,
- Twitter handle: https://www.linkedin.com/in/haiphunghiem/ (I'm on
LinkedIn mostly)

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
niklub 16af5f8690
Add LabelStudio integration (#8880)
This PR introduces [Label Studio](https://labelstud.io/) integration
with LangChain via `LabelStudioCallbackHandler`:

- sending data to the Label Studio instance
- labeling dataset for supervised LLM finetuning
- rating model responses
- tracking and displaying chat history
- support for custom data labeling workflow

### Example

```
chat_llm = ChatOpenAI(callbacks=[LabelStudioCallbackHandler(mode="chat")])
chat_llm([
    SystemMessage(content="Always use emojis in your responses."),
        HumanMessage(content="Hey AI, how's your day going?"),
    AIMessage(content="🤖 I don't have feelings, but I'm running smoothly! How can I help you today?"),
        HumanMessage(content="I'm feeling a bit down. Any advice?"),
    AIMessage(content="🤗 I'm sorry to hear that. Remember, it's okay to seek help or talk to someone if you need to. 💬"),
        HumanMessage(content="Can you tell me a joke to lighten the mood?"),
    AIMessage(content="Of course! 🎭 Why did the scarecrow win an award? Because he was outstanding in his field! 🌾"),
        HumanMessage(content="Haha, that was a good one! Thanks for cheering me up."),
    AIMessage(content="Always here to help! 😊 If you need anything else, just let me know."),
        HumanMessage(content="Will do! By the way, can you recommend a good movie?"),
])
```

<img width="906" alt="image"
src="https://github.com/langchain-ai/langchain/assets/6087484/0a1cf559-0bd3-4250-ad96-6e71dbb1d2f3">


### Dependencies
- [label-studio](https://pypi.org/project/label-studio/)
- [label-studio-sdk](https://pypi.org/project/label-studio-sdk/)

https://twitter.com/labelstudiohq

---------

Co-authored-by: nik <nik@heartex.net>
1 year ago
Bagatur 8cb2594562
Bagatur/dingo (#9079)
Co-authored-by: gary <1625721671@qq.com>
1 year ago
Manuel Soria 31cfc00845
Code understanding use case (#8801)
Code understanding docs

---------

Co-authored-by: Manuel Soria <manuel.soria@greyscaleai.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
Alvaro Bartolome f7ae183f40
`ArgillaCallbackHandler` to properly use default values for `api_url` and `api_key` (#9113)
As of the recent PR at #9043, after some testing we've realised that the
default values were not being used for `api_key` and `api_url`. Besides
that, the default for `api_key` was set to `argilla.apikey`, but since
the default values are intended for people using the Argilla Quickstart
(easy to run and setup), the defaults should be instead `owner.apikey`
if using Argilla 1.11.0 or higher, or `admin.apikey` if using a lower
version of Argilla.

Additionally, we've removed the f-string replacements from the
docstrings.

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>
1 year ago
Bagatur 0e5d09d0da
dalle nb fix (#9125) 1 year ago
Francisco Ingham 9249d305af
tagging docs refactor (#8722)
refactor of tagging use case according to new format

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
Aayush Shah a429145420
Minor grammatical error (#9102)
Have corrected a grammatical error in:
https://python.langchain.com/docs/modules/model_io/models/llms/ document
😄
1 year ago
Ashutosh Sanzgiri 991b448dfc
minor edits (#9093)
Description:

Minor edit to PR#845

Thanks!
1 year ago
Chenyu Zhao c0acbdca1b
Update Fireworks model names (#9085) 1 year ago
Charles Lanahan a2588d6c57
Update openai embeddings notebook with correct embedding model in section 2 (#5831)
In second section it looks like a copy/paste from the first section and
doesn't include the specific embedding model mentioned in the example so
I added it for clarity.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Josh Phillips 5fc07fa524
change id column type to uuid to match function (#7456)
The table creation process in these examples commands do not match what
the recently updated functions in these example commands is looking for.
This change updates the type in the table creation command.
Issue Number for my report of the doc problem #7446
@rlancemartin and @eyurtsev I believe this is your area
Twitter: @j1philli

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bidhan Roy 02430e25b6
BagelDB (bageldb.ai), VectorStore integration. (#8971)
- **Description**: [BagelDB](bageldb.ai) a collaborative vector
database. Integrated the bageldb PyPi package with langchain with
related tests and code.

  - **Issue**: Not applicable.
  - **Dependencies**: `betabageldb` PyPi package.
  - **Tag maintainer**: @rlancemartin, @eyurtsev, @baskaryan
  - **Twitter handle**: bageldb_ai (https://twitter.com/BagelDB_ai)
  
We ran `make format`, `make lint` and `make test` locally.

Followed the contribution guideline thoroughly
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

---------

Co-authored-by: Towhid1 <nurulaktertowhid@gmail.com>
1 year ago
Harrison Chase bb6fbf4c71
openai adapters (#8988)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
1 year ago
Piyush Jain 8eea46ed0e
Bedrock embeddings async methods (#9024)
## Description
This PR adds the `aembed_query` and `aembed_documents` async methods for
improving the embeddings generation for large documents. The
implementation uses asyncio tasks and gather to achieve concurrency as
there is no bedrock async API in boto3.

### Maintainers
@agola11 
@aarora79  

### Open questions
To avoid throttling from the Bedrock API, should there be an option to
limit the concurrency of the calls?
1 year ago
Nicolas e3fb11bc10
docs: (Mendable Search) Fixes stuck when tabbing out issue (#9074)
This fixes Mendable not completing when tabbing out and fixes the
duplicate message issue as well.
1 year ago
Bagatur 1edead28b8
Add docs community page (#8992)
Co-authored-by: briannawolfson <brianna.wolfson@gmail.com>
1 year ago
Eugene Yurtsev a5a4c53280
RedisStore: Update init and Documentation updates (#9044)
* Update Redis Store to support init from parameters
* Update notebook to show how to use redis store, and some fixes in
documentation
1 year ago
Bagatur f3f5853e9f
update api ref exampels (#9065)
manually update for now
1 year ago
Blake (Yung Cher Ho) 8d351bfc20
Takeoff integration (#9045)
## Description:
This PR adds the Titan Takeoff Server to the available LLMs in
LangChain.

Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.

Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)

#### Testing
As Titan Takeoff runs locally on port 8000 by default, no network access
is needed. Responses are mocked for testing.

- [x] Make Lint
- [x] Make Format
- [x] Make Test

#### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.

Thanks for your help and please let me know if you have any questions.

cc: @hwchase17 @baskaryan
1 year ago
Aashish Saini 8a320e55a0
Corrected grammatical errors and spelling mistakes in the index.mdx file. (#9026)
Expressing gratitude to the creator for crafting this remarkable
application. 🙌, Would like to Enhance grammar and spelling in the
documentation for a polished reader experience.

Your feedback is valuable as always 

@baskaryan , @hwchase17 , @eyurtsev
1 year ago
Eugene Yurtsev 5e05ba2140
Add embeddings cache (#8976)
This PR adds the ability to temporarily cache or persistently store
embeddings. 

A notebook has been included showing how to set up the cache and how to
use it with a vectorstore.
1 year ago
Lance Martin 2380492c8e
API use case (#8546)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Luca Foppiano dfb93dd2b5
Improved grobid documentation (#9025)
- Description: Improvement in the Grobid loader documentation, typos and
suggesting to use the docker image instead of installing Grobid in local
(the documentation was also limited to Mac, while docker allow running
in any platform)
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @whitenoise
1 year ago
Hiroshige Umino 2c7297d243
Fix a broken code block display (#9034)
- Description: Fix a broken code block in this page:
https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/
- Issue: N/A
- Dependencies: None
- Tag maintainer: @baskaryan
- Twitter handle: yaotti
1 year ago
Piyush Jain 3b51817706
Updating port and ssl use in sample notebook (#8995)
## Description
This PR updates the sample notebook to use the default port (8182) and
the ssl for the Neptune database connection.
1 year ago
Michael Shen c2f46b2cdb
Fixed wrong paper reference (#8970)
The ReAct reference references to MRKL paper. Corrected so that it
points to the actual ReAct paper #8964.
1 year ago
Jerzy Czopek 539672a7fd
Feature/fix azureopenai model mappings (#8621)
This pull request aims to ensure that the `OpenAICallbackHandler` can
properly calculate the total cost for Azure OpenAI chat models. The
following changes have resolved this issue:

- The `model_name` has been added to the ChatResult llm_output. Without
this, the default values of `gpt-35-turbo` were applied. This was
causing the total cost for Azure OpenAI's GPT-4 to be significantly
inaccurate.
- A new parameter `model_version` has been added to `AzureChatOpenAI`.
Azure does not include the model version in the response. With the
addition of `model_name`, this is not a significant issue for GPT-4
models, but it's an issue for GPT-3.5-Turbo. Version 0301 (default) of
GPT-3.5-Turbo on Azure has a flat rate of 0.002 per 1k tokens for both
prompt and completion. However, version 0613 introduced a split in
pricing for prompt and completion tokens.
- The `OpenAICallbackHandler` implementation has been updated with the
proper model names, versions, and cost per 1k tokens.

Unit tests have been added to ensure the functionality works as
expected; the Azure ChatOpenAI notebook has been updated with examples.

Maintainers: @hwchase17, @baskaryan

Twitter handle: @jjczopek

---------

Co-authored-by: Jerzy Czopek <jerzy.czopek@avanade.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 7de6a1b78e
parent document retriever (#8941) 1 year ago
Taqi Jaffri 5919c0f4a2 notebook cleanup 1 year ago
Taqi Jaffri bcdf3be530 Merge branch 'master' into tjaffri/docugami_loader_source 1 year ago
arjunbansal a2681f950d
add instructions on integrating Log10 (#8938)
- Description: Instruction for integration with Log10: an [open
source](https://github.com/log10-io/log10) proxiless LLM data management
and application development platform that lets you log, debug and tag
your Langchain calls
  - Tag maintainer: @baskaryan
  - Twitter handle: @log10io @coffeephoenix

Several examples showing the integration included
[here](https://github.com/log10-io/log10/tree/main/examples/logging) and
in the PR
1 year ago
Aarav Borthakur 3f64b8a761
Integrate Rockset as a chat history store (#8940)
Description: Adds Rockset as a chat history store
Dependencies: no changes
Tag maintainer: @hwchase17

This PR passes linting and testing. 

I added a test for the integration and an example notebook showing its
use.
1 year ago
Bagatur 0a1be1d501
document lcel fallbacks (#8942) 1 year ago
Molly Cantillon 99b5a7226c
Weaviate: adding auth example + fixing spelling in ReadME (#8939)
Added basic auth example to Weaviate notebook @baskaryan
1 year ago
Joe Reuter 8f0cd91d57
Airbyte based loaders (#8586)
This PR adds 8 new loaders:
* `AirbyteCDKLoader` This reader can wrap and run all python-based
Airbyte source connectors.
* Separate loaders for the most commonly used APIs:
  * `AirbyteGongLoader`
  * `AirbyteHubspotLoader`
  * `AirbyteSalesforceLoader`
  * `AirbyteShopifyLoader`
  * `AirbyteStripeLoader`
  * `AirbyteTypeformLoader`
  * `AirbyteZendeskSupportLoader`

## Documentation and getting started
I added the basic shape of the config to the notebooks. This increases
the maintenance effort a bit, but I think it's worth it to make sure
people can get started quickly with these important connectors. This is
also why I linked the spec and the documentation page in the readme as
these two contain all the information to configure a source correctly
(e.g. it won't suggest using oauth if that's avoidable even if the
connector supports it).

## Document generation
The "documents" produced by these loaders won't have a text part
(instead, all the record fields are put into the metadata). If a text is
required by the use case, the caller needs to do custom transformation
suitable for their use case.

## Incremental sync
All loaders support incremental syncs if the underlying streams support
it. By storing the `last_state` from the reader instance away and
passing it in when loading, it will only load updated records.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 7543a3d70e
Harrison/image (#845)
Co-authored-by: Ashutosh Sanzgiri <sanzgiri@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline 33a2f58fbf
`tensoflow_datasets` document loader (#8721)
This PR adds `tensoflow_datasets` document loader
1 year ago
Leonid Ganeline 2d078c7767
`PubMed` document loader (#8893)
- added `PubMed Document Loader` artifacts; ut-s; examples 
- fixed `PubMed utility`; ut-s

@hwchase17
1 year ago
Jeremy W c5c0735fc4
Remove Evaluation from Modules page (#8926)
Remove Evaluation link (which gives 404 now) from Modules page, since it
lives under Guides page now
1 year ago
Seif 6327eecdaf
Fix typo in Vectara docs (#8925)
Fixed a typo in the Vectara docs description.
1 year ago
Chris Pappalardo beab637f04
added filter kwarg to VectorStoreIndexWrapper query and query_with_so… (#8844)
- Description: added filter to query methods in VectorStoreIndexWrapper
for filtering by metadata (i.e. search_kwargs)
- Tag maintainer: @rlancemartin, @eyurtsev

Updated the doc snippet on this topic as well. It took me a long while
to figure out how to filter the vectorstore by filename, so this might
help someone else out.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Apurv Agarwal 4a63533216
addition to docs at 'Store and reference chat history' (#8910)
- Description: I have added an example showing how to pass a custom
template to ConversationRetrievalChain. Instead of
CONDENSE_QUESTION_PROMPT we can pass any prompt in the argument
condense_question_prompt. Look in Use cases -> QA over Documents -> How
to -> Store and reference chat history,
  - Issue: #8864,
  - Dependencies: NA,
  - Tag maintainer: @hinthornw,
  - Twitter handle:

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
David vonThenen bf4a112aa6
Fixes to the Nebula LLM Integration (#8918)
This addresses some issues with introducing the Nebula LLM to LangChain
in this PR:
https://github.com/langchain-ai/langchain/pull/8876

This fixes the following:
- Removes `SYMBLAI` from variable names
- Fixes bug with `Bearer` for the API KEY


Thanks again in advance for your help!
cc: @hwchase17, @baskaryan

---------

Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
1 year ago
Jacob Lee d1e305028f
Automatically set docs appearance to system default (#8924)
@baskaryan
1 year ago
Josh Hart 6116cbf0de
Fix imports in awslambda docs (#8916)
Minor doc fix to awslambda tool notebook. 

Add missing import for initialize_agent to awslambda agent example

Co-authored-by: Josh Hart <josharj@amazon.com>
1 year ago
Maurits de Groot 61c2d918c6
Fixed inaccurate import in integrations:providers:bedrock documentation (#8915)
Description:
Fixed inaccurate import in integrations:providers:bedrock documentation

In the current version of the bedrock documentation, page
https://python.langchain.com/docs/integrations/providers/bedrock it
states that the import is from langchain import Bedrock

This has been changed to from langchain.llms.bedrock import Bedrock as
stated in https://python.langchain.com/docs/integrations/llms/bedrock

Issue:
Not applicable

Dependencies
No dependencies required

Tag maintainer
@baskaryan

Twitter handle:
Not applicable
1 year ago
Manuel Soria e74a605379
SQL use case docs (#8513) 1 year ago
Jacob Lee fa30a57034
Adds Ollama as an LLM (#8829)
Adds Ollama as an LLM. Ollama can run various open source models locally
e.g. Llama 2 and Vicuna, automatically configuring and GPU-optimizing
them.

@rlancemartin @hwchase17

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
Ash Vardanian 1f9124ceaa
Add: USearch Vector Store (#8835)
## Description

I am excited to propose an integration with USearch, a lightweight
vector-search engine available for both Python and JavaScript, among
other languages.

## Dependencies

It introduces a new PyPi dependency - `usearch`. I am unsure if it must
be added to the Poetry file, as this would make the PR too clunky.
Please let me know.

## Profiles

- Maintainers: @ashvardanian @davvard
- Twitter handles: @ashvardanian @unum_cloud

---------

Co-authored-by: Davit Vardanyan <78792753+davvard@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Kuligin b52a3785c9
Allow to specify a custom loader for GcsFileLoader (#8868)
Co-authored-by: Leonid Kuligin <kuligin@google.com>
1 year ago
Jeffrey Wang ff44fe4e16
Change default Metaphor search example to use prompt optimizer (#8890)
- fix install command
- change example notebook to use Metaphor autoprompt by default

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Jeffrey Wang ce3666c28b
Fix metaphor install command in guide (#8888) 1 year ago
Harrison Chase bbd22b9b76
update metaphor docs (#8886) 1 year ago
Carson cc908d49a3
Fixes typo in documentation (#8882)
Fixes a simple typo in the google search engine tool documentation
@baskaryan
1 year ago
Joshua Sundance Bailey 7fc07ba5df
Create ChatAnyscale (#8770)
- Description: Adds the ChatAnyscale class with llama-2 7b, llama-2 13b,
and llama-2 70b on [Anyscale
Endpoints](https://app.endpoints.anyscale.com/)
- It inherits from ChatOpenAI and requires openai (probably unnecessary
but it made for a quick and easy implementation)
- Inspired by https://github.com/langchain-ai/langchain/pull/8434
(@kylehh and @baskaryan )
1 year ago
David vonThenen 40079d4936
Introduce Nebula LLM to LangChain (#8876)
## Description

This PR adds Nebula to the available LLMs in LangChain.

Nebula is an LLM focused on conversation understanding and enables users
to extract conversation insights from video, audio, text, and chat-based
conversations. These conversations can occur between any mix of human or
AI participants.

Examples of some questions you could ask Nebula from a given
conversation are:
- What could be the customer’s pain points based on the conversation?
- What sales opportunities can be identified from this conversation?
- What best practices can be derived from this conversation for future
customer interactions?

You can read more about Nebula here:

https://symbl.ai/blog/extract-insights-symbl-ai-generative-ai-recall-ai-meetings/

#### Integration Test 

An integration test is added, but it requires network access. Since
Nebula is fully managed like OpenAI, network access is required to
exercise the integration test.

#### Linting

- [x] make lint
- [x] make test (TODO: there seems to be a failure in another
non-related test??? Need to check on this.)
- [x] make format

### Dependencies

No new dependencies were introduced.

### Twitter handle

[@symbldotai](https://twitter.com/symbldotai)
[@dvonthenen](https://twitter.com/dvonthenen)


If you have any questions, please let me know.

cc: @hwchase17, @baskaryan

---------

Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Lance Martin 84c1ad7eaa
Fix colab link for extraction ntbk (#8878)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Nuno Campos 9892e95d03
Add flush=True to stream examples (#8862) 1 year ago
manmax31 40096c73cd
Add BGE embeddings support (#8848)
- Description: [BGE-large](https://huggingface.co/BAAI/bge-large-en)
embeddings from BAAI are at the top of [MTEB
leaderboard](https://huggingface.co/spaces/mteb/leaderboard). Hence
adding support for it.
- Tag maintainer: @baskaryan
- Twitter handle: @ManabChetia3

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Tudor Golubenco aeaef8f3a3
Add support for Xata as a vector store (#8822)
This adds support for [Xata](https://xata.io) (data platform based on
Postgres) as a vector store. We have recently added [Xata to
Langchain.js](https://github.com/hwchase17/langchainjs/pull/2125) and
would love to have the equivalent in the Python project as well.

The PR includes integration tests and a Jupyter notebook as docs. Please
let me know if anything else would be needed or helpful.

I have added the xata python SDK as an optional dependency.

## To run the integration tests

You will need to create a DB in xata (see the docs), then run something
like:

```
OPENAI_API_KEY=sk-... XATA_API_KEY=xau_... XATA_DB_URL='https://....xata.sh/db/langchain'  poetry run pytest tests/integration_tests/vectorstores/test_xata.py
```

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Philip Krauss <35487337+philkra@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 472f00ada7
add moderation example (#8718) 1 year ago
Massimiliano Pronesti a616e19975
feat(llms): add support for vLLM (#8806)
Hello langchain maintainers, 
this PR aims at integrating
[vllm](https://vllm.readthedocs.io/en/latest/#) into langchain. This PR
closes #8729.

This feature clearly depends on `vllm`, but I've seen other models
supported here depend on packages that are not included in the
pyproject.toml (e.g. `gpt4all`, `text-generation`) so I thought it was
the case for this as well.

@hwchase17, @baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Karthik Raja A 5a9765b1b5
MultiOn client toolkit update 2.0 (#8750)
- Updated to use newer better function interaction
 - Previous version had only one callback
 - @hinthornw @hwchase17  Can you look into this
 -  Shout out to @MultiON_AI @DivGarg9 on twitter

---------

Co-authored-by: Naman Garg <ngarg3@binghamton.edu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 0adc282d70
Harrison/as retriever docstring (#8840)
Co-authored-by: Bytestorm <31070777+Bytestorm5@users.noreply.github.com>
1 year ago
Zend bd4865b6fe
Async Recursive URL loader (#8502)
Description: This PR improves the function of recursive_url_loader, such
as limiting the depth of the access, and customizable extractors(from
the raw webpage to the text of the Document object), so that users can
use other tools to extract the webpage. This PR also includes the
document and test for the new loader.
Old PR closed due to project structure change. #7756

Because socket requests are not allowed, the old unit test was removed.
Issue: N/A
Dependencies: asyncio, aiohttp
Tag maintainer: @rlancemartin
Twitter handle: @ Zend_Nihility

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
fqassemi 485d716c21
Feature faiss delete (#8135)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: docstore had two main method: add and search, however,
dealing with docstore sometimes requires deleting an entry from
docstore. So I have added a simple delete method that deletes items from
docstore. Additionally, I have added the delete method to faiss
vectorstore for the very same reason.
  - Issue: NA
  - Dependencies: NA
  - Tag maintainer:  @rlancemartin, @eyurtsev
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Nicolas b57fa1a39c
docs: Improvements on Mendable Search (#8808)
- Balancing prioritization between keyword / AI search
- Show snippets of highlighted keywords when searching 
- Improved keyword search
- Fixed bugs and issues

Shoutout to @calebpeffer for implementing and gathering feedback on it 

cc: @dev2049 @rlancemartin @hwchase17
1 year ago
Ikko Eltociear Ashimine 6b93670410
Fix typo in long_context_reorder.ipynb (#8811)
begining -> beginning

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Harrison Chase 2bb1d256f3
add example of memory and returning retrieved docs (#8830) 1 year ago
Kshitij Wadhwa 5f1aab5487
Fix docs for Rockset (#8807)
* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
1 year ago
Bagatur d7b613a293
Bagatur/revert revert nuclia (#8833) 1 year ago
Bagatur 2f309a4ce6
Revert "Bagatur/nuclia (#8404)" (#8832) 1 year ago
Snehil Kumar 1bd4890506
Update links on QA Use Case docs (#8784)
- Description: 2 links were not working on Question Answering Use Cases
documentation page. Hence, changed them to nearest useful links,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @baskaryan,
  - Twitter handle: NA

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Bal Narendra Sapa a22d502248
added the embeddings part (#8805)
Description: forgot to add the embeddings part in the documentation.
sorry 😅

@baskaryan
1 year ago
Bagatur 9fc9018951
Bagatur/nuclia (#8404)
Co-authored-by: Eric BREHAULT <ebrehault@gmail.com>
1 year ago
Francisco Ingham ef5bc1fef1
Refactor for extraction docs (#8465)
Refactor for the extraction use case documentation

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
1 year ago
Bagatur 21771a6f1c
rm sklearn links (#8773) 1 year ago
Joshua Carroll e5fed7d535
Extend the StreamlitChatMessageHistory docs with a fuller example and… (#8774)
Add more details to the [notebook for
StreamlitChatMessageHistory](https://python.langchain.com/docs/integrations/memory/streamlit_chat_message_history),
including a link to a [running example
app](https://langchain-st-memory.streamlit.app/).

Original PR: https://github.com/langchain-ai/langchain/pull/8497
1 year ago
Eugene Yurtsev 19dfe166c9
Update documentation for prompts (#8381)
* Documentation to favor creation without declaring input_variables
* Cut out obvious examples, but add more description in a few places

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
1 year ago
Dayou Liu 91a0817e39
docs: llamacpp minor fixes (#8738)
- Description: minor updates on llama cpp doc
1 year ago
Eugene Yurtsev 003e1ca9a0
Update api references (#8646)
Update API reference documentation. This PR will pick up a number of missing classes, it also applies selective formatting based on the class / object type.
1 year ago
Snehil Kumar a6ee646ef3
Update get_started.mdx (#8744)
- Description: Added a missing word and rearranged a sentence in the
documentation of Self Query Retrievers.,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @baskaryan,
  - Twitter handle: NA

Thanks for your time.
1 year ago
Bal Narendra Sapa bd61757423
add documentation for serializer function (#8769)
Description: Added necessary documentation for serializer functions

@baskaryan
1 year ago
rjanardhan3 affaaea87b
Updates fireworks (#8765)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: Updates to Fireworks Documentation, 
  - Issue: N/A,
  - Dependencies: N/A,
  - Tag maintainer: @rlancemartin,

---------

Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>
1 year ago
Bagatur 8c35fcb571
update rss doc (#8761) 1 year ago
Bagatur 0d5a90f30a
Revert "add filter to sklearn vector store functions (#8113)" (#8760) 1 year ago
Lance Martin be638ad77d
Chatbots use case (#8554)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Ruiqi Guo 6aee589eec
Add ScaNN support in vectorstore. (#8251)
Description: Add ScaNN vectorstore to langchain.
ScaNN is a Open Source, high performance vector similarity library
optimized for AVX2-enabled CPUs.
https://github.com/google-research/google-research/tree/master/scann

- Dependencies: scann

Python notebook to illustrate the usage:
docs/extras/integrations/vectorstores/scann.ipynb
Integration test:
libs/langchain/tests/integration_tests/vectorstores/test_scann.py

@rlancemartin, @eyurtsev for review.

Thanks!
1 year ago
shibuiwilliam 0f0ccfe7f6
add filter to sklearn vector store functions (#8113)
# What
- This is to add filter option to sklearn vectore store functions

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: Add filter to sklearn vectore store functions.
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
shibuiwilliam 2759e2d857
add save and load tfidf vectorizer and docs for TFIDFRetriever (#8112)
This is to add save_local and load_local to tfidf_vectorizer and docs in
tfidf_retriever to make the vectorizer reusable.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: add save_local and load_local to tfidf_vectorizer and
docs in tfidf_retriever
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Lance Martin d1b95db874
Retriever that can re-phase user inputs (#8026)
Simple retriever that applies an LLM between the user input and the
query pass the to retriever.

It can be used to pre-process the user input in any way.

The default prompt:

```
DEFAULT_QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an assistant tasked with taking a natural languge query from a user
    and converting it into a query for a vectorstore. In this process, you strip out
    information that is not relevant for the retrieval task. Here is the user query: {question} """
)
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 6c3573e7f6
Harrison/aleph alpha (#8735)
Co-authored-by: PiotrMazurek <piotr.mazurek@aleph-alpha.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Ilya 6f0bccfeb5
Add regex control over separators in character text splitter (#7933)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
#7854

Added the ability to use the `separator` ase a regex or a simple
character.
Fixed a bug where `start_index` was incorrectly counting from -1.

Who can review?
@eyurtsev
@hwchase17 
@mmz-001
1 year ago
Ofer Mendelevitch 29f51055e8
Updates to Vectara documentation (#8699)
- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
ruze 8ef7e14a85
RSS Feed / OPML loader (#8694)
Replace this comment with:
- Description: added a document loader for a list of RSS feeds or OPML.
It iterates through the list and uses NewsURLLoader to load each
article.
  - Issue: N/A
  - Dependencies: feedparser, listparser
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @ruze

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur b2b71b0d35
Bagatur/eden llm (#8670)
Co-authored-by: RedhaWassim <rwasssim@gmail.com>
Co-authored-by: KyrianC <ckyrian@protonmail.com>
Co-authored-by: sam <melaine.samy@gmail.com>
1 year ago
axa99 1f54ec899b
updated interface jupyter notebook explanations (#8689)
Updated the documentation in the interface.ipynb to clearly show the
_input_ and _output_ types for various components @baskaryan
1 year ago
Lance Martin 37aade19da
Minor formatting and additional figure for summarization use case (#8663) 1 year ago
Harrison Chase 43dffe39fb
Harrison/conversational retrieval agent (#8639)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
ruze 71f98db2fe
Newspaper (#8647)
- Description: Added newspaper3k based news article loader. Provide a
list of urls.
  - Issue: N/A
  - Dependencies: newspaper3k,
  - Tag maintainer: @rlancemartin , @eyurtsev 
  - Twitter handle: @ruze

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
millerick 5018af8839
docs: fix some grammar (#8654)
### Description
Fixes a grammar issue I noticed when reading through the documentation.

### Maintainers
@baskaryan

Co-authored-by: mmillerick <mmillerick@blend.com>
1 year ago
Lance Martin 59194c2214
Add summarization use-case (#8376)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Will Thompson ee1d13678e
🐛 Docs Fixes [2 one-liners, examples broken] (#8519)
## Description: 
   
1)Map reduce example in docs is missing an important import statement.
Figured other people would benefit from being able to copy 🍝 the code.

2)RefineDocumentsChain example also broken.

## Issue: 

None

## Dependencies:

None. One liner.

## Tag maintainer:

@baskaryan

## Twitter handle: 

I mean, it's a one line fix lol. But @will_thompson_k is my twitter
handle.
1 year ago
Leonid Ganeline 1335f2b9f8
`MLflow` examples (#8642)
Updated `MLflow` examples with links to the examples from MLflow

 @baskaryan
1 year ago
Comendeiro 5c516945d0
Add local support for audio models (PR #7329) (#7591)
- Description: run the poetry dependencies
  - Issue: #7329 
  - Dependencies: any dependencies required for this change,
  - Tag maintainer: @rlancemartin

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
rjanardhan3 68113348cc
Fireworks integration (#8322)
Description - Integrates Fireworks within Langchain LLMs to allow users
to use Fireworks models with Langchain, mainly for summarization.

Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin

---------

Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>
1 year ago
Taqi Jaffri 4806504ebc Fixed one last key name 1 year ago
Joshua Carroll 6705928b9d
Add StreamlitChatMessageHistory (#8497)
Add a StreamlitChatMessageHistory class that stores chat messages in
[Streamlit's Session
State](https://docs.streamlit.io/library/api-reference/session-state).

Note: The integration test uses a currently-experimental Streamlit
testing framework to simulate the execution of a Streamlit app. Marking
this PR as draft until I confirm with the Streamlit team that we're
comfortable supporting it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Matt Robinson 8961c720b8
docs: update `unstructured` install instructions (#8596)
### Summary

Updates the `unstructured` install instructions. For
`unstructured>=0.9.0`, dependencies are broken out by document type and
the base `unstructured` package includes fewer dependencies. `pip
install "unstructured[local-inference]"` has been replace by `pip
install "unstructured[all-docs]"`, though the `local-inference` extra is
still supported for the time being.

### Reviewers

- @rlancemartin
- @eyurtsev
- @hwchase17
1 year ago
Bagatur 73072d3db8
mv (#8595) 1 year ago
Tesfagabir Meharizghi a7000ee89e
Callback handler for Amazon SageMaker Experiments (#8587)
## Description

This PR implements a callback handler for SageMaker Experiments which is
similar to that of mlflow.
* When creating the callback handler, it takes the experiment's run
object as an argument. All the callback outputs are then logged to the
run object.
* The output of each callback action (e.g., `on_llm_start`) is saved to
S3 bucket as json file.
* Optionally, you can also log additional information such as the LLM
hyper-parameters to the same run object.
* Once the callback object is no more needed, you will need to call the
`flush_tracker()` method. This makes sure that any intermediate files
are deleted.
* A separate notebook example is provided to show how the callback is
used.

@3coins  @agola11

---------

Co-authored-by: Tesfagabir Meharizghi <mehariz@amazon.com>
1 year ago
Taqi Jaffri 96843f3bd4 Fixed source key name for docugami loader 1 year ago
mpb159753 7df2dfc4c2
Add Support for Loading Documents from Huawei OBS (#8573)
Description:
This PR adds support for loading documents from Huawei OBS (Object
Storage Service) in Langchain. OBS is a cloud-based object storage
service provided by Huawei Cloud. With this enhancement, Langchain users
can now easily access and load documents stored in Huawei OBS directly
into the system.

Key Changes:
- Added a new document loader module specifically for Huawei OBS
integration.
- Implemented the necessary logic to authenticate and connect to Huawei
OBS using access credentials.
- Enabled the loading of individual documents from a specified bucket
and object key in Huawei OBS.
- Provided the option to specify custom authentication information or
obtain security tokens from Huawei Cloud ECS for easy access.

How to Test:
1. Ensure the required package "esdk-obs-python" is installed.
2. Configure the endpoint, access key, secret key, and bucket details
for Huawei OBS in the Langchain settings.
3. Load documents from Huawei OBS using the updated document loader
module.
4. Verify that documents are successfully retrieved and loaded into
Langchain for further processing.

Please review this PR and let us know if any further improvements are
needed. Your feedback is highly appreciated!

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 66226d1d4d
add example for memory (#8552) 1 year ago
Shantanu Nair 53f3793504
Fast load conversationsummarymemory from existing summary (#7533)
- Description: Adds an optional buffer arg to the memory's
from_messages() method. If provided the existing memory will be loaded
instead of regenerating a summary from the loaded messages.
 
Why? If we have past messages to load from, it is likely we also have an
existing summary. This is particularly helpful in cases where the chat
is ephemeral and/or is backed by serverless where the chat history is
not stored but where the updated chat history is passed back and forth
between a backend/frontend.

Eg: Take a stateless qa backend implementation that loads messages on
every request and generates a response — without this addition, each
time the messages are loaded via from_messages, the summaries are
recomputed even though they may have just been computed during the
previous response. With this, the previously computed summary can be
passed in and avoid:
  1) spending extra $$$ on tokens, and 
2) increased response time by avoiding regenerating previously generated
summary.

Tag maintainer: @hwchase17
Twitter handle: https://twitter.com/ShantanuNair

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
DJ Atha ec40ead980
Fixed bug7445 where a duplicate restuld_id is added to the vectorstore. (#7573)
- Description: updated BabyAGI examples to append the iteration to the
result id to fix error storing data to vectorstore.
  - Issue: 7445
  - Dependencies: no
  - Tag maintainer: @eyurtsev
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

This fix worked for me locally. Happy to take some feedback and iterate
on a better solution. I was considering appending a uuid instead but
didnt want to over complicate the example.
1 year ago
Kenny 1e8fca5518
Add ConcurrentLoader (#7512)
Works just like the GenericLoader but concurrently for those who choose
to optimize their workflow.

@rlancemartin @eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Danny Davenport 8d2344db43
updates some spelling mistakes (#8537)
Just updating some spelling / grammar issues in the documentation. No
code changes.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Leonid Kuligin b4a126ae71
Updated docs on Vertex AI going GA (#8531)
#8074

Co-authored-by: Leonid Kuligin <kuligin@google.com>
1 year ago
Bharat Raghunathan c19a0b9c10
doc(prompts): Follow up on broken Prompt Sublink pages (#8530)
- Description: Follow up of #8478  
  - Issue: #8477
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: [@BharatR123](twitter.com/BharatR123)

The links were still broken after #8478 and sadly the issue was not
caught with either the Vercel app build and `make docs_linkcheck`
1 year ago
Harrison Chase bca0749a11
conversational retrieval chain in lcel (#8532) 1 year ago
Jeff Huber 07d6d1ca38
fix error in chroma docker instructions (#8533)
This makes the Chroma instructions for Docker work! 


https://python.langchain.com/docs/integrations/vectorstores/chroma#basic-example-using-the-docker-container
1 year ago
Matthew DeGuzman 844eca98d5
Add LLaMa Formatter and AzureML Chat Endpoint (#8382)
## Description

Microsoft and Meta recently [announced their
collaboration](https://blogs.microsoft.com/blog/2023/07/18/microsoft-and-meta-expand-their-ai-partnership-with-llama-2-on-azure-and-windows/)
on LLaMa2. This PR extends the current LLM wrapper and introduces a new
Chat Model wrapper for AzureML to support LLaMa2.

## Dependencies

No dependencies added :)

## Twitter Handles

[@matthew_d13](https://twitter.com/matthew_d13)
[@prakhar_in](https://twitter.com/prakhar_in)

maintainers - @hwchase17, @baskaryan
1 year ago
Anthony Mahanna 1ab773c742
docs: Update ArangoDB Colab URL (#8547)
1-commit PR to update the Google Colab URL of the ArangoDB Graph QA
Chain notebook
1 year ago
Harrison Chase 5e3b968078
router runnable (#8496)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
1 year ago
Anubhav Bindlish 913a156cff
Minor improvements to rockset vectorstore (#8416)
This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 893f3014af add xml agent notebook 1 year ago
Harrison Chase 6556a8fcfd
add initial anthropic agent (#8468)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
1 year ago
Muhammed Al-Dulaimi 9975ba4124
Fix ChromaDB integration -> docker container instructions (#8447)
## Description
This PR handles modifying the Chroma DB integration's documentation.
It modifies the **Docker container** example to fix the instructions
mentioned in the documentation.
In the current documentation, the below `client.reset()` line causes a
runtime error:

```py
...
client = chromadb.HttpClient(settings=Settings(allow_reset=True))
client.reset()  # resets the database
collection = client.create_collection("my_collection")
...
```

`Exception: {"error":"ValueError('Resetting is not allowed by this
configuration')"}`

This is due to the Chroma DB server needing to have the `allow_reset`
flag set to `true` there as well.
This is fixed by adding the `ALLOW_RESET=TRUE` to the `docker-compose`
file environment variable to the docker container before spinning it

## Issue
This fixes the runtime error that occurs when running the docker
container example code

## Tag Maintainer
@rlancemartin, @eyurtsev
1 year ago
Nicolas Raoul 7f9c6c3baa
Fixed typo: papaer -> paper (#8500) 1 year ago
Piyush Jain b2f8a5bae9
Fixed exports for NeptuneOpenCypherQAChain (#8439)
## Description
The imports for `NeptuneOpenCypherQAChain` are failing. This PR adds the
chain class to the `__init__.py` file to fix this issue.

## Maintainers
@dev2049 
@krlawrence
1 year ago
Bharat Raghunathan 04ebdbe98f
doc(prompts): Add redirects in Prompt subcategories pages (#8478)
- Description: Fixes broken links in some Prompts subcategories in
documentation (Example Selectors, Prompt Templates)
  - Issue: #8477 (Fixes #8477)
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: [@BharatR123](https://twitter.com/BharatR123)
1 year ago
Ludwig Hubert 08f5e6b801
Fix documentation for from_documents signature (#8482)
Docs for from_documents() were outdated as seen in
https://github.com/langchain-ai/langchain/issues/8457 .

fixes #8457 

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Muneeb Ahmad 4923cf029a
Added Proper Documentation for `faiss-gpu` Installation (#8492)
### Description
In the LangChain Documentation and Comments, I've Noticed that `pip
install faiss` was mentioned, instead of `pip install faiss-gpu`, since
installing `pip install faiss` results in an error. I've gone ahead and
updated the Documentation, and `faiss.ipynb`. This Change will ensure
ease of use for the end user, trying to install `faiss-gpu`.

### Issue: 
Documentation / Comments Related.

### Dependencies:
No Dependencies we're changed only updated the files with the wrong
reference.

### Tag maintainer:
 @rlancemartin, @eyurtsev (Thank You for your contributions 😄 )
1 year ago
Harrison Chase 8f14ddefdf
add anthropic functions wrapper (#8475)
a cheeky wrapper around claude that adds in function calling support
(kind of, hence it going in experimental)
1 year ago
Harrison Chase 490ad93b3c
fix links generation (#8471) 1 year ago
Harrison Chase ae4638aa35
improve notebooks (#8461) 1 year ago
Harrison Chase 412fa4e1db
add guide notebook (#8258)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
William FH b7c0eb9ecb
Wfh/ref links (#8454) 1 year ago
William FH 7d79178827
Wfh/update guide imports (#8452) 1 year ago
Harrison Chase 17953ab61f
add notebook for sql query (#8442) 1 year ago
Zack Proser 3892cefac6
Minor fixes to enhance notebook usability: (#8389)
- Install langchain
- Set Pinecone API key and environment as env vars
- Create Pinecone index if it doesn't already exist
---
- Description: Fix a couple minor issues I came across when running this
notebook,
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: none,
  - Tag maintainer: @rlancemartin @eyurtsev,
  - Twitter handle: @zackproser (certainly not necessary!)
1 year ago
Amélie 8ee56b9a5b
Feature: Add support for meilisearch vectorstore (#7649)
**Description:**

Add support for Meilisearch vector store.
Resolve #7603 

- No external dependencies added
- A notebook has been added

@rlancemartin

https://twitter.com/meilisearch

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bharat Raghunathan 62b8b459c6
doc(prompts): Add redirect to fix broken link on Prompts Page (#8408)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 2311d57df4
mv dropbox (#8438) 1 year ago
Bagatur 2db2987b1b
add experimental ref (#8435) 1 year ago
HeTaoPKU d5884017a9
Add Minimax llm model to langchain (#7645)
- Description: Minimax is a great AI startup from China, recently they
released their latest model and chat API, and the API is widely-spread
in China. As a result, I'd like to add the Minimax llm model to
Langchain.
- Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: the <tao.he@hulu.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 1b0bfa54cf cr 1 year ago
Jiayi Ni 1efb9bae5f
FEAT: Integrate Xinference LLMs and Embeddings (#8171)
- [Xorbits
Inference(Xinference)](https://github.com/xorbitsai/inference) is a
powerful and versatile library designed to serve language, speech
recognition, and multimodal models. Xinference supports a variety of
GGML-compatible models including chatglm, whisper, and vicuna, and
utilizes heterogeneous hardware and a distributed architecture for
seamless cross-device and cross-server model deployment.
- This PR integrates Xinference models and Xinference embeddings into
LangChain.
- Dependencies: To install the depenedencies for this integration, run
    
    `pip install "xinference[all]"`
    
- Example Usage:

To start a local instance of Xinference, run `xinference`.

To deploy Xinference in a distributed cluster, first start an Xinference
supervisor using `xinference-supervisor`:

`xinference-supervisor -H "${supervisor_host}"`

Then, start the Xinference workers using `xinference-worker` on each
server you want to run them on.

`xinference-worker -e "http://${supervisor_host}:9997"`

To use Xinference with LangChain, you also need to launch a model. You
can use command line interface (CLI) to do so. Fo example: `xinference
launch -n vicuna-v1.3 -f ggmlv3 -q q4_0`. This launches a model named
vicuna-v1.3 with `model_format="ggmlv3"` and `quantization="q4_0"`. A
model UID is returned for you to use.

Now you can use Xinference with LangChain:

```python
from langchain.llms import Xinference

llm = Xinference(
    server_url="http://0.0.0.0:9997", # suppose the supervisor_host is "0.0.0.0"
    model_uid = {model_uid} # model UID returned from launching a model
)

llm(
    prompt="Q: where can we visit in the capital of France? A:",
    generate_config={"max_tokens": 1024},
)
```

You can also use RESTful client to launch a model:
```python
from xinference.client import RESTfulClient

client = RESTfulClient("http://0.0.0.0:9997")

model_uid = client.launch_model(model_name="vicuna-v1.3", model_size_in_billions=7, quantization="q4_0")
```

The following code block demonstrates how to use Xinference embeddings
with LangChain:
```python
from langchain.embeddings import XinferenceEmbeddings

xinference = XinferenceEmbeddings(
    server_url="http://0.0.0.0:9997",
    model_uid = model_uid
)
```

```python
query_result = xinference.embed_query("This is a test query")
```

```python
doc_result = xinference.embed_documents(["text A", "text B"])
```

Xinference is still under rapid development. Feel free to [join our
Slack
community](https://xorbitsio.slack.com/join/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA)
to get the latest updates!

- Request for review: @hwchase17, @baskaryan
- Twitter handle: https://twitter.com/Xorbitsio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Gordon Clark e66759cc9d
Github add "Create PR" tool + Docs update (#8235)
Added a new tool to the Github toolkit called **Create Pull Request.**
Now we can make our own langchain contributor in langchain 😁

In order to have somewhere to pull from, I also added a new env var,
"GITHUB_BASE_BRANCH." This will allow the existing env var,
"GITHUB_BRANCH," to be a working branch for the bot (so that it doesn't
have to always commit on the main/master). For example, if you want the
bot to work in a branch called `bot_dev` and your repo base is `main`,
you would set up the vars like:
```
GITHUB_BASE_BRANCH = "main"
GITHUB_BRANCH = "bot_dev"
``` 

Maintainer responsibilities:
  - Agents / Tools / Toolkits: @hinthornw
1 year ago
William FH ecd4aae818
Few Shot Chat Prompt (#8038)
Proposal for a few shot chat message example selector

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Karan V a003a0baf6
fix(petals) allows to run models that aren't Bloom (Support for LLama and newer models) (#8356)
In this PR:

- Removed restricted model loading logic for Petals-Bloom
- Removed petals imports (DistributedBloomForCausalLM,
BloomTokenizerFast)
- Instead imported more generalized versions of loader
(AutoDistributedModelForCausalLM, AutoTokenizer)
- Updated the Petals example notebook to allow for a successful
installation of Petals in Apple Silicon Macs

- Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Harrison Chase 25b8cc7e3d
Harrison/update memory docs (#8384)
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Taozhi Wang 594f195e54
Add embeddings for AwaEmbedding (#8353)
- Description: Adds AwaEmbeddings class for embeddings, which provides
users with a convenient way to do fine-tuning, as well as the potential
need for multimodality

  - Tag maintainer: @baskaryan

Create `Awa.ipynb`: an example notebook for AwaEmbeddings class
Modify `embeddings/__init__.py`: Import the class
Create `embeddings/awa.py`: The embedding class
Create `embeddings/test_awa.py`: The test file.

---------

Co-authored-by: taozhiwang <taozhiwa@gmail.com>
1 year ago
bheroder dc3ca44e05
Add an example for azure ml managed feature store (#8324)
We are adding an example of how one can connect to azure ml managed
feature store and use such a prompt template in a llm chain. @baskaryan
1 year ago
evelynmitchell 539574670c
Update tot.ipynb (#8387)
Spelling error fix

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
1 year ago
Sachin Varghese 01217b2247
Update sql database agent example (#8354)
This PR fixes a minor documentation issue on the SQL database toolkit
example notebook.
1 year ago
Bagatur 55beab326c
cleanup warnings (#8379) 1 year ago
William FH 41524304bf
Update local script for docs build (#8377) 1 year ago
Bagatur 68763bd25f
mv popular and additional chains to use cases (#8242) 1 year ago
William FH 94a693e2ee
Link to use cases from tutorials (#8371) 1 year ago
Rubén Barragán ef6332ead6
Support loading files from Dropbox (#8271)
## Description
This commit introduces the `DropboxLoader` class, a new document loader
that allows loading files from Dropbox into the application. The loader
relies on a Dropbox app, which requires creating an app on Dropbox,
obtaining the necessary scope permissions, and generating an access
token. Additionally, the dropbox Python package is required.

The `DropboxLoader` class is designed to be used as a document loader
for processing various file types, including text files, PDFs, and
Dropbox Paper files.

## Dependencies
`pip install dropbox` and `pip install unstructured` for PDF reading.

## Tag maintainer
@rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some
feedback here 🙏 .

## Social Networks
https://github.com/rubenbarragan
https://www.linkedin.com/in/rgbarragan/
https://twitter.com/RubenBarraganP

---------

Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>
1 year ago
Ikko Eltociear Ashimine 934ea80780
Fix typo in Etherscan.ipynb (#8340)
specifc  -> specific
1 year ago
Vadim Gubergrits e7e5cb9d08
Tree of Thought introducing a new ToTChain. (#5167)
# [WIP] Tree of Thought introducing a new ToTChain.

This PR adds a new chain called ToTChain that implements the ["Large
Language Model Guided
Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper.

There's a notebook example `docs/modules/chains/examples/tot.ipynb` that
shows how to use it.


Implements #4975


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @vowelparrot

---------

Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
William FH 412e29d436
Fix notebook that 'cannot convert' via nbdoc_build (#8333) 1 year ago
William FH 9eb7e6e27f
Delete Old Evals Examples (#8252)
Still retain:
- Comparison Examples
- Data + QA walkthrough
- QA (but really minimize it)
1 year ago
Fabrizio Ruocco ddc353a768
Azure Cognitive Search: Custom index and scoring profile support (#6843)
Description: Adding support for custom index and scoring profile support
in Azure Cognitive Search
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Leonid Ganeline ed24de8467
removed namespace title (#8208)
This change compacts the left-side Navbar (ToC) of the [API
Reference](https://api.python.langchain.com/en/latest/api_reference.html).
Now almost each namespace item is split into two lines. For example
`langchain.chat_models: Chat Models`
We remove the `Chat Models` and leave one the `langchain.chat_models`. 
This effectively compacts the navbar and increases the main page's
usability. On my screen, it reduces # of lines in Toc from 28 t to 18,
which is huge.

Removing the namespace "title" (like `Chat Models`) does not remove any
information because the title is composed directly from the namespace.
API Reference users are developers. Usability for them is very
important. We see less text => we find faster.
1 year ago
Kacper Łukawski c5988c1d4b
Implement async support for Cohere (#8237)
This PR introduces async API support for Cohere, both LLM and
embeddings. It requires updating `cohere` package to `^4`.

Tagging @hwchase17, @baskaryan, @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur ceab0a7c1f
update api ref style (#8318) 1 year ago
William FH 01a9b06400
Add api cross ref linking (#8275)
Example of how it would show up in our python docs:


![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15)


Examples added to the reference docs:

https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma


![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)
1 year ago
Riche Akparuorji f3d2fdd54c
Fix for code snippet in documentation (#8290)
- Description: I fixed an issue in the code snippet related to the
variable name and the evaluation of its length. The original code used
the variable "docs," but the correct variable name is "docs_svm" after
using the SVMRetriever.
- maintainer: @baskaryan
- Twitter handle: @iamreechi_

Co-authored-by: iamreechi <richieakparuorji>
1 year ago
Bagatur f27176930a
fix geopandas link (#8305) 1 year ago
Timon Palm 70604e590f
DuckDuckGoSearch News Tool (#8292)
Description: 
I wanted to use the DuckDuckGoSearch tool in an agent to let him get the
latest news for a topic. DuckDuckGoSearch has already an implemented
function for retrieving news articles. But there wasn't a tool to use
it. I simply adapted the SearchResult class with an extra argument
"backend". You can set it to "news" to only get news articles.

Furthermore, I added an example to the DuckDuckGo Notebook on how to
further customize the results by using the DuckDuckGoSearchAPIWrapper.

Dependencies: no new dependencies
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Aarav Borthakur 8ce661d5a1
Docs: Fix Rockset links (#8214)
Fix broken Rockset links.

Right now links at
https://python.langchain.com/docs/integrations/providers/rockset are
broken.
1 year ago
Jon Bennion ad38eb2d50
correction to reference to code (#8301)
- Description: fixes typo referencing code

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Naveen Tatikonda 9cbefcc56c
[ OpenSearch ] : Add AOSS Support to OpenSearch (#8256)
### Description

This PR includes the following changes:

- Adds AOSS (Amazon OpenSearch Service Serverless) support to
OpenSearch. Please refer to the documentation on how to use it.
- While creating an index, AOSS only supports Approximate Search with
`nmslib` and `faiss` engines. During Search, only Approximate Search and
Script Scoring (on doc values) are supported.
- This PR also adds support to `efficient_filter` which can be used with
`faiss` and `lucene` engines.
- The `lucene_filter` is deprecated. Instead please use the
`efficient_filter` for the lucene engine.


Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Lance Martin 7a00f17033
Web research retriever (#8102)
Given a user question, this will -
* Use LLM to generate a set of queries.
* Query for each.
* The URLs from search results are stored in self.urls.
* A check is performed for any new URLs that haven't been processed yet
(not in self.url_database).
* Only these new URLs are loaded, transformed, and added to the
vectorstore.
* The vectorstore is queried for relevant documents based on the
questions generated by the LLM.
* Only unique documents are returned as the final result.

This code will avoid reprocessing of URLs across multiple runs of
similar queries, which should improve the performance of the retriever.
It also keeps track of all URLs that have been processed, which could be
useful for debugging or understanding the retriever's behavior.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Byron Saltysiak 68a906bb31
added lxml to the pip install example since it is required (#8260)
- Description: The trello dataloader example didn't work without an
additional dependency installed - lxml
  - Issue: na
1 year ago
Emory Petermann 7734a2b5ab
update golden-query notebook and fix typo in golden docs (#8253)
updating the documentation to be consistent for Golden query tool and
have a better introduction to the tool
1 year ago
William FH dd87275dde
Add LLMChain example of memory with chat models (#8250) 1 year ago
William FH 1f40d3e094
Update Broken Links (#8247) 1 year ago
William FH 30c2d3cd06
Update references (#8243) 1 year ago
William FH 0a16b3d84b
Update Integrations links (#8206) 1 year ago
Dayuan Jiang 125ae6d9de
add Hybrid retriever that not require any external service (#8108)
- Until now, hybrid search was limited to modules requiring external
services, such as Weaviate/Pinecone Hybrid Search. However, I have
developed a hybrid retriever that can merge a list of retrievers using
the [Reciprocal Rank
Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf)
algorithm. This new approach, similar to Weaviate hybrid search, does
not require the initialization of any external service.
  - Dependencies: No  - Twitter handle: dayuanjian21687

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Dario Ruben 04e45f9cde
Fixed grammar in LLM models documentation (#8210)
Description: I fixed a typo in the documentation related to LLMs
(https://python.langchain.com/docs/modules/model_io/models/llms/)
1 year ago
Taqi Jaffri 8f158b72fc
Added stop sequence support to replicate (#8107)
Stop sequences are useful if you are doing long-running completions and
need to early-out rather than running for the full max_length... not
only does this save inference cost on Replicate, it is also much faster
if you are going to truncate the output later anyway.

Other LLMs support stop sequences natively (e.g. OpenAI) but I didn't
see this for Replicate so adding this via their prediction cancel
method.

Housekeeping: I ran `make format` and `make lint`, no issues reported in
the files I touched.

I did update the replicate integration test and ran `poetry run pytest
tests/integration_tests/llms/test_replicate.py` successfully.

Finally, I am @tjaffri https://twitter.com/tjaffri for feature
announcement tweets... or if you could please tag @docugami
https://twitter.com/docugami we would really appreciate that :-)

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
1 year ago
glaze f7ad14acfa
Add etherscan document loader (#7943)
@rlancemartin 
The modification includes:
* etherscanLoader
* test_etherscan
* document ipynb

I have run the test, lint, format, and spell check. I do encounter a
linting error on ipynb, I am not sure how to address that.
```
docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:55: error: Name "null" is not defined  [name-defined]
docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:76: error: Name "null" is not defined  [name-defined]
Found 2 errors in 1 file (checked 1 source file)
```
- Description: The Etherscan loader uses etherscan api to load
transaction histories under specific accounts on Ethereum Mainnet.
- No dependency is introduced by this PR.
- Twitter handle: glazecl

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 483f6c2fe3
mv eval docs (#8209) 1 year ago
Liu Ming 24f889f2bc
Change with_history option to False for ChatGLM by default (#8076)
ChatGLM LLM integration will by default accumulate conversation
history(with_history=True) to ChatGLM backend api, which is not expected
in most cases. This PR set with_history=False by default, user should
explicitly set llm.with_history=True to turn this feature on. Related
PR: #8048 #7774

---------

Co-authored-by: mlot <limpo2000@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Anthony Mahanna 76102971c0
ArangoDB/AQL support for Graph QA Chain (#7880)
**Description**: Serves as an introduction to LangChain's support for
[ArangoDB](https://github.com/arangodb/arangodb), similar to
https://github.com/hwchase17/langchain/pull/7165 and
https://github.com/hwchase17/langchain/pull/4881

**Issue**: No issue has been created for this feature

**Dependencies**: `python-arango` has been added as an optional
dependency via the `CONTRIBUTING.md` guidelines
 
**Twitter handle**: [at]arangodb

- Integration test has been added
- Notebook has been added:
[graph_arangodb_qa.ipynb](https://github.com/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb)

[![Open In
Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb)

```
docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb
```

```
pip install git+https://github.com/amahanna/langchain.git
```

```python
from arango import ArangoClient

from langchain.chat_models import ChatOpenAI
from langchain.graphs import ArangoGraph
from langchain.chains import ArangoGraphQAChain

db = ArangoClient(hosts="localhost:8529").db(name="_system", username="root", password="", verify=True)

graph = ArangoGraph(db)

chain = ArangoGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph)

chain.run("Is Ned Stark alive?")
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Adilkhan Sarsen 3e7d2a1b64
SelfQuery support for deeplake (#7888)
Added support SelfQuery for Deeplake
1 year ago
Juan José Torres 1cc7d4c9eb
Update SageMaker Endpoint Embeddings docs to be up to date with current requirements (#8103)
- **Description:** Simple change of the Class that ContentHandler
inherits from. To create an object of type SagemakerEndpointEmbeddings,
the property content_handler must be of type EmbeddingsContentHandler
not ContentHandlerBase anymore,
  - **Twitter handle:** @Juanjo_Torres11

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 1a7d8667c8
Bagatur/gateway chat (#8198)
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: dbczumar <corey.zumar@databricks.com>
1 year ago
Ettore Di Giacinto ae28568e2a
Add embeddings for LocalAI (#8134)
Description:

This PR adds embeddings for LocalAI (
https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in
replacement. As LocalAI can re-use OpenAI clients it is mostly following
the lines of the OpenAI embeddings, however when embedding documents, it
just uses string instead of sending tokens as sending tokens is
best-effort depending on the model being used in LocalAI. Sending tokens
is also tricky as token id's can mismatch with the model - so it's safer
to just send strings in this case.

Partly related to: https://github.com/hwchase17/langchain/issues/5256

Dependencies: No new dependencies

Twitter: @mudler_it
---------

Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago