Commit Graph

714 Commits (main)
 

Author SHA1 Message Date
Ankush Gola 7b5e160d28
Make Tools own model, add ToolKit Concept (#1095)
Follow-up of @hinthornw's PR:

- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
1 year ago
Harrison Chase 45b5640fe5
fix sql (#1141) 1 year ago
Sam Hogan 85c1449a96
Fix typo in HyDE docs (#1142) 1 year ago
kekayan 9111f4ca8a
fix chatvectordbchain to use pinecone namespace (#1139)
In the similarity search, the pinecone namespace is not used, which
makes the bot return _I don't know_ where the embeddings are stored in
the pinecone namespace. Now we can query by passing the namespace
optionally.
```result = qa({"question": query, "chat_history": chat_history, "namespace":"01gshyhjcfgkq1q5wxjtm17gjh"})```
1 year ago
Harrison Chase fb3c73d194
add srt loader (#1140) 1 year ago
Francisco Ingham 3f29742adc
Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
1 year ago
Harrison Chase 483821ea3b
fix docs (#1133) 1 year ago
Harrison Chase ee3590cb61
instruct embeddings docs (#1131) 1 year ago
Noah Gundotra 8c5fbab72d
[Integration Tests] Cast fake embeddings to ALL float values (#1102)
Pydantic validation breaks tests for example (`test_qdrant.py`) because
fake embeddings contain an integer.

This PR casts the embeddings array to all floats.

Now the `qdrant` test passes, `poetry run pytest
tests/integration_tests/vectorstores/test_qdrant.py`
1 year ago
Harrison Chase d5f3dfa1e1
Harrison/hn loader (#1130)
Co-authored-by: William X <william.y.xuan@gmail.com>
1 year ago
Tom Bocklisch 47c3221fda
Max marginal relecance search fails if there are not enough docs (#1117)
Implementation fails if there are not enough documents. Added the same
check as used for similarity search.

Current implementation raises
```  
File ".venv/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 160, in max_marginal_relevance_search
    _id = self.index_to_docstore_id[i]
KeyError: -1
```
1 year ago
Harrison Chase 511d41114f
return source documents for chat vector db chain (#1128) 1 year ago
Jon Luo c39ef70aa4
fix for database compatibility when getting table DDL (#1129)
#1081 introduced a method to get DDL (table definitions) in a manner
specific to sqlite3, thus breaking compatibility with other non-sqlite3
databases. This uses the sqlite3 command if the detected dialect is
sqlite, and otherwise uses the standard SQL `SHOW CREATE TABLE`. This
should fix #1103.
1 year ago
yakigac 1ed708391e
Fix a bug that shows "KeyError 'items'" (#1118)
Fix KeyError 'items' when no result found.

## Problem

When no result found for a query, google search crashed with `KeyError
'items'`.

## Solution

I added a check for an empty response before accessing the 'items' key.
It will handle the case correctly.

## Other

my twitter: yakigac
(I don't mind even if you don't mention me for this PR. But just because
last time my real name was shout out :) )
1 year ago
Matt Robinson 2bee8d4941
feat: add support for `.ppt` files in `UnstructuredPowerPointLoader` (#1124)
###  Summary

Adds support for older `.ppt` file in the PowerPoint loader. 

### Testing

The following should work on `unstructured==0.4.11` using the example
docs from the `unstructured` repo.

```python
from langchain.document_loaders import UnstructuredPowerPointLoader

filename = "../unstructured/example-docs/fake-power-point.pptx"
loader = UnstructuredPowerPointLoader(filename)
loader.load()

filename = "../unstructured/example-docs/fake-power-point.ppt"
loader = UnstructuredPowerPointLoader(filename)
loader.load()
```

Now downgrade `unstructured` to version `0.4.10`. The following should
work:

```python
from langchain.document_loaders import UnstructuredPowerPointLoader

filename = "../unstructured/example-docs/fake-power-point.pptx"
loader = UnstructuredPowerPointLoader(filename)
loader.load()
```

and the following should give you a `ValueError` and invite you to
upgrade `unstructured`.


```python
from langchain.document_loaders import UnstructuredPowerPointLoader

filename = "../unstructured/example-docs/fake-power-point.ppt"
loader = UnstructuredPowerPointLoader(filename)
loader.load()
```
1 year ago
Matt Robinson b956070f08
docs: add an unstructured section to the ecosystem page (#1125)
### Summary

Adds an Unstructured section to the ecosystem page.
1 year ago
Hasegawa Yuya 383c67c1b2
Fix Issue #1100 (#1101)
https://github.com/hwchase17/langchain/issues/1100
When faiss data and doc.index are created in past versions, error occurs
that say there was no attribute. So I put hasattr in the check as a
simple solution.

However, increasing the number of such checks is not good for
conservatism, so I think there is a better solution.


Also, the code for the batch process was left out, so I put it back in.
1 year ago
Harrison Chase 3f50feb280
fix telegram imports (#1110) 1 year ago
trigaten 6fafcd0a70
Strange behavior with LLM import requirements (#1104)
This import works fine:
```python
from langchain import Anthropic
```
This import does not:
```python
from langchain import AI21
```

```
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'AI21' from 'langchain' (/opt/anaconda3/envs/fed_nlp/lib/python3.9/site-packages/langchain/__init__.py)
```

I think there is a slight documentation inconsistency here:
https://langchain.readthedocs.io/en/latest/reference/modules/llms.html

This PR starts to solve that. Should all the import examples be
`from langchain.llms import X` instead of `from langchain import X`?
1 year ago
Kacper Łukawski ab1a3cccac
Hotfix: Qdrant content retrieval (revert: #1088) (#1093)
The #1088 introduced a bug in Qdrant integration. That PR reverts those
changes and provides class attributes to ensure consistent payload keys.
In addition to that, an exception will be thrown if any of texts is None
(that could have been an issue reported in #1087)
1 year ago
Harrison Chase 6322b6f657
bump version 0.0.88 (#1090) 1 year ago
Francisco Ingham 3462130e2d
Modify number of types of chains (#1089)
Changed number of types of chains to make it consistent with the rest of
the docs
1 year ago
Rishabh Raizada 5d11e5da40
Update qdrant.py (#1088)
Fixes #1087
1 year ago
Harrison Chase 7745505482
chat qa with sources (#1084) 1 year ago
Harrison Chase badeeb37b0
fix stuff count (#1083) 1 year ago
Harrison Chase 971458c5de
docs for batch size (#1082) 1 year ago
Harrison Chase 5e10e19bfe
Harrison/align table (#1081)
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Harrison Chase c60954d0f8
Harrison/telegram loader (#1080)
Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>
1 year ago
Dennis Antela Martinez a1c296bc3c
docs: increase width (#1049)
This addresses #948.

I set the documentation max width to 2560px, but can be adjusted - see
screenshot below.

<img width="1741" alt="Screenshot 2023-02-14 at 13 05 57"
src="https://user-images.githubusercontent.com/23406704/218749076-ea51e90a-a220-4558-b4fe-5a95b39ebf15.png">
1 year ago
Harrison Chase c96ac3e591
Harrison/semantic subset (#1079)
Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu>
1 year ago
Harrison Chase 19c2797bed
add anthropic example (#1041)
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com>
1 year ago
blob42 3ecdea8be4
SearxNG meta search api helper (#854)
This is a work in progress PR to track my progres.

## TODO:

- [x]  Get results using the specifed searx host
- [x]  Prioritize returning an  `answer`  or results otherwise
    - [ ] expose the field `infobox` when available
    - [ ] expose `score` of result to help agent's decision
- [ ] expose the `suggestions` field to agents so they could try new
queries if no results are found with the orignial query ?

- [ ] Dynamic tool description for agents ?
- Searx offers many engines and a search syntax that agents can take
advantage of. It would be nice to generate a dynamic Tool description so
that it can be used many times as a tool but for different purposes.

- [x]  Limit number of results
- [ ]   Implement paging
- [x]  Miror the usage of the Google Search tool
- [x] easy selection of search engines
- [x]  Documentation
    - [ ] update HowTo guide notebook on Search Tools
- [ ] Handle async 
- [ ]  Tests

###  Add examples / documentation on possible uses with
 - [ ]  getting factual answers with `!wiki` option and `infoboxes`
 - [ ]  getting `suggestions`
 - [ ]  getting `corrections`

---------

Co-authored-by: blob42 <spike@w530>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Hasegawa Yuya e08961ab25
Fixed openai embeddings to be safe by batching them based on token size calculation. (#991)
I modified the logic of the batch calculation for embedding according to
this cookbook

https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb
1 year ago
seanaedmiston f0a258555b
Support similarity search by vector (in FAISS) (#961)
Alternate implementation to PR #960 Again - only FAISS is implemented.
If accepted can add this to other vectorstores or leave as
NotImplemented? Suggestions welcome...
1 year ago
Jonathan Pedoeem 05ad399abe
Update PromptLayerOpenAI LLM to include support for ASYNC API (#1066)
This PR updates `PromptLayerOpenAI` to now support requests using the
[Async
API](https://langchain.readthedocs.io/en/latest/modules/llms/async_llm.html)
It also updates the documentation on Async API to let users know that
PromptLayerOpenAI also supports this.

`PromptLayerOpenAI` now redefines `_agenerate` a similar was to how it
redefines `_generate`
1 year ago
Harrison Chase 98186ef180
Harrison/evernote nb (#1078)
Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com>
1 year ago
rogerserper e46cd3b7db
Google Search API integration with serper.dev (wrapper, tests, docs, … (#909)
Adds Google Search integration with [Serper](https://serper.dev) a
low-cost alternative to SerpAPI (10x cheaper + generous free tier).
Includes documentation, tests and examples. Hopefully I am not missing
anything.

Developers can sign up for a free account at
[serper.dev](https://serper.dev) and obtain an api key.

## Usage

```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.llms.openai import OpenAI
from langchain.agents import initialize_agent, Tool

import os
os.environ["SERPER_API_KEY"] = ""
os.environ['OPENAI_API_KEY'] = ""

llm = OpenAI(temperature=0)
search = GoogleSerperAPIWrapper()
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run
    )
]

self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
```

### Output
```
Entering new AgentExecutor chain...
 Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion.
Follow up: Where is Carlos Alcaraz from?
Intermediate answer: El Palmar, Spain
So the final answer is: El Palmar, Spain

> Finished chain.

'El Palmar, Spain'
```
1 year ago
Harrison Chase 52753066ef
Harrison/handle stop tokens ai21 (#1077)
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
1 year ago
Akshay d8ed286200
Update and rename everynote.py to evernote.py (#1060)
Updating this base file as well as the .ipynb file of the example on the
website:

https://github.com/hwchase17/langchain/compare/master...akshayvkt:langchain:patch-1

https://langchain.readthedocs.io/en/latest/modules/document_loaders/examples/everynote.html
1 year ago
Jeff Huber 34cba2da32
Fix typo in integration with Chroma (#1070)
We introduced a breaking change but missed this call. This PR fixes
`langchain` to work with upstream `chroma`.
1 year ago
Jonathan Pedoeem 05df480376
Update `PromptLayerOpenAI` LLM usage instructions in documentation (#1053)
This PR updates the usage instructions for PromptLayerOpenAI in
Langchain's documentation. The updated instructions provide more detail
and conform better to the style of other LLM integration documentation
pages.

No code changes were made in this PR, only improvements to the
documentation. This update will make it easier for users to understand
how to use `PromptLayerOpenAI`
1 year ago
Matt Robinson 3ea1e5af1e
feat: added element metadata to unstructured loader (#1068)
### Summary

Adds tracked metadata from `unstructured` elements to the document
metadata when `UnstructuredFileLoader` is used in `"elements"` mode.
Tracked metadata is available in `unstructured>=0.4.9`, but the code is
written for backward compatibility with older `unstructured` versions.

### Testing

Before running, make sure to upgrade to `unstructured==0.4.9`. In the
code snippet below, you should see `page_number`, `filename`, and
`category` in the metadata for each document. `doc[0]` should have
`page_number: 1` and `doc[-1]` should have `page_number: 2`. The example
document is `layout-parser-paper-fast.pdf` from the [`unstructured`
sample
docs](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs).

```python
from langchain.document_loaders import UnstructuredFileLoader
loader = UnstructuredFileLoader(file_path=f"layout-parser-paper-fast.pdf", mode="elements")
docs = loader.load()
```
1 year ago
Harrison Chase bac676c8e7
bump version (#1057) 1 year ago
Ankush Gola d8ac274fc2
add to async chain notebook (#1056) 1 year ago
Ankush Gola caa8e4742e
Enable streaming for OpenAI LLM (#986)
* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
1 year ago
Harrison Chase f05f025e41
bump version to 0086 (#1050) 1 year ago
Sasmitha Manathunga c67c5383fd
docs: fix typo in notebook (#1046) 1 year ago
Harrison Chase 88bebb4caa
Harrison/llm integrations (#1039)
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
1 year ago
Harrison Chase ec727bf166
Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Harrison Chase 8c45f06d58
Harrison/standarize prompt loading (#1036)
Co-authored-by: Ibis Prevedello <ibiscp@gmail.com>
1 year ago