Commit Graph

1094 Commits

Author SHA1 Message Date
Abhinav Upadhyay
9707eda83c
Fix docstring of FAISS constructor (#1611) 2023-03-12 09:31:40 -07:00
Kayvane Shakerifar
7e550df6d4
feat: add lookup index to csv loader to make retrieving the original … (#1612)
feat: add lookup index to csv loader to make retrieving the original csv
information easier using theDocument properties
2023-03-12 09:29:27 -07:00
Harrison Chase
c9b5a30b37
move output parsing (#1605) 2023-03-11 16:41:03 -08:00
Harrison Chase
cb04ba0136
Add support for intermediate steps to SQLDatabaseSequentialChain (#1583) (#1601)
for https://github.com/hwchase17/langchain/issues/1582

I simply added the `return_intermediate_steps` and changed the
`output_keys` function.

I added 2 simple tests, 1 for SQLDatabaseSequentialChain without the
intermediate steps and 1 with

Co-authored-by: brad-nemetski <115185478+brad-nemetski@users.noreply.github.com>
2023-03-11 15:44:41 -08:00
Harrison Chase
5903a93f3d
add convinence method to call chat model as an llm (#1604) 2023-03-11 15:04:57 -08:00
Harrison Chase
15de3e8137
Harrison/docs footer (#1600)
Co-authored-by: Albert Avetisian <albert.avetisian@gmail.com>
2023-03-11 09:18:35 -08:00
Harrison Chase
f95d551f7a
Harrison/shallow metadata (#1599)
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-03-11 09:18:25 -08:00
Harrison Chase
c6bfa00178
bump version to 107 (#1590) 2023-03-10 15:39:30 -08:00
Tim Asp
01a57198b8
[bugfix] Fix persisted chromadb vectorstore (#1444)
If a `persist_directory` param was set, chromadb would throw a warning
that ""No embedding_function provided, using default embedding function:
SentenceTransformerEmbeddingFunction". and would error with a `Illegal
instruction: 4` error.

This is on a MBP M1 13.2.1, python 3.9.

I'm not entirely sure why that error happened, but when using
`get_or_create_collection` instead of `list_collection` on our end, the
error and warning goes away and chroma works as expected.

Added bonus this is cleaner and likely more efficient.
`list_collections` builds a new `Collection` instance for each collect,
then `Chroma` would just use the `name` field to tell if the collection
existed.
2023-03-10 15:14:35 -08:00
Harrison Chase
8dba30f31e
Harrison/kwargs loaders (#1588)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-10 15:05:06 -08:00
Harrison Chase
9f78717b3c
Harrison/callbacks (#1587) 2023-03-10 12:53:09 -08:00
Harrison Chase
90846dcc28
fix chat agent (#1586) 2023-03-10 12:40:37 -08:00
Claus Thomasen
6ed16e13b1
Readded similarity_search_by_vector (#1568)
I am redoing this PR, as I made a mistake by merging the latest changes
into my fork's branch, sorry. This added a bunch of commits to my
previous PR.

This fixes #1451.
2023-03-10 12:40:14 -08:00
Harrison Chase
c1dc784a3d
buffer memory old version (#1581)
bring back an older version of memory since people seem to be using it
more widely
2023-03-10 11:27:15 -08:00
fabi.s
5b0e747f9a
Fix description of UnstructuredURLLoader & UnstructuredHTMLLoader (#1570) 2023-03-10 07:08:58 -08:00
Zach Schillaci
624c72c266
Add wikipedia tool doc (#1579) 2023-03-10 07:07:27 -08:00
Ryan Dao
a950287206
Strip trailing whitespaces in agent's stop sequences (#1566)
Fixes #1489
2023-03-09 16:36:15 -08:00
Tim Asp
30383abb12
Add CSVLoader document loader (#1573)
Simple CSV document loader which wraps `csv` reader, and preps the file
with a single `Document` per row.

The column header is prepended to each value for context which is useful
for context with embedding and semantic search
2023-03-09 16:35:18 -08:00
Zach Schillaci
cdb97f3dfb
Add Wikipedia search utility and tool (#1561)
The Python `wikipedia` package gives easy access for searching and
fetching pages from Wikipedia, see https://pypi.org/project/wikipedia/.
It can serve as an additional search and retrieval tool, like the
existing Google and SerpAPI helpers, for both chains and agents.
2023-03-09 16:34:39 -08:00
Felix Altenberger
b44c8bd969
Add optional base_url arg to GitbookLoader (#1552)
First of all, big kudos on what you guys are doing, langchain is
enabling some really amazing usecases and I'm having lot's of fun
playing around with it. It's really cool how many data sources it
supports out of the box.

However, I noticed some limitations of the current `GitbookLoader` which
this PR adresses:

The main change is that I added an optional `base_url` arg to
`GitbookLoader`. This enables use cases where one wants to crawl docs
from a start page other than the index page, e.g., the following call
would scrape all pages that are reachable via nav bar links from
"https://docs.zenml.io/v/0.35.0":

```python
GitbookLoader(
    web_page="https://docs.zenml.io/v/0.35.0", 
    load_all_paths=True,
    base_url="https://docs.zenml.io",
)
```

Previously, this would fail because relative links would be of the form
`/v/0.35.0/...` and the full link URLs would become
`docs.zenml.io/v/0.35.0/v/0.35.0/...`.

I also fixed another issue of the `GitbookLoader` where the link URLs
were constructed incorrectly as `website//relative_url` if the provided
`web_page` had a trailing slash.
2023-03-09 16:32:40 -08:00
Andriy Mulyar
c9189d354a
AtlasDB vector store documentation updates. (#1572)
- Updated errors in the AtlasDB vector store documentation
- Removed extraneous output logs in example notebook.
2023-03-09 16:31:14 -08:00
622578a022
docs: fix typo in searx tool (#1569)
Co-authored-by: blob42 <spike@w530>
2023-03-09 15:58:33 -08:00
Matt Robinson
7018806a92
feat: document loader for markdown files (#1558)
### Summary

Adds a document loader for handling markdown files. This document loader
requires `unstructured>=0.4.16`.

### Testing

```python
from langchain.document_loaders import UnstructuredMarkdownLoader

loader = UnstructuredMarkdownLoader("README.md")
loader.load()
```
2023-03-09 10:55:07 -08:00
Harrison Chase
bd335ffd64
bump version to 106 (#1562) 2023-03-09 10:20:54 -08:00
Harrison Chase
a094c49153
add chat agent (#1509) 2023-03-09 09:12:08 -08:00
Brenton Wheeler
99fe023496
docs: fix typo in modules/indexes/chain_examples/question_answering (#1551)
docs: fix typo in modules/indexes/chain_examples/question_answering


![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)
2023-03-09 09:11:43 -08:00
Harrison Chase
3ee32a01ea
Harrison/prompt layer (#1547)
Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com>
Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>
2023-03-08 21:24:27 -08:00
Harrison Chase
c844d1fd46
Harrison/chunk size (#1549)
Co-authored-by: Florian Leuerer <31259070+floleuerer@users.noreply.github.com>
2023-03-08 21:24:18 -08:00
Harrison Chase
9405af6919
Harrison/hf inf error (#1543)
Co-authored-by: Konstantin Hebenstreit <57603012+KonstantinHebenstreit@users.noreply.github.com>
2023-03-08 20:53:46 -08:00
Harrison Chase
357d808484
Harrison/remote paths pdf (#1544)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-08 20:53:37 -08:00
Harrison Chase
cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
2023-03-08 20:53:27 -08:00
Harrison Chase
b053f831cd
Harrison/contributing (#1542)
Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com>
2023-03-08 20:53:16 -08:00
Harrison Chase
523ad8d2e2
Harrison/chat history formatter1 (#1538)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
2023-03-08 20:46:37 -08:00
Graham Neubig
31303d0b11
Added other evaluation metrics for data-augmented QA (#1521)
This PR adds additional evaluation metrics for data-augmented QA,
resulting in a report like this at the end of the notebook:

![Screen Shot 2023-03-08 at 8 53 23
AM](https://user-images.githubusercontent.com/398875/223731199-8eb8e77f-5ff3-40a2-a23e-f3bede623344.png)

The score calculation is based on the
[Critique](https://docs.inspiredco.ai/critique/) toolkit, an API-based
toolkit (like OpenAI) that has minimal dependencies, so it should be
easy for people to run if they choose.

The code could further be simplified by actually adding a chain that
calls Critique directly, but that probably should be saved for another
PR if necessary. Any comments or change requests are welcome!
2023-03-08 20:41:03 -08:00
gidler
494c9d341a
[DOCS] Assorted wording, punctuation, and consistency revisions (#1443)
Contributing some small fixes I noticed while reading through the
documentation.

Thank you for a creating and maintaining this project!
2023-03-08 20:16:09 -08:00
Harrison Chase
519f0187b6
Harrison/gdrive pdf (#1433)
Co-authored-by: LM <93918064+LuisMalhadas@users.noreply.github.com>
Co-authored-by: Luis Malhadas <luis@sia.so>
2023-03-08 20:15:36 -08:00
Florian Leuerer
64c6435545
Added client_settings support for chromadb vecstore (#1528)
# Problem

The ChromaDB vecstore only supported local connection. There was no way
to use a chromadb server.

# Fix
Added `client_settings` as Chroma attribute. 

# Usage

```
from chromadb.config import Settings
from langchain.vectorstores import Chroma

chroma_settings = Settings(chroma_api_impl="rest",
                            chroma_server_host="localhost",
                            chroma_server_http_port="80")

docsearch = Chroma.from_documents(chunks, embeddings, metadatas=metadatas, client_settings=chroma_settings, collection_name=COLLECTION_NAME)
```
2023-03-08 17:42:09 -08:00
Harrison Chase
7eba828e1b
Harrison/update regex (#1534)
Co-authored-by: Luis <57528712+LuisLechugaRuiz@users.noreply.github.com>
2023-03-08 17:41:17 -08:00
Harrison Chase
2a7215bc3b
Harrison/prompt issues (#1537) 2023-03-08 16:56:10 -08:00
Alpri Else
784d24a1d5
Support S3 Object keys with / in S3FileLoader (#1517)
Resolves https://github.com/hwchase17/langchain/issues/1510

### Problem
When loading S3 Objects with `/` in the object key (eg.
`folder/some-document.txt`) using `S3FileLoader`, the objects are
downloaded into a temporary directory and saved as a file.

This errors out when the parent directory does not exist within the
temporary directory.

See
https://github.com/hwchase17/langchain/issues/1510#issuecomment-1459583696
on how to reproduce this bug

### What this pr does
Creates parent directories based on object key. 

This also works with deeply nested keys:
`folder/subfolder/some-document.txt`
2023-03-08 16:17:26 -08:00
Harrison Chase
aba58e9e2e
Harrison/bumpver104 (#1525) 2023-03-08 09:46:02 -08:00
Harrison Chase
c4a557bdd4
add concept of prompt collection (#1507) 2023-03-08 08:31:29 -08:00
Ivan
97e3666e0d
changed requests.run to requests.get (#1485)
This pull request proposes an update to the Lightweight wrapper
library's documentation. The current documentation provides an example
of how to use the library's requests.run method, as follows:
requests.run("https://www.google.com"). However, this example does not
work for the 0.0.102 version of the library.

Testing:

The changes have been tested locally to ensure they are working as
intended.

Thank you for considering this pull request.
2023-03-07 21:10:23 -08:00
Harrison Chase
7ade419a0e
allow passing of messages into prompt template (#1505) 2023-03-07 21:10:12 -08:00
Harrison Chase
a4a2d79087
Harrison/rtd loader (#1513)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
2023-03-07 21:09:54 -08:00
Harrison Chase
8f21605d71
add return source docs (#1515) 2023-03-07 21:09:36 -08:00
Harrison Chase
064741db58
Harrison/fix text splitter (#1511)
Co-authored-by: ajaysolanky <ajsolanky@gmail.com>
Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>
2023-03-07 15:42:28 -08:00
Tom Dyson
e3354404ad
Fix link to Pinecone notebook (#1492) 2023-03-07 15:24:03 -08:00
Harrison Chase
3610ef2830
add fake embeddings class (#1503) 2023-03-07 15:23:46 -08:00
Ankush Gola
27104d4921
fix ChatOpenAI.agenerate (#1504) 2023-03-07 15:22:05 -08:00