Commit Graph

239 Commits

Author SHA1 Message Date
Bilel MEDIMEGH
a2280f321f
Docs: Fix typo in memory/key_concepts.md (#1671)
dialouge -> dialogue
2023-03-14 18:12:01 -07:00
Xin Qiu
4e13cef05a
feat: add redisearch vectorstore (#1307)
# Description

Add `RediSearch` vectorstore for LangChain

RediSearch: [RediSearch quick
start](https://redis.io/docs/stack/search/quick_start/)

# How to use

```
from langchain.vectorstores.redisearch import RediSearch

rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379")
```
2023-03-14 18:06:03 -07:00
Harrison Chase
2d098e8869
Harrison/agent eval (#1620)
Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>
2023-03-14 12:37:48 -07:00
Harrison Chase
7cf46b3fee
Harrison/convo agent (#1642) 2023-03-14 09:42:24 -07:00
Jon Luo
0a1b1806e9
sql: do not hard code the LIMIT clause in the table_info section (#1563)
Seeing a lot of issues in Discord in which the LLM is not using the
correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT`
for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc.
I think this could be due to us specifying the LIMIT statement in the
example rows portion of `table_info`. So the LLM is seeing the `LIMIT`
statement used in the prompt.
Since we can't specify each dialect's method here, I think it's fine to
just replace the `SELECT... LIMIT 3;` statement with `3 rows from
table_name table:`, and wrap everything in a block comment directly
following the `CREATE` statement. The Rajkumar et al paper wrapped the
example rows and `SELECT` statement in a block comment as well anyway.
Thoughts @fpingham?
2023-03-13 23:08:27 -07:00
Tim Asp
b3234bf3b0
cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615)
`OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of
the pdf loaders.

Because they're all similar, I propose moving all to `pdy.py` and the
same docs/examples page.

Additionally, `PagedPDFSplitter` naming doesn't match the pattern the
rest of the loaders follow, so I renamed to `PyPDFLoader` and had it
inherit from `BasePDFLoader` so it can now load from remote file
sources.
2023-03-13 23:06:50 -07:00
Harrison Chase
d53ff270e0
bump version to 109 (#1646) 2023-03-13 15:52:35 -07:00
Harrison Chase
df6c33d4b3
Harrison/new output parser (#1617) 2023-03-13 15:08:39 -07:00
Ikko Eltociear Ashimine
6e98ab01e1
Fix typo in vectorstore.ipynb (#1614)
Initalize -> Initialize
2023-03-12 14:12:47 -07:00
yakigac
acd86d33bc
Add read only shared memory (#1491)
Provide shared memory capability for the Agent.
Inspired by #1293 .

## Problem

If both Agent and Tools (i.e., LLMChain) use the same memory, both of
them will save the context. It can be annoying in some cases.


## Solution

Create a memory wrapper that ignores the save and clear, thereby
preventing updates from Agent or Tools.
2023-03-12 09:34:36 -07:00
Harrison Chase
c9b5a30b37
move output parsing (#1605) 2023-03-11 16:41:03 -08:00
Harrison Chase
90846dcc28
fix chat agent (#1586) 2023-03-10 12:40:37 -08:00
Zach Schillaci
624c72c266
Add wikipedia tool doc (#1579) 2023-03-10 07:07:27 -08:00
Tim Asp
30383abb12
Add CSVLoader document loader (#1573)
Simple CSV document loader which wraps `csv` reader, and preps the file
with a single `Document` per row.

The column header is prepended to each value for context which is useful
for context with embedding and semantic search
2023-03-09 16:35:18 -08:00
Andriy Mulyar
c9189d354a
AtlasDB vector store documentation updates. (#1572)
- Updated errors in the AtlasDB vector store documentation
- Removed extraneous output logs in example notebook.
2023-03-09 16:31:14 -08:00
Matt Robinson
7018806a92
feat: document loader for markdown files (#1558)
### Summary

Adds a document loader for handling markdown files. This document loader
requires `unstructured>=0.4.16`.

### Testing

```python
from langchain.document_loaders import UnstructuredMarkdownLoader

loader = UnstructuredMarkdownLoader("README.md")
loader.load()
```
2023-03-09 10:55:07 -08:00
Harrison Chase
bd335ffd64
bump version to 106 (#1562) 2023-03-09 10:20:54 -08:00
Harrison Chase
a094c49153
add chat agent (#1509) 2023-03-09 09:12:08 -08:00
Brenton Wheeler
99fe023496
docs: fix typo in modules/indexes/chain_examples/question_answering (#1551)
docs: fix typo in modules/indexes/chain_examples/question_answering


![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)
2023-03-09 09:11:43 -08:00
Harrison Chase
3ee32a01ea
Harrison/prompt layer (#1547)
Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com>
Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>
2023-03-08 21:24:27 -08:00
Harrison Chase
cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
2023-03-08 20:53:27 -08:00
Harrison Chase
523ad8d2e2
Harrison/chat history formatter1 (#1538)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
2023-03-08 20:46:37 -08:00
gidler
494c9d341a
[DOCS] Assorted wording, punctuation, and consistency revisions (#1443)
Contributing some small fixes I noticed while reading through the
documentation.

Thank you for a creating and maintaining this project!
2023-03-08 20:16:09 -08:00
Harrison Chase
c4a557bdd4
add concept of prompt collection (#1507) 2023-03-08 08:31:29 -08:00
Ivan
97e3666e0d
changed requests.run to requests.get (#1485)
This pull request proposes an update to the Lightweight wrapper
library's documentation. The current documentation provides an example
of how to use the library's requests.run method, as follows:
requests.run("https://www.google.com"). However, this example does not
work for the 0.0.102 version of the library.

Testing:

The changes have been tested locally to ensure they are working as
intended.

Thank you for considering this pull request.
2023-03-07 21:10:23 -08:00
Harrison Chase
3610ef2830
add fake embeddings class (#1503) 2023-03-07 15:23:46 -08:00
Harrison Chase
4f41e20f09
memory docs (#1501) 2023-03-07 11:02:46 -08:00
Harrison Chase
f276bfad8e
Harrison/chat memory (#1495) 2023-03-07 09:02:40 -08:00
Harrison Chase
7bec461782
Harrison/memory refactor (#1478)
moves memory to own module, factors out common stuff
2023-03-07 07:59:37 -08:00
Harrison Chase
0e21463f07
(rfc) chat models (#1424)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-06 08:34:24 -08:00
Harrison Chase
63a5614d23
Harrison/simple memory (#1435)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-04 08:15:52 -08:00
Harrison Chase
a1b9dfc099
Harrison/similarity search chroma (#1434)
Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
2023-03-04 08:10:15 -08:00
Tim Asp
23231d65a9
Add PyMuPDF PDF loader (#1426)
Different PDF libraries have different strengths and weaknesses. PyMuPDF
does a good job at extracting the most amount of content from the doc,
regardless of the source quality, extremely fast (especially compared to
Unstructured).

https://pymupdf.readthedocs.io/en/latest/index.html
2023-03-03 20:59:28 -08:00
3d54b05863
searx: add install instructions, update doc and notebooks (#1420)
- Added instructions on setting up self hosted searx
- Add notebook example with agent
- Use `localhost:8888` as example url to stay consistent since public
instances are not really usable.

Co-authored-by: blob42 <spike@w530>
2023-03-03 20:57:50 -08:00
Tim Asp
bca0935d90
[docs] fix minor import error (#1425) 2023-03-03 16:10:07 -08:00
Jason Gill
1989e7d4c2
Update examples to prevent confusing missing _type warning (#1391)
The YAML and JSON examples of prompt serialization now give a strange
`No '_type' key found, defaulting to 'prompt'` message when you try to
run them yourself or copy the format of the files. The reason for this
harmless warning is that the _type key was not in the config files,
which means they are parsed as a standard prompt.

This could be confusing to new users (like it was confusing to me after
upgrading from 0.0.85 to 0.0.86+ for my few_shot prompts that needed a
_type added to the example_prompt config), so this update includes the
_type key just for clarity.

Obviously this is not critical as the warning is harmless, but it could
be confusing to track down or be interpreted as an error by a new user,
so this update should resolve that.
2023-03-02 07:39:57 -08:00
Harrison Chase
dda5259f68
bump version to 0.0.99 (#1390) 2023-03-02 07:25:59 -08:00
Kacper Łukawski
9ac442624c
Add Qdrant named arguments (#1386)
This PR:
- Increases `qdrant-client` version to 1.0.4
- Introduces custom content and metadata keys (as requested in #1087)
- Moves all the `QdrantClient` parameters into the method parameters to
simplify code completion
2023-03-02 07:05:14 -08:00
Ankush Gola
fe30be6fba
add async and streaming support to OpenAIChat (#1378)
title says it all
2023-03-01 21:55:43 -08:00
Lakshya Agarwal
cfed0497ac
Minor grammatical fixes (#1325)
Fixed typos and links in a few places across documents
2023-03-01 21:18:09 -08:00
Harrison Chase
1cd8996074
Harrison/summarizer chain (#1356)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-01 20:59:07 -08:00
Harrison Chase
4b5e850361
chatgpt wrapper (#1367) 2023-03-01 11:47:01 -08:00
Harrison Chase
4d4b43cf5a
fix doc names (#1354) 2023-03-01 09:40:31 -08:00
Harrison Chase
fe7dbecfe6
pandas and csv agents (#1353) 2023-02-28 22:19:11 -08:00
Harrison Chase
02ec72df87
improve docs (#1351) 2023-02-28 21:37:18 -08:00
Jon Luo
92ab27e4b8
sql doc formatting (#1350)
My bad, missed a few tabs between the two PRs
2023-02-28 19:54:46 -08:00
Ankush Gola
82baecc892
Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150)
This PR adds 

* `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting
with a sql database. This builds off of `SQLDatabaseChain`. The main
advantages are 1) answering general questions about the db, 2) access to
a tool for double checking queries, and 3) recovering from errors
* `ZeroShotAgent.as_json_agent` which returns an agent for interacting
with json blobs.
* Several examples in notebooks

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-02-28 19:44:39 -08:00
Jon Luo
35f1e8f569
separate columns by tabs instead of single space in sql sample rows (#1348)
Use tabs to separate columns instead of a single space - confusing when
there are spaces in a cell
2023-02-28 18:59:53 -08:00
Jon Luo
5bf8772f26
add option to use user-defined SQL table info (#1347)
Currently, table information is gathered through SQLAlchemy as complete
table DDL and a user-selected number of sample rows from each table.
This PR adds the option to use user-defined table information instead of
automatically collecting it. This will use the provided table
information and fall back to the automatic gathering for tables that the
user didn't provide information for.

Off the top of my head, there are a few cases where this can be quite
useful:
- The first n rows of a table are uninformative, or very similar to one
another. In this case, hand-crafting example rows for a table such that
they provide the good, diverse information can be very helpful. Another
approach we can think about later is getting a random sample of n rows
instead of the first n rows, but there are some performance
considerations that need to be taken there. Even so, hand-crafting the
sample rows is useful and can guarantee the model sees informative data.
- The user doesn't want every column to be available to the model. This
is not an elegant way to fulfill this specific need since the user would
have to provide the table definition instead of a simple list of columns
to include or ignore, but it does work for this purpose.
- For the developers, this makes it a lot easier to compare/benchmark
the performance of different prompting structures for providing table
information in the prompt.

These are cases I've run into myself (particularly cases 1 and 3) and
I've found these changes useful. Personally, I keep custom table info
for a few tables in a yaml file for versioning and easy loading.

Definitely open to other opinions/approaches though!
2023-02-28 18:58:04 -08:00
Harrison Chase
786852e9e6
partial variables (#1308) 2023-02-28 08:40:35 -08:00