Commit Graph

4708 Commits

Author SHA1 Message Date
Brian Thorne
9ee2713272
Bugfix - allow custom input variables in chat zero shot agent's prompt (#1624)
I was trying out the `chat-zero-shot-react-description` agent for
[qabot](dbbd31bb27/qabot/agents/data_query_chain.py (L35-L52))
but langchain 0.0.108 doesn't correctly use custom 'input_variables` in
the prompt template.
2023-03-13 23:07:35 -07:00
Tim Asp
b3234bf3b0
cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615)
`OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of
the pdf loaders.

Because they're all similar, I propose moving all to `pdy.py` and the
same docs/examples page.

Additionally, `PagedPDFSplitter` naming doesn't match the pattern the
rest of the loaders follow, so I renamed to `PyPDFLoader` and had it
inherit from `BasePDFLoader` so it can now load from remote file
sources.
2023-03-13 23:06:50 -07:00
Luis
562d9891ea
Add regex dict: (#1616)
This class enables us to send a dictionary containing an output key and
the expected format, which in turn allows us to retrieve the result of
the matching formats and extract specific information from it.

To exclude irrelevant information from our return dictionary, we can
prompt the LLM to use a specific command that notifies us when it
doesn't know the answer. We refer to this variable as the
"no_update_value".

Regarding the updated regular expression pattern
(r"{}:\s?([^.'\n']*).?"), it enables us to retrieve a format as 'Output
Key':'value'.

We have improved the regex by adding an optional space between ':' and
'value' with "s?", and by excluding points and line jumps from the
matches using "[^.'\n']*".
2023-03-13 23:05:39 -07:00
Harrison Chase
56aff797c0
docs req (#1647) 2023-03-13 16:03:32 -07:00
Harrison Chase
d53ff270e0
bump version to 109 (#1646) 2023-03-13 15:52:35 -07:00
Harrison Chase
df6c33d4b3
Harrison/new output parser (#1617) 2023-03-13 15:08:39 -07:00
Dennis Aumiller
039d05c808
Update types in cohere.py (#1635)
Adjust argument type and clarification on parameter limits for
attributes `frequency_penalty` and `presence_penalty`.
2023-03-13 09:08:32 -07:00
Harrison Chase
aed9f9febe
Harrison/return intermediate (#1633)
Co-authored-by: Mario Kostelac <mario@intercom.io>
2023-03-13 07:54:29 -07:00
Harrison Chase
72b461e257
improve chat error (#1632) 2023-03-13 07:43:44 -07:00
Peng Qu
cb646082ba
remove an extra whitespace (#1625) 2023-03-13 07:27:21 -07:00
Eugene Yurtsev
bd4a2a670b
Add copy button to sphinx notebooks (#1622)
This adds a copy button at the top right corner of all notebook cells in
sphinx
notebooks.
2023-03-12 21:15:07 -07:00
Ikko Eltociear Ashimine
6e98ab01e1
Fix typo in vectorstore.ipynb (#1614)
Initalize -> Initialize
2023-03-12 14:12:47 -07:00
Harrison Chase
c0ad5d13b8
bump to version 108 (#1613) 2023-03-12 09:50:45 -07:00
yakigac
acd86d33bc
Add read only shared memory (#1491)
Provide shared memory capability for the Agent.
Inspired by #1293 .

## Problem

If both Agent and Tools (i.e., LLMChain) use the same memory, both of
them will save the context. It can be annoying in some cases.


## Solution

Create a memory wrapper that ignores the save and clear, thereby
preventing updates from Agent or Tools.
2023-03-12 09:34:36 -07:00
Abhinav Upadhyay
9707eda83c
Fix docstring of FAISS constructor (#1611) 2023-03-12 09:31:40 -07:00
Kayvane Shakerifar
7e550df6d4
feat: add lookup index to csv loader to make retrieving the original … (#1612)
feat: add lookup index to csv loader to make retrieving the original csv
information easier using theDocument properties
2023-03-12 09:29:27 -07:00
Harrison Chase
c9b5a30b37
move output parsing (#1605) 2023-03-11 16:41:03 -08:00
Harrison Chase
cb04ba0136
Add support for intermediate steps to SQLDatabaseSequentialChain (#1583) (#1601)
for https://github.com/hwchase17/langchain/issues/1582

I simply added the `return_intermediate_steps` and changed the
`output_keys` function.

I added 2 simple tests, 1 for SQLDatabaseSequentialChain without the
intermediate steps and 1 with

Co-authored-by: brad-nemetski <115185478+brad-nemetski@users.noreply.github.com>
2023-03-11 15:44:41 -08:00
Harrison Chase
5903a93f3d
add convinence method to call chat model as an llm (#1604) 2023-03-11 15:04:57 -08:00
Harrison Chase
15de3e8137
Harrison/docs footer (#1600)
Co-authored-by: Albert Avetisian <albert.avetisian@gmail.com>
2023-03-11 09:18:35 -08:00
Harrison Chase
f95d551f7a
Harrison/shallow metadata (#1599)
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-03-11 09:18:25 -08:00
Harrison Chase
c6bfa00178
bump version to 107 (#1590) 2023-03-10 15:39:30 -08:00
Tim Asp
01a57198b8
[bugfix] Fix persisted chromadb vectorstore (#1444)
If a `persist_directory` param was set, chromadb would throw a warning
that ""No embedding_function provided, using default embedding function:
SentenceTransformerEmbeddingFunction". and would error with a `Illegal
instruction: 4` error.

This is on a MBP M1 13.2.1, python 3.9.

I'm not entirely sure why that error happened, but when using
`get_or_create_collection` instead of `list_collection` on our end, the
error and warning goes away and chroma works as expected.

Added bonus this is cleaner and likely more efficient.
`list_collections` builds a new `Collection` instance for each collect,
then `Chroma` would just use the `name` field to tell if the collection
existed.
2023-03-10 15:14:35 -08:00
Harrison Chase
8dba30f31e
Harrison/kwargs loaders (#1588)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-10 15:05:06 -08:00
Harrison Chase
9f78717b3c
Harrison/callbacks (#1587) 2023-03-10 12:53:09 -08:00
Harrison Chase
90846dcc28
fix chat agent (#1586) 2023-03-10 12:40:37 -08:00
Claus Thomasen
6ed16e13b1
Readded similarity_search_by_vector (#1568)
I am redoing this PR, as I made a mistake by merging the latest changes
into my fork's branch, sorry. This added a bunch of commits to my
previous PR.

This fixes #1451.
2023-03-10 12:40:14 -08:00
Harrison Chase
c1dc784a3d
buffer memory old version (#1581)
bring back an older version of memory since people seem to be using it
more widely
2023-03-10 11:27:15 -08:00
fabi.s
5b0e747f9a
Fix description of UnstructuredURLLoader & UnstructuredHTMLLoader (#1570) 2023-03-10 07:08:58 -08:00
Zach Schillaci
624c72c266
Add wikipedia tool doc (#1579) 2023-03-10 07:07:27 -08:00
Ryan Dao
a950287206
Strip trailing whitespaces in agent's stop sequences (#1566)
Fixes #1489
2023-03-09 16:36:15 -08:00
Tim Asp
30383abb12
Add CSVLoader document loader (#1573)
Simple CSV document loader which wraps `csv` reader, and preps the file
with a single `Document` per row.

The column header is prepended to each value for context which is useful
for context with embedding and semantic search
2023-03-09 16:35:18 -08:00
Zach Schillaci
cdb97f3dfb
Add Wikipedia search utility and tool (#1561)
The Python `wikipedia` package gives easy access for searching and
fetching pages from Wikipedia, see https://pypi.org/project/wikipedia/.
It can serve as an additional search and retrieval tool, like the
existing Google and SerpAPI helpers, for both chains and agents.
2023-03-09 16:34:39 -08:00
Felix Altenberger
b44c8bd969
Add optional base_url arg to GitbookLoader (#1552)
First of all, big kudos on what you guys are doing, langchain is
enabling some really amazing usecases and I'm having lot's of fun
playing around with it. It's really cool how many data sources it
supports out of the box.

However, I noticed some limitations of the current `GitbookLoader` which
this PR adresses:

The main change is that I added an optional `base_url` arg to
`GitbookLoader`. This enables use cases where one wants to crawl docs
from a start page other than the index page, e.g., the following call
would scrape all pages that are reachable via nav bar links from
"https://docs.zenml.io/v/0.35.0":

```python
GitbookLoader(
    web_page="https://docs.zenml.io/v/0.35.0", 
    load_all_paths=True,
    base_url="https://docs.zenml.io",
)
```

Previously, this would fail because relative links would be of the form
`/v/0.35.0/...` and the full link URLs would become
`docs.zenml.io/v/0.35.0/v/0.35.0/...`.

I also fixed another issue of the `GitbookLoader` where the link URLs
were constructed incorrectly as `website//relative_url` if the provided
`web_page` had a trailing slash.
2023-03-09 16:32:40 -08:00
Andriy Mulyar
c9189d354a
AtlasDB vector store documentation updates. (#1572)
- Updated errors in the AtlasDB vector store documentation
- Removed extraneous output logs in example notebook.
2023-03-09 16:31:14 -08:00
622578a022
docs: fix typo in searx tool (#1569)
Co-authored-by: blob42 <spike@w530>
2023-03-09 15:58:33 -08:00
Matt Robinson
7018806a92
feat: document loader for markdown files (#1558)
### Summary

Adds a document loader for handling markdown files. This document loader
requires `unstructured>=0.4.16`.

### Testing

```python
from langchain.document_loaders import UnstructuredMarkdownLoader

loader = UnstructuredMarkdownLoader("README.md")
loader.load()
```
2023-03-09 10:55:07 -08:00
Harrison Chase
bd335ffd64
bump version to 106 (#1562) 2023-03-09 10:20:54 -08:00
Harrison Chase
a094c49153
add chat agent (#1509) 2023-03-09 09:12:08 -08:00
Brenton Wheeler
99fe023496
docs: fix typo in modules/indexes/chain_examples/question_answering (#1551)
docs: fix typo in modules/indexes/chain_examples/question_answering


![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)
2023-03-09 09:11:43 -08:00
Harrison Chase
3ee32a01ea
Harrison/prompt layer (#1547)
Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com>
Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>
2023-03-08 21:24:27 -08:00
Harrison Chase
c844d1fd46
Harrison/chunk size (#1549)
Co-authored-by: Florian Leuerer <31259070+floleuerer@users.noreply.github.com>
2023-03-08 21:24:18 -08:00
Harrison Chase
9405af6919
Harrison/hf inf error (#1543)
Co-authored-by: Konstantin Hebenstreit <57603012+KonstantinHebenstreit@users.noreply.github.com>
2023-03-08 20:53:46 -08:00
Harrison Chase
357d808484
Harrison/remote paths pdf (#1544)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-08 20:53:37 -08:00
Harrison Chase
cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
2023-03-08 20:53:27 -08:00
Harrison Chase
b053f831cd
Harrison/contributing (#1542)
Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com>
2023-03-08 20:53:16 -08:00
Harrison Chase
523ad8d2e2
Harrison/chat history formatter1 (#1538)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
2023-03-08 20:46:37 -08:00
Graham Neubig
31303d0b11
Added other evaluation metrics for data-augmented QA (#1521)
This PR adds additional evaluation metrics for data-augmented QA,
resulting in a report like this at the end of the notebook:

![Screen Shot 2023-03-08 at 8 53 23
AM](https://user-images.githubusercontent.com/398875/223731199-8eb8e77f-5ff3-40a2-a23e-f3bede623344.png)

The score calculation is based on the
[Critique](https://docs.inspiredco.ai/critique/) toolkit, an API-based
toolkit (like OpenAI) that has minimal dependencies, so it should be
easy for people to run if they choose.

The code could further be simplified by actually adding a chain that
calls Critique directly, but that probably should be saved for another
PR if necessary. Any comments or change requests are welcome!
2023-03-08 20:41:03 -08:00
gidler
494c9d341a
[DOCS] Assorted wording, punctuation, and consistency revisions (#1443)
Contributing some small fixes I noticed while reading through the
documentation.

Thank you for a creating and maintaining this project!
2023-03-08 20:16:09 -08:00
Harrison Chase
519f0187b6
Harrison/gdrive pdf (#1433)
Co-authored-by: LM <93918064+LuisMalhadas@users.noreply.github.com>
Co-authored-by: Luis Malhadas <luis@sia.so>
2023-03-08 20:15:36 -08:00