Commit Graph

610 Commits

Author SHA1 Message Date
Chirag Bhatia
f174aa7712
Fix broken Cerebrium link in documentation (#3554)
The current hyperlink has a typo. This PR contains the corrected
hyperlink to Cerebrium docs
2023-04-26 08:11:58 -07:00
Zander Chase
d6d697a41b
Sentence Transformers Aliasing (#3541)
The sentence transformers was a dup of the HF one. 

This is a breaking change (model_name vs. model) for anyone using
`SentenceTransformerEmbeddings(model="some/nondefault/model")`, but
since it was landed only this week it seems better to do this now rather
than doing a wrapper.
2023-04-25 23:29:20 -07:00
Eric Peter
603ea75bcd
Fix docs error for google drive loader (#3574) 2023-04-25 22:52:59 -07:00
Harrison Chase
f4829025fe
add feast nb (#3565) 2023-04-25 17:46:06 -07:00
Harrison Chase
52d95ec47d
anthropic docs: deprecated LLM, add chat model (#3549) 2023-04-25 16:11:14 -07:00
apurvsibal
af7906f100
Update Alchemy Key URL (#3559)
Update Alchemy Key URL in Blockchain Document Loader. I want to say
thank you for the incredible work the LangChain library creators have
done.

I am amazed at how seamlessly the Loader integrates with Ethereum
Mainnet, Ethereum Testnet, Polygon Mainnet, and Polygon Testnet, and I
am excited to see how this technology can be extended in the future.

@hwchase17 - Please let me know if I can improve or if I have missed any
community guidelines in making the edit? Thank you again for your hard
work and dedication to the open source community.
2023-04-25 16:08:42 -07:00
Tiago De Gaspari
4d53cefbe9
Fix agents' notebooks outputs (#3517)
Fix agents' notebooks to make the answer reflect what is being asked by
the user.
2023-04-25 16:06:47 -07:00
engkheng
5680fb6894
Fix typo in Prompts Templates Getting Started page (#3514)
`from_templates` -> `from_template`
2023-04-25 16:05:13 -07:00
Zander Chase
b49ee372f1
Change Chain Docs (#3537)
Co-authored-by: engkheng <60956360+outday29@users.noreply.github.com>
2023-04-25 10:51:09 -07:00
leo-gan
6b28cbe058
improved arxiv (#3495)
Improved `arxiv/tool.py` by adding more specific information to the
`description`. It would help with selecting `arxiv` tool between other
tools.
Improved `arxiv.ipynb` with more useful descriptions.
2023-04-25 08:09:17 -07:00
Harrison Chase
0fc0aa62f2
Harrison/blockchain docloader (#3491)
Co-authored-by: Jon Saginaw <saginawj@users.noreply.github.com>
2023-04-25 08:07:06 -07:00
Harrison Chase
bee59b4689
Updated missing refactor in docs "return_map_steps" (#2956) (#3469)
Minor rename in the documentation that was overlooked when refactoring.

---------

Co-authored-by: Ehmad Zubair <ehmad@cogentlabs.co>
2023-04-24 22:28:47 -07:00
Harrison Chase
707741de58
Harrison/prediction guard (#3490)
Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>
2023-04-24 22:27:22 -07:00
Maxwell Mullin
696f840426
GuessedAtParserWarning from RTD document loader documentation example (#3397)
Addresses #3396 by adding 

`features='html.parser'` in example
2023-04-24 21:54:39 -07:00
engkheng
06f6c49e61
Improve llm_chain.ipynb and getting_started.ipynb for chains docs (#3380)
My attempt at improving the `Chain`'s `Getting Started` docs and
`LLMChain` docs. Might need some proof-reading as English is not my
first language.

In LLM examples, I replaced the example use case when a simpler one
(shorter LLM output) to reduce cognitive load.
2023-04-24 21:49:55 -07:00
jrhe
980cc41709
Adds progress bar using tqdm to directory_loader (#3349)
Approach copied from `WebBaseLoader`. Assumes the user doesn't have
`tqdm` installed.
2023-04-24 21:42:42 -07:00
engkheng
7c2c73af5f
Update Getting Started page of Prompt Templates (#3298)
Updated `Getting Started` page of `Prompt Templates` to showcase more
features provided by the class. Might need some proof reading because
apparently English is not my first language.
2023-04-24 21:10:22 -07:00
Zander Chase
416f3bdf11
Vwp/alpaca streaming (#3468)
Co-authored-by: Luke Stanley <306671+lukestanley@users.noreply.github.com>
2023-04-24 16:27:51 -07:00
Harrison Chase
675d86aa11
show how to use memory in convo chain (#3463) 2023-04-24 13:29:51 -07:00
Eduard van Valkenburg
46c9636012
small constructor change and updated notebook (#3426)
small change in the pydantic definitions, same api. 

updated notebook with right constructure and added few shot example
2023-04-24 10:42:38 -07:00
Davit Buniatyan
2c0023393b
Deep Lake mini upgrades (#3375)
Improvements
* set default num_workers for ingestion to 0
* upgraded notebooks for avoiding dataset creation ambiguity
* added `force_delete_dataset_by_path`
* bumped deeplake to 3.3.0
* creds arg passing to deeplake object that would allow custom S3

Notes
* please double check if poetry is not messed up (thanks!)

Asks
* Would be great to create a shared slack channel for quick questions

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-04-23 21:23:54 -07:00
Haste171
93d53e417a
Update unstructured_file.ipynb (#3377)
Fix typo in docs
2023-04-23 21:22:38 -07:00
Zander Chase
20f530e9c5
Add Sentence Transformers Embeddings (#3409)
Add embeddings based on the sentence transformers library.
Add a notebook and integration tests.

Co-authored-by: khimaros <me@khimaros.com>
2023-04-23 18:25:20 -07:00
Harrison Chase
e5ffbee5eb
Harrison/hf document loader (#3394)
Co-authored-by: Azam Iftikhar <azamiftikhar1000@gmail.com>
2023-04-23 10:17:43 -07:00
Harrison Chase
a6664be79c
Harrison/myscale (#3352)
Co-authored-by: Fangrui Liu <fangruil@moqi.ai>
Co-authored-by: 刘 方瑞 <fangrui.liu@outlook.com>
Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>
2023-04-22 09:17:38 -07:00
Honkware
a5ad1c270f
Add ChatGPT Data Loader (#3336)
This pull request adds a ChatGPT document loader to the document loaders
module in `langchain/document_loaders/chatgpt.py`. Additionally, it
includes an example Jupyter notebook in
`docs/modules/indexes/document_loaders/examples/chatgpt_loader.ipynb`
which uses fake sample data based on the original structure of the
`conversations.json` file.

The following files were added/modified:
- `langchain/document_loaders/__init__.py`
- `langchain/document_loaders/chatgpt.py`
- `docs/modules/indexes/document_loaders/examples/chatgpt_loader.ipynb`
-
`docs/modules/indexes/document_loaders/examples/example_data/fake_conversations.json`

This pull request was made in response to the recent release of ChatGPT
data exports by email:
https://help.openai.com/en/articles/7260999-how-do-i-export-my-chatgpt-history
2023-04-22 09:06:24 -07:00
Zander Chase
61d40ba042
Fix Sagemaker Batch Endpoints (#3249)
Add different typing for @evandiewald 's heplful PR

---------

Co-authored-by: Evan Diewald <evandiewald@gmail.com>
2023-04-22 08:49:51 -07:00
Richy Wang
88a8f59aa7
Add a full PostgresSQL syntax database 'AnalyticDB' as vector store. (#3135)
Hi there!
I'm excited to open this PR to add support for using a fully Postgres
syntax compatible database 'AnalyticDB' as a vector.
As AnalyticDB has been proved can be used with AutoGPT,
ChatGPT-Retrieve-Plugin, and LLama-Index, I think it is also good for
you.
AnalyticDB is a distributed Alibaba Cloud-Native vector database. It
works better when data comes to large scale. The PR includes:

- [x]  A new memory: AnalyticDBVector
- [x]  A suite of integration tests verifies the AnalyticDB integration

I have read your [contributing
guidelines](72b7d76d79/.github/CONTRIBUTING.md).
And I have passed the tests below
- [x]  make format
- [x]  make lint
- [x]  make coverage
- [x]  make test
2023-04-22 08:25:41 -07:00
Harrison Chase
cc6fe18152
Harrison/power bi (#3205)
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
2023-04-22 08:24:48 -07:00
Daniel Chalef
61e09229c8
args_schema type hint on subclassing (#3323)
per https://github.com/hwchase17/langchain/issues/3297

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-21 15:51:13 -07:00
Paul Garner
aa9d5707e0
Add PythonLoader which auto-detects encoding of Python files (#3311)
This PR contributes a `PythonLoader`, which inherits from
`TextLoader` but detects and sets the encoding automatically.
2023-04-21 10:47:57 -07:00
Daniel Chalef
1ecbeec24e
Fix example match_documents fn table name, grammar (#3294)
ref
https://github.com/hwchase17/langchain/pull/3100#issuecomment-1517086472

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-21 10:21:23 -07:00
Harrison Chase
87544d2378
gradio tools (#3255) 2023-04-20 22:09:15 -07:00
Davis Chase
46542dc774
Contextual compression retriever (#2915)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-20 17:01:14 -07:00
Harrison Chase
2dbb5261b5 wikibase agent 2023-04-20 15:37:56 -07:00
Harrison Chase
8f22949dc4 update nnotebook title 2023-04-20 11:53:23 -07:00
Harrison Chase
b7f2061736
Harrison/google places (#3207)
Co-authored-by: Cao Hoang <65607230+cnhhoang850@users.noreply.github.com>
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-20 07:57:07 -07:00
Harrison Chase
d2520a5f1e
Harrison/ddg (#3206)
Co-authored-by: itai <itai.marks@gmail.com>
Co-authored-by: Itai Marks <itaim@users.noreply.github.com>
Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com>
Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>
Co-authored-by: Adilzhan Ismailov <13088690+aismlv@users.noreply.github.com>
Co-authored-by: Justin Flick <Justinjayflick@gmail.com>
Co-authored-by: Justin Flick <jflick@homesite.com>
2023-04-19 21:32:26 -07:00
Harrison Chase
36c10f8a52
nits (#3203) 2023-04-19 21:14:46 -07:00
Daniel Chalef
27cdf8d675
supabase vectorstore - first cut (#3100)
First cut of a supabase vectorstore loosely patterned on the langchainjs
equivalent. Doesn't support async operations which is a limitation of
the supabase python client.

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-19 21:06:44 -07:00
Harrison Chase
96809b5794
Harrison/discord loader (#3200)
Co-authored-by: Rajtilak Bhattacharjee <rajtilak.blog@gmail.com>
2023-04-19 21:04:12 -07:00
Zander Chase
c757c3cde4
Add HuggingFace Examples (#3187)
Add a Pipeline example and add other models in th ehub notebook

To close issue
[#3077](https://github.com/hwchase17/langchain/issues/3099)
2023-04-19 17:08:10 -07:00
Donald "Max" Ziff
6adf2d1c39
first draft (#2690)
There is a long way to go on this!

---------

Co-authored-by: Max Ziff <max.ziff@concur.com>
2023-04-19 17:06:55 -07:00
Harrison Chase
68cd37175e
Harrison/arxiv tool (#3186)
Co-authored-by: leo-gan <leo.gan.57@gmail.com>
2023-04-19 16:53:34 -07:00
Pranabendra Prasad Chandra
7b1f0656b8
Fix typo in ElasticSearch sample notebook (#3171)
Added missing parenthesis in example notebook
[elasticsearch.ipynb](https://github.com/hwchase17/langchain/blob/master/docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb)
2023-04-19 16:06:31 -07:00
Happydog
5e66d05928
Fix: typo in custom_mrkl_agents.ipynb document (#3159)
I have noticed a typo error in the `custom_mrkl_agents.ipynb` document
while trying the example from the documentation page. As a result, I
have opened a pull request (PR) to address this minor issue, even though
it may seem insignificant 😂.
2023-04-19 14:57:33 -07:00
Quentin Pleplé
126d7f11dd
Fix notebook example (#3142)
The following calls were throwing an exception:


575b717d10/docs/use_cases/evaluation/agent_vectordb_sota_pg.ipynb (L192)


575b717d10/docs/use_cases/evaluation/agent_vectordb_sota_pg.ipynb (L239)

Exception:

```
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[14], line 1
----> 1 chain_sota = RetrievalQA.from_chain_type(llm=OpenAI(temperature=0), chain_type="stuff", retriever=vectorstore_sota, input_key="question")

File ~/github/langchain/venv/lib/python3.9/site-packages/langchain/chains/retrieval_qa/base.py:89, in BaseRetrievalQA.from_chain_type(cls, llm, chain_type, chain_type_kwargs, **kwargs)
     85 _chain_type_kwargs = chain_type_kwargs or {}
     86 combine_documents_chain = load_qa_chain(
     87     llm, chain_type=chain_type, **_chain_type_kwargs
     88 )
---> 89 return cls(combine_documents_chain=combine_documents_chain, **kwargs)

File ~/github/langchain/venv/lib/python3.9/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for RetrievalQA
retriever
  instance of BaseRetriever expected (type=type_error.arbitrary_type; expected_arbitrary_type=BaseRetriever)
```

The vectorstores had to be converted to retrievers:
`vectorstore_sota.as_retriever()` and `vectorstore_pg.as_retriever()`.

The PR also:
- adds the file `paul_graham_essay.txt` referenced by this notebook
- adds to gitignore *.pkl and *.bin files that are generated by this
notebook

Interestingly enough, the performance of the prediction greatly
increased (new version of langchain or ne version of OpenAI models since
the last run of the notebook): from 19/33 correct to 28/33 correct!
2023-04-19 08:55:06 -07:00
Jakub Kukul
599e17cea8
Working example for Anthropic (#3151)
would be great if the provided example worked out of the box 😄
2023-04-19 08:52:33 -07:00
Zander Chase
8a050ba4bf
Notebook Nit (#3125)
The required arg is `question` not `query`
2023-04-18 22:43:52 -07:00
Zander Chase
90ef705ced
Update Tool Input (#3103)
- Remove dynamic model creation in the `args()` property. _Only infer
for the decorator (and add an argument to NOT infer if someone wishes to
only pass as a string)_
- Update the validation example to make it less likely to be
misinterpreted as a "safe" way to run a repl


There is one example of "Multi-argument tools" in the custom_tools.ipynb
from yesterday, but we could add more. The output parsing for the base
MRKL agent hasn't been adapted to handle structured args at this point
in time

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-18 18:18:33 -07:00
Harrison Chase
aad0a498ac
Harrison/output error (#3094)
Co-authored-by: yummydum <sumita@nowcast.co.jp>
2023-04-18 08:59:56 -07:00
Harrison Chase
1c1b77bbfe
Harrison/discord (#3092)
Co-authored-by: Rajtilak Bhattacharjee <rajtilak.blog@gmail.com>
2023-04-18 08:19:23 -07:00
James O'Dwyer
0257829776
Bump Metal to use index_id (#3089)
## Use `index_id` over `app_id`
We made a major update to index + retrieve based on Metal Indexes
(instead of apps). With this change, we accept an index instead of an
app in each of our respective core apis. [More details
here](https://docs.getmetal.io/api-reference/core/indexing).
2023-04-18 07:28:13 -07:00
Hamza Kyamanywa
064a1db2b2
[Documentation] Show how to initiate pinecone from an existing index (#3070)
## What is this PR for:
* This PR adds a commented line of code in the documentation that shows
how someone can use the Pinecone client with an already existing
Pinecone index
* The documentation currently only shows how to create a pinecone index
from langchain documents but not how to load one that already exists
2023-04-18 07:27:46 -07:00
Harrison Chase
894c272a56 tool validation logic 2023-04-17 21:59:32 -07:00
Harrison Chase
1920536d99
Harrison/obsidian (#3060)
Co-authored-by: Ben Hofferber <hofferber.ben@gmail.com>
2023-04-17 21:57:32 -07:00
Zander Chase
93c0514105
Add Twitter Tweet Loader (#3050)
Reformatted version of #3022

---------

Co-authored-by: LiaoKong <568250549@qq.com>
2023-04-17 21:44:54 -07:00
Harrison Chase
db968284f8
tools refactor (#2961)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 21:35:29 -07:00
Harrison Chase
b140d366e3
Harrison/jira (#3055)
Co-authored-by: William Li <32046231+zywilliamli@users.noreply.github.com>
Co-authored-by: William Li <twelvehertz@Williams-MacBook-Air.local>
2023-04-17 21:14:40 -07:00
leo-gan
c33883a40e
fixed the Cohere example title (#3053)
- fixed the Cohere example title (bug in #3041, sorry for it)
- fixed the runhouse.ipynb file name inconsistency
2023-04-17 21:02:52 -07:00
Harrison Chase
5107fac656
Harrison/rec gd (#3054)
Co-authored-by: Benjamin Scholtz <BenSchZA@users.noreply.github.com>
2023-04-17 21:02:35 -07:00
Harrison Chase
db7106cb79
Harrison/image caption loader (#3051)
Co-authored-by: Sean Saito <saitosean@ymail.com>
2023-04-17 20:49:10 -07:00
leo-gan
5420a0e404
updated langchain/docs/modules/models/llms/integrations/ notebooks (#3041)
- Updated `langchain/docs/modules/models/llms/integrations/` notebooks:
added links to the original sites, the install information, etc.
- Added the `nlpcloud` notebook.
- Removed "Example" from Titles of some notebooks, so all notebook
titles are consistent.
2023-04-17 20:25:32 -07:00
Azam Iftikhar
471ef84835
Examples fixed (#3042)
### https://github.com/hwchase17/langchain/issues/2997

Replaced `conversation.memory.store` to
`conversation.memory.entity_store.store`
As conversation.memory.store doesn't exist  and re-ran  the whole file.
2023-04-17 20:25:01 -07:00
Harrison Chase
afd3e70ae5
Harrison/confluent loader (#2994)
Co-authored-by: Justin Flick <Justinjayflick@gmail.com>
2023-04-17 20:23:45 -07:00
Harrison Chase
f1d15b4a75 update nb 2023-04-16 22:09:31 -07:00
vowelparrot
99c0382209
Generative Characters (#2859)
Add a time-weighted memory retriever and a notebook that approximates a
Generative Agent from https://arxiv.org/pdf/2304.03442.pdf


The "daily plan" components are removed for now since they are less
useful without a virtual world, but the memory is an interesting
component to build off.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-16 21:41:00 -07:00
Jan Backes
a9310a3e8b
Add Annoy as VectorStore (#2939)
Adds Annoy (https://github.com/spotify/annoy) as vector Store. 

RESOLVES hwchase17/langchain#2842

discord ref:
https://discord.com/channels/1038097195422978059/1051632794427723827/1096089994168377354

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-16 13:44:04 -07:00
Harrison Chase
e12e00df12
use output parsers in agents (#2987) 2023-04-16 13:15:21 -07:00
Mauricio Scheffer
7302787a7b
Fix docs for parse_with_prompt (#2986) 2023-04-16 12:57:04 -07:00
Azam Iftikhar
1e655d5ffd
Fixed Regular expression (#2933)
###  https://github.com/hwchase17/langchain/issues/2898
Instead of `"Action" and "Action Input"` keywords, we are getting
`"Action 1" and "Action 1 Input" or "Action Input 1" ` from
**gpt-3.5-turbo**

 Updated the Regular expression to handle all these cases
 
Attaching the screenshot of the result from the updated Regular
expression.
 
<img width="1036" alt="Screenshot 2023-04-16 at 1 39 00 AM"
src="https://user-images.githubusercontent.com/55012400/232251184-23ca6cc2-7229-411a-b6e1-53b2f5ec18a5.png">
2023-04-16 09:16:50 -07:00
Harrison Chase
88d3ce12b8
Harrison/diffbot (#2984)
Co-authored-by: Manuel Saelices <msaelices@gmail.com>
2023-04-16 09:11:24 -07:00
Chetanya Rastogi
aead062a70
Add an example tutorial for using PDFMinerPDFasHTMLLoader (#2960)
Last week I added the `PDFMinerPDFasHTMLLoader`. I am adding some
example code in the notebook to serve as a tutorial for how that loader
can be used to create snippets of a pdf that are structured within
sections. All the other loaders only provide the `Document` objects
segmented by pages but that's pretty loose given the amount of other
metadata that can be extracted.

With the new loader, one can leverage font-size of the text to decide
when a new sections starts and can segment the text more semantically as
shown in the tutorial notebook. The cell shows that we are able to find
the content of entire section under **Related Work** for the example pdf
which is spread across 2 pages and hence is stored as two separate
documents by other loaders
2023-04-16 08:34:39 -07:00
Harrison Chase
274b25c010
SVM retriever (#2947) (#2949)
Add SVM retriever class, based on
https://github.com/karpathy/randomfun/blob/master/knn_vs_svm.ipynb.

Testing still WIP, but the logic is correct (I have a local
implementation outside of Langchain working).

---------

Co-authored-by: Lance Martin <122662504+PineappleExpress808@users.noreply.github.com>
Co-authored-by: rlm <31treehaus@31s-MacBook-Pro.local>
2023-04-15 12:49:59 -07:00
Davit Buniatyan
b3a5b51728
[minor] Deep Lake auth improvements in docs, kwargs pass, faster tests (#2927)
Minor cosmetic changes 
- Activeloop environment cred authentication in notebooks with
`getpass.getpass` (instead of CLI which not always works)
- much faster tests with Deep Lake pytest mode on 
- Deep Lake kwargs pass

Notes
- I put pytest environment creds inside `vectorstores/conftest.py`, but
feel free to suggest a better location. For context, if I put in
`test_deeplake.py`, `ruff` doesn't let me to set them before import
deeplake

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-04-15 10:49:16 -07:00
Nahin Khan
ad3973a3b8
Fix typo (#2942) 2023-04-15 08:53:25 -07:00
Harrison Chase
cf2789d86d
delete antropic chat notebook (#2945) 2023-04-15 08:48:51 -07:00
Hai Nguyen Mau
0aa828b1dc
typo fix (#2937)
missing w in link
2023-04-15 08:31:43 -07:00
Ankush Gola
ec59e9d886
Fix ChatAnthropic stop_sequences error (#2919) (#2920)
Note to self: Always run integration tests, even on "that last minute
change you thought would be safe" :)

---------

Co-authored-by: Mike Lambert <mike.lambert@anthropic.com>
2023-04-14 17:22:01 -07:00
Akash NP
13a0ed064b
add encoding to avoid UnicodeDecodeError (#2908)
**About**
Specify encoding to avoid UnicodeDecodeError when reading .txt for users
who are following the tutorial.

**Reference**
```
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1205: character maps to <undefined>
```

**Environment**
OS: Win 11
Python: 3.8
2023-04-14 16:36:03 -07:00
Kwuang Tang
a508afa91c
Add file filter param to Git loader (#2904)
Allows users to specify what files should be loaded instead of
indiscriminately loading the entire repo.

extends #2851 

NOTE: for reviewers, `hide whitespace` option recommended since I
changed the indentation of an if-block to use `continue` instead so it
looks less like a Christmas tree :)
2023-04-14 10:45:54 -07:00
Harrison Chase
8fef69296d
nits (#2873) 2023-04-14 07:55:12 -07:00
Harrison Chase
0a38bbc750
updates to vectorstore memory (#2875) 2023-04-14 07:54:57 -07:00
Ikko Eltociear Ashimine
203c0eb2ae
docs: update getting_started.ipynb (#2883)
HuggingFace -> Hugging Face
2023-04-14 07:40:26 -07:00
Harrison Chase
07d7096de6
Harrison/playwright (#2871)
Co-authored-by: Manuel Saelices <msaelices@gmail.com>
2023-04-13 22:15:03 -07:00
ecneladis
74abeb8c53
Update output in Git notebook (#2868)
Supplemental to https://github.com/hwchase17/langchain/pull/2851.
Updates one notebook cell that I forgot to commit before.
2023-04-13 21:56:17 -07:00
ecneladis
016738e676
Add GitLoader (#2851) 2023-04-13 21:39:20 -07:00
vowelparrot
bf0887c486
Add Slack Directory Loader (#2841)
Fixes linting issue from #2835 

Adds a loader for Slack Exports which can be a very valuable source of
knowledge to use for internal QA bots and other use cases.

```py
# Export data from your Slack Workspace first.
from langchain.document_loaders import SLackDirectoryLoader

SLACK_WORKSPACE_URL = "https://awesome.slack.com"

loader = ("Slack_Exports", SLACK_WORKSPACE_URL)
docs = loader.load()
```
2023-04-13 21:31:59 -07:00
Jon Luo
f3180f05f9
Update sql chain notebook to clarify use of SQLAlchemy for connections (#2850)
Have seen questions about whether or not the `SQLDatabaseChain` supports
more than just sqlite, which was unclear in the docs, so tried to
clarify that and how to connect to other dialects.
2023-04-13 11:46:59 -07:00
Tim Asp
70ffe470aa
Add easy print method to openai callback (#2848)
Found myself constantly copying the snippet outputting all the callback
tracking details. so adding a simple way to output the full context
2023-04-13 11:28:42 -07:00
vowelparrot
82d1d5f24e
Fix grammar in Vector Memory Docs (#2847) 2023-04-13 11:00:09 -07:00
Tim Asp
53dc157145
[Docs] minor fixes to loaders links and rst warnings (#2846)
The doc loaders index was picking up a bunch of subheadings because I
mistakenly made the MD titles H1s. Fixed that.

also the easy minor warnings from docs_build
2023-04-13 10:54:40 -07:00
Harrison Chase
1609950597
Harrison/retriever memory (#2804)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-13 10:03:43 -07:00
Rounak Datta
7688bf9182
WhatsApp document loader - update regex (#2776)
I was testing out the WhatsApp Document loader, and noticed that
sometimes the date is of the following format (notice the additional
underscore):
```
3/24/23, 1:54_PM - +91 99999 99999 joined using this group's invite link
3/24/23, 6:29_PM - +91 99999 99999: When are we starting then?
```

Wierdly, the underscore is visible in Vim, but not on editors like
VSCode. I presume it is some unusual character/line terminator.
Nevertheless, I think handling this edge case will make the document
loader more robust.
2023-04-13 09:48:32 -07:00
vowelparrot
2db9b7a45d
Revert "Add Slack Directory Loader (#2835)" (#2839)
This reverts commit a6f767ae7a.

To fix the linting error.
2023-04-13 09:42:54 -07:00
Azam Iftikhar
2a89dc8c1c
Fixing factually incorrect example (#2810)
### https://github.com/hwchase17/langchain/issues/2802
It appears that Google's Flan model may not perform as well as other
models, I used a simple example to get factually correct answer.
2023-04-13 08:42:39 -07:00
vowelparrot
a6f767ae7a
Add Slack Directory Loader (#2835)
Adds a loader for Slack Exports which can be a very valuable source of
    knowledge to use for internal QA bots and other use cases.

    ```py
    # Export data from your Slack Workspace first.
    from langchain.document_loaders import SLackDirectoryLoader

    SLACK_WORKSPACE_URL = "https://awesome.slack.com"

    loader = ("Slack_Exports", SLACK_WORKSPACE_URL)
    docs = loader.load()
```

---------

Co-authored-by: Mikhail Dubov <mikhail@chattermill.io>
2023-04-13 08:39:07 -07:00
Harrison Chase
9a96691803 cr 2023-04-13 08:23:33 -07:00
Harrison Chase
b2bc5ef56a
agent refactor (#2801) 2023-04-12 21:21:41 -07:00
Harrison Chase
e49f1e628c
Harrison/gpt cache (#2744)
Co-authored-by: SimFG <bang.fu@zilliz.com>
2023-04-12 14:16:58 -07:00
Harrison Chase
a2d729e537 cr 2023-04-12 13:44:21 -07:00
Harrison Chase
7adbc4fbb4
agent memory (#2792) 2023-04-12 12:51:15 -07:00
wangml999
fa0c9390c2
Update custom_agent.ipynb (#2767)
Fixed an issue the agent is not taking the user's question as input.
2023-04-12 09:13:46 -07:00
Nuhman Pk
789cc314c5
Typo (#2747) 2023-04-12 09:06:30 -07:00
Nuhman Pk
b5bbe601fb
Update chatgpt_plugins.ipynb (#2745)
Changed deprecated requests to requests_all in plugins example
2023-04-11 22:45:31 -07:00
Harrison Chase
b38a6ea7df
Harrison/apply llm flag (#2743)
Co-authored-by: Nick Gibb <gibbnick@gmail.com>
Co-authored-by: Nick Gibb <nick.gibb@bluedot.global>
2023-04-11 22:02:37 -07:00
Harrison Chase
507cee5ee5
Harrison/pinecone hybrid update (#2742)
Co-authored-by: acatav <39461369+acatav@users.noreply.github.com>
Co-authored-by: Amnon Catav <catav.amnon1@gmail.com>
2023-04-11 21:32:17 -07:00
vowelparrot
709f26b69e
Added bilibili loader (#2673) (#2724)
I've added a bilibili loader, bilibili is a very active video site in
China and I think we need this loader.

Example:
```python
from langchain.document_loaders.bilibili import BiliBiliLoader

loader = BiliBiliLoader(
       ["https://www.bilibili.com/video/BV1xt411o7Xu/",
       "https://www.bilibili.com/video/av330407025/"]
)
docs = loader.load()
```

Co-authored-by: 了空 <568250549@qq.com>
2023-04-11 10:40:32 -07:00
David Wu
d42deff402
fixed typo (#2720)
changed "to" to "too" in the memory notebook
2023-04-11 09:53:38 -07:00
David Wu
263ce40844
added a missing word (typo) (#2719)
Changed from "You may often to" to "You may often have to" to fix the
sentence.
2023-04-11 09:09:28 -07:00
Harrison Chase
e0a13e9355
Harrison/postgres (#2691)
Co-authored-by: Ankit Jain <ankneo@users.noreply.github.com>
2023-04-10 21:15:42 -07:00
Naveen Tatikonda
4364d3316e
Add custom vector fields and text fields for OpenSearch (#2652)
**Description**
Add custom vector field name and text field name while indexing and
querying for OpenSearch

**Issues**
https://github.com/hwchase17/langchain/issues/2500

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-04-10 21:02:02 -07:00
Nikita Zavgorodnii
1c979e320d
docs: update tokenizer notice in llms/getting_started (#2641)
A tiny update in docs which is spotted here:
https://github.com/hwchase17/langchain/issues/2439
2023-04-10 20:55:45 -07:00
Harrison Chase
ad3c5dd186
Harrison/databerry (#2688)
Co-authored-by: Georges Petrov <georgesm.petrov@gmail.com>
2023-04-10 18:49:47 -07:00
Tommertom
bd9f095ed2
Doc - Update google_search.ipynb - more explicit reference to places where to create API keys (#2670)
Took me a bit to find the proper places to get the API keys. The link
earlier provided to setup search is still good, but why not provide
direct link to the Google cloud tools that give you ability to create
keys?
2023-04-10 12:36:52 -07:00
Ankush Gola
8d3b059332
Add docs for callbacks (#2643)
Basically copy what's in the ts docs:
https://js.langchain.com/docs/production/callbacks


Discovered a bug wrt not awaiting callbacks in `LLMMathChain` so fixed
that
2023-04-10 10:23:11 -07:00
Harrison Chase
e63f9a846b
Harrison/docs agents (#2647) 2023-04-09 22:34:34 -07:00
Ankush Gola
b82cbd1be0
Use run and arun in place of combine_docs and acombine_docs (#2635)
`combine_docs` does not go through the standard chain call path which
means that chain callbacks won't be triggered, meaning QA chains won't
be traced properly, this fixes that.

Also fix several errors in the chat_vector_db notebook
2023-04-09 18:47:59 -07:00
Chetanya Rastogi
50c511d75f
Add new loader to load pdf as html content (#2607)
Adds a new pdf loader using the existing dependency on PDFMiner. 

The new loader can be helpful for chunking texts semantically into
sections as the output html content can be parsed via `BeautifulSoup` to
get more structured and rich information about font size, page numbers,
pdf headers/footers, etc. which may not be available otherwise with
other pdf loaders
2023-04-09 17:57:25 -07:00
Ankush Gola
61f7bd7a3a
fix question answering nb (#2637)
Was throwing exception bc `VectorIndexWrapper` did not have
`similarity_search` -- changed to just use retriever
2023-04-09 17:56:49 -07:00
William FH
10ff1fda8e
Add Streaming for GPT4All (#2642)
- Adds  support for callback handlers in GPT4All models
- Updates notebook and docs
2023-04-09 17:54:26 -07:00
William FH
e56673c7f9
BabyAGI Notebook Example (#2559)
Create a notebook implementing
[BabyAGI](https://github.com/yoheinakajima/babyagi/tree/main) by [Yohei
Nakajima](https://twitter.com/yoheinakajima) as LLM Chains.
2023-04-09 13:54:23 -07:00
Harrison Chase
7aba18ea77
Harrison/docs cleanup (#2633) 2023-04-09 12:55:22 -07:00
Nick Gibb
63175eb696
Fix typo in docs (#2601)
Minor typo in the docs ("reccomended" -> "recommended")

Co-authored-by: Nick Gibb <nick.gibb@bluedot.global>
2023-04-09 12:52:35 -07:00
Davit Buniatyan
aaac7071a3
Deep Lake retriever example analyzing Twitter the-algorithm source code (#2602)
Improvements to Deep Lake Vector Store
- much faster view loading of embeddings after filters with
`fetch_chunks=True`
- 2x faster ingestion
- use np.float32 for embeddings to save 2x storage, LZ4 compression for
text and metadata storage (saves up to 4x storage for text data)
- user defined functions as filters

Docs
- Added retriever full example for analyzing twitter the-algorithm
source code with GPT4
- Added a use case for code analysis (please let us know your thoughts
how we can improve it)

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-04-09 12:29:47 -07:00
William FH
5c0c5fafb2
Multi-Hop / Multi-Spec LLM Chain (#2549)
Add a notebook showing how to make a chain that composes multiple
OpenAPI Endpoint operations to accomplish tasks.
2023-04-09 12:29:16 -07:00
ecneladis
9a49f5763d
Add missing comma in async_agent.ipynb (#2614) 2023-04-09 12:28:28 -07:00
Girish Sharma
9aed565f13
Fix missing import in AzureOpenAI embeddings example (#2625)
## Why this PR?

Fixes #2624
There's a missing import statement in AzureOpenAI embeddings example.

## What's new in this PR?

- Import `OpenAIEmbeddings` before creating it's object.

## How it's tested?
- By running notebook and creating embedding object.

Signed-off-by: letmerecall <girishsharma001@gmail.com>
2023-04-09 12:25:31 -07:00
Harrison Chase
b9e5b27a99
Harrison/motorhead (#2599)
Co-authored-by: James O'Dwyer <100361543+softboyjimbo@users.noreply.github.com>
2023-04-08 13:27:20 -07:00
Roy Xue
f5afb60116
doc: change comment with correct name (#2580)
In this comment, it should be **ConversationalRetrievalChain** instead
of **ChatVectorDBChain**
2023-04-08 08:31:33 -07:00
akmhmgc
544cc7f395
Modified doc (#2568)
# description
Remove unnecessary codes and made the output easier to check in docs :)
2023-04-07 22:01:53 -07:00
joaoareis
b4d6a425a2
Fix typo in ChatGPT plugins (#2553)
This PR adds a `,` that was missing in the ChatGPT plugins examples.
2023-04-07 11:17:15 -07:00
Ikko Eltociear Ashimine
fc1d48814c
fix typo in summary_buffer.ipynb (#2547)
ouput -> output
2023-04-07 11:16:53 -07:00
Harrison Chase
a32c85951e
agent docs (#2551) 2023-04-07 10:01:23 -07:00
Harrison Chase
247a88f2f9
Harrison/move eval (#2533) 2023-04-07 07:53:13 -07:00
akmhmgc
481de8df7f
Modify docs (#2539)
# description
Modified doc according to recently added `AgentType`.
2023-04-07 07:21:38 -07:00
Harrison Chase
a31c9511e8
Harrison/redis improvements (#2528)
Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
2023-04-06 23:21:22 -07:00
Hamza Kyamanywa
ec489599fd
Correct typo in documentation for word 'therefore' (#2529)
This PR corrects a typo in the langchain
[documentation.](https://python.langchain.com/en/latest/modules/indexes.html#:~:text=We%20therefor%20have%20a%20concept)
It corrects the word `therefor` to `therefore`
2023-04-06 23:20:30 -07:00
Harrison Chase
3d0449bb45
agent tool retrieval (#2530) 2023-04-06 23:20:10 -07:00
William FH
632c65d64b
Add to notebook to assist in ground truth question generation (#2523)
At the bottom of the notebook, continue to show how to generate example
test cases with the assistance of an LLM
2023-04-06 23:08:55 -07:00
Harrison Chase
5c64b86ba3
Harrison/weaviate retriever (#2524)
Co-authored-by: Erika Cardenas <110841617+erika-cardenas@users.noreply.github.com>
2023-04-06 22:27:37 -07:00
William FH
629fda3957
Use JSON rather than JSON5 (#2520)
Evaluation so far has shown that agents do a reasonable job of emitting
`json` blocks as arguments when cued (instead of typescript), and `json`
permits the `strict=False` flag to permit control characters, which are
likely to appear in the response in particular.

This PR makes this change to the request and response synthesizer
chains, and fixes the temperature to the OpenAI agent in the eval
notebook. It also adds a `raise_error = False` flag in the notebook to
facilitate debugging
2023-04-06 21:14:12 -07:00
William FH
f8e4048cd8
Add an Example Evaluation Notebook for the API Chain (#2516)
Taking the Klarna API as an example, uses evaluation chain's to judge
the quality of the request and response synthesizers based on a small
set of curated queries.

Also updates intermediate steps for chain to emit a dict so each step
can be keyed for lookup


![image](https://user-images.githubusercontent.com/13333726/230505771-5cdb4de4-6fe7-4f54-b944-f29d438fa42c.png)
2023-04-06 15:58:41 -07:00
Harrison Chase
7149d33c71
max time limit for agent (#2513) 2023-04-06 14:38:34 -07:00
William FH
f240651bd8
Add Request body (#2507)
This still doesn't handle the following

- non-JSON media types
- anyOf, allOf, oneOf's

And doesn't emit the typescript definitions for referred types yet, but
that can be saved for a separate PR.

Also, we could have better support for Swagger 2.0 specs and OpenAPI
3.0.3 (can use the same lib for the latter) recommend offline conversion
for now.
2023-04-06 13:02:42 -07:00
Timon Ruban
f0926bad9f
Fix docstring in indexes/getting-started (#2452)
Fixed a letter. That's all.
2023-04-06 12:48:08 -07:00
Davit Buniatyan
b4914888a7
Deep Lake upgrade to include attribute search, distance metrics, returning scores and MMR (#2455)
### Features include

- Metadata based embedding search
- Choice of distance metric function (`L2` for Euclidean, `L1` for
Nuclear, `max` L-infinity distance, `cos` for cosine similarity, 'dot'
for dot product. Defaults to `L2`
- Returning scores
- Max Marginal Relevance Search
- Deleting samples from the dataset

### Notes
- Added numerous tests, let me know if you would like to shorten them or
make smarter

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-04-06 12:47:33 -07:00
Sam Weaver
2ffb90b161
Extend opensearch to better support existing instances (#2500) (#2509)
Closes #2500.
2023-04-06 12:45:56 -07:00
Matt Royer
ad87584c35
Fix 'embeddings is not defined' (#2468)
Nothing major. The docs just give an error when you try to use
`embeddings` instead of `llama`.
2023-04-06 12:45:45 -07:00
Jimmy Comfort
1dfb6a2a44
Update gpt4all example with model param (#2499)
I am pretty sure that the documentation here should point to `model`
instead of `model_path` based on the documentation here:


https://github.com/hwchase17/langchain/blob/master/langchain/llms/gpt4all.py#L26
2023-04-06 12:38:26 -07:00