Harrison Chase
0998577dfe
Harrison/unstructured structured ( #1004 )
2023-02-12 07:36:11 -08:00
Harrison Chase
bbb06ca4cf
pdfminer ( #1003 )
2023-02-12 07:29:26 -08:00
Francisco Ingham
0b6aa6a024
Added initial capital letter to bullet points that had it missing ( #1000 )
...
Co-authored-by: Francisco Ingham <>
2023-02-11 20:31:34 -08:00
Harrison Chase
10e7297306
Harrison/fake llm ( #990 )
...
Co-authored-by: Stefan Keselj <skeselj@princeton.edu>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-11 15:12:35 -08:00
Harrison Chase
e51fad1488
Harrison/0083 ( #996 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-11 08:29:28 -08:00
Shahriar Tajbakhsh
b7747017d7
Import of declarative_base
when SQLAlchemy <1.4 ( #883 )
...
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml ),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py )
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
2023-02-10 18:33:47 -08:00
Harrison Chase
2e96704d59
Harrison/airbyte ( #989 )
...
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2023-02-10 18:08:00 -08:00
Charles Frye
e9799d6821
improves huggingface_hub example ( #988 )
...
The provided example uses the default `max_length` of `20` tokens, which
leads to the example generation getting cut off. 20 tokens is way too
short to show CoT reasoning, so I boosted it to `64`.
Without knowing HF's API well, it can be hard to figure out just where
those `model_kwargs` come from, and `max_length` is a super critical
one.
2023-02-10 17:56:15 -08:00
zanderchase
c2d1d903fa
Zander/online pdf loader ( #984 )
2023-02-10 15:42:30 -08:00
Harrison Chase
055a53c27f
add texts example ( #985 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2023-02-10 12:32:44 -08:00
Harrison Chase
231da14771
bump version to 0082 ( #980 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2023-02-10 11:38:24 -08:00
jeff
6ab432d62e
docs: update spelling typos ( #982 )
...
Wonder why "with" is spelled "wiht" so many times by human
2023-02-10 11:37:59 -08:00
Matt Robinson
07a407d89a
feat: adds UnstructuredURLLoader
for loading data from urls ( #979 )
...
### Summary
Adds a `UnstructuredURLLoader` that supports loading data from a list of
URLs.
### Testing
```python
from langchain.document_loaders import UnstructuredURLLoader
urls = [
"https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023 ",
"https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023 "
]
loader = UnstructuredURLLoader(urls=urls)
raw_documents = loader.load()
```
2023-02-10 10:18:38 -08:00
Harrison Chase
c64f98e2bb
Harrison/format agent instructions ( #973 )
...
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
2023-02-10 10:07:26 -08:00
Harrison Chase
5469d898a9
Harrison/everynote ( #974 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-10 08:02:35 -08:00
Harrison Chase
3d639d1539
update lint ( #975 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-10 08:01:13 -08:00
Harrison Chase
91c6cea227
Harrison/batch embeds ( #972 )
...
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-10 06:59:50 -08:00
Harrison Chase
ba54d36787
Harrison/tiktoken spec ( #964 )
...
Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-09 23:30:18 -08:00
Harrison Chase
5f8082bdd7
Harrison/deps ( #963 )
...
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-09 23:19:19 -08:00
Kevin Huo
512c523368
remove sample_row_in_table_info and simplify set operations in SQLDB ( #932 )
...
-Address TODO: deprecate for sample_row_in_table_info
-Simplify set operations by casting to sets to not need multiple set
casts + .difference() calls
2023-02-09 23:15:41 -08:00
Harrison Chase
e323d0cfb1
bump version 0081 ( #956 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-09 08:29:11 -08:00
Harrison Chase
01fa2d8117
Harrison/youtube fixes ( #955 )
...
Co-authored-by: Ji <jizhang.work@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-09 08:12:22 -08:00
zanderchase
8e126bc9bd
adding webpage loading logic ( #942 )
2023-02-09 07:52:50 -08:00
Harrison Chase
c71027e725
add docs for steamship deployment ( #949 )
...
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-08 16:01:19 -08:00
Usama Navid
e85c53ce68
Update readthedocs.py ( #943 )
...
Sometimes, the docs may be empty. For example for the text =
soup.find_all("main", {"id": "main-content"}) was an empty list. To
cater to these edge cases, the clean function needs to be checked if it
is empty or not.
2023-02-08 16:01:07 -08:00
Harrison Chase
3e1901e1aa
gutenberg books ( #946 )
...
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-08 12:00:47 -08:00
jeff
6a4f602156
docs: fix spelling typo ( #934 )
2023-02-08 11:13:35 -08:00
Ikko Eltociear Ashimine
6023d5be09
Update huggingface_hub.ipynb ( #944 )
...
HuggingFace -> Hugging Face
2023-02-08 11:05:28 -08:00
Harrison Chase
a306baacd1
bump version to 0080 ( #941 )
2023-02-08 07:41:25 -08:00
Harrison Chase
44ecec3896
Harrison/add roam loader ( #939 )
2023-02-08 00:35:33 -08:00
Ankush Gola
bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent ( #841 )
...
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)
Summary of changes:
**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`
**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now
**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`
Create demo notebook.
Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
2023-02-07 21:21:57 -08:00
Vincent Elster
afc7f1b892
Fix typos ( #929 )
...
accomplisehd -> accomplished
2023-02-07 14:39:45 -08:00
Harrison Chase
d43250bfa5
Harrison/ver0079 ( #927 )
2023-02-07 07:59:35 -08:00
Harrison Chase
bc53c928fc
Harrison/athropic ( #921 )
...
Co-authored-by: Mike Lambert <mlambert@gmail.com>
Co-authored-by: mrbean <sam@you.com>
Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
2023-02-06 22:29:25 -08:00
Harrison Chase
637c0d6508
Harrison/obsidian ( #920 )
2023-02-06 22:21:16 -08:00
Harrison Chase
1e56879d38
Harrison/save faiss ( #916 )
...
Co-authored-by: Shrey Joshi <shreyjoshi2004@gmail.com>
2023-02-06 21:44:50 -08:00
Ankush Gola
6bd1529cb7
add GoogleDriveLoader ( #914 )
...
only deal with docs files for now
2023-02-06 21:44:35 -08:00
Harrison Chase
2584663e44
remove unused buffer ( #919 )
2023-02-06 20:31:30 -08:00
Harrison Chase
cc20b9425e
add reqs ( #918 )
2023-02-06 20:30:03 -08:00
Harrison Chase
cea380174f
fix docs custom prompt template ( #917 )
2023-02-06 20:29:48 -08:00
Harrison Chase
87fad8fc00
analyze document ( #731 )
...
add analyze document chain, which does text splitting and then analysis
2023-02-06 20:02:19 -08:00
Harrison Chase
e2b834e427
Harrison/prompt template prefix ( #888 )
...
Co-authored-by: Gabriel Simmons <simmons.gabe@gmail.com>
2023-02-06 19:09:28 -08:00
Harrison Chase
f95cedc443
Harrison/sql rows ( #915 )
...
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-06 18:56:18 -08:00
Harrison Chase
ba5a2f06b9
Harrison/inference endpoint ( #861 )
...
Co-authored-by: Eno Reyes <enoreyes@gmail.com>
2023-02-06 18:14:25 -08:00
Harrison Chase
2ec25ddd4c
add unstructured examples ( #913 )
2023-02-06 18:13:46 -08:00
Kevin Huo
31b054f69d
Add pinecone integration test ( #911 )
...
Basic integration test for pinecone
2023-02-06 18:13:35 -08:00
Harrison Chase
93a091cfb8
Optionally return shell output on incorrect command ( #894 ) ( #899 )
...
This allows the LLM to correct its previous command by looking at the
error message output to the shell.
Additionally, this uses subprocess.run because that is now recommended
over subprocess.check_output:
https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module
Co-authored-by: Amos Ng <me@amos.ng>
2023-02-06 12:46:16 -08:00
James Briggs
3aa53b44dd
added i_end in batch extraction ( #907 )
...
Fix for issue #906
Switches `[i : i + batch_size]` to `[i : i_end]` in Pinecone
`from_texts` method
2023-02-06 12:45:56 -08:00
Harrison Chase
82c080c6e6
bump version to 0078 ( #908 )
2023-02-06 00:32:44 -08:00
Harrison Chase
71e662e88d
update docs ( #905 )
2023-02-06 00:26:20 -08:00