Commit Graph

625 Commits (28781a6213207bdf0794905b27222035c0d51cbd)
 

Author SHA1 Message Date
Jeff Huber 34cba2da32
Fix typo in integration with Chroma (#1070)
We introduced a breaking change but missed this call. This PR fixes
`langchain` to work with upstream `chroma`.
2 years ago
Jonathan Pedoeem 05df480376
Update `PromptLayerOpenAI` LLM usage instructions in documentation (#1053)
This PR updates the usage instructions for PromptLayerOpenAI in
Langchain's documentation. The updated instructions provide more detail
and conform better to the style of other LLM integration documentation
pages.

No code changes were made in this PR, only improvements to the
documentation. This update will make it easier for users to understand
how to use `PromptLayerOpenAI`
2 years ago
Matt Robinson 3ea1e5af1e
feat: added element metadata to unstructured loader (#1068)
### Summary

Adds tracked metadata from `unstructured` elements to the document
metadata when `UnstructuredFileLoader` is used in `"elements"` mode.
Tracked metadata is available in `unstructured>=0.4.9`, but the code is
written for backward compatibility with older `unstructured` versions.

### Testing

Before running, make sure to upgrade to `unstructured==0.4.9`. In the
code snippet below, you should see `page_number`, `filename`, and
`category` in the metadata for each document. `doc[0]` should have
`page_number: 1` and `doc[-1]` should have `page_number: 2`. The example
document is `layout-parser-paper-fast.pdf` from the [`unstructured`
sample
docs](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs).

```python
from langchain.document_loaders import UnstructuredFileLoader
loader = UnstructuredFileLoader(file_path=f"layout-parser-paper-fast.pdf", mode="elements")
docs = loader.load()
```
2 years ago
Harrison Chase bac676c8e7
bump version (#1057) 2 years ago
Ankush Gola d8ac274fc2
add to async chain notebook (#1056) 2 years ago
Ankush Gola caa8e4742e
Enable streaming for OpenAI LLM (#986)
* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
2 years ago
Harrison Chase f05f025e41
bump version to 0086 (#1050) 2 years ago
Sasmitha Manathunga c67c5383fd
docs: fix typo in notebook (#1046) 2 years ago
Harrison Chase 88bebb4caa
Harrison/llm integrations (#1039)
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
2 years ago
Harrison Chase ec727bf166
Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2 years ago
Harrison Chase 8c45f06d58
Harrison/standarize prompt loading (#1036)
Co-authored-by: Ibis Prevedello <ibiscp@gmail.com>
2 years ago
Enrico Shippole f30dcc6359
Add GooseAI, CerebriumAI, Petals, ForefrontAI (#981)
Add GooseAI, CerebriumAI, Petals, ForefrontAI
2 years ago
Anton Troynikov d43d430d86
Chroma persistence (#1028)
This PR adds persistence to the Chroma vector store.

Users can supply a `persist_directory` with any of the `Chroma` creation
methods. If supplied, the store will be automatically persisted at that
directory.

If a user creates a new `Chroma` instance with the same persistence
directory, it will get loaded up automatically. If they use `from_texts`
or `from_documents` in this way, the documents will be loaded into the
existing store.

There is the chance of some funky behavior if the user passes a
different embedding function from the one used to create the collection
- we will make this easier in future updates. For now, we log a warning.
2 years ago
Harrison Chase 012a6dfb16
Harrison/makefile (#1033)
Co-authored-by: blob42 <contact@blob42.xyz>
Co-authored-by: blob42 <spike@w530>
2 years ago
Harrison Chase 6a31a59400
add links (#1027) 2 years ago
Oliver Klingefjord 20889205e8
Added retry for openai.error.ServiceUnavailableError (#1022)
Imho retries should be performed for ServiceUnavailableError (which
tends to happen to me quite often).
2 years ago
Harrison Chase fc2502cd81
bump version to 0085 (#1017) 2 years ago
Harrison Chase 0f0e69adce
agent refactors (#997) 2 years ago
Harrison Chase 7fb33fca47
chroma docs (#1012) 2 years ago
Harrison Chase 0c553d2064
Harrion/kg (#1016)
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2 years ago
Anton Troynikov 78abd277ff
Chroma in LangChain (#1010)
Chroma is a simple to use, open-source, zero-config, zero setup
vectorstore.

Simply `pip install chromadb`, and you're good to go. 

Out-of-the-box Chroma is suitable for most LangChain workloads, but is
highly flexible. I tested to 1M embs on my M1 mac, with out issues and
reasonably fast query times.

Look out for future releases as we integrate more Chroma features with
LangChain!
2 years ago
cragwolfe 05d8969c79
Unstructured example notebook: add a pdf, related deps (#1011)
Updates the Unstructured example notebook with a PDF example. Includes
additional dependencies for PDF processing (and images, etc).
2 years ago
Dhruv Anand 03e5794978
typo fix on chat vector db docs (#1007)
simple typo fix: because --> between
2 years ago
Harrison Chase 6d44a2285c
bump version to 0084 (#1005) 2 years ago
Harrison Chase 0998577dfe
Harrison/unstructured structured (#1004) 2 years ago
Harrison Chase bbb06ca4cf
pdfminer (#1003) 2 years ago
Francisco Ingham 0b6aa6a024
Added initial capital letter to bullet points that had it missing (#1000)
Co-authored-by: Francisco Ingham <>
2 years ago
Harrison Chase 10e7297306
Harrison/fake llm (#990)
Co-authored-by: Stefan Keselj <skeselj@princeton.edu>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Harrison Chase e51fad1488
Harrison/0083 (#996)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Shahriar Tajbakhsh b7747017d7
Import of `declarative_base` when SQLAlchemy <1.4 (#883)
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py)
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
2 years ago
Harrison Chase 2e96704d59
Harrison/airbyte (#989)
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2 years ago
Charles Frye e9799d6821
improves huggingface_hub example (#988)
The provided example uses the default `max_length` of `20` tokens, which
leads to the example generation getting cut off. 20 tokens is way too
short to show CoT reasoning, so I boosted it to `64`.

Without knowing HF's API well, it can be hard to figure out just where
those `model_kwargs` come from, and `max_length` is a super critical
one.
2 years ago
zanderchase c2d1d903fa
Zander/online pdf loader (#984) 2 years ago
Harrison Chase 055a53c27f
add texts example (#985)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2 years ago
Harrison Chase 231da14771
bump version to 0082 (#980)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2 years ago
jeff 6ab432d62e
docs: update spelling typos (#982)
Wonder why "with" is spelled "wiht" so many times by human
2 years ago
Matt Robinson 07a407d89a
feat: adds `UnstructuredURLLoader` for loading data from urls (#979)
### Summary

Adds a `UnstructuredURLLoader` that supports loading data from a list of
URLs.


### Testing

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = [
    "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023",
    "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023"
]
loader = UnstructuredURLLoader(urls=urls)
raw_documents = loader.load()
```
2 years ago
Harrison Chase c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
2 years ago
Harrison Chase 5469d898a9
Harrison/everynote (#974)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Harrison Chase 3d639d1539
update lint (#975)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Harrison Chase 91c6cea227
Harrison/batch embeds (#972)
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Harrison Chase ba54d36787
Harrison/tiktoken spec (#964)
Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Harrison Chase 5f8082bdd7
Harrison/deps (#963)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Kevin Huo 512c523368
remove sample_row_in_table_info and simplify set operations in SQLDB (#932)
-Address TODO: deprecate for sample_row_in_table_info
-Simplify set operations by casting to sets to not need multiple set
casts + .difference() calls
2 years ago
Harrison Chase e323d0cfb1
bump version 0081 (#956)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Harrison Chase 01fa2d8117
Harrison/youtube fixes (#955)
Co-authored-by: Ji <jizhang.work@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
zanderchase 8e126bc9bd
adding webpage loading logic (#942) 2 years ago
Harrison Chase c71027e725
add docs for steamship deployment (#949)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Usama Navid e85c53ce68
Update readthedocs.py (#943)
Sometimes, the docs may be empty. For example for the text =
soup.find_all("main", {"id": "main-content"}) was an empty list. To
cater to these edge cases, the clean function needs to be checked if it
is empty or not.
2 years ago
Harrison Chase 3e1901e1aa
gutenberg books (#946)
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago