Commit Graph

1181 Commits (fix-readthedocs)
 

Author SHA1 Message Date
blob42 14d0e0ee41 fix: ReadTheDocs loader main content filter 1 year ago
Nuno Campos 6f39e88a2c
Add AsyncIteratorCallbackHandler (#2329) 1 year ago
Harrison Chase 6e4e7d2637
bump version to 135 (#2600) 1 year ago
rkeshwani 5e57496225
#2595 ChromaDB: Add ability to adjust metadata for indexes upon creating co… (#2597)
Referencing #2595
Added optional default parameter to adjust index metadata upon
collection creation per chroma code

ce0bc89777/chromadb/api/local.py (L74)

Allowing for user to have the ability to adjust distance calculation
functions.
1 year ago
Harrison Chase b9e5b27a99
Harrison/motorhead (#2599)
Co-authored-by: James O'Dwyer <100361543+softboyjimbo@users.noreply.github.com>
1 year ago
Johnny Lim 79a44c8225
Remove unnecessary question mark in link in README (#2589)
This PR removes an unnecessary question mark in link in the `README.md`
file.
1 year ago
Harrison Chase 2f49c96532
Harrison/redis (#2588)
Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
1 year ago
Yuchu Luo 40469eef7f
fix temperature parameter not used in chat models (#2558) 1 year ago
Will Henchy 125afb51d7
Add shared Google Drive folder support (#2562)
closes #1634

Adds support for loading files from a shared Google Drive folder to
`GoogleDriveLoader`. Shared drives are commonly used by businesses on
their Google Workspace accounts (this is my particular use case).
1 year ago
Alex Rad 7bf5b0ccd3
RWKV: do not propagate model_state between calls (#2565)
RWKV is an RNN with a hidden state that is part of its inference.
However, the model state should not be carried across uses and it's a
bug to do so.

This resets the state for multiple invocations
1 year ago
Venky 7a4e1b72a8
Fix docs links (#2572)
Fix broken links in documentation.
1 year ago
Roy Xue f5afb60116
doc: change comment with correct name (#2580)
In this comment, it should be **ConversationalRetrievalChain** instead
of **ChatVectorDBChain**
1 year ago
Shishin Mo f7f118e021
use openai_organization as argument (#2566)
Added support for passing the openai_organization as an argument, as it
was only supported by the environment variable but openai_api_key was
supported by both environment variables and arguments.

`ChatOpenAI(temperature=0, model_name="gpt-4", openai_api_key="sk-****",
openai_organization="org-****")`
1 year ago
akmhmgc 544cc7f395
Modified doc (#2568)
# description
Remove unnecessary codes and made the output easier to check in docs :)
1 year ago
sergerdn cd9336469e
fix: missed deps integrations tests (#2560)
Almost all integration tests have failed, but we haven't encountered any
import errors yet. Some tests failed due to lazy import issues. It
doesn't seem like a problem to resolve some of these errors in the next
PR.
I have a headache from resolving conflicts with `deeplake` and `boto3`,
so I will temporarily comment out `boto3`.


fix https://github.com/hwchase17/langchain/issues/2426
1 year ago
Kacper Łukawski d8967e28d0
Upgrade Qdrant to 1.1.2 (#2554)
This is a minor upgrade for Qdrant. We made a small bugfix in the local
mode, so it might also be good to upgrade Qdrant for LangChain users.
1 year ago
joaoareis b4d6a425a2
Fix typo in ChatGPT plugins (#2553)
This PR adds a `,` that was missing in the ChatGPT plugins examples.
1 year ago
Ikko Eltociear Ashimine fc1d48814c
fix typo in summary_buffer.ipynb (#2547)
ouput -> output
1 year ago
Duncan Brown 9b78bb7393
Fix a typo in the SQL agent prompt prefix (#2552)
Fix the grammar in this sentence, and remove the redundant "few"

"only ask for a the few relevant columns" -> "only ask for the relevant
columns"
1 year ago
Harrison Chase a32c85951e
agent docs (#2551) 1 year ago
Harrison Chase 95e780d6f9
bump version 134 (#2544) 1 year ago
Harrison Chase 247a88f2f9
Harrison/move eval (#2533) 1 year ago
sergerdn 6dc86ad48f
feat: add pytest-vcr for recording HTTP interactions in integration tests (#2445)
Using `pytest-vcr` in integration tests has several benefits. Firstly,
it removes the need to mock external services, as VCR records and
replays HTTP interactions on the fly. Secondly, it simplifies the
integration test setup by eliminating the need to set up and tear down
external services in some cases. Finally, it allows for more reliable
and deterministic integration tests by ensuring that HTTP interactions
are always replayed with the same response.
Overall, `pytest-vcr` is a valuable tool for simplifying integration
test setup and improving their reliability

This commit adds the `pytest-vcr` package as a dependency for
integration tests in the `pyproject.toml` file. It also introduces two
new fixtures in `tests/integration_tests/conftest.py` files for managing
cassette directories and VCR configurations.

In addition, the
`tests/integration_tests/vectorstores/test_elasticsearch.py` file has
been updated to use the `@pytest.mark.vcr` decorator for recording and
replaying HTTP interactions.

Finally, this commit removes the `documents` fixture from the
`test_elasticsearch.py` file and replaces it with a new fixture defined
in `tests/integration_tests/vectorstores/conftest.py` that yields a list
of documents to use in any other tests.

This also includes my second attempt to fix issue :
https://github.com/hwchase17/langchain/issues/2386

Maybe related https://github.com/hwchase17/langchain/issues/2484
1 year ago
tmyjoe c9f93f5f74
fix: token counting for chat openai. (#2543)
I noticed that the value of get_num_tokens_from_messages in `ChatOpenAI`
is always one less than the response from OpenAI's API. Upon checking
the official documentation, I found that it had been updated, so I made
the necessary corrections.
Then now I got the same value from OpenAI's API.


d972e7482e (diff-2d4485035b3a3469802dbad11d7b4f834df0ea0e2790f418976b303bc82c1874L474)
1 year ago
SangamSwadiK 8cded3fdad
fix typo (#2532)
1) Any breaking changes  ?
None

2) What does this do ?
Fix typo in QA eval

cc @hwchase17
1 year ago
Ankush Gola dca21078ad
Run tools concurrently in `_atake_next_step` (#2537)
small refactor to allow this
1 year ago
Ankush Gola 6dbd29e440
add async vector operations in VectorStore base class (#2535)
not currently implemented by any subclasses
1 year ago
akmhmgc 481de8df7f
Modify docs (#2539)
# description
Modified doc according to recently added `AgentType`.
1 year ago
Harrison Chase a31c9511e8
Harrison/redis improvements (#2528)
Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
1 year ago
Hamza Kyamanywa ec489599fd
Correct typo in documentation for word 'therefore' (#2529)
This PR corrects a typo in the langchain
[documentation.](https://python.langchain.com/en/latest/modules/indexes.html#:~:text=We%20therefor%20have%20a%20concept)
It corrects the word `therefor` to `therefore`
1 year ago
Harrison Chase 3d0449bb45
agent tool retrieval (#2530) 1 year ago
William FH 632c65d64b
Add to notebook to assist in ground truth question generation (#2523)
At the bottom of the notebook, continue to show how to generate example
test cases with the assistance of an LLM
1 year ago
Harrison Chase 15cdfa9e7f
Harrison/table index (#2526)
Co-authored-by: Alvaro Sevilla <alvaro@chainalysis.com>
1 year ago
Harrison Chase 704b0feb38
Harrison/allow org none (#2527) 1 year ago
Alex Iribarren aecd1c8ee3
Gitbook enhancements (#2279)
The gitbook importer had some issues while trying to ingest a particular
site, these commits allowed it to work as expected. The last commit
(06017ff) is to open the door to extending this class for other
documentation formats (which will come in a future PR).
1 year ago
Harrison Chase 58a93f88da
Harrison/entity store (#2525)
Co-authored-by: Alex Iribarren <alex.iribarren@gmail.com>
1 year ago
Vashisht Madhavan aa439ac2ff
Adding an in-context QA evaluation chain + chain of thought reasoning chain for improved accuracy (#2444)
Right now, eval chains require an answer for every question. It's
cumbersome to collect this ground truth so getting around this issue
with 2 things:

* Adding a context param in `ContextQAEvalChain` and simply evaluating
if the question is answered accurately from context
* Adding chain of though explanation prompting to improve the accuracy
of this w/o GT.

This also gets to feature parity with openai/evals which has the same
contextual eval w/o GT.

TODO in follow-up:
* Better prompt inheritance. No need for seperate prompt for CoT
reasoning. How can we merge them together

---------

Co-authored-by: Vashisht Madhavan <vashishtmadhavan@Vashs-MacBook-Pro.local>
1 year ago
AeroXi e131156805
set default embedding max token size (#2330)
#991 has already implemented this convenient feature to prevent
exceeding max token limit in embedding model.

> By default, this function is deactivated so as not to change the
previous behavior. If you specify something like 8191 here, it will work
as desired.
According to the author, this is not set by default. 
Until now, the default model in OpenAIEmbeddings's max token size is
8191 tokens, no other openai model has a larger token limit.
So I believe it will be better to set this as default value, other wise
users may encounter this error and hard to solve it.
1 year ago
Fabian Venturini Cabau 0316900d2f
feat: implements similarity_search_by_vector on Weaviate (#2522)
This PR implements `similarity_search_by_vector` in the Weaviate
vectorstore.
1 year ago
Harrison Chase 5c64b86ba3
Harrison/weaviate retriever (#2524)
Co-authored-by: Erika Cardenas <110841617+erika-cardenas@users.noreply.github.com>
1 year ago
Tiago De Gaspari c2f21a519f
Add support to set up openai organizations (#2514)
Add support for defining the organization of OpenAI, similarly to what
is done in the reference code below:

```
import os
import openai
openai.organization = os.getenv("OPENAI_ORGANIZATION")
openai.api_key = os.getenv("OPENAI_API_KEY")
```
1 year ago
William FH 629fda3957
Use JSON rather than JSON5 (#2520)
Evaluation so far has shown that agents do a reasonable job of emitting
`json` blocks as arguments when cued (instead of typescript), and `json`
permits the `strict=False` flag to permit control characters, which are
likely to appear in the response in particular.

This PR makes this change to the request and response synthesizer
chains, and fixes the temperature to the OpenAI agent in the eval
notebook. It also adds a `raise_error = False` flag in the notebook to
facilitate debugging
1 year ago
William FH f8e4048cd8
Add an Example Evaluation Notebook for the API Chain (#2516)
Taking the Klarna API as an example, uses evaluation chain's to judge
the quality of the request and response synthesizers based on a small
set of curated queries.

Also updates intermediate steps for chain to emit a dict so each step
can be keyed for lookup


![image](https://user-images.githubusercontent.com/13333726/230505771-5cdb4de4-6fe7-4f54-b944-f29d438fa42c.png)
1 year ago
Alex Rad bd780a8223
Add support for rwkv (#2422)
This adds support for running RWKV with pytorch. 

https://github.com/hwchase17/langchain/issues/2398

This does not yet support  rwkv.cpp
1 year ago
Harrison Chase 7149d33c71
max time limit for agent (#2513) 1 year ago
William FH f240651bd8
Add Request body (#2507)
This still doesn't handle the following

- non-JSON media types
- anyOf, allOf, oneOf's

And doesn't emit the typescript definitions for referred types yet, but
that can be saved for a separate PR.

Also, we could have better support for Swagger 2.0 specs and OpenAPI
3.0.3 (can use the same lib for the latter) recommend offline conversion
for now.
1 year ago
Zach Jones 13d1df2140
Feature: AgentExecutor execution time limit (#2399)
`AgentExecutor` already has support for limiting the number of
iterations. But the amount of time taken for each iteration can vary
quite a bit, so it is difficult to place limits on the execution time.
This PR adds a new field `max_execution_time` to the `AgentExecutor`
model. When called asynchronously, the agent loop is wrapped in an
`asyncio.timeout()` context which triggers the early stopping response
if the time limit is reached. When called synchronously, the agent loop
checks for both the max_iteration limit and the time limit after each
iteration.

When used asynchronously `max_execution_time` gives really tight control
over the max time for an execution chain. When used synchronously, the
chain can unfortunately exceed max_execution_time, but it still gives
more control than trying to estimate the number of max_iterations needed
to cap the execution time.

---------

Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
1 year ago
qued 5b34931948
docs: update unstructured detectron install instructions (#2498)
Updated recommended `detectron2` version to install for use with
`unstructured`.

Should now match version in [Unstructured
README](https://github.com/Unstructured-IO/unstructured/blob/main/README.md#eight_pointed_black_star-quick-start).
1 year ago
Timon Ruban f0926bad9f
Fix docstring in indexes/getting-started (#2452)
Fixed a letter. That's all.
1 year ago
Davit Buniatyan b4914888a7
Deep Lake upgrade to include attribute search, distance metrics, returning scores and MMR (#2455)
### Features include

- Metadata based embedding search
- Choice of distance metric function (`L2` for Euclidean, `L1` for
Nuclear, `max` L-infinity distance, `cos` for cosine similarity, 'dot'
for dot product. Defaults to `L2`
- Returning scores
- Max Marginal Relevance Search
- Deleting samples from the dataset

### Notes
- Added numerous tests, let me know if you would like to shorten them or
make smarter

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
1 year ago