Commit Graph

651 Commits (searx-query-suffixy)
 

Author SHA1 Message Date
blob42 889aada1f0 searx: add `query_suffix` parameter
- allows to build tools and dynamically inject extra searxh suffix in
  the query
 example: search.run("python library", query_suffix="site:github.com")
 resulting query: "python library site:github.com"
1 year ago
blob42 c0c3f4bbbe searx: remove duplicate param 1 year ago
Harrison Chase 4e43b0efe9
bump version 0092 (#1204) 1 year ago
Matt Robinson 3d5f56a8a1
docs: add quotes to `unstructured[local-inference]` install instructions (#1208)
### Summary

Corrects the install instruction for local inference to `pip install
"unstructured[local-inference]"`
1 year ago
Harrison Chase 047231840d
add docs for chroma persistance (#1202) 1 year ago
Harrison Chase 5bdb8dd6fe
Harrison/unstructured io (#1200) 1 year ago
Harrison Chase d90a287d8f
Harrison/updating docs (#1196) 1 year ago
Harrison Chase b7708bbec6
rfc: callback changes (#1165)
conceptually, no reason a tool should know what an "agent action" is

unless any objections, can change in all callback handlers
1 year ago
Harrison Chase fb83cd4ff4
catch networkx error (#1201) 1 year ago
Harrison Chase 44c8d8a9ac
move serpapi wrapper (#1199)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Konstantin Hebenstreit af94f1dd97
HuggingFaceEndpoint: Correct Example for ImportError (#1176)
When I try to import the Class HuggingFaceEndpoint I get an Import
Error: cannot import name 'HuggingFaceEndpoint' from 'langchain'.
(langchain version 0.0.88)
These two imports work fine: from langchain import HuggingFacePipeline
and from langchain import HuggingFaceHub.

So I corrected the import statement in the example. There is probably a
better solution to this, but this fixes the Error for me.
1 year ago
Harrison Chase 0c84ce1082
Harrison/add documents (#1197)
Co-authored-by: OmriNach <32659330+OmriNach@users.noreply.github.com>
1 year ago
Francisco Ingham 0b6a650cb4
added ability to override default verbose and memory when load chain … (#1153)
It is useful to be able to specify `verbose` or `memory` while still
keeping the chain's overall structure.

---------

Co-authored-by: Francisco Ingham <>
1 year ago
Anton Troynikov d2ef5d6167
Default Chroma collection name (#1198)
For persistence, it's convenient to have a default collection name which
gets used everywhere.
1 year ago
Dennis Antela Martinez 23243ae69c
add gitbook document loader (#1180)
Added a GitBook document loader. It lets you both, (1) fetch text from
any single GitBook page, or (2) fetch all relative paths and return
their respective content in Documents.

I've modified the `scrape` method in the `WebBaseLoader` to accept
custom web paths if given, but happy to remove it and move that logic
into the `GitbookLoader` itself.
1 year ago
William FH 13ba0177d0
Add a StdIn "Interaction" Tool (#1193)
Lets a chain prompt the user for more input as a part of its execution.
1 year ago
Naveen Tatikonda 0118706fd6
Add Support for OpenSearch Vector database (#1191)
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:

- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.

### Issues Resolved 
https://github.com/hwchase17/langchain/issues/1054

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Andrew White c5015d77e2
Allow k to be higher than doc size in max_marginal_relevance_search (#1187)
Fixes issue #1186. For some reason, #1117 didn't seem to fix it.
1 year ago
Zach Schillaci 159c560c95
Refactor some loops into list comprehensions (#1185) 1 year ago
Harrison Chase 926c121b98
Harrison/text splitter docs (#1188) 1 year ago
Harrison Chase 91446a5e9b
clean up text splitting docs (#1184) 1 year ago
Harrison Chase a5a14405ad
bump version to 0091 (#1181) 1 year ago
Harrison Chase 5a954efdd7
update gallery with slack bot (#1177) 1 year ago
Harrison Chase 4766b20223
clean up loaders (#1178) 1 year ago
blob42 9962bda70b
searx_search: docs updates (#1175)
- fix notebook formatting, remove empty cells and add scrolling for long
text

---------

Co-authored-by: blob42 <spike@w530>
1 year ago
Harrison Chase 4f3fbd7267
improve docs for indexes (#1146) 1 year ago
Harrison Chase 28781a6213
Harrison/markdown splitter (#1169)
Co-authored-by: Michael Chen <flamingdescent@gmail.com>
Co-authored-by: Michael Chen <michaelchen@stripe.com>
1 year ago
Harrison Chase 37dd34bea5
fix path (#1168) 1 year ago
Nan Wang e8f224fd3a
docs: add missing links to toc (#1163)
add missing links to toc

---------

Signed-off-by: Nan Wang <nan.wang@jina.ai>
1 year ago
Nick afe884fb96
AI21 documentation incorrectly titled Cohere (#1167) 1 year ago
Ji ed37fbaeff
for ChatVectorDBChain, add top_k_docs_for_context to allow control how many chunks of context will be retrieved (#1155)
given that we allow user define chunk size, think it would be useful for
user to define how many chunks of context will be retrieved.
1 year ago
Harrison Chase 955c89fccb
pass in prompts to vectordbqa (#1158) 1 year ago
Harrison Chase 65cc81c479
directory loader improvements (#1162) 1 year ago
Harrison Chase 05a05bcb04
bump version to 0.0.90 (#1157) 1 year ago
Harrison Chase 9d6d8f85da
Harrison/self hosted runhouse (#1154)
Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com>
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com>
Co-authored-by: jeff <tangj1122@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Charles Frye <cfrye59@gmail.com>
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com>
Co-authored-by: Stefan Keselj <skeselj@princeton.edu>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: blob42 <contact@blob42.xyz>
Co-authored-by: blob42 <spike@w530>
Co-authored-by: Enrico Shippole <henryshippole@gmail.com>
Co-authored-by: Ibis Prevedello <ibiscp@gmail.com>
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com>
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
Co-authored-by: Jeff Huber <jeffchuber@gmail.com>
Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com>
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com>
Co-authored-by: seanaedmiston <seane999@gmail.com>
Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu>
Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com>
Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>
Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>
1 year ago
CG80499 af8f5c1a49
Added constitutional chain. (#1147)
- Added self-critique constitutional chain based on this
[paper](https://www.anthropic.com/constitutional.pdf).
1 year ago
Harrison Chase a83ba44efa
Harrison/ver0089 (#1144) 1 year ago
Ankush Gola 7b5e160d28
Make Tools own model, add ToolKit Concept (#1095)
Follow-up of @hinthornw's PR:

- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
1 year ago
Harrison Chase 45b5640fe5
fix sql (#1141) 1 year ago
Sam Hogan 85c1449a96
Fix typo in HyDE docs (#1142) 1 year ago
kekayan 9111f4ca8a
fix chatvectordbchain to use pinecone namespace (#1139)
In the similarity search, the pinecone namespace is not used, which
makes the bot return _I don't know_ where the embeddings are stored in
the pinecone namespace. Now we can query by passing the namespace
optionally.
```result = qa({"question": query, "chat_history": chat_history, "namespace":"01gshyhjcfgkq1q5wxjtm17gjh"})```
1 year ago
Harrison Chase fb3c73d194
add srt loader (#1140) 1 year ago
Francisco Ingham 3f29742adc
Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
1 year ago
Harrison Chase 483821ea3b
fix docs (#1133) 1 year ago
Harrison Chase ee3590cb61
instruct embeddings docs (#1131) 1 year ago
Noah Gundotra 8c5fbab72d
[Integration Tests] Cast fake embeddings to ALL float values (#1102)
Pydantic validation breaks tests for example (`test_qdrant.py`) because
fake embeddings contain an integer.

This PR casts the embeddings array to all floats.

Now the `qdrant` test passes, `poetry run pytest
tests/integration_tests/vectorstores/test_qdrant.py`
1 year ago
Harrison Chase d5f3dfa1e1
Harrison/hn loader (#1130)
Co-authored-by: William X <william.y.xuan@gmail.com>
1 year ago
Tom Bocklisch 47c3221fda
Max marginal relecance search fails if there are not enough docs (#1117)
Implementation fails if there are not enough documents. Added the same
check as used for similarity search.

Current implementation raises
```  
File ".venv/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 160, in max_marginal_relevance_search
    _id = self.index_to_docstore_id[i]
KeyError: -1
```
1 year ago
Harrison Chase 511d41114f
return source documents for chat vector db chain (#1128) 1 year ago
Jon Luo c39ef70aa4
fix for database compatibility when getting table DDL (#1129)
#1081 introduced a method to get DDL (table definitions) in a manner
specific to sqlite3, thus breaking compatibility with other non-sqlite3
databases. This uses the sqlite3 command if the detected dialect is
sqlite, and otherwise uses the standard SQL `SHOW CREATE TABLE`. This
should fix #1103.
1 year ago