Commit Graph

175 Commits (fix-searx)

Author SHA1 Message Date
Harrison Chase 3ee32a01ea
Harrison/prompt layer (#1547)
Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com>
Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>
1 year ago
Harrison Chase c844d1fd46
Harrison/chunk size (#1549)
Co-authored-by: Florian Leuerer <31259070+floleuerer@users.noreply.github.com>
1 year ago
Harrison Chase 357d808484
Harrison/remote paths pdf (#1544)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Harrison Chase cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
1 year ago
Harrison Chase 7ade419a0e
allow passing of messages into prompt template (#1505) 1 year ago
Harrison Chase 064741db58
Harrison/fix text splitter (#1511)
Co-authored-by: ajaysolanky <ajsolanky@gmail.com>
Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>
1 year ago
Ankush Gola 27104d4921
fix `ChatOpenAI.agenerate` (#1504) 1 year ago
Harrison Chase 7bec461782
Harrison/memory refactor (#1478)
moves memory to own module, factors out common stuff
1 year ago
Harrison Chase 0e21463f07
(rfc) chat models (#1424)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
1 year ago
Harrison Chase 63a5614d23
Harrison/simple memory (#1435)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Harrison Chase a1b9dfc099
Harrison/similarity search chroma (#1434)
Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
1 year ago
Tim Asp 23231d65a9
Add PyMuPDF PDF loader (#1426)
Different PDF libraries have different strengths and weaknesses. PyMuPDF
does a good job at extracting the most amount of content from the doc,
regardless of the source quality, extremely fast (especially compared to
Unstructured).

https://pymupdf.readthedocs.io/en/latest/index.html
1 year ago
Nuno Campos 499e76b199
Allow the regular openai class to be used for ChatGPT models (#1393)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Kacper Łukawski 9ac442624c
Add Qdrant named arguments (#1386)
This PR:
- Increases `qdrant-client` version to 1.0.4
- Introduces custom content and metadata keys (as requested in #1087)
- Moves all the `QdrantClient` parameters into the method parameters to
simplify code completion
1 year ago
Ankush Gola fe30be6fba
add async and streaming support to `OpenAIChat` (#1378)
title says it all
1 year ago
Harrison Chase 1cd8996074
Harrison/summarizer chain (#1356)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Ankush Gola 82baecc892
Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150)
This PR adds 

* `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting
with a sql database. This builds off of `SQLDatabaseChain`. The main
advantages are 1) answering general questions about the db, 2) access to
a tool for double checking queries, and 3) recovering from errors
* `ZeroShotAgent.as_json_agent` which returns an agent for interacting
with json blobs.
* Several examples in notebooks

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 786852e9e6
partial variables (#1308) 1 year ago
Tim Asp 72ef69d1ba
Add new iFixit document loader (#1333)
iFixit is a wikipedia-like site that has a huge amount of open content
on how to fix things, questions/answers for common troubleshooting and
"things" related content that is more technical in nature. All content
is licensed under CC-BY-SA-NC 3.0

Adding docs from iFixit as context for user questions like "I dropped my
phone in water, what do I do?" or "My macbook pro is making a whining
noise, what's wrong with it?" can yield significantly better responses
than context free response from LLMs.
1 year ago
Harrison Chase 166cda2cc6
Harrison/deeplake (#1316)
Co-authored-by: Davit Buniatyan <d@activeloop.ai>
1 year ago
Harrison Chase aaad6cc954
Harrison/atlas db (#1315)
Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>
1 year ago
Enrico Shippole 9becdeaadf
Add Writer, Banana, Modal, StochasticAI (#1270)
Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI

Added rigid json format for Banana and Modal
1 year ago
Dennis Antela Martinez 53c67e04d4
add aleph alpha llm (#1207)
Integrate Aleph Alpha's client into Langchain to provide access to the
luminous models - more info on latest benchmarks here:
https://www.aleph-alpha.com/luminous-performance-benchmarks
1 year ago
Harrison Chase b7708bbec6
rfc: callback changes (#1165)
conceptually, no reason a tool should know what an "agent action" is

unless any objections, can change in all callback handlers
1 year ago
Harrison Chase 44c8d8a9ac
move serpapi wrapper (#1199)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Naveen Tatikonda 0118706fd6
Add Support for OpenSearch Vector database (#1191)
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:

- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.

### Issues Resolved 
https://github.com/hwchase17/langchain/issues/1054

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Andrew White c5015d77e2
Allow k to be higher than doc size in max_marginal_relevance_search (#1187)
Fixes issue #1186. For some reason, #1117 didn't seem to fix it.
1 year ago
Harrison Chase 9d6d8f85da
Harrison/self hosted runhouse (#1154)
Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com>
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com>
Co-authored-by: jeff <tangj1122@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Charles Frye <cfrye59@gmail.com>
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com>
Co-authored-by: Stefan Keselj <skeselj@princeton.edu>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: blob42 <contact@blob42.xyz>
Co-authored-by: blob42 <spike@w530>
Co-authored-by: Enrico Shippole <henryshippole@gmail.com>
Co-authored-by: Ibis Prevedello <ibiscp@gmail.com>
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com>
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
Co-authored-by: Jeff Huber <jeffchuber@gmail.com>
Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com>
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com>
Co-authored-by: seanaedmiston <seane999@gmail.com>
Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu>
Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com>
Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>
Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>
1 year ago
CG80499 af8f5c1a49
Added constitutional chain. (#1147)
- Added self-critique constitutional chain based on this
[paper](https://www.anthropic.com/constitutional.pdf).
1 year ago
Ankush Gola 7b5e160d28
Make Tools own model, add ToolKit Concept (#1095)
Follow-up of @hinthornw's PR:

- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
1 year ago
Francisco Ingham 3f29742adc
Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
1 year ago
Noah Gundotra 8c5fbab72d
[Integration Tests] Cast fake embeddings to ALL float values (#1102)
Pydantic validation breaks tests for example (`test_qdrant.py`) because
fake embeddings contain an integer.

This PR casts the embeddings array to all floats.

Now the `qdrant` test passes, `poetry run pytest
tests/integration_tests/vectorstores/test_qdrant.py`
1 year ago
yakigac 1ed708391e
Fix a bug that shows "KeyError 'items'" (#1118)
Fix KeyError 'items' when no result found.

## Problem

When no result found for a query, google search crashed with `KeyError
'items'`.

## Solution

I added a check for an empty response before accessing the 'items' key.
It will handle the case correctly.

## Other

my twitter: yakigac
(I don't mind even if you don't mention me for this PR. But just because
last time my real name was shout out :) )
1 year ago
Harrison Chase 5e10e19bfe
Harrison/align table (#1081)
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Hasegawa Yuya e08961ab25
Fixed openai embeddings to be safe by batching them based on token size calculation. (#991)
I modified the logic of the batch calculation for embedding according to
this cookbook

https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb
1 year ago
seanaedmiston f0a258555b
Support similarity search by vector (in FAISS) (#961)
Alternate implementation to PR #960 Again - only FAISS is implemented.
If accepted can add this to other vectorstores or leave as
NotImplemented? Suggestions welcome...
1 year ago
rogerserper e46cd3b7db
Google Search API integration with serper.dev (wrapper, tests, docs, … (#909)
Adds Google Search integration with [Serper](https://serper.dev) a
low-cost alternative to SerpAPI (10x cheaper + generous free tier).
Includes documentation, tests and examples. Hopefully I am not missing
anything.

Developers can sign up for a free account at
[serper.dev](https://serper.dev) and obtain an api key.

## Usage

```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.llms.openai import OpenAI
from langchain.agents import initialize_agent, Tool

import os
os.environ["SERPER_API_KEY"] = ""
os.environ['OPENAI_API_KEY'] = ""

llm = OpenAI(temperature=0)
search = GoogleSerperAPIWrapper()
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run
    )
]

self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
```

### Output
```
Entering new AgentExecutor chain...
 Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion.
Follow up: Where is Carlos Alcaraz from?
Intermediate answer: El Palmar, Spain
So the final answer is: El Palmar, Spain

> Finished chain.

'El Palmar, Spain'
```
1 year ago
Ankush Gola caa8e4742e
Enable streaming for OpenAI LLM (#986)
* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
1 year ago
Harrison Chase 88bebb4caa
Harrison/llm integrations (#1039)
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
1 year ago
Harrison Chase ec727bf166
Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Enrico Shippole f30dcc6359
Add GooseAI, CerebriumAI, Petals, ForefrontAI (#981)
Add GooseAI, CerebriumAI, Petals, ForefrontAI
1 year ago
Anton Troynikov d43d430d86
Chroma persistence (#1028)
This PR adds persistence to the Chroma vector store.

Users can supply a `persist_directory` with any of the `Chroma` creation
methods. If supplied, the store will be automatically persisted at that
directory.

If a user creates a new `Chroma` instance with the same persistence
directory, it will get loaded up automatically. If they use `from_texts`
or `from_documents` in this way, the documents will be loaded into the
existing store.

There is the chance of some funky behavior if the user passes a
different embedding function from the one used to create the collection
- we will make this easier in future updates. For now, we log a warning.
1 year ago
Anton Troynikov 78abd277ff
Chroma in LangChain (#1010)
Chroma is a simple to use, open-source, zero-config, zero setup
vectorstore.

Simply `pip install chromadb`, and you're good to go. 

Out-of-the-box Chroma is suitable for most LangChain workloads, but is
highly flexible. I tested to 1M embs on my M1 mac, with out issues and
reasonably fast query times.

Look out for future releases as we integrate more Chroma features with
LangChain!
1 year ago
Shahriar Tajbakhsh b7747017d7
Import of `declarative_base` when SQLAlchemy <1.4 (#883)
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py)
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
1 year ago
Harrison Chase c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
1 year ago
Harrison Chase 91c6cea227
Harrison/batch embeds (#972)
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
1 year ago
Ankush Gola bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841)
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)

Summary of changes:

**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`

**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now

**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`

Create demo notebook.

Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
1 year ago
Harrison Chase bc53c928fc
Harrison/athropic (#921)
Co-authored-by: Mike Lambert <mlambert@gmail.com>
Co-authored-by: mrbean <sam@you.com>
Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
1 year ago
Harrison Chase 1e56879d38
Harrison/save faiss (#916)
Co-authored-by: Shrey Joshi <shreyjoshi2004@gmail.com>
1 year ago
Harrison Chase e2b834e427
Harrison/prompt template prefix (#888)
Co-authored-by: Gabriel Simmons <simmons.gabe@gmail.com>
1 year ago