langchain

Commit Graph

Author	SHA1	Message	Date
Tim Asp	d22651d82a	Add new iFixit document loader (#1333 ) iFixit is a wikipedia-like site that has a huge amount of open content on how to fix things, questions/answers for common troubleshooting and "things" related content that is more technical in nature. All content is licensed under CC-BY-SA-NC 3.0 Adding docs from iFixit as context for user questions like "I dropped my phone in water, what do I do?" or "My macbook pro is making a whining noise, what's wrong with it?" can yield significantly better responses than context free response from LLMs.	1 year ago
Harrison Chase	166cda2cc6	Harrison/deeplake (#1316 ) Co-authored-by: Davit Buniatyan <d@activeloop.ai>	1 year ago
Harrison Chase	aaad6cc954	Harrison/atlas db (#1315 ) Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>	1 year ago
Enrico Shippole	9becdeaadf	Add Writer, Banana, Modal, StochasticAI (#1270 ) Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI Added rigid json format for Banana and Modal	1 year ago
Dennis Antela Martinez	53c67e04d4	add aleph alpha llm (#1207 ) Integrate Aleph Alpha's client into Langchain to provide access to the luminous models - more info on latest benchmarks here: https://www.aleph-alpha.com/luminous-performance-benchmarks	1 year ago
Harrison Chase	44c8d8a9ac	move serpapi wrapper (#1199 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	1 year ago
Naveen Tatikonda	0118706fd6	Add Support for OpenSearch Vector database (#1191 ) ### Description This PR adds a wrapper which adds support for the OpenSearch vector database. Using opensearch-py client we are ingesting the embeddings of given text into opensearch cluster using Bulk API. We can perform the `similarity_search` on the index using the 3 popular searching methods of OpenSearch k-NN plugin: - `Approximate k-NN Search` use approximate nearest neighbor (ANN) algorithms from the [nmslib](https://github.com/nmslib/nmslib), [faiss](https://github.com/facebookresearch/faiss), and [Lucene](https://lucene.apache.org/) libraries to power k-NN search. - `Script Scoring` extends OpenSearch’s script scoring functionality to execute a brute force, exact k-NN search. - `Painless Scripting` adds the distance functions as painless extensions that can be used in more complex combinations. Also, supports brute force, exact k-NN search like Script Scoring. ### Issues Resolved https://github.com/hwchase17/langchain/issues/1054 --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	1 year ago
Andrew White	c5015d77e2	Allow k to be higher than doc size in max_marginal_relevance_search (#1187 ) Fixes issue #1186. For some reason, #1117 didn't seem to fix it.	1 year ago
Harrison Chase	9d6d8f85da	Harrison/self hosted runhouse (#1154 ) Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>	1 year ago
Noah Gundotra	8c5fbab72d	[Integration Tests] Cast fake embeddings to ALL float values (#1102 ) Pydantic validation breaks tests for example (`test_qdrant.py`) because fake embeddings contain an integer. This PR casts the embeddings array to all floats. Now the `qdrant` test passes, `poetry run pytest tests/integration_tests/vectorstores/test_qdrant.py`	1 year ago
yakigac	1ed708391e	Fix a bug that shows "KeyError 'items'" (#1118 ) Fix KeyError 'items' when no result found. ## Problem When no result found for a query, google search crashed with `KeyError 'items'`. ## Solution I added a check for an empty response before accessing the 'items' key. It will handle the case correctly. ## Other my twitter: yakigac (I don't mind even if you don't mention me for this PR. But just because last time my real name was shout out :) )	1 year ago
Hasegawa Yuya	e08961ab25	Fixed openai embeddings to be safe by batching them based on token size calculation. (#991 ) I modified the logic of the batch calculation for embedding according to this cookbook https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb	1 year ago
seanaedmiston	f0a258555b	Support similarity search by vector (in FAISS) (#961 ) Alternate implementation to PR #960 Again - only FAISS is implemented. If accepted can add this to other vectorstores or leave as NotImplemented? Suggestions welcome...	1 year ago
rogerserper	e46cd3b7db	Google Search API integration with serper.dev (wrapper, tests, docs, … (#909 ) Adds Google Search integration with [Serper](https://serper.dev) a low-cost alternative to SerpAPI (10x cheaper + generous free tier). Includes documentation, tests and examples. Hopefully I am not missing anything. Developers can sign up for a free account at [serper.dev](https://serper.dev) and obtain an api key. ## Usage ```python from langchain.utilities import GoogleSerperAPIWrapper from langchain.llms.openai import OpenAI from langchain.agents import initialize_agent, Tool import os os.environ["SERPER_API_KEY"] = "" os.environ['OPENAI_API_KEY'] = "" llm = OpenAI(temperature=0) search = GoogleSerperAPIWrapper() tools = [ Tool( name="Intermediate Answer", func=search.run ) ] self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True) self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?") ``` ### Output ``` Entering new AgentExecutor chain... Yes. Follow up: Who is the reigning men's U.S. Open champion? Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion. Follow up: Where is Carlos Alcaraz from? Intermediate answer: El Palmar, Spain So the final answer is: El Palmar, Spain > Finished chain. 'El Palmar, Spain' ```	1 year ago
Ankush Gola	caa8e4742e	Enable streaming for OpenAI LLM (#986 ) * Support a callback `on_llm_new_token` that users can implement when `OpenAI.streaming` is set to `True`	1 year ago
Harrison Chase	88bebb4caa	Harrison/llm integrations (#1039 ) Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com>	1 year ago
Enrico Shippole	f30dcc6359	Add GooseAI, CerebriumAI, Petals, ForefrontAI (#981 ) Add GooseAI, CerebriumAI, Petals, ForefrontAI	1 year ago
Anton Troynikov	d43d430d86	Chroma persistence (#1028 ) This PR adds persistence to the Chroma vector store. Users can supply a `persist_directory` with any of the `Chroma` creation methods. If supplied, the store will be automatically persisted at that directory. If a user creates a new `Chroma` instance with the same persistence directory, it will get loaded up automatically. If they use `from_texts` or `from_documents` in this way, the documents will be loaded into the existing store. There is the chance of some funky behavior if the user passes a different embedding function from the one used to create the collection - we will make this easier in future updates. For now, we log a warning.	1 year ago
Anton Troynikov	78abd277ff	Chroma in LangChain (#1010 ) Chroma is a simple to use, open-source, zero-config, zero setup vectorstore. Simply `pip install chromadb`, and you're good to go. Out-of-the-box Chroma is suitable for most LangChain workloads, but is highly flexible. I tested to 1M embs on my M1 mac, with out issues and reasonably fast query times. Look out for future releases as we integrate more Chroma features with LangChain!	1 year ago
Harrison Chase	c64f98e2bb	Harrison/format agent instructions (#973 ) Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>	1 year ago
Harrison Chase	91c6cea227	Harrison/batch embeds (#972 ) Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	1 year ago
Ankush Gola	bc7e56e8df	Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841 ) Supporting asyncio in langchain primitives allows for users to run them concurrently and creates more seamless integration with asyncio-supported frameworks (FastAPI, etc.) Summary of changes: LLM * Add `agenerate` and `_agenerate` * Implement in OpenAI by leveraging `client.Completions.acreate` Chain * Add `arun`, `acall`, `_acall` * Implement them in `LLMChain` and `LLMMathChain` for now Agent * Refactor and leverage async chain and llm methods * Add ability for `Tools` to contain async coroutine * Implement async SerpaPI `arun` Create demo notebook. Open questions: * Should all the async stuff go in separate classes? I've seen both patterns (keeping the same class and having async and sync methods vs. having class separation)	1 year ago
Harrison Chase	bc53c928fc	Harrison/athropic (#921 ) Co-authored-by: Mike Lambert <mlambert@gmail.com> Co-authored-by: mrbean <sam@you.com> Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>	1 year ago
Harrison Chase	1e56879d38	Harrison/save faiss (#916 ) Co-authored-by: Shrey Joshi <shreyjoshi2004@gmail.com>	1 year ago
Harrison Chase	ba5a2f06b9	Harrison/inference endpoint (#861 ) Co-authored-by: Eno Reyes <enoreyes@gmail.com>	1 year ago
Kevin Huo	31b054f69d	Add pinecone integration test (#911 ) Basic integration test for pinecone	1 year ago
Harrison Chase	3f48eed5bd	Harrison/milvus (#856 ) Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com> Signed-off-by: Frank Liu <frank.liu@zilliz.com> Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com> Co-authored-by: Frank Liu <frank@frankzliu.com>	1 year ago
kahkeng	4a8f5cdf4b	Add alternative token-based text splitter (#816 ) This does not involve a separator, and will naively chunk input text at the appropriate boundaries in token space. This is helpful if we have strict token length limits that we need to strictly follow the specified chunk size, and we can't use aggressive separators like spaces to guarantee the absence of long strings. CharacterTextSplitter will let these strings through without splitting them, which could cause overflow errors downstream. Splitting at arbitrary token boundaries is not ideal but is hopefully mitigated by having a decent overlap quantity. Also this results in chunks which has exact number of tokens desired, instead of sometimes overcounting if we concatenate shorter strings. Potentially also helps with #528.	1 year ago
Harrison Chase	23d5f64bda	Harrison/ngram example (#846 ) Co-authored-by: Sean Spriggens <ssprigge@syr.edu>	1 year ago
Harrison Chase	d564308e0f	rfc: instruct embeddings (#811 ) Co-authored-by: seanaedmiston <seane999@gmail.com>	1 year ago
Harrison Chase	7b4882a2f4	Harrison/tf embeddings (#817 ) Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>	1 year ago
dham	e04b063ff4	add faiss local saving/loading (#676 ) - This uses the faiss built-in `write_index` and `load_index` to save and load faiss indexes locally - Also fixes #674 - The save/load functions also use the faiss library, so I refactored the dependency into a function	1 year ago
Harrison Chase	0b204d8c21	Harrison/quadrant (#665 ) Co-authored-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>	1 year ago
Harrison Chase	4d4cff0530	Harrison/cohere experimental (#638 ) Co-authored-by: inyourhead <44607279+xettrisomeman@users.noreply.github.com>	1 year ago
Harrison Chase	ffc7e04d44	Harrison/wolfram alpha (#579 ) Co-authored-by: Nicolas <nicolascamara29@gmail.com>	1 year ago
Harrison Chase	0072686aab	Harrison/new search engine (#477 ) Co-authored-by: Nicolas <nicolascamara29@gmail.com>	1 year ago
Harrison Chase	f8b605293f	Harrison/improve memory (#432 ) add AI prefix add new type of memory Co-authored-by: Jason <chisanch@usc.edu>	2 years ago
Harrison Chase	cf98f219f9	Harrison/tools exp (#372 )	2 years ago
Harrison Chase	3474f39e21	Harrison/improve cache (#368 ) make it so everything goes through generate, which removes the need for two types of caches	2 years ago
Harrison Chase	a7084ad6e4	Harrison/version 0040 (#366 )	2 years ago
mrbean	50257fce59	Support Streaming Tokens from OpenAI (#364 ) https://github.com/hwchase17/langchain/issues/363 @hwchase17 how much does this make you want to cry?	2 years ago
mrbean	fe6695b9e7	Add HuggingFacePipeline LLM (#353 ) https://github.com/hwchase17/langchain/issues/354 Add support for running your own HF pipeline locally. This would allow you to get a lot more dynamic with what HF features and models you support since you wouldn't be beholden to what is hosted in HF hub. You could also do stuff with HF Optimum to quantize your models and stuff to get pretty fast inference even running on a laptop.	2 years ago
Harrison Chase	9bb7195085	Harrison/llm saving (#331 ) Co-authored-by: Akash Samant <70665700+asamant21@users.noreply.github.com>	2 years ago
Harrison Chase	3ca2c8d6c5	allow passing of stop params into openai (#232 )	2 years ago
Harrison Chase	ca2394028f	move search to not be a chain (#226 )	2 years ago
Andrew Gleave	ea67c049f0	Support SQL statements that return no results (#222 ) Adds support for statements such as insert, update etc which do not return any rows. `engine.execute` is deprecated and so execution has been updated to use `connection.exec_driver_sql` as-per: https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Engine.execute	2 years ago
Harrison Chase	1b9b8efbc9	pal chain (#207 ) from https://arxiv.org/pdf/2211.10435.pdf	2 years ago
Harrison Chase	b94244eb12	nits (#210 ) use json.dump move test to integration tests (since it requires huggingface_hub)	2 years ago
Bagatur	b90e25f786	Add HuggingFace Hub Embeddings (#125 ) Add support for calling HuggingFace embedding models using the HuggingFaceHub Inference API. New class mirrors the existing HuggingFaceHub LLM implementation. Currently only supports 'sentence-transformers' models. Closes #86	2 years ago
Harrison Chase	ae9c6257fe	Harrison/arbitrary params (#186 )	2 years ago

1 2

73 Commits (main)