langchain

Commit Graph

Author	SHA1	Message	Date
Harrison Chase	5ba1c7b601	ruff ruff (#1203 )	1 year ago
Harrison Chase	429af93cab	fix imports (#1288 )	1 year ago
Matt Robinson	3830d900bf	feat: document loader for MS Word documents (#1282 ) ### Summary Adds a document loader for MS Word Documents. Works with both `.docx` and `.doc` files as longer as the user has installed `unstructured>=0.4.11`. ### Testing The follow workflow test the loader for both `.doc` and `.docx` files using example docs from the `unstructured` repo. #### `.docx` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.docx" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` #### `.doc` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.doc" loader = UnstructuredWordDocumentLoader(filename) loader.load() ```	1 year ago
Harrison Chase	cc10f1fe9c	cleanup (#1274 )	1 year ago
Harrison Chase	cf8cb58b3a	Harrison/cohere params (#1278 ) Co-authored-by: Stefano Faraggi <40745694+stepp1@users.noreply.github.com>	1 year ago
Harrison Chase	42ddf44f86	Harrison/logprobs (#1279 ) Co-authored-by: Prateek Shah <97124740+prateekspanning@users.noreply.github.com>	1 year ago
Harrison Chase	b592b4d3f6	Harrison/fb loader (#1277 ) Co-authored-by: Vairo Di Pasquale <vairo.dp@gmail.com>	1 year ago
Harrison Chase	91584382e3	Harrison/errors (#1276 ) Co-authored-by: Kevin Huo <5000881+kwhuo68@users.noreply.github.com>	1 year ago
Klein Tahiraj	24f02aa39a	adding .ipynb loader and documentation Fixes #1248 (#1252 ) `NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object. Parameters: * `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False). * `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10). * `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False). * `traceback` (bool): whether to include full traceback (default is False).	1 year ago
Harrison Chase	6cd37e308f	Harrison/source docs (#1275 ) Co-authored-by: Tushar Dhadiwal <tushardhadiwal@users.noreply.github.com>	1 year ago
Enrico Shippole	4dbea7e9e4	Add Writer, Banana, Modal, StochasticAI (#1270 ) Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI Added rigid json format for Banana and Modal	1 year ago
blob42	04bdd22b9f	searx: add `query_suffix` parameter (#1259 ) - allows to build tools and dynamically inject extra searxh suffix in the query. example: `search.run("python library", query_suffix="site:github.com")` resulting query: `python library site:github.com` Co-authored-by: blob42 <spike@w530>	1 year ago
Harrison Chase	df37dd1be4	fix bug with length function (#1257 )	1 year ago
Iskren Ivov Chernev	2ab4def6d3	Add DeepInfra LLM support (#1232 ) DeepInfra is an Inference-as-a-Service provider. Add a simple wrapper using HTTPS requests.	1 year ago
Satoru Sakamoto	b248037053	fix to specific language transcript (#1231 ) Currently youtube loader only seems to support English audio. Changed to load videos in the specified language.	1 year ago
Harrison Chase	fcfb409dd3	add ifttt tool (#1244 )	1 year ago
Jon Luo	aec2bb84a8	Don't instruct LLM to use the LIMIT clause, which is incompatible with SQL Server (#1242 ) The current prompt specifically instructs the LLM to use the `LIMIT` clause. This will cause issues with MS SQL Server, which uses `SELECT TOP` instead of `LIMIT`. The generated SQL will use `LIMIT`; the instruction to "always limit... using the LIMIT clause" seems to override the "create a syntactically correct mssql query to run" portion. Reported here: https://github.com/hwchase17/langchain/issues/1103#issuecomment-1441144224 I don't have access to a SQL Server instance to test, but removing that part of the prompt in OpenAI Playground results in the correct `SELECT TOP` syntax, whereas keeping it in results in the `LIMIT` clause, even when instructing it to generate syntactically correct mssql. It's also still correctly using `LIMIT` in my MariaDB database. I think in this case we can assume that the model will select the appropriate method based on the dialect specified. In general, it would be nice to be able to test a suite of SQL dialects for things like dialect-specific syntax and other issues we've run into in the past, but I'm not quite sure how to best approach that yet.	1 year ago
Dennis Antela Martinez	985f36eb3f	add aleph alpha llm (#1207 ) Integrate Aleph Alpha's client into Langchain to provide access to the luminous models - more info on latest benchmarks here: https://www.aleph-alpha.com/luminous-performance-benchmarks	1 year ago
Klein Tahiraj	ebb9e4087c	Fixing typo in loading.py (#1235 ) Just fixing a typo I found in loading.py	1 year ago
Jon Luo	8cad8c34cb	fix sqlite internal tables breaking table_info (#1224 ) With the current method used to get the SQL table info, sqlite internal schema tables are being included and are not being handled correctly by sqlalchemy because the columns have no types. This is easy to see with the Chinook database: ```python db = SQLDatabase.from_uri("sqlite:///Chinook.db") print(db.table_info) ``` ```python ... sqlalchemy.exc.CompileError: (in table 'sqlite_sequence', column 'name'): Can't generate DDL for NullType(); did you forget to specify a type on this Column? ``` SQLAlchemy 2.0 [ignores these by default](`63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L856-L880)`): `63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L2096-L2123)`	1 year ago
djacobs7	043ce02906	Fix typo in constitutional_ai base.py (#1216 ) Found a typo in the documentation code for the constitutional_ai module	1 year ago
blob42	ffeb00c82b	searx: remove duplicate param (#1219 ) Co-authored-by: blob42 <spike@w530>	1 year ago
Harrison Chase	f3c92172f2	Harrison/unstructured io (#1200 )	1 year ago
Harrison Chase	df84f69b6c	Harrison/updating docs (#1196 )	1 year ago
Harrison Chase	5f3550437f	rfc: callback changes (#1165 ) conceptually, no reason a tool should know what an "agent action" is unless any objections, can change in all callback handlers	1 year ago
Harrison Chase	0008c431fc	catch networkx error (#1201 )	1 year ago
Harrison Chase	529df2a2dd	move serpapi wrapper (#1199 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	1 year ago
Konstantin Hebenstreit	bfe6038c3b	HuggingFaceEndpoint: Correct Example for ImportError (#1176 ) When I try to import the Class HuggingFaceEndpoint I get an Import Error: cannot import name 'HuggingFaceEndpoint' from 'langchain'. (langchain version 0.0.88) These two imports work fine: from langchain import HuggingFacePipeline and from langchain import HuggingFaceHub. So I corrected the import statement in the example. There is probably a better solution to this, but this fixes the Error for me.	1 year ago
Harrison Chase	8cc3fd424f	Harrison/add documents (#1197 ) Co-authored-by: OmriNach <32659330+OmriNach@users.noreply.github.com>	1 year ago
Francisco Ingham	f5c83eaef4	added ability to override default verbose and memory when load chain … (#1153 ) It is useful to be able to specify `verbose` or `memory` while still keeping the chain's overall structure. --------- Co-authored-by: Francisco Ingham <>	1 year ago
Anton Troynikov	28ffe63136	Default Chroma collection name (#1198 ) For persistence, it's convenient to have a default collection name which gets used everywhere.	1 year ago
Dennis Antela Martinez	1053c94f17	add gitbook document loader (#1180 ) Added a GitBook document loader. It lets you both, (1) fetch text from any single GitBook page, or (2) fetch all relative paths and return their respective content in Documents. I've modified the `scrape` method in the `WebBaseLoader` to accept custom web paths if given, but happy to remove it and move that logic into the `GitbookLoader` itself.	1 year ago
William FH	fb3c992749	Add a StdIn "Interaction" Tool (#1193 ) Lets a chain prompt the user for more input as a part of its execution.	1 year ago
Naveen Tatikonda	8e2152c1d6	Add Support for OpenSearch Vector database (#1191 ) ### Description This PR adds a wrapper which adds support for the OpenSearch vector database. Using opensearch-py client we are ingesting the embeddings of given text into opensearch cluster using Bulk API. We can perform the `similarity_search` on the index using the 3 popular searching methods of OpenSearch k-NN plugin: - `Approximate k-NN Search` use approximate nearest neighbor (ANN) algorithms from the [nmslib](https://github.com/nmslib/nmslib), [faiss](https://github.com/facebookresearch/faiss), and [Lucene](https://lucene.apache.org/) libraries to power k-NN search. - `Script Scoring` extends OpenSearch’s script scoring functionality to execute a brute force, exact k-NN search. - `Painless Scripting` adds the distance functions as painless extensions that can be used in more complex combinations. Also, supports brute force, exact k-NN search like Script Scoring. ### Issues Resolved https://github.com/hwchase17/langchain/issues/1054 --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	1 year ago
Andrew White	f0f9f276bd	Allow k to be higher than doc size in max_marginal_relevance_search (#1187 ) Fixes issue #1186. For some reason, #1117 didn't seem to fix it.	1 year ago
Zach Schillaci	06d1af114a	Refactor some loops into list comprehensions (#1185 )	1 year ago
Harrison Chase	c42f19d681	clean up loaders (#1178 )	1 year ago
blob42	9962bda70b	searx_search: docs updates (#1175 ) - fix notebook formatting, remove empty cells and add scrolling for long text --------- Co-authored-by: blob42 <spike@w530>	1 year ago
Harrison Chase	28781a6213	Harrison/markdown splitter (#1169 ) Co-authored-by: Michael Chen <flamingdescent@gmail.com> Co-authored-by: Michael Chen <michaelchen@stripe.com>	1 year ago
Harrison Chase	37dd34bea5	fix path (#1168 )	1 year ago
Ji	ed37fbaeff	for ChatVectorDBChain, add top_k_docs_for_context to allow control how many chunks of context will be retrieved (#1155 ) given that we allow user define chunk size, think it would be useful for user to define how many chunks of context will be retrieved.	1 year ago
Harrison Chase	955c89fccb	pass in prompts to vectordbqa (#1158 )	1 year ago
Harrison Chase	65cc81c479	directory loader improvements (#1162 )	1 year ago
Harrison Chase	9d6d8f85da	Harrison/self hosted runhouse (#1154 ) Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>	1 year ago
CG80499	af8f5c1a49	Added constitutional chain. (#1147 ) - Added self-critique constitutional chain based on this [paper](https://www.anthropic.com/constitutional.pdf).	1 year ago
Harrison Chase	a83ba44efa	Harrison/ver0089 (#1144 )	1 year ago
Ankush Gola	7b5e160d28	Make Tools own model, add ToolKit Concept (#1095 ) Follow-up of @hinthornw's PR: - Migrate the Tool abstraction to a separate file (`BaseTool`). - `Tool` implementation of `BaseTool` takes in function and coroutine to more easily maintain backwards compatibility - Add a Toolkit abstraction that can own the generation of tools around a shared concept or state --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net> Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>	1 year ago
Harrison Chase	45b5640fe5	fix sql (#1141 )	1 year ago
kekayan	9111f4ca8a	fix chatvectordbchain to use pinecone namespace (#1139 ) In the similarity search, the pinecone namespace is not used, which makes the bot return _I don't know_ where the embeddings are stored in the pinecone namespace. Now we can query by passing the namespace optionally. ```result = qa({"question": query, "chat_history": chat_history, "namespace":"01gshyhjcfgkq1q5wxjtm17gjh"})```	1 year ago
Harrison Chase	fb3c73d194	add srt loader (#1140 )	1 year ago

1 2 3 4 5 ...

461 Commits (5ba1c7b6018837a2cbf61ae8e7f45c8f1e3e8f72)