langchain

Author	SHA1	Message	Date
玄猫	43a7a89e93	opt: document_loader notiondb to extract url (#4222 )	2023-05-06 09:34:33 -07:00
Leonid Ganeline	9544b30821	added `Wikipedia` document loader (#4141 ) - Added the `Wikipedia` document loader. It is based on the existing `unilities/WikipediaAPIWrapper` - Added a respective ut-s and example notebook - Sorted list of classes in __init__	2023-05-06 09:32:45 -07:00
Eugene Yurtsev	423f497168	Add BlobParser abstraction (#3979 ) This PR adds the BlobParser abstraction. It follows the proposal described here: https://github.com/hwchase17/langchain/pull/2833#issuecomment-1509097756	2023-05-05 21:43:38 -04:00
Davis Chase	5ca13cc1f0	Dev2049/pypdfium2 (#4209 ) thanks @jerrytigerxu for the addition! --------- Co-authored-by: Jere Xu <jtxu2008@gmail.com> Co-authored-by: jerrytigerxu <jere.tiger.xu@gmailc.om>	2023-05-05 17:55:31 -07:00
Leonid Ganeline	59204a5033	docs: `document_loaders` improvements (#4200 ) - made notebooks consistent: titles, service/format descriptions. - corrected short names to full names, for example, `Word` -> `Microsoft Word` - added missed descriptions - renamed notebook files to make ToC correctly sorted	2023-05-05 17:44:54 -07:00
Harrison Chase	eeb7c96e0c	bump version to 160 (#4205 )	2023-05-05 17:02:39 -07:00
Davis Chase	f1fc4dfebc	Dev2049/obsidian patch (#4204 ) thanks @shkarlsson for the fix! (just updated formatting) --------- Co-authored-by: shkarlsson <sven.henrik.karlsson@gmail.com>	2023-05-05 16:49:19 -07:00
George	2324f19c85	Update qdrant interface (#3971 ) Hello 1) Passing `embedding_function` as a callable seems to be outdated and the common interface is to pass `Embeddings` instance 2) At the moment `Qdrant.add_texts` is designed to be used with `embeddings.embed_query`, which is 1) slow 2) causes ambiguity due to 1. It should be used with `embeddings.embed_documents` This PR solves both problems and also provides some new tests	2023-05-05 16:46:40 -07:00
Harrison Chase	76ed41f48a	update docs (#4194 )	2023-05-05 16:45:26 -07:00
Zander Chase	1017e5cee2	Add LCP Client (#4198 ) Adding a client to fetch datasets, examples, and runs from a LCP instance and run objects over them.	2023-05-05 16:28:56 -07:00
Zander Chase	a30f42da4e	Update V2 Tracer (#4193 ) - Update the RunCreate object to work with recent changes - Add optional Example ID to the tracer - Adjust default persist_session behavior to attempt to load the session if it exists - Raise more useful HTTP errors for logging - Add unit testing - Fix the default ID to be a UUID for v2 tracer sessions Broken out from the big draft here: https://github.com/hwchase17/langchain/pull/4061	2023-05-05 14:55:01 -07:00
Mike Wang	c3044b1bf0	[test] Add integration_test for PandasAgent (#4056 ) - confirm creation - confirm functionality with a simple dimension check. The test now is calling OpenAI API directly, but learning from @vowelparrot that we’re caching the requests, so that it’s not that expensive. I also found we’re calling OpenAI api in other integration tests. Please lmk if there is any concern of real external API calls. I can alternatively make a fake LLM for this test. Thanks	2023-05-05 14:49:02 -07:00
Aivin V. Solatorio	6567b73e1a	JSON loader (#4067 ) This implements a loader of text passages in JSON format. The `jq` syntax is used to define a schema for accessing the relevant contents from the JSON file. This requires dependency on the `jq` package: https://pypi.org/project/jq/. --------- Signed-off-by: Aivin V. Solatorio <avsolatorio@gmail.com>	2023-05-05 14:48:13 -07:00
PawelFaron	bb6d97c18c	Fixed the example code (#4117 ) Fixed the issue mentioned here: https://github.com/hwchase17/langchain/issues/3799#issuecomment-1534785861 Co-authored-by: Pawel Faron <ext-pawel.faron@vaisala.com>	2023-05-05 14:22:10 -07:00
Anurag	19e28d8784	feat: Allow users to pass additional arguments to the WebDriver (#4121 ) This commit adds support for passing additional arguments to the `SeleniumURLLoader ` when creating Chrome or Firefox web drivers. Previously, only a few arguments such as `headless` could be passed in. With this change, users can pass any additional arguments they need as a list of strings using the `arguments` parameter. The `arguments` parameter allows users to configure the driver with any options that are available for that particular browser. For example, users can now pass custom `user_agent` strings or `proxy` settings using this parameter. This change also includes updated documentation and type hints to reflect the new `arguments` parameter and its usage. fixes #4120	2023-05-05 13:24:42 -07:00
hp0404	2a3c5f8353	Update WhatsAppChatLoader regex to handle multiple date-time formats (#4186 ) This PR updates the `message_line_regex` used by `WhatsAppChatLoader` to support different date-time formats used in WhatsApp chat exports; resolves #4153. The new regex handles the following input formats: ```terminal [05.05.23, 15:48:11] James: Hi here [11/8/21, 9:41:32 AM] User name: Message 123 1/23/23, 3:19 AM - User 2: Bye! 1/23/23, 3:22_AM - User 1: And let me know if anything changes ``` Tests have been added to verify that the loader works correctly with all formats.	2023-05-05 13:13:05 -07:00
Nicolas	a57259ec83	docs: Mendable Fixes and Improvements (#4184 ) Overall fixes and improvements.	2023-05-05 13:04:24 -07:00
Harrison Chase	7dcc698ebf	bump version to 159 (#4183 )	2023-05-05 09:31:08 -07:00
Harrison Chase	26534457f5	simplify csv args (#4182 )	2023-05-05 09:22:08 -07:00
Eduard van Valkenburg	3095546851	PowerBI fix for table names with spaces (#4170 ) small fix to make sure a table name with spaces is passed correctly to the API for the schema lookup.	2023-05-05 09:15:47 -07:00
obbiondo	b1e2e29222	fix: remove expand parameter from ConfluenceLoader by label (#4181 ) expand is not an allowed parameter for the method confluence.get_all_pages_by_label, since it doesn't return the body of the text but just metadata of documents Co-authored-by: Andrea Biondo <a.biondo@reply.it>	2023-05-05 09:15:21 -07:00
Zander Chase	84cfa76e00	Update Cohere Reranker (#4180 ) The forward ref annotations don't get updated if we only iimport with type checking --------- Co-authored-by: Abhinav Verma <abhinav_win12@yahoo.co.in>	2023-05-05 09:11:37 -07:00
Davis Chase	d84bb02881	Add Chroma self query (#4149 ) Add internal query language -> chroma metadata filter translator	2023-05-05 08:43:08 -07:00
Vinoo Ganesh	905a2114d7	Fix: Typo in Docs (#4179 ) Fixing small typo in docs	2023-05-05 08:35:49 -07:00
Ankush Gola	8de1b4c4c2	Revert "fix: #4128 missing run_manager parameter" (#4159 ) Reverts hwchase17/langchain#4130	2023-05-05 00:52:16 -07:00
Chakib Ben Ziane	878d0c8155	fix: #4128 missing run_manager parameter (#4130 ) `run_manager` was not being passed downstream. Not sure if this was a deliberate choice but it seems like it broke many agent callbacks like `agent_action` and `agent_finish`. This fix needs a proper review. Co-authored-by: blob42 <spike@w530>	2023-05-04 23:59:55 -07:00
Zander Chase	6032a051e9	Add Tenant ID to V2 Tracer (#4135 ) Update the V2 tracer to - use UUIDs instead of int's - load a tenant ID and use that when saving sessions	2023-05-04 21:35:20 -07:00
Zander Chase	fea639c1fc	Vwp/sqlalchemy (#4145 ) Bump threshold to 1.4 from 1.3. Change import to be compatible Resolves #4142 and #4129 --------- Co-authored-by: ndaugreal <ndaugreal@gmail.com> Co-authored-by: Jeremy Lopez <lopez86@users.noreply.github.com>	2023-05-04 20:46:38 -07:00
Zander Chase	2f087d63af	Fix Python RePL Tool (#4137 ) Filter out kwargs from inferred schema when determining if a tool is single input. Add a couple unit tests. Move tool unit tests to the tools dir	2023-05-04 20:31:16 -07:00
Zander Chase	cc068f1b77	Add Issue Templates (#4021 ) Add issue templates for - bug reports - feature suggestions - documentation and a link to the discord for general discussion. Open to other suggestions here. Could also add another "Other" template with just a raw text box if we think this is too restrictive <img width="1464" alt="image" src="https://user-images.githubusercontent.com/130414180/236115358-e603bcbe-282c-40c7-82eb-905eb93ccec0.png">	2023-05-04 16:33:52 -07:00
Zander Chase	ac0a9d02bd	Visual Studio Code/Github Codespaces Dev Containers (#4035 ) (#4122 ) Having dev containers makes its easier, faster and secure to setup the dev environment for the repository. The pull request consists of: - .devcontainer folder with: - devcontainer.json : (minimal necessary vscode extensions and settings) - docker-compose.yaml : (could be modified to run necessary services as per need. Ex vectordbs, databases) - Dockerfile:(non root with dev tools) - Changes to README - added the Open in Github Codespaces Badge - added the Open in dev container Badge Co-authored-by: Jinto Jose <129657162+jj701@users.noreply.github.com>	2023-05-04 11:37:00 -07:00
Harrison Chase	d86ed15d88	bump version to 158 (#4091 )	2023-05-04 09:14:47 -07:00
OlajideOgun	624554a43a	DeepLake: Pass in rest of args to self._search_helper (#4080 ) As of right now when trying to use functions like `max_marginal_relevance_search()` or `max_marginal_relevance_search_by_vector()` the rest of the kwargs are not propagated to `self._search_helper()`. For example a user cannot explicitly state the distance_metric they want to use when calling `max_marginal_relevance_search`	2023-05-04 02:14:22 -07:00
Eduard van Valkenburg	6d84541ff9	fix base url (#4095 ) Noticed a mistake in the base url and group vs non-group urls	2023-05-04 02:08:21 -07:00
Harrison Chase	a9c2450330	Harrison/toml loader (#4090 ) Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>	2023-05-03 23:14:39 -07:00
Harrison Chase	d4cf1eb60a	Add firestore memory (#3792 ) (#3941 ) If you have any other suggestions or feedback, please let me know. --------- Co-authored-by: yakigac <10434946+yakigac@users.noreply.github.com>	2023-05-03 22:55:47 -07:00
Harrison Chase	fba6921b50	Harrison/one drive loader (#4081 ) Co-authored-by: José Ferraz Neto <netoferraz@gmail.com>	2023-05-03 22:55:34 -07:00
golergka	bd277b5327	feat: prune summary buffer (#4004 ) If the library user has to decrease the `max_token_limit`, he would probably want to prune the summary buffer even though he haven't added any new messages. Personally, I need it because I want to serialise memory buffer object and save to database, and when I load it, I may have re-configured my code to have a shorter memory to save on tokens.	2023-05-03 22:45:48 -07:00
AndreLCanada	bf726f9d8a	Update python_repl docs (#4012 ) In the example for creating a Python REPL tool under the Agent module, the ".run" was omitted in the example. I believe this is required when defining a Tool.	2023-05-03 22:45:32 -07:00
Mike Wang	67db495fcf	[agent] Add Spark Agent (#4020 ) - added support for spark through pyspark library. - added jupyter notebook as example.	2023-05-03 22:45:23 -07:00
Gengliang Wang	8af25867cb	Simplify HumanMessages in the quick start guide (#4026 ) In the section `Get Message Completions from a Chat Model` of the quick start guide, the HumanMessage doesn't need to include `Translate this sentence from English to French.` when there is a system message. Simplify HumanMessages in these examples can further demonstrate the power of LLM.	2023-05-03 22:45:03 -07:00
Harrison Chase	087a4bd2b8	improve agent documentation (#4062 )	2023-05-03 22:44:01 -07:00
rogerserper	b1446bea5f	google-serper: async + full json results + support for Google Images, Places and News (#4078 ) * implemented arun, results, and aresults. Reuses aiosession if available. * helper tools GoogleSerperRun and GoogleSerperResults * support for Google Images, Places and News (examples given) and filtering based on time (e.g. past hour) * updated docs	2023-05-03 22:35:48 -07:00
mbchang	cdea47491d	refactor: refactor dialogue examples (DialogueAgent, DialogueSimulator) (#4074 ) refactor dialogue examples to have same DialogueAgent and DialogueSimulator definitions	2023-05-03 22:32:26 -07:00
Jan Philipp Harries	657f5f259f	Added option to reduce verbosity of Deeplake integration (#4038 ) The deeplake integration was/is very verbose (see e.g. [the documentation example](https://python.langchain.com/en/latest/use_cases/code/code-analysis-deeplake.html) when loading or creating a deeplake dataset with only limited options to dial down verbosity. Additionally, the warning that a "Deep Lake Dataset already exists" was confusing, as there is as far as I can tell no other way to load a dataset. This small PR changes that and introduces an explicit `verbose` argument which is also passed to the deeplake library. There should be minimal changes to the default output (the loading line is printed instead of warned to make it consistent with `ds.summary()` which also prints.	2023-05-03 22:16:27 -07:00
Davis Chase	7f8727bbcd	Router chains (#4019 ) Unpolished router examples to help flesh out abstractions and use cases ![Screenshot 2023-05-02 at 7 02 58 PM](https://user-images.githubusercontent.com/130488702/235820394-389e5584-db0b-415e-a260-2824b5555167.png) --------- Co-authored-by: Shreya Rajpal <shreya.rajpal@gmail.com>	2023-05-03 22:02:55 -07:00
Pulkit Mehta	bbbca10704	issue#4082 base_language had wrong code comment that it was using gpt… (#4084 ) …3 to tokenize text instead of gpt-2 Co-authored-by: Pulkit <pulkit.mehta@catylex.com>	2023-05-03 21:58:29 -07:00
Leonid Ganeline	6caba8e759	docs: added a link to the `Google Scholar` articles (#4007 ) Google Scholar outputs a nice list of scientific and research articles that use LangChain. I added a link to the Google Scholar page to the `gallery` doc page	2023-05-03 21:54:44 -07:00
obbiondo	d18e788ee3	bugfix: return whole document when loading with ConfluenceLoader.load by label (#3980 ) Method confluence.get_all_pages_by_label, returns only metadata about documents with a certain label (such as pageId, titles, ...). To return all documents with a certain label we need to extract all page ids given a certain label and get pages content by these ids. --------- Co-authored-by: Andrea Biondo <a.biondo@reply.it>	2023-05-03 21:52:05 -07:00
Harrison Chase	5f30cc8713	Harrison/knn retriever (#4083 ) Co-authored-by: Yuichi Tateno (secon) <hotchpotch@users.noreply.github.com>	2023-05-03 21:21:58 -07:00

... 2 3 4 5 6 ...

1933 Commits