langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Roma	2b4e9a3efa	Add unit test for _merge_splits function (#3513 ) This commit adds a new unit test for the _merge_splits function in the text splitter. The new test verifies that the function merges text into chunks of the correct size and overlap, using a specified separator. The test passes on the current implementation of the function.	2023-04-25 10:02:59 -07:00
Sami Liedes	61da2bb742	Pandas agent: Pass forward callback manager (#3518 ) The Pandas agent fails to pass callback_manager forward, making it impossible to use custom callbacks with it. Fix that. Co-authored-by: Sami Liedes <sami.liedes@rocket-science.ch>	2023-04-25 09:58:56 -07:00
mbchang	a08e9a3109	Docs: fix naming typo (#3532 )	2023-04-25 09:58:25 -07:00
Harrison Chase	dc2188b36d	bump version to 149 (#3530 )	2023-04-25 08:43:59 -07:00
mbchang	831ca61481	docs: two_player_dnd docs (#3528 )	2023-04-25 08:24:53 -07:00
yakigac	f338d6251c	Add a test for cosmos db memory (#3525 ) Test for #3434 @eavanvalkenburg Initially, I was unaware and had submitted a pull request #3450 for the same purpose, but I have now repurposed the one I used for that. And it worked.	2023-04-25 08:10:02 -07:00
leo-gan	6b28cbe058	improved arxiv (#3495 ) Improved `arxiv/tool.py` by adding more specific information to the `description`. It would help with selecting `arxiv` tool between other tools. Improved `arxiv.ipynb` with more useful descriptions.	2023-04-25 08:09:17 -07:00
mbchang	29f321046e	doc: add two player D&D game (#3476 ) In this notebook, we show how we can use concepts from [CAMEL](https://www.camel-ai.org/) to simulate a role-playing game with a protagonist and a dungeon master. To simulate this game, we create a `TwoAgentSimulator` class that coordinates the dialogue between the two agents.	2023-04-25 08:07:18 -07:00
Harrison Chase	0fc0aa62f2	Harrison/blockchain docloader (#3491 ) Co-authored-by: Jon Saginaw <saginawj@users.noreply.github.com>	2023-04-25 08:07:06 -07:00
Harrison Chase	bee59b4689	Updated missing refactor in docs "return_map_steps" (#2956 ) (#3469 ) Minor rename in the documentation that was overlooked when refactoring. --------- Co-authored-by: Ehmad Zubair <ehmad@cogentlabs.co>	2023-04-24 22:28:47 -07:00
Harrison Chase	707741de58	Harrison/prediction guard (#3490 ) Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>	2023-04-24 22:27:22 -07:00
Harrison Chase	7257f9e015	Harrison/tfidf parameters (#3481 ) Co-authored-by: pao <go5kuramubon@gmail.com> Co-authored-by: KyoHattori <kyo.hattori@abejainc.com>	2023-04-24 22:19:58 -07:00
Harrison Chase	eda69b13f3	openai embeddings (#3488 )	2023-04-24 22:19:47 -07:00
Harrison Chase	d3ce47414d	Harrison/chroma update (#3489 ) Co-authored-by: vyeevani <30946190+vyeevani@users.noreply.github.com> Co-authored-by: Vineeth Yeevani <vineeth.yeevani@gmail.com>	2023-04-24 22:19:36 -07:00
Sami Liedes	c8b70e1c6a	langchain-server: Do not expose postgresql port to host (#3431 ) Apart from being unnecessary, postgresql is run on its default port, which means that the langchain-server will fail to start if there is already a postgresql server running on the host. This is obviously less than ideal. (Yeah, I don't understand why "expose" is the syntax that does not expose the ports to the host...) Tested by running langchain-server and trying out debugging on a host that already has postgresql bound to the port 5432. Co-authored-by: Sami Liedes <sami.liedes@rocket-science.ch>	2023-04-24 22:19:23 -07:00
Harrison Chase	7084d69ea7	Harrison/verbose conv ret (#3492 ) Co-authored-by: makretch <max.kretchmer@gmail.com>	2023-04-24 22:16:07 -07:00
Harrison Chase	36a039d017	Harrison/prompt prefix (#3496 ) Co-authored-by: Ian <ArGregoryIan@gmail.com>	2023-04-24 22:15:44 -07:00
Harrison Chase	408a0183cd	Harrison/weaviate (#3494 ) Co-authored-by: Nick Rubell <nick@rubell.com>	2023-04-24 22:15:32 -07:00
Eduard van Valkenburg	ba7a5ac9d7	Azure CosmosDB memory (#3434 ) Still needs docs, otherwise works.	2023-04-24 22:15:12 -07:00
Lucas Vieira	e6c1c32aff	Support GCS Objects with `/` in GCS Loaders (#3356 ) So, this is basically fixing the same things as #1517 but for GCS. ### Problem When loading GCS Objects with `/` in the object key (eg. folder/some-document.txt) using `GCSFileLoader`, the objects are downloaded into a temporary directory and saved as a file. This errors out when the parent directory does not exist within the temporary directory. ### What this pr does Creates parent directories based on object key. This also works with deeply nested keys: folder/subfolder/some-document.txt	2023-04-24 22:05:44 -07:00
Mindaugas Sharskus	a4d85f7fd5	[Fix #3365 ]: Changed regex to cover new line before action serious (#3367 ) Fix for: [Changed regex to cover new line before action serious.](https://github.com/hwchase17/langchain/issues/3365) --- This PR fixes the issue where `ValueError: Could not parse LLM output:` was thrown on seems to be valid input. Changed regex to cover new lines before action serious (after the keywords "Action:" and "Action Input:"). regex101: https://regex101.com/r/CXl1kB/1 --------- Co-authored-by: msarskus <msarskus@cisco.com>	2023-04-24 22:05:31 -07:00
Maxwell Mullin	696f840426	GuessedAtParserWarning from RTD document loader documentation example (#3397 ) Addresses #3396 by adding `features='html.parser'` in example	2023-04-24 21:54:39 -07:00
engkheng	06f6c49e61	Improve `llm_chain.ipynb` and `getting_started.ipynb` for chains docs (#3380 ) My attempt at improving the `Chain`'s `Getting Started` docs and `LLMChain` docs. Might need some proof-reading as English is not my first language. In LLM examples, I replaced the example use case when a simpler one (shorter LLM output) to reduce cognitive load.	2023-04-24 21:49:55 -07:00
Zander Chase	b89c258bc5	Add retry logic for ChromaDB (#3372 ) Rewrite of #3368 Mainly an issue for when people are just getting started, but still nice to not throw an error if the number of docs is < k. Add a little decorator utility to block mutually exclusive keyword arguments	2023-04-24 21:48:29 -07:00
tkarper	6b49be9951	Add Databutton to list of Deployment options (#3364 )	2023-04-24 21:45:38 -07:00
jrhe	980cc41709	Adds progress bar using tqdm to directory_loader (#3349 ) Approach copied from `WebBaseLoader`. Assumes the user doesn't have `tqdm` installed.	2023-04-24 21:42:42 -07:00
killpanda	344e3508b1	bug_fixes: use md5 instead of uuid id generation (#3442 ) At present, the method of generating `point` in qdrant is to use random `uuid`. The problem with this approach is that even documents with the same content will be inserted repeatedly instead of updated. Using `md5` as the `ID` of `point` to insert text can achieve true `update or insert`. Co-authored-by: mayue <mayue05@qiyi.com>	2023-04-24 21:39:51 -07:00
Jon Luo	b765805964	Support SQLAlchemy 2.0 (#3310 ) With https://github.com/executablebooks/jupyter-cache/pull/93 merged and `MyST-NB` updated, we can now support SQLAlchemy 2. Closes #1766	2023-04-24 21:10:56 -07:00
engkheng	7c2c73af5f	Update `Getting Started` page of `Prompt Templates` (#3298 ) Updated `Getting Started` page of `Prompt Templates` to showcase more features provided by the class. Might need some proof reading because apparently English is not my first language.	2023-04-24 21:10:22 -07:00
Hasan Patel	a14d1c02f8	Updated Readme.md (#3477 ) Corrected some minor grammar issues, changed infra to infrastructure for more clarity. Improved readability	2023-04-24 20:11:29 -07:00
Davis Chase	b2564a6391	fix #3884 (#3475 ) fixes mar bug #3384	2023-04-24 19:54:15 -07:00
Prakhar Agarwal	53b14de636	pass list of strings to embed method in tf_hub (#3284 ) This fixes the below mentioned issue. Instead of simply passing the text to `tensorflow_hub`, we convert it to a list and then pass it. https://github.com/hwchase17/langchain/issues/3282 Co-authored-by: Prakhar Agarwal <i.prakhar-agarwal@devrev.ai>	2023-04-24 19:51:53 -07:00
Beau Horenberger	2b9f1cea4e	add LoRA loading for the LlamaCpp LLM (#3363 ) First PR, let me know if this needs anything like unit tests, reformatting, etc. Seemed pretty straightforward to implement. Only hitch was that mmap needs to be disabled when loading LoRAs or else you segfault.	2023-04-24 18:31:14 -07:00
Ehsan M. Kermani	5d0674fb46	Use a consistent poetry version everywhere (#3250 ) Fixes the discrepancy of poetry version in Dockerfile and the GAs	2023-04-24 18:19:51 -07:00
Felipe Lopes	8c56e92566	feat: add private weaviate api_key support on from_texts (#3139 ) This PR adds support for providing a Weaviate API Key to the VectorStore methods `from_documents` and `from_texts`. With this addition, users can authenticate to Weaviate and make requests to private Weaviate servers when using these methods. ## Motivation Currently, LangChain's VectorStore methods do not provide a way to authenticate to Weaviate. This limits the functionality of the library and makes it more difficult for users to take advantage of Weaviate's features. This PR addresses this issue by adding support for providing a Weaviate API Key as extra parameter used in the `from_texts` method. ## Contributing Guidelines I have read the [contributing guidelines](`72b7d76d79/.github/CONTRIBUTING.md`) and the PR code passes the following tests: - [x] make format - [x] make lint - [x] make coverage - [x] make test	2023-04-24 17:55:34 -07:00
Zzz233	239dc10852	ES similarity_search_with_score() and metadata filter (#3046 ) Add similarity_search_with_score() to ElasticVectorSearch, add metadata filter to both similarity_search() and similarity_search_with_score()	2023-04-24 17:20:08 -07:00
Zander Chase	416f3bdf11	Vwp/alpaca streaming (#3468 ) Co-authored-by: Luke Stanley <306671+lukestanley@users.noreply.github.com>	2023-04-24 16:27:51 -07:00
Cao Hoang	26035dfa59	remove default usage of openai model in SQLDatabaseToolkit (#2884 ) #2866 This toolkit used openai LLM as the default, which could incurr unwanted cost.	2023-04-24 16:27:38 -07:00
Harrison Chase	675d86aa11	show how to use memory in convo chain (#3463 )	2023-04-24 13:29:51 -07:00
leo-gan	d5086d4760	added integration links to the ecosystem.rst (#3453 ) Now it is hard to search for the integration points between data_loaders, retrievers, tools, etc. I've placed links to all groups of providers and integrations on the `ecosystem` page. So, it is easy to navigate between all integrations from a single location.	2023-04-24 12:17:44 -07:00
Davis Chase	2cbd41145c	Bugfix: Not all combine docs chains takes kwargs `prompt` (#3462 ) Generalize ConversationalRetrievalChain.from_llm kwargs --------- Co-authored-by: shubham.suneja <shubham.suneja>	2023-04-24 12:13:06 -07:00
cs0lar	3033c6b964	fixes #1214 (#3003 ) ### Background Continuing to implement all the interface methods defined by the `VectorStore` class. This PR pertains to implementation of the `max_marginal_relevance_search_by_vector` method. ### Changes - a `max_marginal_relevance_search_by_vector` method implementation has been added in `weaviate.py` - tests have been added to the the new method - vcr cassettes have been added for the weaviate tests ### Test Plan Added tests for the `max_marginal_relevance_search_by_vector` implementation ### Change Safety - [x] I have added tests to cover my changes	2023-04-24 11:50:55 -07:00
Harrison Chase	434d8c4c0e	Merge branch 'master' of github.com:hwchase17/langchain	2023-04-24 11:30:14 -07:00
Harrison Chase	bdb5f2f9fb	update notebook	2023-04-24 11:30:06 -07:00
Zander Chase	d06d47bc92	LM Requests Wrapper (#3457 ) Co-authored-by: jnmarti <88381891+jnmarti@users.noreply.github.com>	2023-04-24 11:12:47 -07:00
Harrison Chase	b64c86a25f	bump version to 148 (#3458 )	2023-04-24 11:08:32 -07:00
mbchang	82845e3821	add meta-prompt to autonomous agents use cases (#3254 ) An implementation of [meta-prompt](https://noahgoodman.substack.com/p/meta-prompt-a-simple-self-improving), where the agent modifies its own instructions across episodes with a user. ![figure](https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F468217b9-96d9-47c0-a08b-dbf6b21b9f49_492x384.png)	2023-04-24 10:48:38 -07:00
yunfeilu92	77235bbe43	propogate kwargs to cls in OpenSearchVectorSearch (#3416 ) kwargs shoud be passed into cls so that opensearch client can be properly initlized in __init__(). Otherwise logic like below will not work. as auth will not be passed into __init__ ```python docsearch = OpenSearchVectorSearch.from_documents(docs, embeddings, opensearch_url="http://localhost:9200") query = "What did the president say about Ketanji Brown Jackson" docs = docsearch.similarity_search(query) ``` Co-authored-by: EC2 Default User <ec2-user@ip-172-31-28-97.ec2.internal>	2023-04-24 10:43:41 -07:00
Eduard van Valkenburg	46c9636012	small constructor change and updated notebook (#3426 ) small change in the pydantic definitions, same api. updated notebook with right constructure and added few shot example	2023-04-24 10:42:38 -07:00
Zander Chase	49122a96e7	Structured Tool Bugfixes (#3324 ) - Proactively raise error if a tool subclasses BaseTool, defines its own schema, but fails to add the type-hints - fix the auto-inferred schema of the decorator to strip the unneeded virtual kwargs from the schema dict Helps avoid silent instances of #3297	2023-04-24 09:58:29 -07:00

... 2 3 4 5 6 ...

1684 Commits