langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Ankush Gola	d3ec00b566	Callbacks Refactor [base] (#3256 ) Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-04-30 11:14:09 -07:00
Zander Chase	18ec22fe56	Remove multi-input tool section (#3810 ) Moving to new notebook. Will re-intro w/ new agent	2023-04-29 15:29:08 -07:00
mbchang	adcad98bee	fix: fix filepath error in agent simulations docs (#3795 )	2023-04-29 11:21:27 -07:00
Harrison Chase	20aad0bed1	stripe docs	2023-04-29 08:16:37 -07:00
Harrison Chase	378f0889eb	bump version to 153 (#3774 )	2023-04-29 07:31:35 -07:00
Sheldon	399065e858	update zilliz example (#3578 ) 1. Now the Zilliz example can't connect to Zilliz Cloud, fixed Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-04-28 22:10:13 -07:00
Harrison Chase	bd7e0a534c	Harrison/csv loader (#3771 ) Co-authored-by: mrT23 <tal.r@codium.ai>	2023-04-28 21:54:24 -07:00
Harrison Chase	c494ca3ad2	Harrison/doc2txt (#3772 ) Co-authored-by: rishni ratnam <rishniratnam@gmail.com>	2023-04-28 21:54:16 -07:00
Mike Wang	ce4fea983b	[simple] added test case and improve self class return type annotation (#3773 ) a simple follow up of https://github.com/hwchase17/langchain/pull/3748 - added test case - improve annotation when function return type is class itself.	2023-04-28 21:54:07 -07:00
Harrison Chase	0c0f14407c	Harrison/tair (#3770 ) Co-authored-by: Seth Huang <848849+seth-hg@users.noreply.github.com>	2023-04-28 21:25:33 -07:00
Aurélien SCHILTZ	502ba6a0be	Fix type annotation for SQLDatabaseToolkit.llm (#3581 ) Currently `langchain.agents.agent_toolkits.SQLDatabaseToolkit` has a field `llm` with type `BaseLLM`. This breaks initialization for some LLMs. For example, trying to use it with GPT4: ``` from langchain.sql_database import SQLDatabase from langchain.chat_models import ChatOpenAI from langchain.agents.agent_toolkits import SQLDatabaseToolkit db = SQLDatabase.from_uri("some_db_uri") llm = ChatOpenAI(model_name="gpt-4") toolkit = SQLDatabaseToolkit(db=db, llm=llm) # pydantic.error_wrappers.ValidationError: 1 validation error for SQLDatabaseToolkit # llm # Can't instantiate abstract class BaseLLM with abstract methods _agenerate, _generate, _llm_type (type=type_error) ``` Seems like much of the rest of the codebase has switched from BaseLLM to BaseLanguageModel. This PR makes the change for SQLDatabaseToolkit as well	2023-04-28 21:19:01 -07:00
uyhcire	0a7a2b99b5	Fix Chroma integration failing when there are less than 4 items in the collection (#3674 ) The code was failing to decrement the `n_results` kwarg passed to `query(...)`	2023-04-28 21:18:19 -07:00
Rafal Wojdyla	57e028549a	Expose kwargs in `LLMChainExtractor.from_llm` (#3748 ) Re: https://github.com/hwchase17/langchain/issues/3747	2023-04-28 21:18:05 -07:00
Mike Wang	512c24fc9c	[annotation improvement] Make AgentType->Class Conversion More Scalable (#3749 ) In the current solution, AgentType and AGENT_TO_CLASS are placed in two separate files and both manually maintained. This might cause inconsistency when we update either of them. — latest — based on the discussion with hwchase17, we don’t know how to further use the newly introduced AgentTypeConfig type, so it doesn’t make sense yet to add it. Instead, it’s better to move the dictionary to another file to keep the loading.py file clear. The consistency is a good point. Instead of asserting the consistency during linting, we added a unittest for consistency check. I think it works as auto unittest is triggered every time with clear failure notice. (well, force push is possible, but we all know what we are doing, so let’s show trust. :>) ~~This PR includes~~ - ~~Introduced AgentTypeConfig as the source of truth of all AgentType related meta data.~~ - ~~Each AgentTypeConfig is a annotated class type which can be used for annotation in other places.~~ - ~~Each AgentTypeConfig can be easily extended when we have more meta data needs.~~ - ~~Strong assertion to ensure AgentType and AGENT_TO_CLASS are always consistent.~~ - ~~Made AGENT_TO_CLASS automatically generated.~~ ~~Test Plan:~~ - ~~since this change is focusing on annotation, lint is the major test focus.~~ - ~~lint, format and test passed on local.~~	2023-04-28 21:17:28 -07:00
Harrison Chase	b7ae9f715d	Langchain with reddit (#3661 ) (#3768 ) I have added a reddit document loader which fetches the text from the Posts of Subreddits or Reddit users, using the `praw` Python package. I have also added an example notebook reddit.ipynb in order to guide users to use this dataloader. This code was made in format similar to twiiter document loader. I have run code formating, linting and also checked the code myself for different scenarios. This is my first contribution to an open source project and I am really excited about this. If you want to suggest some improvements in my code, I will be happy to do it. :) Co-authored-by: Taaha Bajwa <taaha.s.bajwa@gmail.com>	2023-04-28 20:59:56 -07:00
Kohei Kumazaki	fa4c35e9e5	Fix encoding issue in WebBaseLoader (#3602 ) The character code mismatches occurred when character information was not included in the response header (In my case, a Japanese web page). I solved this issue by changing the encoding setting to apparent_encoding.	2023-04-28 20:56:33 -07:00
Harrison Chase	be7a8e0824	Harrison/redis cache (#3766 ) Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>	2023-04-28 20:47:18 -07:00
Mike Wang	b588446bf9	[simple][test] Added test case for schema.py (#3692 ) - added unittest for schema.py covering utility functions and token counting. - fixed a nit. based on huggingface doc, the tokenizer model is gpt-2. [link](https://huggingface.co/transformers/v4.8.2/_modules/transformers/models/gpt2/tokenization_gpt2_fast.html) - make lint && make format, passed on local - screenshot of new test running result <img width="1283" alt="Screenshot 2023-04-27 at 9 51 55 PM" src="https://user-images.githubusercontent.com/62768671/235057441-c0ac3406-9541-453f-ba14-3ebb08656114.png">	2023-04-28 20:42:24 -07:00
Harrison Chase	15b92d361d	Harrison/confluence stuff (#3765 ) Co-authored-by: Jelmer Borst <japborst@gmail.com>	2023-04-28 20:19:44 -07:00
SimFG	5998b53596	Use the GPTCache api interface (#3693 ) Use the GPTCache api interface to reduce the possibility of compatibility issues	2023-04-28 20:18:51 -07:00
engkheng	f37a932b24	Improve chat prompt template docs (#3719 ) Add a few more explanations and examples.	2023-04-28 20:16:22 -07:00
Robert Perrotta	22770f5202	Make StuffDocumentsChain doc separator configurable (#3718 ) This PR makes the `"\n\n"` string with which `StuffDocumentsChain` joins formatted documents a property so it can be configured. The new `document_separator` property defaults to `"\n\n"` so the change is backwards compatible.	2023-04-28 20:14:07 -07:00
Akhil Vempali	64ba24292d	fix: 🐛 SQLAlchemy import error (#3716 ) During the import of langchain, SQLAlchemy was throeing an errror `ImportError: cannot import name 'Mapped' from 'sqlalchemy.orm'`. This is becaue the Mapped name was introduced in v1.4	2023-04-28 20:13:32 -07:00
Jon Saginaw	f8d69e4e52	Enhancement: Blockchain Document Loader with better Metadata support (#3710 ) This PR includes some minor alignment updates, including: - metadata object extended to support contractAddress, blockchainType, and tokenId - notebook doc better aligned to standard langchain format - startToken changed from int to str to support multiple hex value types on the Alchemy API The updated metadata will look like the below. It's possible for a single contractAddress to exist across multiple blockchains (e.g. Ethereum, Polygon, etc.) so it's important to include the blockchainType. ``` metadata = {"source": self.contract_address, "blockchain": self.blockchainType, "tokenId": tokenId} ```	2023-04-28 20:13:05 -07:00
Davis Chase	220a7076ac	Add Mathpix pdf loader (#3727 ) Inspo https://twitter.com/danielgross/status/1651695062307274754?s=46&t=1zHLap5WG4I_kQPPjfW9fA Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-04-28 20:11:22 -07:00
Rafal Wojdyla	37ed6f2177	Handle length safe embedding only if needed (#3723 ) Re: https://github.com/hwchase17/langchain/issues/3722 Copy pasting context from the issue: `1bf1c37c0c/langchain/embeddings/openai.py (L210-L211)` Means that the length safe embedding method is "always" used, initial implementation https://github.com/hwchase17/langchain/pull/991 has the `embedding_ctx_length` set to -1 (meaning you had to opt-in for the length safe method), https://github.com/hwchase17/langchain/pull/2330 changed that to max length of OpenAI embeddings v2, meaning the length safe method is used at all times. How about changing that if branch to use length safe method only when needed, meaning when the text is longer than the max context length?	2023-04-28 20:10:04 -07:00
Harrison Chase	40f6e60e68	Harrison/stripe (#3762 ) Co-authored-by: Ismail Pelaseyed <homanp@gmail.com>	2023-04-28 20:03:21 -07:00
Jelmer Borst	8cf2ff0be0	Confluence: Add page status filter for spaces (#3732 ) At the moment all content in Confluence is retrieved by default, including archived content. Often, this is undesired as the content is not relevant anymore. Notes Fetching pages by label does not support excluding archived content. This may lead to unexpected results.	2023-04-28 19:56:53 -07:00
Harrison Chase	7a129ac043	Harrison/pypdf loader (#3764 ) Co-authored-by: Felipe Meres <felipe@felipemeres.com>	2023-04-28 19:56:21 -07:00
mbchang	4eefea0fe8	new example: single agent, simulated environment (openai gym) (#3758 ) For many applications of LLM agents, the environment is real (internet, database, REPL, etc). However, we can also define agents to interact in simulated environments like text-based games. This is an example of how to create a simple agent-environment interaction loop with [Gymnasium](https://github.com/Farama-Foundation/Gymnasium) (formerly [OpenAI Gym](https://github.com/openai/gym)).	2023-04-28 19:52:05 -07:00
0xDTE	6ce34bb4fe	Fixing broken document links (#3756 ) simple document url fixes. nothing fancy.	2023-04-28 19:51:23 -07:00
Rafal Wojdyla	160bfae93f	Add `DocstoreFn` - lookup doc via arbitrary function (#3760 ) This partially addresses https://github.com/hwchase17/langchain/issues/1524, but it's also useful for some of our use cases. This `DocstoreFn` allows to lookup a document given a function that accepts the `search` string without the need to implement a custom `Docstore`. This could be useful when: * you don't want to implement a `Docstore` just to provide a custom `search` * it's expensive to construct an `InMemoryDocstore`/dict * you retrieve documents from remote sources * you just want to reuse existing objects	2023-04-28 19:50:32 -07:00
Harrison Chase	c55ba43093	Harrison/vespa (#3761 ) Co-authored-by: Lester Solbakken <lesters@users.noreply.github.com>	2023-04-28 19:48:43 -07:00
mbchang	ee20b3e0d0	bug fix: initialize the arxivAPIWrapper object (#3733 )	2023-04-28 19:35:01 -07:00
leo-gan	e510732ad2	docs: improved `vectorstore` notebooks (#3724 ) - Added links to the vectorstore providers - Added installation code (it is not clear that we have to go to the `LangChan Ecosystem` page to get installation instructions.)	2023-04-28 19:26:50 -07:00
BioErrorLog	ad4eae7ef0	Fix linting on the Quickstart Guide sample codes (#3701 ) When copying and pasting the sample code from the Quickstart Guide, lint errors ("missing whitespace around operator") occur."	2023-04-28 17:29:05 -07:00
Zander Chase	a46f1d830e	Synchronous Browser (#3745 ) Split out sync methods in playwright	2023-04-28 17:09:00 -07:00
Zander Chase	6c2b16e465	Add SceneXplain Tool (#3752 )	2023-04-28 17:01:54 -07:00
erwanlc	72c5c15f7f	Fix: Updated links for in depth explanation of chain types in the Question Answering notebooks (#3714 ) In the notebook question_answering.ipynb ([link](https://github.com/hwchase17/langchain/blob/master/docs/modules/chains/index_examples/question_answering.ipynb)), and the notebook qa_with_sources.ipynb ([link](https://github.com/hwchase17/langchain/blob/master/docs/modules/chains/index_examples/qa_with_sources.ipynb)), the first paragraph contains a dead link: > This notebook walks through how to use LangChain for question answering over a list of documents. It covers four different types of chains: stuff, map_reduce, refine, map_rerank. For a more in depth explanation of what these chain types are, see [here](`32793f94fd/docs/modules/chains/combine_docs.md`). The file combine_docs.md doesn't exist anymore and thus provide 404 - Page not found. I updated the links so it redirect to https://docs.langchain.com/docs/components/chains/index_related_chains as in the summarize notebook ([link](https://github.com/hwchase17/langchain/blob/master/docs/modules/chains/index_examples/summarize.ipynb)) present in the same folder.	2023-04-28 15:06:46 -07:00
Alan Cha	e3b7a20454	Fix typo (#3728 )	2023-04-28 13:01:09 -07:00
Zander Chase	5042bd40d3	Add Shell Tool (#3335 ) Create an official bash shell tool to replace the dynamically generated one	2023-04-28 11:10:43 -07:00
Zander Chase	334c162f16	Add Other File Utilities (#3209 ) Add other File Utilities, include - List Directory - Search for file - Move - Copy - Remove file Bundle as toolkit Add a notebook that connects to the Chat Agent, which somewhat supports multi-arg input tools Update original read/write files to return the original dir paths and better handle unsupported file paths. Add unit tests	2023-04-28 10:53:37 -07:00
Zander Chase	491c27f861	PlayWright Web Browser Toolkit (#3262 ) Adds a PlayWright web browser toolkit with the following tools: - NavigateTool (navigate_browser) - navigate to a URL - NavigateBackTool (previous_page) - wait for an element to appear - ClickTool (click_element) - click on an element (specified by selector) - ExtractTextTool (extract_text) - use beautiful soup to extract text from the current web page - ExtractHyperlinksTool (extract_hyperlinks) - use beautiful soup to extract hyperlinks from the current web page - GetElementsTool (get_elements) - select elements by CSS selector - CurrentPageTool (current_page) - get the current page URL	2023-04-28 10:42:44 -07:00
Zander Chase	da7b51455c	Dynamic tool -> single purpose (#3697 ) I think the logic of https://github.com/hwchase17/langchain/pull/3684#pullrequestreview-1405358565 is too confusing. I prefer this alternative because: - All `Tool()` implementations by default will be treated the same as before. No breaking changes. - Less reliance on pydantic magic - The decorator (which only is typed as returning a callable) can infer schema and generate a structured tool - Either way, the recommended way to create a custom tool is through inheriting from the base tool	2023-04-28 09:38:41 -07:00
Zach Schillaci	1bf1c37c0c	Update VectorDBQA to RetrievalQA in tools (#3698 ) Because `VectorDBQA` and `VectorDBQAWithSourcesChain` are deprecated	2023-04-28 07:39:59 -07:00
Harrison Chase	32793f94fd	bump version to 152 (#3695 )	2023-04-28 00:21:53 -07:00
mbchang	1da3ee1386	Multiagent authoritarian (#3686 ) This notebook showcases how to implement a multi-agent simulation where a privileged agent decides who to speak. This follows the polar opposite selection scheme as [multi-agent decentralized speaker selection](https://python.langchain.com/en/latest/use_cases/agent_simulations/multiagent_bidding.html). We show an example of this approach in the context of a fictitious simulation of a news network. This example will showcase how we can implement agents that - think before speaking - terminate the conversation	2023-04-27 23:33:29 -07:00
Zander Chase	4654c58f72	Add validation on agent instantiation for multi-input tools (#3681 ) Tradeoffs here: - No lint-time checking for compatibility - Differs from JS package - The signature inference, etc. in the base tool isn't simple - The `args_schema` is optional Pros: - Forwards compatibility retained - Doesn't break backwards compatibility - User doesn't have to think about which class to subclass (single base tool or dynamic `Tool` interface regardless of input) - No need to change the load_tools, etc. interfaces Co-authored-by: Hasan Patel <mangafield@gmail.com>	2023-04-27 15:36:11 -07:00
Davis Chase	212aadd4af	Nit: list to sequence (#3678 )	2023-04-27 14:41:59 -07:00
Davis Chase	b807a114e4	Add query parsing unit tests (#3672 )	2023-04-27 13:42:12 -07:00

... 3 4 5 6 7 ...

1843 Commits