langchain

Commit Graph

Author	SHA1	Message	Date
Naveen Tatikonda	bb6c459f7a	OpenSearch: Add Support for Lucene Filter (#3201 ) ### Description Add Support for Lucene Filter. When you specify a Lucene filter for a k-NN search, the Lucene algorithm decides whether to perform an exact k-NN search with pre-filtering or an approximate search with modified post-filtering. This filter is supported only for approximate search with the indexes that are created using `lucene` engine. OpenSearch Documentation - https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/#lucene-k-nn-filter-implementation Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	1 year ago
Davis Chase	36720cb57f	Hf emb device (#3266 ) Make it possible to control the HuggingFaceEmbeddings and HuggingFaceInstructEmbeddings client model kwargs. Additionally, the cache folder was added for HuggingFaceInstructEmbedding as the client inherits from SentenceTransformer (client of HuggingFaceEmbeddings). It can be useful, especially to control the client device, as it will be defaulted to GPU by sentence_transformers if there is any. --------- Co-authored-by: Yoann Poupart <66315201+Xmaster6y@users.noreply.github.com>	1 year ago
Zach Jones	d7942a9f19	Fix type annotation for `QueryCheckerTool.llm` (#3237 ) Currently `langchain.tools.sql_database.tool.QueryCheckerTool` has a field `llm` with type `BaseLLM`. This breaks initialization for some LLMs. For example, trying to use it with GPT4: ```python from langchain.sql_database import SQLDatabase from langchain.chat_models import ChatOpenAI from langchain.tools.sql_database.tool import QueryCheckerTool db = SQLDatabase.from_uri("some_db_uri") llm = ChatOpenAI(model_name="gpt-4") tool = QueryCheckerTool(db=db, llm=llm) # pydantic.error_wrappers.ValidationError: 1 validation error for QueryCheckerTool # llm # Can't instantiate abstract class BaseLLM with abstract methods _agenerate, _generate, _llm_type (type=type_error) ``` Seems like much of the rest of the codebase has switched from `BaseLLM` to `BaseLanguageModel`. This PR makes the change for QueryCheckerTool as well Co-authored-by: Zachary Jones <zjones@zetaglobal.com>	1 year ago
Davis Chase	46542dc774	Contextual compression retriever (#2915 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Matt Robinson	3943759a90	feat: add loader for rich text files (#3227 ) ### Summary Adds a loader for rich text files. Requires `unstructured>=0.5.12`. ### Testing The following test uses the example RTF file from the [`unstructured` repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs). ```python from langchain.document_loaders import UnstructuredRTFLoader loader = UnstructuredRTFLoader("fake-doc.rtf", mode="elements") docs = loader.load() docs[0].page_content ```	1 year ago
Harrison Chase	5ef2d1e2a1	add to docs	1 year ago
Harrison Chase	4aedbeaffb	Merge branch 'master' of github.com:hwchase17/langchain	1 year ago
Harrison Chase	2dbb5261b5	wikibase agent	1 year ago
Albert Castellana	0684aa081a	Ecosystem/Yeager.ai (#3239 ) Added yeagerai.md to ecosystem	1 year ago
Boris Feld	0e797a3ff9	Fixing issue link for Comet callback (#3212 ) Sorry I fixed that link once but there was still a typo inside, this time it should be good.	1 year ago
Daniel Chalef	ae528fd06e	fix error msg ref to beautifulsoup4 (#3242 ) Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	1 year ago
Tom Dyson	7d3e6389f2	Add DuckDB prompt (#3233 ) Adds a prompt template for the DuckDB SQL dialect.	1 year ago
Zander Chase	daee0b2b97	Patch Chat History Formatting (#3236 ) While we work on solidifying the memory interfaces, handle common chat history formats. This may break linting on anyone who has been passing in `get_chat_history` . Somewhat handles #3077 Alternative to #3078 that updates the typing	1 year ago
Harrison Chase	8f22949dc4	update nnotebook title	1 year ago
leo-gan	130e4b9fcb	fixed a link to the youtube page (#3232 ) A link to the `YouTube` page was missing on the `index` page.	1 year ago
Peter Stolz	d54b977d4e	Fix docstring of RetrievalQA (#3231 ) Structure changed an RetrievalQA now expects BaseRetriever not VectorStore	1 year ago
Harrison Chase	b7dea80cba	bump version to 145 (#3229 )	1 year ago
Harrison Chase	b7f2061736	Harrison/google places (#3207 ) Co-authored-by: Cao Hoang <65607230+cnhhoang850@users.noreply.github.com> Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	1 year ago
Gabriel Altay	34fb56b633	fix copy/pasta typos wikipedia->arxiv (#3222 ) just updates a few module level docstrings from Wikipedia -> Arxiv	1 year ago
Harrison Chase	d2520a5f1e	Harrison/ddg (#3206 ) Co-authored-by: itai <itai.marks@gmail.com> Co-authored-by: Itai Marks <itaim@users.noreply.github.com> Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com> Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com> Co-authored-by: Adilzhan Ismailov <13088690+aismlv@users.noreply.github.com> Co-authored-by: Justin Flick <Justinjayflick@gmail.com> Co-authored-by: Justin Flick <jflick@homesite.com>	1 year ago
Harrison Chase	36c10f8a52	nits (#3203 )	1 year ago
Daniel Chalef	27cdf8d675	supabase vectorstore - first cut (#3100 ) First cut of a supabase vectorstore loosely patterned on the langchainjs equivalent. Doesn't support async operations which is a limitation of the supabase python client. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	1 year ago
Harrison Chase	9a0356d276	Harrison/file chat history (#3198 ) Co-authored-by: Young Lee <joybro201@gmail.com>	1 year ago
Kazon Wilson	a66cab8b71	Add new line to refine prompt tmpl (#3197 ) Adding a new line to fix issue #3117	1 year ago
Harrison Chase	96809b5794	Harrison/discord loader (#3200 ) Co-authored-by: Rajtilak Bhattacharjee <rajtilak.blog@gmail.com>	1 year ago
Justin Flick	8faef1a91a	Confluence DL retry/backoff (#3168 ) Implemented a retry/backoff logic in response to #2473 --------- Co-authored-by: Justin Flick <jflick@homesite.com>	1 year ago
Adilzhan Ismailov	c03a65c6dc	Fix from_embeddings method examples (#3174 ) Fix examples for `from_embeddings` method for annoy and faiss vectorstores	1 year ago
Harrison Chase	f19b3890c9	Harrison/site map tqdm (#3184 ) Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com> Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>	1 year ago
Harrison Chase	e55db5841a	Harrison/svm speedup (#3195 ) Co-authored-by: Lance Martin <122662504+PineappleExpress808@users.noreply.github.com>	1 year ago
obbiondo	d6b2f2b9bd	add ConfluenceLoader to document_loaders init (#3143 ) Fix ConfluenceLoader import Co-authored-by: Andrea Biondo <a.biondo@reply.it>	1 year ago
Zander Chase	c757c3cde4	Add HuggingFace Examples (#3187 ) Add a Pipeline example and add other models in th ehub notebook To close issue [#3077](https://github.com/hwchase17/langchain/issues/3099)	1 year ago
Donald "Max" Ziff	6adf2d1c39	first draft (#2690 ) There is a long way to go on this! --------- Co-authored-by: Max Ziff <max.ziff@concur.com>	1 year ago
Harrison Chase	9181cd9b22	Harrison/playwright selector (#3185 ) Co-authored-by: zhyuri <4649294+zhyuri@users.noreply.github.com>	1 year ago
Harrison Chase	68cd37175e	Harrison/arxiv tool (#3186 ) Co-authored-by: leo-gan <leo.gan.57@gmail.com>	1 year ago
Tunay Okumus	6e48107734	fix: separate model and deployment for OpenAIEmbeddings (#3076 ) Separated the deployment from model to support Azure OpenAI Embeddings properly. Also removed the deprecated document_model_name and query_model_name attributes.	1 year ago
Zander Chase	4adfd790f0	Update File Management Tools to Include Root Directory (#3112 ) - Permit the specification of a `root_dir` to the read/write file tools to specify a working directory - Add validation for attempts to read/write outside the directory (e.g., through `../../` or symlinks or `/abs/path`'s that don't lie in the correct path) - Add some tests for all One question is whether we should make a default root directory for these? tradeoffs either way	1 year ago
John-David Wuarin	a63bfb6c9f	fix: kwargs.pop("redis_url") KeyError: 'redis_url' (#3121 ) This occurred when redis_url was not passed as a parameter even though a REDIS_URL env variable was present. This occurred for all methods that eventually called any of: (from_texts, drop_index, from_existing_index) - i.e. virtually all methods in the class. This fixes it	1 year ago
engkheng	dbbc340f25	Validate `input_variables` when using `jinja2` templates (#3140 ) `langchain.prompts.PromptTemplate` and `langchain.prompts.FewShotPromptTemplate` do not validate `input_variables` when initialized as `jinja2` template. ```python # Using langchain v0.0.144 template = """"\ Your variable: {{ foo }} {% if bar %} You just set bar boolean variable to true {% endif %} """ # Missing variable, should raise ValueError prompt_template = PromptTemplate(template=template, input_variables=["bar"], template_format="jinja2", validate_template=True) # Extra variable, should raise ValueError prompt_template = PromptTemplate(template=template, input_variables=["bar", "foo", "extra", "thing"], template_format="jinja2", validate_template=True) ```	1 year ago
Matt Robinson	3e0c44bae8	enhancement: support headers for non-html urls (#3166 ) ### Summary Updates the `UnstructuredURLLoader` to support passing in headers for non HTML content types. While this update maintains backward compatibility with older versions of `unstructured`, we strongly recommended upgrading to `unstructured>=0.5.13` if you are using the `UnstructuredURLLoader`. ### Testing #### With headers ```python from langchain.document_loaders import UnstructuredURLLoader urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"] loader = UnstructuredURLLoader(urls=urls, headers={"Accept": "application/json"}, strategy="fast") docs = loader.load() print(docs[0].page_content[:1000]) ``` #### Without headers ```python from langchain.document_loaders import UnstructuredURLLoader urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"] loader = UnstructuredURLLoader(urls=urls, strategy="fast") docs = loader.load() print(docs[0].page_content[:1000]) ``` --------- Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	1 year ago
Pranabendra Prasad Chandra	7b1f0656b8	Fix typo in ElasticSearch sample notebook (#3171 ) Added missing parenthesis in example notebook [elasticsearch.ipynb](https://github.com/hwchase17/langchain/blob/master/docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb)	1 year ago
Davis Chase	10e4b32ecb	Add document transformer abstraction (#3182 ) Add DocumentTransformer abstraction so that in #2915 we don't have to wrap TextSplitter and RedundantEmbeddingFilter (neither of which uses the query) in the contextual doc compression abstractions. with this change, doc filter (doc extractor, whatever we call it) would look something like ```python class BaseDocumentFilter(BaseDocumentTransformer[_RetrievedDocument], ABC): @abstractmethod def filter(self, documents: List[_RetrievedDocument], query: str) -> List[_RetrievedDocument]: ... def transform_documents(self, documents: List[_RetrievedDocument], query: Optional[str] = None, **kwargs: Any) -> List[_RetrievedDocument]: if query is None: raise ValueError("Must pass in non-null query to DocumentFilter") return self.filter(documents, query) ```	1 year ago
Zander Chase	74342ab209	Update the marathon notebook (#3183 ) There were some steps that didn't make sense. Update now. This time it produced a nice markdown formatted table too	1 year ago
leo-gan	a78f55b851	Additional resources - `YouTube` (#3180 ) Added links to the YouTube tutorials and videos in the `youtube.md`. Added link to the ^ in `index.rst`.	1 year ago
det-sys	26c8cd1ea2	Update gallery.rst (#3176 ) Add https://anysummary.app to the gallery	1 year ago
Happydog	5e66d05928	Fix: typo in custom_mrkl_agents.ipynb document (#3159 ) I have noticed a typo error in the `custom_mrkl_agents.ipynb` document while trying the example from the documentation page. As a result, I have opened a pull request (PR) to address this minor issue, even though it may seem insignificant 😂.	1 year ago
Harrison Chase	99b1983461	add example	1 year ago
Zander Chase	89c63cf8a6	Add Marathon Notebook (#3163 ) Add an example using autogpt to get the boston marathon winning times Add a web browser + summarization tool in the notebook	1 year ago
Dariel Dato-on	0b542661b4	Prevent `kwargs` from being overwritten (#3158 ) Fixes #3157. Prevents `kwargs` from being overwritten by `_to_args_and_kwargs()` and sending the wrong `kwargs` in line 109.	1 year ago
Quentin Pleplé	126d7f11dd	Fix notebook example (#3142 ) The following calls were throwing an exception: `575b717d10/docs/use_cases/evaluation/agent_vectordb_sota_pg.ipynb`?short_path=4b3386c#L192 `575b717d10/docs/use_cases/evaluation/agent_vectordb_sota_pg.ipynb`?short_path=4b3386c#L239 Exception: ``` --------------------------------------------------------------------------- ValidationError Traceback (most recent call last) Cell In[14], line 1 ----> 1 chain_sota = RetrievalQA.from_chain_type(llm=OpenAI(temperature=0), chain_type="stuff", retriever=vectorstore_sota, input_key="question") File ~/github/langchain/venv/lib/python3.9/site-packages/langchain/chains/retrieval_qa/base.py:89, in BaseRetrievalQA.from_chain_type(cls, llm, chain_type, chain_type_kwargs, kwargs) 85 _chain_type_kwargs = chain_type_kwargs or {} 86 combine_documents_chain = load_qa_chain( 87 llm, chain_type=chain_type, _chain_type_kwargs 88 ) ---> 89 return cls(combine_documents_chain=combine_documents_chain, *kwargs) File ~/github/langchain/venv/lib/python3.9/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.__init__() ValidationError: 1 validation error for RetrievalQA retriever instance of BaseRetriever expected (type=type_error.arbitrary_type; expected_arbitrary_type=BaseRetriever) ``` The vectorstores had to be converted to retrievers: `vectorstore_sota.as_retriever()` and `vectorstore_pg.as_retriever()`. The PR also: - adds the file `paul_graham_essay.txt` referenced by this notebook - adds to gitignore .pkl and *.bin files that are generated by this notebook Interestingly enough, the performance of the prediction greatly increased (new version of langchain or ne version of OpenAI models since the last run of the notebook): from 19/33 correct to 28/33 correct!	1 year ago
Jakub Kukul	599e17cea8	Working example for Anthropic (#3151 ) would be great if the provided example worked out of the box 😄	1 year ago

... 9 10 11 12 13 ...

1945 Commits (49ce5ce1ca70657e34b63c2f239222e9557be115) All Branches Search

1945 Commits (49ce5ce1ca70657e34b63c2f239222e9557be115)

All Branches