langchain

Commit Graph

Author	SHA1	Message	Date
Paul Garner	aa9d5707e0	Add PythonLoader which auto-detects encoding of Python files (#3311 ) This PR contributes a `PythonLoader`, which inherits from `TextLoader` but detects and sets the encoding automatically.	1 year ago
Daniel Chalef	1ecbeec24e	Fix example match_documents fn table name, grammar (#3294 ) ref https://github.com/hwchase17/langchain/pull/3100#issuecomment-1517086472 Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	1 year ago
Davis Chase	2fd24d31a4	Cleanup integration test dir (#3308 )	1 year ago
leo-gan	3bc703b0d6	added links to the important YouTube videos (#3244 ) Added links to the important YouTube videos	1 year ago
Sertaç Özercan	1e91266a8a	fix: handle youtube TranscriptsDisabled (#3276 ) handles error when youtube video has transcripts disabled ``` youtube_transcript_api._errors.TranscriptsDisabled: Could not retrieve a transcript for the video https://www.youtube.com/watch?v=<URL> This is most likely caused by: Subtitles are disabled for this video If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem! ``` Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	1 year ago
Alexandre Pesant	04e1d6c699	Do not print openai settings (#3280 ) There's no reason to print these settings like that, it just pollutes the logs :)	1 year ago
Zander Chase	a71a2c0eb2	Handle null action in AutoGPT Agent (#3274 ) Handle the case where the command is `null`	1 year ago
Harrison Chase	bf78200f55	bump version 146 (#3272 )	1 year ago
Harrison Chase	87544d2378	gradio tools (#3255 )	1 year ago
Naveen Tatikonda	bb6c459f7a	OpenSearch: Add Support for Lucene Filter (#3201 ) ### Description Add Support for Lucene Filter. When you specify a Lucene filter for a k-NN search, the Lucene algorithm decides whether to perform an exact k-NN search with pre-filtering or an approximate search with modified post-filtering. This filter is supported only for approximate search with the indexes that are created using `lucene` engine. OpenSearch Documentation - https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/#lucene-k-nn-filter-implementation Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	1 year ago
Davis Chase	36720cb57f	Hf emb device (#3266 ) Make it possible to control the HuggingFaceEmbeddings and HuggingFaceInstructEmbeddings client model kwargs. Additionally, the cache folder was added for HuggingFaceInstructEmbedding as the client inherits from SentenceTransformer (client of HuggingFaceEmbeddings). It can be useful, especially to control the client device, as it will be defaulted to GPU by sentence_transformers if there is any. --------- Co-authored-by: Yoann Poupart <66315201+Xmaster6y@users.noreply.github.com>	1 year ago
Zach Jones	d7942a9f19	Fix type annotation for `QueryCheckerTool.llm` (#3237 ) Currently `langchain.tools.sql_database.tool.QueryCheckerTool` has a field `llm` with type `BaseLLM`. This breaks initialization for some LLMs. For example, trying to use it with GPT4: ```python from langchain.sql_database import SQLDatabase from langchain.chat_models import ChatOpenAI from langchain.tools.sql_database.tool import QueryCheckerTool db = SQLDatabase.from_uri("some_db_uri") llm = ChatOpenAI(model_name="gpt-4") tool = QueryCheckerTool(db=db, llm=llm) # pydantic.error_wrappers.ValidationError: 1 validation error for QueryCheckerTool # llm # Can't instantiate abstract class BaseLLM with abstract methods _agenerate, _generate, _llm_type (type=type_error) ``` Seems like much of the rest of the codebase has switched from `BaseLLM` to `BaseLanguageModel`. This PR makes the change for QueryCheckerTool as well Co-authored-by: Zachary Jones <zjones@zetaglobal.com>	1 year ago
Davis Chase	46542dc774	Contextual compression retriever (#2915 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Matt Robinson	3943759a90	feat: add loader for rich text files (#3227 ) ### Summary Adds a loader for rich text files. Requires `unstructured>=0.5.12`. ### Testing The following test uses the example RTF file from the [`unstructured` repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs). ```python from langchain.document_loaders import UnstructuredRTFLoader loader = UnstructuredRTFLoader("fake-doc.rtf", mode="elements") docs = loader.load() docs[0].page_content ```	1 year ago
Harrison Chase	5ef2d1e2a1	add to docs	1 year ago
Harrison Chase	4aedbeaffb	Merge branch 'master' of github.com:hwchase17/langchain	1 year ago
Harrison Chase	2dbb5261b5	wikibase agent	1 year ago
Albert Castellana	0684aa081a	Ecosystem/Yeager.ai (#3239 ) Added yeagerai.md to ecosystem	1 year ago
Boris Feld	0e797a3ff9	Fixing issue link for Comet callback (#3212 ) Sorry I fixed that link once but there was still a typo inside, this time it should be good.	1 year ago
Daniel Chalef	ae528fd06e	fix error msg ref to beautifulsoup4 (#3242 ) Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	1 year ago
Tom Dyson	7d3e6389f2	Add DuckDB prompt (#3233 ) Adds a prompt template for the DuckDB SQL dialect.	1 year ago
Zander Chase	daee0b2b97	Patch Chat History Formatting (#3236 ) While we work on solidifying the memory interfaces, handle common chat history formats. This may break linting on anyone who has been passing in `get_chat_history` . Somewhat handles #3077 Alternative to #3078 that updates the typing	1 year ago
Harrison Chase	8f22949dc4	update nnotebook title	1 year ago
leo-gan	130e4b9fcb	fixed a link to the youtube page (#3232 ) A link to the `YouTube` page was missing on the `index` page.	1 year ago
Peter Stolz	d54b977d4e	Fix docstring of RetrievalQA (#3231 ) Structure changed an RetrievalQA now expects BaseRetriever not VectorStore	1 year ago
Harrison Chase	b7dea80cba	bump version to 145 (#3229 )	1 year ago
Harrison Chase	b7f2061736	Harrison/google places (#3207 ) Co-authored-by: Cao Hoang <65607230+cnhhoang850@users.noreply.github.com> Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	1 year ago
Gabriel Altay	34fb56b633	fix copy/pasta typos wikipedia->arxiv (#3222 ) just updates a few module level docstrings from Wikipedia -> Arxiv	1 year ago
Harrison Chase	d2520a5f1e	Harrison/ddg (#3206 ) Co-authored-by: itai <itai.marks@gmail.com> Co-authored-by: Itai Marks <itaim@users.noreply.github.com> Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com> Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com> Co-authored-by: Adilzhan Ismailov <13088690+aismlv@users.noreply.github.com> Co-authored-by: Justin Flick <Justinjayflick@gmail.com> Co-authored-by: Justin Flick <jflick@homesite.com>	1 year ago
Harrison Chase	36c10f8a52	nits (#3203 )	1 year ago
Daniel Chalef	27cdf8d675	supabase vectorstore - first cut (#3100 ) First cut of a supabase vectorstore loosely patterned on the langchainjs equivalent. Doesn't support async operations which is a limitation of the supabase python client. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	1 year ago
Harrison Chase	9a0356d276	Harrison/file chat history (#3198 ) Co-authored-by: Young Lee <joybro201@gmail.com>	1 year ago
Kazon Wilson	a66cab8b71	Add new line to refine prompt tmpl (#3197 ) Adding a new line to fix issue #3117	1 year ago
Harrison Chase	96809b5794	Harrison/discord loader (#3200 ) Co-authored-by: Rajtilak Bhattacharjee <rajtilak.blog@gmail.com>	1 year ago
Justin Flick	8faef1a91a	Confluence DL retry/backoff (#3168 ) Implemented a retry/backoff logic in response to #2473 --------- Co-authored-by: Justin Flick <jflick@homesite.com>	1 year ago
Adilzhan Ismailov	c03a65c6dc	Fix from_embeddings method examples (#3174 ) Fix examples for `from_embeddings` method for annoy and faiss vectorstores	1 year ago
Harrison Chase	f19b3890c9	Harrison/site map tqdm (#3184 ) Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com> Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>	1 year ago
Harrison Chase	e55db5841a	Harrison/svm speedup (#3195 ) Co-authored-by: Lance Martin <122662504+PineappleExpress808@users.noreply.github.com>	1 year ago
obbiondo	d6b2f2b9bd	add ConfluenceLoader to document_loaders init (#3143 ) Fix ConfluenceLoader import Co-authored-by: Andrea Biondo <a.biondo@reply.it>	1 year ago
Zander Chase	c757c3cde4	Add HuggingFace Examples (#3187 ) Add a Pipeline example and add other models in th ehub notebook To close issue [#3077](https://github.com/hwchase17/langchain/issues/3099)	1 year ago
Donald "Max" Ziff	6adf2d1c39	first draft (#2690 ) There is a long way to go on this! --------- Co-authored-by: Max Ziff <max.ziff@concur.com>	1 year ago
Harrison Chase	9181cd9b22	Harrison/playwright selector (#3185 ) Co-authored-by: zhyuri <4649294+zhyuri@users.noreply.github.com>	1 year ago
Harrison Chase	68cd37175e	Harrison/arxiv tool (#3186 ) Co-authored-by: leo-gan <leo.gan.57@gmail.com>	1 year ago
Tunay Okumus	6e48107734	fix: separate model and deployment for OpenAIEmbeddings (#3076 ) Separated the deployment from model to support Azure OpenAI Embeddings properly. Also removed the deprecated document_model_name and query_model_name attributes.	1 year ago
Zander Chase	4adfd790f0	Update File Management Tools to Include Root Directory (#3112 ) - Permit the specification of a `root_dir` to the read/write file tools to specify a working directory - Add validation for attempts to read/write outside the directory (e.g., through `../../` or symlinks or `/abs/path`'s that don't lie in the correct path) - Add some tests for all One question is whether we should make a default root directory for these? tradeoffs either way	1 year ago
John-David Wuarin	a63bfb6c9f	fix: kwargs.pop("redis_url") KeyError: 'redis_url' (#3121 ) This occurred when redis_url was not passed as a parameter even though a REDIS_URL env variable was present. This occurred for all methods that eventually called any of: (from_texts, drop_index, from_existing_index) - i.e. virtually all methods in the class. This fixes it	1 year ago
engkheng	dbbc340f25	Validate `input_variables` when using `jinja2` templates (#3140 ) `langchain.prompts.PromptTemplate` and `langchain.prompts.FewShotPromptTemplate` do not validate `input_variables` when initialized as `jinja2` template. ```python # Using langchain v0.0.144 template = """"\ Your variable: {{ foo }} {% if bar %} You just set bar boolean variable to true {% endif %} """ # Missing variable, should raise ValueError prompt_template = PromptTemplate(template=template, input_variables=["bar"], template_format="jinja2", validate_template=True) # Extra variable, should raise ValueError prompt_template = PromptTemplate(template=template, input_variables=["bar", "foo", "extra", "thing"], template_format="jinja2", validate_template=True) ```	1 year ago
Matt Robinson	3e0c44bae8	enhancement: support headers for non-html urls (#3166 ) ### Summary Updates the `UnstructuredURLLoader` to support passing in headers for non HTML content types. While this update maintains backward compatibility with older versions of `unstructured`, we strongly recommended upgrading to `unstructured>=0.5.13` if you are using the `UnstructuredURLLoader`. ### Testing #### With headers ```python from langchain.document_loaders import UnstructuredURLLoader urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"] loader = UnstructuredURLLoader(urls=urls, headers={"Accept": "application/json"}, strategy="fast") docs = loader.load() print(docs[0].page_content[:1000]) ``` #### Without headers ```python from langchain.document_loaders import UnstructuredURLLoader urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"] loader = UnstructuredURLLoader(urls=urls, strategy="fast") docs = loader.load() print(docs[0].page_content[:1000]) ``` --------- Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	1 year ago
Pranabendra Prasad Chandra	7b1f0656b8	Fix typo in ElasticSearch sample notebook (#3171 ) Added missing parenthesis in example notebook [elasticsearch.ipynb](https://github.com/hwchase17/langchain/blob/master/docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb)	1 year ago
Davis Chase	10e4b32ecb	Add document transformer abstraction (#3182 ) Add DocumentTransformer abstraction so that in #2915 we don't have to wrap TextSplitter and RedundantEmbeddingFilter (neither of which uses the query) in the contextual doc compression abstractions. with this change, doc filter (doc extractor, whatever we call it) would look something like ```python class BaseDocumentFilter(BaseDocumentTransformer[_RetrievedDocument], ABC): @abstractmethod def filter(self, documents: List[_RetrievedDocument], query: str) -> List[_RetrievedDocument]: ... def transform_documents(self, documents: List[_RetrievedDocument], query: Optional[str] = None, **kwargs: Any) -> List[_RetrievedDocument]: if query is None: raise ValueError("Must pass in non-null query to DocumentFilter") return self.filter(documents, query) ```	1 year ago

... 3 4 5 6 7 ...

1654 Commits (52e4fba897b24361112fceebf87b3d7b81c4a18c) All Branches Search

1654 Commits (52e4fba897b24361112fceebf87b3d7b81c4a18c)

All Branches