langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Maciej Dzieżyc	bcd308c368	Fix Open in Colab link for ClearML docs 2 (#11491 ) Description: Fixed the Open in Colab link for ClearML docs Issue: https://github.com/allegroai/clearml/issues/1125 Twitter handle: DziezycMaciej	2023-10-06 12:01:47 -07:00
Bagatur	88ab69c288	mv docs extras (#11399 )	2023-10-06 10:09:41 -07:00
Bagatur	1bf8ef1a4f	rm brave (#11482 )	2023-10-06 07:44:19 -07:00
Theron Tau	35297ca0d3	Add feature for extracting images from pdf and recognizing text from images. (#10653 ) Description It is for #10423 that it will be a useful feature if we can extract images from pdf and recognize text on them. I have implemented it with `PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`, `PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`. [RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize text on extracted images. It is time-consuming for ocr so a boolen parameter `extract_images` is set to control whether to extract and recognize. I have tested the time usage for each parser on my own laptop thinkbook 14+ with AMD R7-6800H by unit test and the result is: \| extract_images \| PyPDFParser \| PDFMinerParser \| PyMuPDFParser \| PyPDFium2Parser \| PDFPlumberParser \| \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| \| False \| 0.27s \| 0.39s \| 0.06s \| 0.08s \| 1.01s \| \| True \| 17.01s \| 20.67s \| 20.32s \| 19,75s \| 20.55s \| Issue #10423 Dependencies rapidocr_onnxruntime in [RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:51:59 -07:00
Bagatur	a3a2ce623e	Revise vowpal_wabbit notebook	2023-10-05 18:18:19 -07:00
Bagatur	8fafa1af91	merge	2023-10-05 18:09:35 -07:00
rodrigo-clickup	5944c1851b	Add ClickUp Toolkit (#10662 ) - Description: Adds a toolkit to interact with the [ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/) - Dependencies: None - Tag maintainer: @rodrigo-georgian, @rodrigo-clickup, @aiswaryasankarwork - Twitter handle: - Aiswarya (https://twitter.com/Aiswarya_Sankar, https://www.linkedin.com/in/sankaraiswarya/) - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/) --------- Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local> Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 16:33:05 -07:00
Beck Bekmyradov	f9df55f7d2	Fix a Typo in Documentation (#11453 ) - Description: This commit corrects a minor typo in the documentation. It changes "frum" to "from" in the sentence: "The results from search are passed back to the LLM for synthesis into an answer" in the file `docs/extras/use_cases/more/agents/agents.ipynb`. This typo fix enhances the clarity and accuracy of the documentation. - Tag maintainer: @baskaryan	2023-10-05 15:34:06 -07:00
Bagatur	f5ce286932	fix api docs build (#11445 )	2023-10-05 15:33:11 -07:00
mrbean	9903a70379	Add youdotcom retriever (#11304 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:48:11 -07:00
Syed Ather Rizvi	bfd48925e5	Feature/csharp text splitter doc (#10571 ) - Description: Just docs related to csharp code splitter - Issue: It's related to a request made by @baskaryan in a comment on my previous PR #10350 - Dependencies: None - Twitter handle: @ather19 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 12:22:54 -07:00
maks-operlejn-ds	2aae1102b0	Instance anonymization (#10501 ) ### Description Add instance anonymization - if `John Doe` will appear twice in the text, it will be treated as the same entity. The difference between `PresidioAnonymizer` and `PresidioReversibleAnonymizer` is that only the second one has a built-in memory, so it will remember anonymization mapping for multiple texts: ``` >>> anonymizer = PresidioAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Brett Russell. Hi Brett Russell!' ``` ``` >>> anonymizer = PresidioReversibleAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' ``` ### Twitter handle @deepsense_ai / @MaksOpp ### Tag maintainer @baskaryan @hwchase17 @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:23:02 -07:00
Holt Skinner	9f73fec057	fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513 ) - Description: Google Cloud Enterprise Search was renamed to Vertex AI Search - https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available - This PR updates the documentation and Retriever class to use the new terminology. - Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to `GoogleVertexAISearchRetriever` - Updated documentation to specify that `extractive_segments` requires the new [Enterprise edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features) to be enabled. - Fixed spelling errors in documentation. - Change parameter for Retriever from `search_engine_id` to `data_store_id` - When this retriever was originally implemented, there was no distinction between a data store and search engine, but now these have been split. - Fixed an issue blocking some users where the api_endpoint can't be set	2023-10-05 10:47:47 -07:00
Mateusz Wosinski	656480feb6	Add language detection example (#10540 ) ### Description Adds language detection examples based on [langdetect](https://github.com/Mimino666/langdetect/tree/master/langdetect) and [fasttext](https://github.com/facebookresearch/fastText/) libraries. These frameworks can be especially useful together with components that require selection of the language (e.g. data-anonymizer) ### Twitter handle @deepsense_ai, @matt_wosinski	2023-10-05 10:39:08 -07:00
billytrend-cohere	2ff91a46c0	Add cohere /chat integration (#11389 ) Add cohere /chat integration and an iPython notebook to demonstrate the addition. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 09:20:47 -07:00
ElliotKetchup	53d4f1554a	Update aws.mdx (#11431 )	2023-10-05 09:07:16 -07:00
Lance Martin	211a74941a	Update QA doc w/ Runnables (#11401 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 08:07:38 -07:00
Nuno Campos	1e59c44d36	Nc/5oct/runnable release (#11428 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 14:27:50 +01:00
William FH	940b9ae30a	Normalize Option in Scoring Chain (#11412 )	2023-10-04 15:59:28 -07:00
bholagabbar	b9fad28f5e	Fix typing imports in extraction usecase (#11402 ) The person class here: https://python.langchain.com/docs/use_cases/extraction#pydantic-1 has attributes `dog_breed` and `dog_name` that use `Optional` from typing, but it hasn't been imported. Fixed the import here	2023-10-04 13:55:02 -07:00
Leonid Ganeline	22165cb2fc	merge pages into `google` and `AWS` pages (#11312 ) There are several pages in `integrations/providers/more` that belongs to Google and AWS `integrations/providers`. - moved content of these pages into the Google and AWS `integrations/providers` pages - removed these individual pages	2023-10-04 13:44:23 -07:00
Bagatur	91941d1f19	mv LCEL up in docs (#11395 )	2023-10-04 15:34:06 -04:00
Lester Solbakken	a30f98f534	Add Vespa vector store (#11329 ) Addition of Vespa vector store integration including notebook showing its use. Maintainer: @lesters Twitter handle: LesterSolbakken	2023-10-04 14:59:11 -04:00
Tomaz Bratanic	71290315cf	Add optional Cypher validation tool (#11078 ) LLMs have trouble with consistently getting the relationship direction accurately. That's why I organized a competition how to best and most simple to fix it based on the existing schema as a post-processing step. https://github.com/tomasonjo/cypher-direction-competition I am adding the winner's code in this PR: https://github.com/sakusaku-rich/cypher-direction-competition	2023-10-04 12:54:37 -04:00
Anatolii Kmetiuk	34a64101cc	Add explanations to GoogleDriveLoader how to avoid errors (#11335 ) - Description: add a paragraph to the GoogleDriveLoader doc on how to bypass errors on authentication. For some reason, specifying credential path via `credentials_path` constructor parameter when creating `GoogleDriveLoader` makes it so that the oAuth screen is never showing up when first using GoogleDriveLoader. Instead, the `RefreshError: ('invalid_grant: Bad Request', {'error': 'invalid_grant', 'error_description': 'Bad Request'})` error happens. Setting it via `os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ...` solves the problem. Also, `token_path` constructor parameter is mandatory, otherwise another error happens when trying to `load()` for the first time. These errors are tricky and time-consuming to figure out, so I believe it's good to mention them in the docs. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-04 11:12:54 -04:00
MattiaSangermano	cdf5259ca9	Fixed import typo (#11278 ) Fixed small import typo in react_docstore documentation --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-04 10:18:10 -04:00
mziru	9e3c1d4463	add HTMLHeaderTextSplitter (#11039 ) Description: Similar in concept to the `MarkdownHeaderTextSplitter`, the `HTMLHeaderTextSplitter` is a "structure-aware" chunker that splits text at the element level and adds metadata for each header "relevant" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline. Dependency: lxml python package Maintainer: @hwchase17 Twitter handle: @MartinZirulnik --------- Co-authored-by: PresidioVantage <github@presidiovantage.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-04 09:24:25 -04:00
Isaac Chung	1165767df2	Clarifai integration doc improvements (#11251 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: Doc corrections and resolve notebook rendering issue on GH - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: `@isaacchung1217` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-03 21:47:57 -04:00
Oleg Sinavski	1ca62b232b	Docs: improve similarity search examples (#11298 ) Description: Examples in the "Select by similarity" section were not really highlighting capabilities of similarity search. E.g. "# Input is a measurement, so should select the tall/short example" was still outputting the "mood" example. I tweaked the inputs a bit and fixed the examples (checking that those are indeed what the search outputs). Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-03 21:47:08 -04:00
LeeJongBeom	92683262f4	Fix documents for RetrievalQAWithSourcesChain (#11292 ) - Description: Fix typo about `RetrievalQAWithSourceChain` -> `RetrievalQAWithSourcesChain` <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-03 17:36:16 -07:00
Fynn Flügge	0a4baca291	chore: add kotlin code splitter (#11364 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: Adds Kotlin language to `TextSplitter` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-03 18:35:36 -04:00
Ofer Mendelevitch	b93a08079e	Updates to Vectara Implementation (#11366 ) Replace this entire comment with: - Description: updates to documentation and API headers - Tag maintainer: @baskarya - Twitter handle: @ofermend	2023-10-03 18:34:39 -04:00
Vicente Reyes	f3e13e7e5a	Use term keyword according to the official python doc glossary (#11338 ) - Description: use term keyword according to the official python doc glossary, see https://docs.python.org/3/glossary.html - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @hwchase17 - Twitter handle: vreyespue	2023-10-03 12:56:08 -07:00
Leonid Ganeline	39316314fa	`fallback` definition (#10504 ) I've added a definition to `fallback` and fixed couple misspells. It was not really clear what is the "fallback".	2023-10-03 12:38:59 -07:00
Mohammad Mohtashim	3bddd708f7	Add memory to sql chain (#8597 ) continuation of PR #8550 @hwchase17 please see and merge. And also close the PR #8550. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-03 12:04:39 -07:00
Ikko Eltociear Ashimine	49b34e2293	Fix typo in agent_structured.ipynb (#11340 ) therefor -> therefore <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-03 09:00:38 -07:00
Lance Martin	b3c83fdd33	Add prompt hub support for Mistral w/ Ollama (#11315 ) Add Mistral example with prompt support	2023-10-03 08:17:46 -07:00
Bagatur	89436de7a7	update sec doc (#11336 )	2023-10-03 10:22:53 -04:00
Aashish Saini	8a507154ca	Update clarifai.mdx (#11318 ) @baskaryan , Small typo fix	2023-10-02 22:16:00 -07:00
Jacob Lee	933655b4ac	Adds Tavily Search API retriever (#11314 ) @baskaryan @efriis	2023-10-02 17:12:17 -07:00
CG80499	943e4f30d8	Add scoring chain (#11123 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-02 15:15:31 -07:00
João Carabetta	29b9a890d4	Fix line break in docs imports (#11270 ) It is just a straightforward docs fix.	2023-10-02 15:37:16 -04:00
Oleg Sinavski	0b08a17e31	Fix closing bracket in length-based selector snippet (#11294 ) Description: Fix a forgotten closing bracket in the length-based selector snippet Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-02 15:36:58 -04:00
Nuno Campos	1cbe7f5450	Small changes to runnable docs (#11293 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-02 16:27:11 +01:00
zhengkai	3d859075d4	Remove extra spaces (#11283 ) ### Description When I was reading the document, I found that some examples had extra spaces and violated "Unexpected spaces around keyword / parameter equals (E251)" in pep8. I removed these extra spaces. ### Tag maintainer @eyurtsev ### Twitter handle [billvsme](https://twitter.com/billvsme)	2023-10-02 10:02:30 -04:00
James Odeyale	61cd83bf96	Update quickstart.mdx to add backtick after `ChatMessages` (#11241 ) While going through the documentation I found this small issue and wanted to contribute! <!-- Thank you for contributing to LangChain! -->	2023-10-02 10:02:03 -04:00
Kazuki Maeda	a363ab5292	rename repo namespace to langchain-ai (#11259 ) ### Description renamed several repository links from `hwchase17` to `langchain-ai`. ### Why I discovered that the README file in the devcontainer contains an old repository name, so I took the opportunity to rename the old repository name in all files within the repository, excluding those that do not require changes. ### Dependencies none ### Tag maintainer @baskaryan ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda)	2023-10-01 15:30:58 -04:00
Leonid Ganeline	5e5039dbd2	docs: updated `YouTube` and `tutorial` video links (#10897 ) updated `YouTube` and `tutorial` videos with new links. Removed couple of duplicates. Reordered several links by view counters Some formatting: emphasized the names of products	2023-09-30 16:37:28 -07:00
Leonid Ganeline	cb84f612c9	docs: `document_transformers` consistency (#10467 ) - Updated `document_transformers` examples: titles, descriptions, links - Added `integrations/providers` for missed document_transformers	2023-09-30 16:36:23 -07:00
Leonid Ganeline	240190db3f	docs: `integrations/memory` consistency (#10255 ) - updated titles and descriptions of the `integrations/memory` notebooks into consistent and laconic format; - removed `docs/extras/integrations/memory/motorhead_memory_managed.ipynb` file as a duplicate of the `docs/extras/integrations/memory/motorhead_memory.ipynb`; - added `integrations/providers` Integration Cards for `dynamodb`, `motorhead`. - updated `integrations/providers/redis.mdx` with links - renamed several notebooks; updated `vercel.json` to reroute new names.	2023-09-30 16:35:55 -07:00

1 2 3 4 5 ...

2176 Commits