langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Patrick Loeber	5990651070	Add new document_loader: AssemblyAIAudioTranscriptLoader (#9667 ) This PR adds a new document loader `AssemblyAIAudioTranscriptLoader` that allows to transcribe audio files with the [AssemblyAI API](https://www.assemblyai.com) and loads the transcribed text into documents. - Add new document_loader with class `AssemblyAIAudioTranscriptLoader` - Add optional dependency `assemblyai` - Add unit tests (using a Mock client) - Add docs notebook This is the equivalent to the JS integration already available in LangChain.js. See the [LangChain JS docs AssemblyAI page](https://js.langchain.com/docs/modules/data_connection/document_loaders/integrations/web_loaders/assemblyai_audio_transcription). At its simplest, you can use the loader to get a transcript back from an audio file like this: ```python from langchain.document_loaders.assemblyai import AssemblyAIAudioTranscriptLoader loader = AssemblyAIAudioTranscriptLoader(file_path="./testfile.mp3") docs = loader.load() ``` To use it, it needs the `assemblyai` python package installed, and the environment variable `ASSEMBLYAI_API_KEY` set with your API key. Alternatively, the API key can also be passed as an argument. Twitter handles to shout out if so kindly 🙇 [@AssemblyAI](https://twitter.com/AssemblyAI) and [@patloeber](https://twitter.com/patloeber) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-23 22:51:19 -07:00
seamusp	25f2c82ae8	docs:misc fixes (#9671 ) Improve internal consistency in LangChain documentation - Change occurrences of eg and eg. to e.g. - Fix headers containing unnecessary capital letters. - Change instances of "few shot" to "few-shot". - Add periods to end of sentences where missing. - Minor spelling and grammar fixes.	2023-08-23 22:36:54 -07:00
Nuno Campos	6283f3b63c	Resolve circular imports in runnables (#9675 ) These are about to cause circular imports.	2023-08-24 06:05:51 +01:00
Eugene Yurtsev	9e1dbd4b49	x	2023-08-23 22:51:49 -04:00
Eugene Yurtsev	b88dfcb42a	Add indexing support (#9614 ) This PR introduces a persistence layer to help with indexing workflows into vectostores. The indexing code helps users to: 1. Avoid writing duplicated content into the vectostore 2. Avoid over-writing content if it's unchanged Importantly, this keeps on working even if the content being written is derived via a set of transformations from some source content (e.g., indexing children documents that were derived from parent documents by chunking.) The two main components are: 1. Persistence layer that keeps track of which keys were updated and when. Keeping track of the timestamp of updates, allows to clean up old content safely, and with minimal complexity. 2. HashedDocument which is used to hash the contents (including metadata) of the documents. We rely on the hashes for identifying duplicates. The indexing code works with ANY document loader. To add transformations to the documents, users for now can add a custom document loader that composes an existing loader together with document transformers. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 21:41:38 -04:00
刘方瑞	c215481531	Update default index type and metric type for MyScale vector store (#9353 ) We update the default index type from `IVFFLAT` to `MSTG`, a new vector type developed by MyScale.	2023-08-23 18:26:29 -07:00
Joshua Sundance Bailey	a9c86774da	Anthropic: Allow the use of kwargs consistent with ChatOpenAI. (#9515 ) - Description: ~~Creates a new root_validator in `_AnthropicCommon` that allows the use of `model_name` and `max_tokens` keyword arguments.~~ Adds pydantic field aliases to support `model_name` and `max_tokens` as keyword arguments. Ultimately, this makes `ChatAnthropic` more consistent with `ChatOpenAI`, making the two classes more interchangeable for the developer. - Issue: https://github.com/langchain-ai/langchain/issues/9510 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 18:23:21 -07:00
Lakshay Kansal	a8c916955f	Updates to Nomic Atlas and GPT4All documentation (#9414 ) Description: Updates for Nomic AI Atlas and GPT4All integrations documentation. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 17:49:44 -07:00
Bagatur	342087bdfa	fix integration test imports (#9669 )	2023-08-23 16:47:01 -07:00
Keras Conv3d	cbaea8d63b	tair fix distance_type error, and add hybrid search (#9531 ) - fix: distance_type error, - feature: Tair add hybrid search --------- Co-authored-by: thw <hanwen.thw@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 16:38:31 -07:00
Eugene Yurtsev	cd81e8a8f2	Add exclude to GenericLoader.from_file_system (#9539 ) support exclude param in GenericLoader.from_filesystem --------- Co-authored-by: Kyle Pancamo <50267605+KylePancamo@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 16:09:10 -07:00
Jacob Lee	278ef0bdcf	Adds ChatOllama (#9628 ) @rlancemartin --------- Co-authored-by: Adilkhan Sarsen <54854336+adolkhan@users.noreply.github.com> Co-authored-by: Kim Minjong <make.dirty.code@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 13:02:26 -07:00
Nuno Campos	fa05e18278	Nc/runnable lambda recurse (#9390 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-23 20:07:08 +01:00
Nuno Campos	20ce283fa7	Format	2023-08-23 20:03:35 +01:00
Nuno Campos	6424b3cde0	Add another test	2023-08-23 20:02:35 +01:00
William FH	da18e177f1	Update libs/langchain/langchain/schema/runnable/base.py Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-23 20:00:16 +01:00
Nuno Campos	c326751085	Lint	2023-08-23 20:00:16 +01:00
Nuno Campos	6d19709b65	RunnableLambda, if func returns a Runnable, run it	2023-08-23 20:00:16 +01:00
Nuno Campos	677da6a0fd	Add support for async funcs in RunnableSequence	2023-08-23 19:54:48 +01:00
Nuno Campos	64a958c85d	Runnables: Add .map() method (#9445 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-23 19:54:12 +01:00
Nuno Campos	1751fe114d	Add one more test	2023-08-23 19:52:13 +01:00
Nuno Campos	882b97cfd2	Lint	2023-08-23 19:50:20 +01:00
Nuno Campos	3ddabe8b2c	Code review	2023-08-23 19:48:33 +01:00
Nuno Campos	fdcd50aab4	Extend test	2023-08-23 19:48:33 +01:00
Nuno Campos	9777c2801d	Update method and docstring	2023-08-23 19:48:33 +01:00
Nuno Campos	93bbf67afc	WIP Add test Add test Lint	2023-08-23 19:48:33 +01:00
Nuno Campos	c184be5511	Use a shared executor for all parallel calls	2023-08-23 19:48:33 +01:00
Nuno Campos	dacd5dcba8	Runnables: Use a shared executor for all parallel calls (sync) (#9443 ) Async equivalent coming in future PR <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-23 19:47:35 +01:00
Bagatur	80dd162e0d	mv embedding cache docs (#9664 )	2023-08-23 11:46:04 -07:00
Nuno Campos	db4b256a28	Add error for batch of 0	2023-08-23 19:39:46 +01:00
Nuno Campos	3458489936	Lint	2023-08-23 19:39:46 +01:00
Nuno Campos	e420bf22b6	Lint	2023-08-23 19:39:46 +01:00
Nuno Campos	cc83f54694	L:int	2023-08-23 19:39:46 +01:00
Nuno Campos	d414d47c78	Use a shared executor for all parallel calls	2023-08-23 19:39:46 +01:00
Bagatur	a40c12bb88	Update the nlpcloud connector after some changes on the NLP Cloud API (#9586 ) - Description: remove some text generation deprecated parameters and update the embeddings doc, - Tag maintainer: @rlancemartin	2023-08-23 11:35:08 -07:00
Bagatur	d8e2dd4c89	mv	2023-08-23 11:30:44 -07:00
Bagatur	e2e582f1f6	Fixed source key name for docugami loader (#8598 ) The Docugami loader was not returning the source metadata key. This was triggering this exception when used with retrievers, per https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/schema/prompt_template.py#L193C1-L195C41 The fix is simple and just updates the metadata key name for the document each chunk is sourced from, from "name" to "source" as expected. I tested by running the python notebook that has an end to end scenario in it. Tagging DataLoader maintainers @rlancemartin @eyurtsev	2023-08-23 11:24:55 -07:00
karynzv	5508baf1eb	Add CrateDB prompt (#9657 ) Adds a prompt template for the CrateDB SQL dialect.	2023-08-23 13:33:37 -04:00
Bagatur	0154958243	Runnable locals (#9662 ) Add Runnables that manipulate state local to a RunnableSequence	2023-08-23 10:30:03 -07:00
Bagatur	a8e8a31b41	Merge branch 'master' into bagatur/locals_in_config	2023-08-23 10:26:11 -07:00
Bagatur	ef87affd4d	Revert "Locals in config" (#9661 ) Reverts langchain-ai/langchain#9007	2023-08-23 10:24:59 -07:00
Bagatur	1c64db575c	Runnable locals(#9007 ) Adds Runnables that can manipulate variables local to a RunnableSequence run --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-08-23 10:24:27 -07:00
Bagatur	ef2500584c	fmt	2023-08-23 10:15:45 -07:00
Zizhong Zhang	8a03836160	docs: fix PromptGuard docs (#9659 ) Fix PromptGuard docs. Noticed several trivial issues on the docs when integrating the new class. cc @baskaryan	2023-08-23 10:04:53 -07:00
Yong woo Song	f0ae10a20e	Fix typo in tigris (#9637 ) The link has a typo in [tigirs docs](https://python.langchain.com/docs/integrations/providers/tigris), so I couldn't access it. So, I have corrected it. Thanks! ☺️	2023-08-23 07:15:18 -07:00
Guy Korland	39a5d02225	Cleanup of ruff warnings use isinstance() instead of type() (#9655 ) Minor cosmetic PR just cleanup of `ruff` warnings use `isinstance()` instead of `type()`	2023-08-23 07:14:31 -07:00
Junlin Zhou	5b9bdcac1b	docs: fix link url (#9643 ) This pull request corrects the URL links in the Async API documentation to align with the updated project layout. The links had not been updated despite the changes in layout.	2023-08-23 07:05:02 -07:00
Aashish Saini	eb92da84a1	Fixings grammatical errors in Doc Files (#9647 ) Fixing some typos and grammatical error is doc file. @eyurtsev , @baskaryan Thanks --------- Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: Ishita Chauhan <136303787+IshitaChauhanShortHillsAI@users.noreply.github.com>	2023-08-23 07:04:29 -07:00
Joseph McElroy	2a06e7b216	ElasticsearchStore: improve error logging for adding documents (#9648 ) Not obvious what the error is when you cannot index. This pr adds the ability to log the first errors reason, to help the user diagnose the issue. Also added some more documentation for when you want to use the vectorstore with an embedding model deployed in elasticsearch. Credit: @elastic and @phoey1	2023-08-23 07:04:09 -07:00
Julien Salinas	f1072cc31f	Merge branch 'master' into master	2023-08-23 14:42:40 +02:00

... 6 7 8 9 10 ...

4406 Commits