langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Erick Friis	4153837502	google-genai[patch]: release 0.0.7 (#17193 )	2024-02-07 17:15:09 -08:00
Erick Friis	927ab77d6e	google-genai[patch]: no error for FunctionMessage (#17215 ) Both should eventually match this: https://github.com/langchain-ai/langchain/blob/master/libs/partners/google-vertexai/langchain_google_vertexai/chat_models.py#L179 But seems undocumented / can't find types in genai package	2024-02-07 17:14:50 -08:00
Erick Friis	2ecf318218	google-genai[patch]: match function call interface (#17213 ) should match vertex	2024-02-07 17:07:31 -08:00
Erick Friis	e17173c403	google-vertexai[patch]: function calling integration test (#17209 )	2024-02-07 15:49:56 -08:00
Erick Friis	52be84a603	google-vertexai[patch]: serializable citation metadata, release 0.0.4 (#17145 ) was breaking in langserve before	2024-02-07 15:47:32 -08:00
Nuno Campos	19ff81e74f	Fix stream events/log with some kinds of non addable output (#17205 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-07 15:46:13 -08:00
Bagatur	6f1403b9b6	community[patch]: Release 0.0.19 (#17207 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 15:37:01 -08:00
Erick Friis	a13dc47a08	cli[patch]: copyright 2024 default (#17204 )	2024-02-07 14:52:37 -08:00
Bagatur	00757567ba	core[patch]: Release 0.1.21 (#17202 )	2024-02-07 14:20:20 -08:00
Bagatur	af74301ab9	core[patch], community[patch]: link extraction continue on failure (#17200 )	2024-02-07 14:15:30 -08:00
Henry	2281f00198	langchain: Standardize `output_parser.py` across all agent types for custom `FORMAT_INSTRUCTIONS` (#17168 ) - Description: This PR standardizes the `output_parser.py` file across all agent types to ensure a uniform parsing mechanism is implemented. It introduces a cohesive structure and common interface for output parsing, facilitating easier modifications and extensions by users. The standardized approach enhances maintainability and scalability of the codebase by providing a consistent pattern for output parsing, which can be easily understood and utilized across different agent types. This PR builds upon the foundation set by a previously merged PR, which focused exclusively on standardizing the `output_parser.py` for the `conversational_agent` ([PR #16945](https://github.com/langchain-ai/langchain/pull/16945)). With this new update, I extend the standardization efforts to encompass `output_parser.py` files across all agent types. This enhancement not only unifies the parsing mechanism across the board but also introduces the flexibility for users to incorporate custom `FORMAT_INSTRUCTIONS`. - Issue: https://github.com/langchain-ai/langchain/issues/10721 https://github.com/langchain-ai/langchain/issues/4044 - Dependencies: No new dependencies required for this change - Twitter handle: With my github user is enough. Thanks I hope you accept my PR.	2024-02-07 13:46:17 -08:00
Bagatur	78409634fe	core[patch]: Release 0.1.20 (#17194 )	2024-02-07 12:28:05 -08:00
Nuno Campos	65798289a4	core[minor]: Use batched tracing in sdk (#16305 ) Remove threadpool executor usage in langchain tracer, this is now handled by sdk	2024-02-07 12:10:58 -08:00
chyroc	f87b38a559	google-genai[minor]: support functions call (#15146 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 12:09:30 -08:00
Tomaz Bratanic	302989a2b1	allow optional newline in the action responses of JSON Agent parser (#17186 ) Based on my experiments, the newline isn't always there, so we can make the regex slightly more robust by allowing an optional newline after the bacticks	2024-02-07 10:26:14 -08:00
William FH	9fa07076da	Add trace_as_chain_group metadata (#17187 )	2024-02-07 09:42:44 -08:00
Erick Friis	3e58df43c2	mistralai[patch]: release 0.0.4 (#17139 )	2024-02-06 16:05:20 -08:00
Erick Friis	22b6a03a28	infra: read min versions (#17135 )	2024-02-06 16:05:11 -08:00
Erick Friis	f881a3330c	mistralai[patch]: 16k token batching logic embed (#17136 )	2024-02-06 15:59:08 -08:00
Bagatur	226f376d59	community[patch]: Release 0.0.18 (#17129 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-06 13:40:00 -08:00
Erick Friis	980e30c361	nvidia-ai-endpoints[patch]: release 0.0.2 (#17125 )	2024-02-06 12:48:25 -08:00
Erick Friis	15bd1154a7	pinecone[patch]: integration test new namespace (#17121 )	2024-02-06 11:56:00 -08:00
Mikhail Khludnev	14ff1438e6	nvidia-trt[patch]: propagate InferenceClientException to the caller. (#16936 ) - Description: before the change I've got 1. propagate InferenceClientException to the caller. 2. stop grpc receiver thread on exception ``` for token in result_queue: > result_str += token E TypeError: can only concatenate str (not "InferenceServerException") to str ../../langchain_nvidia_trt/llms.py:207: TypeError ``` And stream thread keeps running. after the change request thread stops correctly and caller got a root cause exception: ``` E tritonclient.utils.InferenceServerException: [request id: 4529729] expected number of inputs between 2 and 3 but got 10 inputs for model 'vllm_model' ../../langchain_nvidia_trt/llms.py:205: InferenceServerException ``` - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: [t.me/mkhl_spb](https://t.me/mkhl_spb) I'm not sure about test coverage. Should I setup deep mocks or there's a kind of triton stub via testcontainers or so.	2024-02-06 11:47:07 -08:00
Junyoung Park	1ed73f1992	community[minor]: Add SelfQueryRetriever support to PGVector (#16991 ) - Description: Add SelfQueryRetriever support to PGVector - Issue: - - Dependencies: - - Twitter handle: - --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-06 10:50:50 -08:00
Bagatur	cd945e3a5b	core[patch]: Release 0.1.19 (#17117 )	2024-02-06 09:54:22 -08:00
Frank	ef082c77b1	community[minor]: add github file loader to load any github file content b… (#15305 ) ### Description support load any github file content based on file extension. Why not use [git loader](https://python.langchain.com/docs/integrations/document_loaders/git#load-existing-repository-from-disk) ? git loader clones the whole repo even only interested part of files, that's too heavy. This GithubFileLoader only downloads that you are interested files. ### Twitter handle my twitter: @shufanhaotop --------- Co-authored-by: Hao Fan <h_fan@apple.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-06 09:42:33 -08:00
Henry	eaeb8a5f71	langchain[patch]: `output_parser.py` in conversation_chat is customizable (#16945 ) Description: With this modification, users can customize the `FORMAT_INSTRUCTIONS` template, allowing them to create their own prompts As it is happening in [this](https://github.com/langchain-ai/langchain/issues/10721) issue, the `FORMAT_INSTRUCTIONS` is not customizable for the output parser, unless you create your own class `ConvoOutputParser`. To avoid this, a modification was done, creating a `format_instruction` variable that users can customize with ease after initialize the agent. For example: ``` agent = initialize_agent( agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION, tools = tools, llm = llm_agent, verbose = True, max_iterations = 3, early_stopping_method = 'generate', memory = b_w_memory, handle_parsing_errors = True, agent_kwargs={ 'system_message':PREFIX, 'human_message':SUFFIX, 'template_tool_response':TEMPLATE_TOOL_RESPONSE, } ) agent.agent.output_parser.format_instructions = "MY CUSTOM FORMAT INSTRUCTIONS" print(agent.agent.output_parser.get_format_instructions()) MY CUSTOM FORMAT INSTRUCTIONS ``` Other parameters like `system_message`, `human_message`, or `template_tool_response` are already customizable and with this PR, the last parameter `FORMAT_INSTRUCTIONS` in `langchain.agents.conversational_chat.prompt` can be modified. Issue: https://github.com/langchain-ai/langchain/issues/10721 Dependencies: No new dependencies required for this change Twitter handle: With my github user is enough. Thanks I hope you accept my PR. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-06 09:41:53 -08:00
Ryan Kraus	f027696b5f	community: Added new Utility runnables for NVIDIA Riva. (#15966 ) Please tag this issue with `nvidia_genai` - Description: Added new Runnables for integration NVIDIA Riva into LCEL chains for Automatic Speech Recognition (ASR) and Text To Speech (TTS). - Issue: N/A - Dependencies: To use these runnables, the NVIDIA Riva client libraries are required. It they are not installed, an error will be raised instructing how to install them. The Runnables can be safely imported without the riva client libraries. - Twitter handle: N/A All of the Riva Runnables are inside a single folder in the Utilities module. In this folder are four files: - common.py - Contains all code that is common to both TTS and ASR - stream.py - Contains a class representing an audio stream that allows the end user to put data into the stream like a queue. - asr.py - Contains the RivaASR runnable - tts.py - Contains the RivaTTS runnable The following Python function is an example of creating a chain that makes use of both of these Runnables: ```python def create( config: Configuration, audio_encoding: RivaAudioEncoding, sample_rate: int, audio_channels: int = 1, ) -> Runnable[ASRInputType, TTSOutputType]: """Create a new instance of the chain.""" _LOGGER.info("Instantiating the chain.") # create the riva asr client riva_asr = RivaASR( url=str(config.riva_asr.service.url), ssl_cert=config.riva_asr.service.ssl_cert, encoding=audio_encoding, audio_channel_count=audio_channels, sample_rate_hertz=sample_rate, profanity_filter=config.riva_asr.profanity_filter, enable_automatic_punctuation=config.riva_asr.enable_automatic_punctuation, language_code=config.riva_asr.language_code, ) # create the prompt template prompt = PromptTemplate.from_template("{user_input}") # model = ChatOpenAI() model = ChatNVIDIA(model="mixtral_8x7b") # type: ignore # create the riva tts client riva_tts = RivaTTS( url=str(config.riva_asr.service.url), ssl_cert=config.riva_asr.service.ssl_cert, output_directory=config.riva_tts.output_directory, language_code=config.riva_tts.language_code, voice_name=config.riva_tts.voice_name, ) # construct and return the chain return {"user_input": riva_asr} \| prompt \| model \| riva_tts # type: ignore ``` The following code is an example of creating a new audio stream for Riva: ```python input_stream = AudioStream(maxsize=1000) # Send bytes into the stream for chunk in audio_chunks: await input_stream.aput(chunk) input_stream.close() ``` The following code is an example of how to execute the chain with RivaASR and RivaTTS ```python output_stream = asyncio.Queue() while not input_stream.complete: async for chunk in chain.astream(input_stream): output_stream.put(chunk) ``` Everything should be async safe and thread safe. Audio data can be put into the input stream while the chain is running without interruptions. --------- Co-authored-by: Hayden Wolff <hwolff@nvidia.com> Co-authored-by: Hayden Wolff <hwolff@Haydens-Laptop.local> Co-authored-by: Hayden Wolff <haydenwolff99@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-05 19:50:50 -08:00
François Paupier	929f071513	community[patch]: Fix error in `LlamaCpp` community LLM with Configurable Fields, 'grammar' custom type not available (#16995 ) - Description: Ensure the `LlamaGrammar` custom type is always available when instantiating a `LlamaCpp` LLM - Issue: #16994 - Dependencies: None - Twitter handle: @fpaupier --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-05 17:56:58 -08:00
Leonid Ganeline	563f325034	experimental[patch]: fixed import in `experimental` (#17078 )	2024-02-05 17:47:13 -08:00
Eugene Yurtsev	fbab8baac5	core[patch]: Add astream events config test (#17055 ) Verify that astream events propagates config correctly --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-05 17:24:58 -08:00
Scott Nath	10bd901139	infra: add integration_tests and coverage to MAKEFILE (#17053 ) - Description: update community MAKE file - adds `integration_tests` - adds `coverage` - Issue: the issue # it fixes if applicable, - moving out of https://github.com/langchain-ai/langchain/pull/17014 - Dependencies: n/a - Twitter handle: @scottnath - Mastodon handle: scottnath@mastodon.social --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-05 16:39:55 -08:00
Giulio Zani	9f0b63dba0	experimental[patch]: Fixes issue #17060 (#17062 ) As described in issue #17060, in the case in which text has only one sentence the following function fails. Checking for that and adding a return case fixed the issue. ```python def split_text(self, text: str) -> List[str]: """Split text into multiple components.""" # Splitting the essay on '.', '?', and '!' single_sentences_list = re.split(r"(?<=[.?!])\s+", text) sentences = [ {"sentence": x, "index": i} for i, x in enumerate(single_sentences_list) ] sentences = combine_sentences(sentences) embeddings = self.embeddings.embed_documents( [x["combined_sentence"] for x in sentences] ) for i, sentence in enumerate(sentences): sentence["combined_sentence_embedding"] = embeddings[i] distances, sentences = calculate_cosine_distances(sentences) start_index = 0 # Create a list to hold the grouped sentences chunks = [] breakpoint_percentile_threshold = 95 breakpoint_distance_threshold = np.percentile( distances, breakpoint_percentile_threshold ) # If you want more chunks, lower the percentile cutoff indices_above_thresh = [ i for i, x in enumerate(distances) if x > breakpoint_distance_threshold ] # The indices of those breakpoints on your list # Iterate through the breakpoints to slice the sentences for index in indices_above_thresh: # The end index is the current breakpoint end_index = index # Slice the sentence_dicts from the current start index to the end index group = sentences[start_index : end_index + 1] combined_text = " ".join([d["sentence"] for d in group]) chunks.append(combined_text) # Update the start index for the next group start_index = index + 1 # The last group, if any sentences remain if start_index < len(sentences): combined_text = " ".join([d["sentence"] for d in sentences[start_index:]]) chunks.append(combined_text) return chunks ``` Co-authored-by: Giulio Zani <salamanderxing@Giulios-MBP.homenet.telecomitalia.it>	2024-02-05 16:18:57 -08:00
Jimmy Moore	912210ac19	core[patch]: fix _sql_record_manager mypy for #17048 (#17073 ) - Description: Add relevant type annotations for relevant session and query objects to resolve mypy errors when `# type: ignore` comments are removed. - Issue: #17048 - Dependencies: None, - Twitter handle: [clesiemo3](https://twitter.com/clesiemo3) I attempted to solve the `UpsertionRecord` ignore but it would require added a deprecated plugin or moving completely to sqlalchemy 2.0+ from my understanding. I'm assuming this is not something desired at this point in time.	2024-02-05 16:18:40 -08:00
William FH	3d5e988c55	Add prompt metadata + tags (#17054 )	2024-02-05 16:17:31 -08:00
Bagatur	6e2ed9671f	infra: fix breebs test lint (#17075 )	2024-02-05 16:09:48 -08:00
T Cramer	cf01fc3790	docs: update parse_partial_json source info (#17036 ) - Description: Update source-link following recent license update at open-interpreter project - Issue: N/A - Dependencies: None	2024-02-05 15:54:34 -08:00
Alex Boury	334b6ebdf3	community[minor]: Breebs docs retriever (#16578 ) - Description: Implementation of breeb retriever with integration tests -> libs/community/tests/integration_tests/retrievers/test_breebs.py and documentation (notebook) -> docs/docs/integrations/retrievers/breebs.ipynb. - Dependencies: None	2024-02-05 15:51:08 -08:00
Serena Ruan	9b279ac127	community[patch]: MLflow callback update (#16687 ) Signed-off-by: Serena Ruan <serena.rxy@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-05 15:46:46 -08:00
Mohammad Mohtashim	3c4b24b69a	community[patch]: Fix the _call of HuggingFaceHub (#16891 ) Fixed the following identified issue: #16849 @baskaryan	2024-02-05 15:34:42 -08:00
Tyler Titsworth	304f3f5fc1	community[patch]: Add Progress bar to HuggingFaceEmbeddings (#16758 ) - Description: Adds a function parameter to HuggingFaceEmbeddings called `show_progress` that enables a `tqdm` progress bar if enabled. Does not function if `multi_process = True`. - Issue: n/a - Dependencies: n/a	2024-02-05 14:33:34 -08:00
Supreet Takkar	ae33979813	community[patch]: Allow adding ARNs as model_id to support Amazon Bedrock custom models (#16800 ) - Description: Adds an additional class variable to `BedrockBase` called `provider` that allows sending a model provider such as amazon, cohere, ai21, etc. Up until now, the model provider is extracted from the `model_id` using the first part before the `.`, such as `amazon` for `amazon.titan-text-express-v1` (see [supported list of Bedrock model IDs here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html)). But for custom Bedrock models where the ARN of the provisioned throughput must be supplied, the `model_id` is like `arn:aws:bedrock:...` so the `model_id` cannot be extracted from this. A model `provider` is required by the LangChain Bedrock class to perform model-based processing. To allow the same processing to be performed for custom-models of a specific base model type, passing this `provider` argument can help solve the issues. The alternative considered here was the use of `provider.arn:aws:bedrock:...` which then requires ARN to be extracted and passed separately when invoking the model. The proposed solution here is simpler and also does not cause issues for current models already using the Bedrock class. - Issue: N/A - Dependencies: N/A --------- Co-authored-by: Piyush Jain <piyushjain@duck.com>	2024-02-05 14:28:03 -08:00
T Cramer	e022bfaa7d	langchain: add partial parsing support to JsonOutputToolsParser (#17035 ) - Description: Add partial parsing support to JsonOutputToolsParser - Issue: [16736](https://github.com/langchain-ai/langchain/issues/16736) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-05 14:18:30 -08:00
calvinweb	dcf973c22c	Langchain: `json_chat` don't need stop sequenes (#16335 ) This is a PR about #16334 The Stop sequenes isn't meanful in `json_chat` because it depends json to work, not completions <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-05 14:18:16 -08:00
Bagatur	66e45e8ab7	community[patch]: chat model mypy fixes (#17061 ) Related to #17048	2024-02-05 13:42:59 -08:00
Bagatur	d93de71d08	community[patch]: chat message history mypy fixes (#17059 ) Related to #17048	2024-02-05 13:13:25 -08:00
Bagatur	af5ae24af2	community[patch]: callbacks mypy fixes (#17058 ) Related to #17048	2024-02-05 12:37:27 -08:00
Vadim Kudlay	75b6fa1134	nvidia-ai-endpoints[patch]: Support User-Agent metadata and minor fixes. (#16942 ) - Description: Several meta/usability updates, including User-Agent. - Issue: - User-Agent metadata for tracking connector engagement. @milesial please check and advise. - Better error messages. Tries harder to find a request ID. @milesial requested. - Client-side image resizing for multimodal models. Hope to upgrade to Assets API solution in around a month. - `client.payload_fn` allows you to modify payload before network request. Use-case shown in doc notebook for kosmos_2. - `client.last_inputs` put back in to allow for advanced support/debugging. - Dependencies: - Attempts to pull in PIL for image resizing. If not installed, prints out "please install" message, warns it might fail, and then tries without resizing. We are waiting on a more permanent solution. For LC viz: @hinthornw For NV viz: @fciannella @milesial @vinaybagade --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-05 12:24:53 -08:00
Nuno Campos	ae56fd020a	Fix condition on custom root type in runnable history (#17017 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-05 12:15:11 -08:00
Nuno Campos	f0ffebb944	Shield callback methods from cancellation: Fix interrupted runs marked as pending forever (#17010 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-05 12:09:47 -08:00

1 2 3 4 5 ...

2789 Commits