**Updated ChatHuggingFace doc string as per issue #22296**:
"langchain_huggingface: updated docstring for ChatHuggingFace in
langchain_huggingface to match that of the description (in the appendix)
provided in issue #22296. "
**Issue:** This PR is in response to issue #22296, and more specifically
ChatHuggingFace model. In particular, this PR updates the docstring for
langchain/libs/partners/hugging_face/langchain_huggingface/chat_models/huggingface.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, and Response metadata. I used the template from the
Anthropic implementation and referenced the Appendix of the original
issue post. I also noted that: langchain_community hugging face llms do
not work with langchain_huggingface's ChatHuggingFace model (at least
for me); the .stream(messages) functionality of ChatHuggingFace only
returned a block of response.
---------
Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
## Description
The `path` param is used to specify the local persistence directory,
which isn't required if using Qdrant server.
This is a breaking but necessary change.
The response.get("model", self.model_name) checks if the model key
exists in the response dictionary. If it does, it uses that value;
otherwise, it uses self.model_name.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
langchain-together depends on langchain-openai ^0.1.8
langchain-openai 0.1.8 has langchain-core >= 0.2.2
Here we bump langchain-core to 0.2.2, just to pass minimum dependency
version tests.
Thank you for contributing to LangChain!
**Description:** Adds Langchain support for Nomic Embed Vision
**Twitter handle:** nomic_ai,zach_nussbaum
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- [ ] **Packages affected**:
- community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7
- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of https://github.com/langchain-ai/langchain/issues/18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
- **Issue:** #19905
- **Dependencies:** No
- **Twitter handle:** https://x.com/GetzJoydeep
<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">
- [ ] **Considerations**:
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Added support for using HuggingFacePipeline in
ChatHuggingFace (previously it was only usable with API endpoints,
probably by oversight).
- **Issue:** #19997
- **Dependencies:** none
- **Twitter handle:** none
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
ChatOpenAI supports a kwarg `stream_options` which can take values
`{"include_usage": True}` and `{"include_usage": False}`.
Setting include_usage to True adds a message chunk to the end of the
stream with usage_metadata populated. In this case the final chunk no
longer includes `"finish_reason"` in the `response_metadata`. This is
the current default and is not yet released. Because this could be
disruptive to workflows, here we remove this default. The default will
now be consistent with OpenAI's API (see parameter
[here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)).
Examples:
```python
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
for chunk in llm.stream("hi"):
print(chunk)
```
```
content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
```
```python
for chunk in llm.stream("hi", stream_options={"include_usage": True}):
print(chunk)
```
```
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```
```python
llm = ChatOpenAI().bind(stream_options={"include_usage": True})
for chunk in llm.stream("hi"):
print(chunk)
```
```
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```