2024-02-26 05:57:26 +00:00
|
|
|
"""Test ChatAnthropic chat model."""
|
|
|
|
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
import json
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
from typing import List, Optional
|
2024-02-26 05:57:26 +00:00
|
|
|
|
2024-05-16 17:56:29 +00:00
|
|
|
import pytest
|
2024-02-26 05:57:26 +00:00
|
|
|
from langchain_core.callbacks import CallbackManager
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
from langchain_core.messages import (
|
|
|
|
AIMessage,
|
|
|
|
AIMessageChunk,
|
|
|
|
BaseMessage,
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
BaseMessageChunk,
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
HumanMessage,
|
2024-04-17 19:37:04 +00:00
|
|
|
SystemMessage,
|
|
|
|
ToolMessage,
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
)
|
2024-02-26 05:57:26 +00:00
|
|
|
from langchain_core.outputs import ChatGeneration, LLMResult
|
2023-12-20 02:55:19 +00:00
|
|
|
from langchain_core.prompts import ChatPromptTemplate
|
2024-05-16 17:56:29 +00:00
|
|
|
from langchain_core.pydantic_v1 import BaseModel, Field
|
2024-04-17 19:37:04 +00:00
|
|
|
from langchain_core.tools import tool
|
2023-12-20 02:55:19 +00:00
|
|
|
|
2024-02-26 05:57:26 +00:00
|
|
|
from langchain_anthropic import ChatAnthropic, ChatAnthropicMessages
|
|
|
|
from tests.unit_tests._utils import FakeCallbackHandler
|
2023-12-20 02:55:19 +00:00
|
|
|
|
2024-03-04 15:03:51 +00:00
|
|
|
MODEL_NAME = "claude-3-sonnet-20240229"
|
|
|
|
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
def test_stream() -> None:
|
Fix: fix partners name typo in tests (#15066)
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Ran <rccalman@gmail.com>
2023-12-22 19:48:39 +00:00
|
|
|
"""Test streaming tokens from Anthropic."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
full: Optional[BaseMessageChunk] = None
|
|
|
|
chunks_with_input_token_counts = 0
|
|
|
|
chunks_with_output_token_counts = 0
|
2023-12-20 02:55:19 +00:00
|
|
|
for token in llm.stream("I'm Pickle Rick"):
|
|
|
|
assert isinstance(token.content, str)
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
full = token if full is None else full + token
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
if token.usage_metadata is not None:
|
|
|
|
if token.usage_metadata.get("input_tokens"):
|
|
|
|
chunks_with_input_token_counts += 1
|
|
|
|
elif token.usage_metadata.get("output_tokens"):
|
|
|
|
chunks_with_output_token_counts += 1
|
|
|
|
if chunks_with_input_token_counts != 1 or chunks_with_output_token_counts != 1:
|
|
|
|
raise AssertionError(
|
|
|
|
"Expected exactly one chunk with input or output token counts. "
|
|
|
|
"AIMessageChunk aggregation adds counts. Check that "
|
|
|
|
"this is behaving properly."
|
|
|
|
)
|
|
|
|
# check token usage is populated
|
|
|
|
assert isinstance(full, AIMessageChunk)
|
|
|
|
assert full.usage_metadata is not None
|
|
|
|
assert full.usage_metadata["input_tokens"] > 0
|
|
|
|
assert full.usage_metadata["output_tokens"] > 0
|
|
|
|
assert full.usage_metadata["total_tokens"] > 0
|
|
|
|
assert (
|
|
|
|
full.usage_metadata["input_tokens"] + full.usage_metadata["output_tokens"]
|
|
|
|
== full.usage_metadata["total_tokens"]
|
|
|
|
)
|
2024-07-02 19:16:20 +00:00
|
|
|
assert "stop_reason" in full.response_metadata
|
|
|
|
assert "stop_sequence" in full.response_metadata
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
|
|
|
|
async def test_astream() -> None:
|
Fix: fix partners name typo in tests (#15066)
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Ran <rccalman@gmail.com>
2023-12-22 19:48:39 +00:00
|
|
|
"""Test streaming tokens from Anthropic."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
full: Optional[BaseMessageChunk] = None
|
|
|
|
chunks_with_input_token_counts = 0
|
|
|
|
chunks_with_output_token_counts = 0
|
2023-12-20 02:55:19 +00:00
|
|
|
async for token in llm.astream("I'm Pickle Rick"):
|
|
|
|
assert isinstance(token.content, str)
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
full = token if full is None else full + token
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
if token.usage_metadata is not None:
|
|
|
|
if token.usage_metadata.get("input_tokens"):
|
|
|
|
chunks_with_input_token_counts += 1
|
|
|
|
elif token.usage_metadata.get("output_tokens"):
|
|
|
|
chunks_with_output_token_counts += 1
|
|
|
|
if chunks_with_input_token_counts != 1 or chunks_with_output_token_counts != 1:
|
|
|
|
raise AssertionError(
|
|
|
|
"Expected exactly one chunk with input or output token counts. "
|
|
|
|
"AIMessageChunk aggregation adds counts. Check that "
|
|
|
|
"this is behaving properly."
|
|
|
|
)
|
|
|
|
# check token usage is populated
|
|
|
|
assert isinstance(full, AIMessageChunk)
|
|
|
|
assert full.usage_metadata is not None
|
|
|
|
assert full.usage_metadata["input_tokens"] > 0
|
|
|
|
assert full.usage_metadata["output_tokens"] > 0
|
|
|
|
assert full.usage_metadata["total_tokens"] > 0
|
|
|
|
assert (
|
|
|
|
full.usage_metadata["input_tokens"] + full.usage_metadata["output_tokens"]
|
|
|
|
== full.usage_metadata["total_tokens"]
|
|
|
|
)
|
2024-07-02 19:16:20 +00:00
|
|
|
assert "stop_reason" in full.response_metadata
|
|
|
|
assert "stop_sequence" in full.response_metadata
|
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.
There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
for text in stream.text_stream:
snapshot = stream.current_message_snapshot
print(f"{count}: {snapshot.usage} -- {text}")
count = count + 1
final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) -- How
4: Usage(input_tokens=8, output_tokens=1) -- can
5: Usage(input_tokens=8, output_tokens=1) -- I
6: Usage(input_tokens=8, output_tokens=1) -- assist
7: Usage(input_tokens=8, output_tokens=1) -- you
8: Usage(input_tokens=8, output_tokens=1) -- today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.
2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
print(f"{count}: {event}")
count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```
Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.
This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.
Usage:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-3-haiku-20240307",
temperature=0
)
full = None
for chunk in model.stream("hi"):
full = chunk if full is None else full + chunk
print(chunk)
print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}
Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
|
|
|
|
|
|
|
# test usage metadata can be excluded
|
|
|
|
model = ChatAnthropic(model_name=MODEL_NAME, stream_usage=False) # type: ignore[call-arg]
|
|
|
|
async for token in model.astream("hi"):
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
assert token.usage_metadata is None
|
|
|
|
# check we override with kwarg
|
|
|
|
model = ChatAnthropic(model_name=MODEL_NAME) # type: ignore[call-arg]
|
|
|
|
assert model.stream_usage
|
|
|
|
async for token in model.astream("hi", stream_usage=False):
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
assert token.usage_metadata is None
|
|
|
|
|
|
|
|
# Check expected raw API output
|
|
|
|
async_client = model._async_client
|
|
|
|
params: dict = {
|
|
|
|
"model": "claude-3-haiku-20240307",
|
|
|
|
"max_tokens": 1024,
|
|
|
|
"messages": [{"role": "user", "content": "hi"}],
|
|
|
|
"temperature": 0.0,
|
|
|
|
}
|
|
|
|
stream = await async_client.messages.create(**params, stream=True)
|
|
|
|
async for event in stream:
|
|
|
|
if event.type == "message_start":
|
|
|
|
assert event.message.usage.input_tokens > 1
|
|
|
|
# Note: this single output token included in message start event
|
|
|
|
# does not appear to contribute to overall output token counts. It
|
|
|
|
# is excluded from the total token count.
|
|
|
|
assert event.message.usage.output_tokens == 1
|
|
|
|
elif event.type == "message_delta":
|
|
|
|
assert event.usage.output_tokens > 1
|
|
|
|
else:
|
|
|
|
pass
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
|
|
|
|
async def test_abatch() -> None:
|
|
|
|
"""Test streaming tokens from ChatAnthropicMessages."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
result = await llm.abatch(["I'm Pickle Rick", "I'm not Pickle Rick"])
|
|
|
|
for token in result:
|
|
|
|
assert isinstance(token.content, str)
|
|
|
|
|
|
|
|
|
|
|
|
async def test_abatch_tags() -> None:
|
|
|
|
"""Test batch tokens from ChatAnthropicMessages."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
result = await llm.abatch(
|
|
|
|
["I'm Pickle Rick", "I'm not Pickle Rick"], config={"tags": ["foo"]}
|
|
|
|
)
|
|
|
|
for token in result:
|
|
|
|
assert isinstance(token.content, str)
|
|
|
|
|
|
|
|
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
async def test_async_tool_use() -> None:
|
|
|
|
llm = ChatAnthropic( # type: ignore[call-arg]
|
|
|
|
model=MODEL_NAME,
|
|
|
|
)
|
|
|
|
|
|
|
|
llm_with_tools = llm.bind_tools(
|
|
|
|
[
|
|
|
|
{
|
|
|
|
"name": "get_weather",
|
|
|
|
"description": "Get weather report for a city",
|
|
|
|
"input_schema": {
|
|
|
|
"type": "object",
|
|
|
|
"properties": {"location": {"type": "string"}},
|
|
|
|
},
|
|
|
|
}
|
|
|
|
]
|
|
|
|
)
|
|
|
|
response = await llm_with_tools.ainvoke("what's the weather in san francisco, ca")
|
|
|
|
assert isinstance(response, AIMessage)
|
|
|
|
assert isinstance(response.content, list)
|
|
|
|
assert isinstance(response.tool_calls, list)
|
|
|
|
assert len(response.tool_calls) == 1
|
|
|
|
tool_call = response.tool_calls[0]
|
|
|
|
assert tool_call["name"] == "get_weather"
|
|
|
|
assert isinstance(tool_call["args"], dict)
|
|
|
|
assert "location" in tool_call["args"]
|
|
|
|
|
|
|
|
# Test streaming
|
|
|
|
first = True
|
|
|
|
chunks = [] # type: ignore
|
|
|
|
async for chunk in llm_with_tools.astream(
|
|
|
|
"what's the weather in san francisco, ca"
|
|
|
|
):
|
|
|
|
chunks = chunks + [chunk]
|
|
|
|
if first:
|
|
|
|
gathered = chunk
|
|
|
|
first = False
|
|
|
|
else:
|
|
|
|
gathered = gathered + chunk # type: ignore
|
|
|
|
assert len(chunks) > 1
|
|
|
|
assert isinstance(gathered, AIMessageChunk)
|
|
|
|
assert isinstance(gathered.tool_call_chunks, list)
|
|
|
|
assert len(gathered.tool_call_chunks) == 1
|
|
|
|
tool_call_chunk = gathered.tool_call_chunks[0]
|
|
|
|
assert tool_call_chunk["name"] == "get_weather"
|
|
|
|
assert isinstance(tool_call_chunk["args"], str)
|
|
|
|
assert "location" in json.loads(tool_call_chunk["args"])
|
|
|
|
|
|
|
|
|
2023-12-20 02:55:19 +00:00
|
|
|
def test_batch() -> None:
|
|
|
|
"""Test batch tokens from ChatAnthropicMessages."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
result = llm.batch(["I'm Pickle Rick", "I'm not Pickle Rick"])
|
|
|
|
for token in result:
|
|
|
|
assert isinstance(token.content, str)
|
|
|
|
|
|
|
|
|
|
|
|
async def test_ainvoke() -> None:
|
|
|
|
"""Test invoke tokens from ChatAnthropicMessages."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
result = await llm.ainvoke("I'm Pickle Rick", config={"tags": ["foo"]})
|
|
|
|
assert isinstance(result.content, str)
|
|
|
|
|
|
|
|
|
|
|
|
def test_invoke() -> None:
|
|
|
|
"""Test invoke tokens from ChatAnthropicMessages."""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
result = llm.invoke("I'm Pickle Rick", config=dict(tags=["foo"]))
|
|
|
|
assert isinstance(result.content, str)
|
|
|
|
|
|
|
|
|
|
|
|
def test_system_invoke() -> None:
|
|
|
|
"""Test invoke tokens with a system message"""
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages(model_name=MODEL_NAME) # type: ignore[call-arg, call-arg]
|
2023-12-20 02:55:19 +00:00
|
|
|
|
|
|
|
prompt = ChatPromptTemplate.from_messages(
|
|
|
|
[
|
|
|
|
(
|
|
|
|
"system",
|
|
|
|
"You are an expert cartographer. If asked, you are a cartographer. "
|
|
|
|
"STAY IN CHARACTER",
|
|
|
|
),
|
|
|
|
("human", "Are you a mathematician?"),
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
chain = prompt | llm
|
|
|
|
|
|
|
|
result = chain.invoke({})
|
|
|
|
assert isinstance(result.content, str)
|
2024-02-26 05:57:26 +00:00
|
|
|
|
|
|
|
|
|
|
|
def test_anthropic_call() -> None:
|
|
|
|
"""Test valid call to anthropic."""
|
2024-06-03 15:21:55 +00:00
|
|
|
chat = ChatAnthropic(model="test") # type: ignore[call-arg]
|
2024-02-26 05:57:26 +00:00
|
|
|
message = HumanMessage(content="Hello")
|
2024-04-24 23:39:23 +00:00
|
|
|
response = chat.invoke([message])
|
2024-02-26 05:57:26 +00:00
|
|
|
assert isinstance(response, AIMessage)
|
|
|
|
assert isinstance(response.content, str)
|
|
|
|
|
|
|
|
|
|
|
|
def test_anthropic_generate() -> None:
|
|
|
|
"""Test generate method of anthropic."""
|
2024-06-03 15:21:55 +00:00
|
|
|
chat = ChatAnthropic(model="test") # type: ignore[call-arg]
|
2024-02-26 05:57:26 +00:00
|
|
|
chat_messages: List[List[BaseMessage]] = [
|
|
|
|
[HumanMessage(content="How many toes do dogs have?")]
|
|
|
|
]
|
|
|
|
messages_copy = [messages.copy() for messages in chat_messages]
|
|
|
|
result: LLMResult = chat.generate(chat_messages)
|
|
|
|
assert isinstance(result, LLMResult)
|
|
|
|
for response in result.generations[0]:
|
|
|
|
assert isinstance(response, ChatGeneration)
|
|
|
|
assert isinstance(response.text, str)
|
|
|
|
assert response.text == response.message.content
|
|
|
|
assert chat_messages == messages_copy
|
|
|
|
|
|
|
|
|
|
|
|
def test_anthropic_streaming() -> None:
|
|
|
|
"""Test streaming tokens from anthropic."""
|
2024-06-03 15:21:55 +00:00
|
|
|
chat = ChatAnthropic(model="test") # type: ignore[call-arg]
|
2024-02-26 05:57:26 +00:00
|
|
|
message = HumanMessage(content="Hello")
|
|
|
|
response = chat.stream([message])
|
|
|
|
for token in response:
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
assert isinstance(token.content, str)
|
|
|
|
|
|
|
|
|
|
|
|
def test_anthropic_streaming_callback() -> None:
|
|
|
|
"""Test that streaming correctly invokes on_llm_new_token callback."""
|
|
|
|
callback_handler = FakeCallbackHandler()
|
|
|
|
callback_manager = CallbackManager([callback_handler])
|
2024-06-03 15:21:55 +00:00
|
|
|
chat = ChatAnthropic( # type: ignore[call-arg]
|
2024-02-26 05:57:26 +00:00
|
|
|
model="test",
|
|
|
|
callback_manager=callback_manager,
|
|
|
|
verbose=True,
|
|
|
|
)
|
|
|
|
message = HumanMessage(content="Write me a sentence with 10 words.")
|
|
|
|
for token in chat.stream([message]):
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
assert isinstance(token.content, str)
|
|
|
|
assert callback_handler.llm_streams > 1
|
|
|
|
|
|
|
|
|
|
|
|
async def test_anthropic_async_streaming_callback() -> None:
|
|
|
|
"""Test that streaming correctly invokes on_llm_new_token callback."""
|
|
|
|
callback_handler = FakeCallbackHandler()
|
|
|
|
callback_manager = CallbackManager([callback_handler])
|
2024-06-03 15:21:55 +00:00
|
|
|
chat = ChatAnthropic( # type: ignore[call-arg]
|
2024-02-26 05:57:26 +00:00
|
|
|
model="test",
|
|
|
|
callback_manager=callback_manager,
|
|
|
|
verbose=True,
|
|
|
|
)
|
|
|
|
chat_messages: List[BaseMessage] = [
|
|
|
|
HumanMessage(content="How many toes do dogs have?")
|
|
|
|
]
|
|
|
|
async for token in chat.astream(chat_messages):
|
|
|
|
assert isinstance(token, AIMessageChunk)
|
|
|
|
assert isinstance(token.content, str)
|
|
|
|
assert callback_handler.llm_streams > 1
|
2024-03-05 01:50:13 +00:00
|
|
|
|
|
|
|
|
|
|
|
def test_anthropic_multimodal() -> None:
|
|
|
|
"""Test that multimodal inputs are handled correctly."""
|
2024-06-03 15:21:55 +00:00
|
|
|
chat = ChatAnthropic(model=MODEL_NAME) # type: ignore[call-arg]
|
2024-03-05 01:50:13 +00:00
|
|
|
messages = [
|
|
|
|
HumanMessage(
|
|
|
|
content=[
|
|
|
|
{
|
|
|
|
"type": "image_url",
|
|
|
|
"image_url": {
|
|
|
|
# langchain logo
|
|
|
|
"url": "
|
|
|
|
},
|
|
|
|
},
|
|
|
|
{"type": "text", "text": "What is this a logo for?"},
|
|
|
|
]
|
|
|
|
)
|
|
|
|
]
|
|
|
|
response = chat.invoke(messages)
|
|
|
|
assert isinstance(response, AIMessage)
|
|
|
|
assert isinstance(response.content, str)
|
2024-03-08 21:32:57 +00:00
|
|
|
|
|
|
|
|
|
|
|
def test_streaming() -> None:
|
|
|
|
"""Test streaming tokens from Anthropic."""
|
|
|
|
callback_handler = FakeCallbackHandler()
|
|
|
|
callback_manager = CallbackManager([callback_handler])
|
|
|
|
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages( # type: ignore[call-arg, call-arg]
|
2024-03-08 21:32:57 +00:00
|
|
|
model_name=MODEL_NAME, streaming=True, callback_manager=callback_manager
|
|
|
|
)
|
|
|
|
|
|
|
|
response = llm.generate([[HumanMessage(content="I'm Pickle Rick")]])
|
|
|
|
assert callback_handler.llm_streams > 0
|
|
|
|
assert isinstance(response, LLMResult)
|
|
|
|
|
|
|
|
|
|
|
|
async def test_astreaming() -> None:
|
|
|
|
"""Test streaming tokens from Anthropic."""
|
|
|
|
callback_handler = FakeCallbackHandler()
|
|
|
|
callback_manager = CallbackManager([callback_handler])
|
|
|
|
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropicMessages( # type: ignore[call-arg, call-arg]
|
2024-03-08 21:32:57 +00:00
|
|
|
model_name=MODEL_NAME, streaming=True, callback_manager=callback_manager
|
|
|
|
)
|
|
|
|
|
|
|
|
response = await llm.agenerate([[HumanMessage(content="I'm Pickle Rick")]])
|
|
|
|
assert callback_handler.llm_streams > 0
|
|
|
|
assert isinstance(response, LLMResult)
|
2024-04-04 20:22:48 +00:00
|
|
|
|
|
|
|
|
|
|
|
def test_tool_use() -> None:
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropic( # type: ignore[call-arg]
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
model=MODEL_NAME,
|
2024-04-04 20:22:48 +00:00
|
|
|
)
|
|
|
|
|
|
|
|
llm_with_tools = llm.bind_tools(
|
|
|
|
[
|
|
|
|
{
|
|
|
|
"name": "get_weather",
|
|
|
|
"description": "Get weather report for a city",
|
|
|
|
"input_schema": {
|
|
|
|
"type": "object",
|
|
|
|
"properties": {"location": {"type": "string"}},
|
|
|
|
},
|
|
|
|
}
|
|
|
|
]
|
|
|
|
)
|
|
|
|
response = llm_with_tools.invoke("what's the weather in san francisco, ca")
|
|
|
|
assert isinstance(response, AIMessage)
|
|
|
|
assert isinstance(response.content, list)
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
assert isinstance(response.tool_calls, list)
|
|
|
|
assert len(response.tool_calls) == 1
|
|
|
|
tool_call = response.tool_calls[0]
|
|
|
|
assert tool_call["name"] == "get_weather"
|
|
|
|
assert isinstance(tool_call["args"], dict)
|
|
|
|
assert "location" in tool_call["args"]
|
|
|
|
|
|
|
|
# Test streaming
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
input = "how are you? what's the weather in san francisco, ca"
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
first = True
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
chunks = [] # type: ignore
|
|
|
|
for chunk in llm_with_tools.stream(input):
|
|
|
|
chunks = chunks + [chunk]
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
if first:
|
|
|
|
gathered = chunk
|
|
|
|
first = False
|
|
|
|
else:
|
|
|
|
gathered = gathered + chunk # type: ignore
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
assert len(chunks) > 1
|
|
|
|
assert isinstance(gathered.content, list)
|
|
|
|
assert len(gathered.content) == 2
|
|
|
|
tool_use_block = None
|
|
|
|
for content_block in gathered.content:
|
|
|
|
assert isinstance(content_block, dict)
|
|
|
|
if content_block["type"] == "tool_use":
|
|
|
|
tool_use_block = content_block
|
|
|
|
break
|
|
|
|
assert tool_use_block is not None
|
|
|
|
assert tool_use_block["name"] == "get_weather"
|
|
|
|
assert "location" in json.loads(tool_use_block["partial_json"])
|
core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 23:41:42 +00:00
|
|
|
assert isinstance(gathered, AIMessageChunk)
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
assert isinstance(gathered.tool_calls, list)
|
|
|
|
assert len(gathered.tool_calls) == 1
|
|
|
|
tool_call = gathered.tool_calls[0]
|
|
|
|
assert tool_call["name"] == "get_weather"
|
|
|
|
assert isinstance(tool_call["args"], dict)
|
|
|
|
assert "location" in tool_call["args"]
|
|
|
|
assert tool_call["id"] is not None
|
|
|
|
|
|
|
|
# Test passing response back to model
|
|
|
|
stream = llm_with_tools.stream(
|
|
|
|
[
|
|
|
|
input,
|
|
|
|
gathered,
|
|
|
|
ToolMessage(content="sunny and warm", tool_call_id=tool_call["id"]),
|
|
|
|
]
|
|
|
|
)
|
|
|
|
chunks = [] # type: ignore
|
|
|
|
first = True
|
|
|
|
for chunk in stream:
|
|
|
|
chunks = chunks + [chunk]
|
|
|
|
if first:
|
|
|
|
gathered = chunk
|
|
|
|
first = False
|
|
|
|
else:
|
|
|
|
gathered = gathered + chunk # type: ignore
|
|
|
|
assert len(chunks) > 1
|
2024-04-04 20:22:48 +00:00
|
|
|
|
|
|
|
|
2024-04-17 19:37:04 +00:00
|
|
|
def test_anthropic_with_empty_text_block() -> None:
|
|
|
|
"""Anthropic SDK can return an empty text block."""
|
|
|
|
|
|
|
|
@tool
|
|
|
|
def type_letter(letter: str) -> str:
|
|
|
|
"""Type the given letter."""
|
|
|
|
return "OK"
|
|
|
|
|
2024-06-03 15:21:55 +00:00
|
|
|
model = ChatAnthropic(model="claude-3-opus-20240229", temperature=0).bind_tools( # type: ignore[call-arg]
|
2024-04-17 19:37:04 +00:00
|
|
|
[type_letter]
|
|
|
|
)
|
|
|
|
|
|
|
|
messages = [
|
|
|
|
SystemMessage(
|
|
|
|
content="Repeat the given string using the provided tools. Do not write "
|
|
|
|
"anything else or provide any explanations. For example, "
|
|
|
|
"if the string is 'abc', you must print the "
|
|
|
|
"letters 'a', 'b', and 'c' one at a time and in that order. "
|
|
|
|
),
|
|
|
|
HumanMessage(content="dog"),
|
|
|
|
AIMessage(
|
|
|
|
content=[
|
|
|
|
{"text": "", "type": "text"},
|
|
|
|
{
|
|
|
|
"id": "toolu_01V6d6W32QGGSmQm4BT98EKk",
|
|
|
|
"input": {"letter": "d"},
|
|
|
|
"name": "type_letter",
|
|
|
|
"type": "tool_use",
|
|
|
|
},
|
|
|
|
],
|
|
|
|
tool_calls=[
|
|
|
|
{
|
|
|
|
"name": "type_letter",
|
|
|
|
"args": {"letter": "d"},
|
|
|
|
"id": "toolu_01V6d6W32QGGSmQm4BT98EKk",
|
|
|
|
},
|
|
|
|
],
|
|
|
|
),
|
|
|
|
ToolMessage(content="OK", tool_call_id="toolu_01V6d6W32QGGSmQm4BT98EKk"),
|
|
|
|
]
|
|
|
|
|
|
|
|
model.invoke(messages)
|
|
|
|
|
|
|
|
|
2024-04-04 20:22:48 +00:00
|
|
|
def test_with_structured_output() -> None:
|
2024-06-03 15:21:55 +00:00
|
|
|
llm = ChatAnthropic( # type: ignore[call-arg]
|
2024-04-04 20:22:48 +00:00
|
|
|
model="claude-3-opus-20240229",
|
|
|
|
)
|
|
|
|
|
|
|
|
structured_llm = llm.with_structured_output(
|
|
|
|
{
|
|
|
|
"name": "get_weather",
|
|
|
|
"description": "Get weather report for a city",
|
|
|
|
"input_schema": {
|
|
|
|
"type": "object",
|
|
|
|
"properties": {"location": {"type": "string"}},
|
|
|
|
},
|
|
|
|
}
|
|
|
|
)
|
|
|
|
response = structured_llm.invoke("what's the weather in san francisco, ca")
|
|
|
|
assert isinstance(response, dict)
|
|
|
|
assert response["location"]
|
2024-05-16 17:56:29 +00:00
|
|
|
|
|
|
|
|
|
|
|
class GetWeather(BaseModel):
|
|
|
|
"""Get the current weather in a given location"""
|
|
|
|
|
|
|
|
location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize("tool_choice", ["GetWeather", "auto", "any"])
|
|
|
|
def test_anthropic_bind_tools_tool_choice(tool_choice: str) -> None:
|
2024-06-03 15:21:55 +00:00
|
|
|
chat_model = ChatAnthropic( # type: ignore[call-arg]
|
anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.
One thing - Anthropic events look like this:
```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```
Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.
We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?
If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?
CC @ccurme @efriis @baskaryan
2024-06-14 16:14:43 +00:00
|
|
|
model=MODEL_NAME,
|
2024-05-16 17:56:29 +00:00
|
|
|
)
|
|
|
|
chat_model_with_tools = chat_model.bind_tools([GetWeather], tool_choice=tool_choice)
|
|
|
|
response = chat_model_with_tools.invoke("what's the weather in ny and la")
|
|
|
|
assert isinstance(response, AIMessage)
|