langchain/docs/snippets/modules/model_io/models/llms/how_to/streaming_llm.mdx

Currently, we support streaming for a broad range of LLM implementations, including but not limited to `OpenAI`, `ChatOpenAI`, `ChatAnthropic`, `Hugging Face Text Generation Inference`, and `Replicate`. This feature has been expanded to accommodate most of the models. To utilize streaming, use a [`CallbackHandler`](https://github.com/langchain-ai/langchain/blob/master/langchain/callbacks/base.py) that implements `on_llm_new_token`. In this example, we are using `StreamingStdOutCallbackHandler`.
```python
from langchain.llms import OpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler


llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
resp = llm("Write me a song about sparkling water.")
```

<CodeOutputBlock lang="python">

```
    Verse 1
    I'm sippin' on sparkling water,
    It's so refreshing and light,
    It's the perfect way to quench my thirst
    On a hot summer night.

    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.

    Verse 2
    I'm sippin' on sparkling water,
    It's so bubbly and bright,
    It's the perfect way to cool me down
    On a hot summer night.

    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.

    Verse 3
    I'm sippin' on sparkling water,
    It's so light and so clear,
    It's the perfect way to keep me cool
    On a hot summer night.

    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.
```

</CodeOutputBlock>

We still have access to the end `LLMResult` if using `generate`. However, `token_usage` is not currently supported for streaming.


```python
llm.generate(["Tell me a joke."])
```

<CodeOutputBlock lang="python">

```
    Q: What did the fish say when it hit the wall?
    A: Dam!


    LLMResult(generations=[[Generation(text='\n\nQ: What did the fish say when it hit the wall?\nA: Dam!', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {}, 'model_name': 'text-davinci-003'})
```

</CodeOutputBlock>