Fast load conversationsummarymemory from existing summary (#7533)

- Description: Adds an optional buffer arg to the memory's from_messages() method. If provided the existing memory will be loaded instead of regenerating a summary from the loaded messages. Why? If we have past messages to load from, it is likely we also have an existing summary. This is particularly helpful in cases where the chat is ephemeral and/or is backed by serverless where the chat history is not stored but where the updated chat history is passed back and forth between a backend/frontend. Eg: Take a stateless qa backend implementation that loads messages on every request and generates a response — without this addition, each time the messages are loaded via from_messages, the summaries are recomputed even though they may have just been computed during the previous response. With this, the previously computed summary can be passed in and avoid: 1) spending extra $$$ on tokens, and 2) increased response time by avoiding regenerating previously generated summary. Tag maintainer: @hwchase17 Twitter handle: https://twitter.com/ShantanuNair --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-11-06 03:20:49 +00:00 · 2023-08-01 06:44:11 +05:30 · 2023-08-01 06:44:11 +05:30 · 53f3793504
commit 53f3793504
parent ec40ead980
1 changed files with 17 additions and 2 deletions
--- a/docs/snippets/modules/memory/types/summary.mdx
+++ b/docs/snippets/modules/memory/types/summary.mdx
@ -60,7 +60,7 @@ memory.predict_new_summary(messages, previous_summary)

 </CodeOutputBlock>

-## Initializing with messages
+## Initializing with messages/existing summary

 If you have messages outside this class, you can easily initialize the class with ChatMessageHistory. During loading, a summary will be calculated.

@ -73,7 +73,11 @@ history.add_ai_message("hi there!")


 ```python
-memory = ConversationSummaryMemory.from_messages(llm=OpenAI(temperature=0), chat_memory=history, return_messages=True)
+memory = ConversationSummaryMemory.from_messages(
+    llm=OpenAI(temperature=0),
+    chat_memory=history,
+    return_messages=True
+)
 ```


@ -89,6 +93,17 @@ memory.buffer

 </CodeOutputBlock>

+Optionally you can speed up initialization using a previously generated summary, and avoid regenerating the summary by just initializing directly.
+
+```python
+memory = ConversationSummaryMemory(
+    llm=OpenAI(temperature=0),
+    buffer="The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.",
+    chat_memory=history,
+    return_messages=True
+)
+```
+
 ## Using in a chain
 Let's walk through an example of using this in a chain, again setting `verbose=True` so we can see the prompt.