Infino integration for simplified logs, metrics & search across LLM data & token usage (#6218)

### Integration of Infino with LangChain for Enhanced Observability This PR aims to integrate [Infino](https://github.com/infinohq/infino), an open source observability platform written in rust for storing metrics and logs at scale, with LangChain, providing users with a streamlined and efficient method of tracking and recording LangChain experiments. By incorporating Infino into LangChain, users will be able to gain valuable insights and easily analyze the behavior of their language models. #### Please refer to the following files related to integration: - `InfinoCallbackHandler`: A [callback handler](https://github.com/naman-modi/langchain/blob/feature/infino-integration/langchain/callbacks/infino_callback.py) specifically designed for storing chain responses within Infino. - Example `infino.ipynb` file: A comprehensive notebook named [infino.ipynb](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/modules/callbacks/integrations/infino.ipynb) has been included to guide users on effectively leveraging Infino for tracking LangChain requests. - [Integration Doc](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/ecosystem/integrations/infino.mdx) for Infino integration. By integrating Infino, LangChain users will gain access to powerful visualization and debugging capabilities. Infino enables easy tracking of inputs, outputs, token usage, execution time of LLMs. This comprehensive observability ensures a deeper understanding of individual executions and facilitates effective debugging. Co-authors: @vinaykakade @savannahar68 --------- Co-authored-by: Vinay Kakade <vinaykakade@gmail.com>
2023-06-21 14:08:20 +05:30 · 2023-06-21 14:08:20 +05:30 · 37a89918e0
commit 37a89918e0
parent e0f468f6c1
4 changed files with 628 additions and 0 deletions
--- a/docs/extras/ecosystem/integrations/infino.mdx
+++ b/docs/extras/ecosystem/integrations/infino.mdx
@ -0,0 +1,35 @@
+# Infino
+
+>[Infino](https://github.com/infinohq/infino) is an open-source observability platform that stores both metrics and application logs together.
+
+Key features of infino include:
+- Metrics Tracking: Capture time taken by LLM model to handle request, errors, number of tokens, and costing indication for the particular LLM.
+- Data Tracking: Log and store prompt, request, and response data for each LangChain interaction.
+- Graph Visualization: Generate basic graphs over time, depicting metrics such as request duration, error occurrences, token count, and cost.
+
+## Installation and Setup
+
+First, you'll need to install the  `infinopy` Python package as follows:
+
+```bash
+pip install infinopy
+```
+
+If you already have an Infino Server running, then you're good to go; but if
+you don't, follow the next steps to start it:
+
+- Make sure you have Docker installed
+- Run the following in your terminal:
+    ```
+    docker run --rm --detach --name infino-example -p 3000:3000 infinohq/infino:latest
+    ```
+
+
+
+## Using Infino
+
+See a [usage example of `InfinoCallbackHandler`](/docs/modules/callbacks/integrations/infino.html).
+
+```python
+from langchain.callbacks import InfinoCallbackHandler
+```
--- a/docs/extras/modules/callbacks/integrations/infino.ipynb
+++ b/docs/extras/modules/callbacks/integrations/infino.ipynb
--- a/langchain/callbacks/init.py
+++ b/langchain/callbacks/init.py
@ -7,6 +7,7 @@ from langchain.callbacks.clearml_callback import ClearMLCallbackHandler
 from langchain.callbacks.comet_ml_callback import CometCallbackHandler
 from langchain.callbacks.file import FileCallbackHandler
 from langchain.callbacks.human import HumanApprovalCallbackHandler
+from langchain.callbacks.infino_callback import InfinoCallbackHandler
 from langchain.callbacks.manager import (
    get_openai_callback,
    tracing_enabled,
@ -36,6 +37,7 @@ __all__ = [
    "FileCallbackHandler",
    "FinalStreamingStdOutCallbackHandler",
    "HumanApprovalCallbackHandler",
+    "InfinoCallbackHandler",
    "MlflowCallbackHandler",
    "OpenAICallbackHandler",
    "StdOutCallbackHandler",
--- a/langchain/callbacks/infino_callback.py
+++ b/langchain/callbacks/infino_callback.py
@ -0,0 +1,172 @@
+import time
+from typing import Any, Dict, List, Optional, Union
+
+from langchain.callbacks.base import BaseCallbackHandler
+from langchain.schema import AgentAction, AgentFinish, LLMResult
+
+
+def import_infino() -> Any:
+    try:
+        from infinopy import InfinoClient
+    except ImportError:
+        raise ImportError(
+            "To use the Infino callbacks manager you need to have the"
+            " `infinopy` python package installed."
+            "Please install it with `pip install infinopy`"
+        )
+    return InfinoClient()
+
+
+class InfinoCallbackHandler(BaseCallbackHandler):
+    """Callback Handler that logs to Infino."""
+
+    def __init__(
+        self,
+        model_id: Optional[str] = None,
+        model_version: Optional[str] = None,
+        verbose: bool = False,
+    ) -> None:
+        # Set Infino client
+        self.client = import_infino()
+        self.model_id = model_id
+        self.model_version = model_version
+        self.verbose = verbose
+
+    def _send_to_infino(
+        self,
+        key: str,
+        value: Any,
+        is_ts: bool = True,
+    ) -> None:
+        """Send the key-value to Infino.
+
+        Parameters:
+        key (str): the key to send to Infino.
+        value (Any): the value to send to Infino.
+        is_ts (bool): if True, the value is part of a time series, else it
+                      is sent as a log message.
+        """
+        payload = {
+            "date": int(time.time()),
+            key: value,
+            "labels": {
+                "model_id": self.model_id,
+                "model_version": self.model_version,
+            },
+        }
+        if self.verbose:
+            print(f"Tracking {key} with Infino: {payload}")
+
+        # Append to Infino time series only if is_ts is True, otherwise
+        # append to Infino log.
+        if is_ts:
+            self.client.append_ts(payload)
+        else:
+            self.client.append_log(payload)
+
+    def on_llm_start(
+        self,
+        serialized: Dict[str, Any],
+        prompts: List[str],
+        **kwargs: Any,
+    ) -> None:
+        """Log the prompts to Infino, and set start time and error flag."""
+        for prompt in prompts:
+            self._send_to_infino("prompt", prompt, is_ts=False)
+
+        # Set the error flag to indicate no error (this will get overridden
+        # in on_llm_error if an error occurs).
+        self.error = 0
+
+        # Set the start time (so that we can calculate the request
+        # duration in on_llm_end).
+        self.start_time = time.time()
+
+    def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
+        """Do nothing when a new token is generated."""
+        pass
+
+    def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
+        """Log the latency, error, token usage, and response to Infino."""
+        # Calculate and track the request latency.
+        self.end_time = time.time()
+        duration = self.end_time - self.start_time
+        self._send_to_infino("latency", duration)
+
+        # Track success or error flag.
+        self._send_to_infino("error", self.error)
+
+        # Track token usage.
+        if (response.llm_output is not None) and isinstance(response.llm_output, Dict):
+            token_usage = response.llm_output["token_usage"]
+            if token_usage is not None:
+                prompt_tokens = token_usage["prompt_tokens"]
+                total_tokens = token_usage["total_tokens"]
+                completion_tokens = token_usage["completion_tokens"]
+                self._send_to_infino("prompt_tokens", prompt_tokens)
+                self._send_to_infino("total_tokens", total_tokens)
+                self._send_to_infino("completion_tokens", completion_tokens)
+
+        # Track prompt response.
+        for generations in response.generations:
+            for generation in generations:
+                self._send_to_infino("prompt_response", generation.text, is_ts=False)
+
+    def on_llm_error(
+        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
+    ) -> None:
+        """Set the error flag."""
+        self.error = 1
+
+    def on_chain_start(
+        self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
+    ) -> None:
+        """Do nothing when LLM chain starts."""
+        pass
+
+    def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
+        """Do nothing when LLM chain ends."""
+        pass
+
+    def on_chain_error(
+        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
+    ) -> None:
+        """Need to log the error."""
+        pass
+
+    def on_tool_start(
+        self,
+        serialized: Dict[str, Any],
+        input_str: str,
+        **kwargs: Any,
+    ) -> None:
+        """Do nothing when tool starts."""
+        pass
+
+    def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
+        """Do nothing when agent takes a specific action."""
+        pass
+
+    def on_tool_end(
+        self,
+        output: str,
+        observation_prefix: Optional[str] = None,
+        llm_prefix: Optional[str] = None,
+        **kwargs: Any,
+    ) -> None:
+        """Do nothing when tool ends."""
+        pass
+
+    def on_tool_error(
+        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
+    ) -> None:
+        """Do nothing when tool outputs an error."""
+        pass
+
+    def on_text(self, text: str, **kwargs: Any) -> None:
+        """Do nothing."""
+        pass
+
+    def on_agent_finish(self, finish: AgentFinish, **kwargs: Any) -> None:
+        """Do nothing."""
+        pass