build(deps): bump transformers from 4.36.2 to 4.38.0 in /application

Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.2 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.36.2...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
chore: Update Dockerfile to use Ubuntu mantic as base image and upgrade gunicorn to version 22.0.0
30 changed files with 853 additions and 467 deletions
--- a/SECURITY.md
+++ b/SECURITY.md
@ -0,0 +1,14 @@
+# Security Policy
+
+## Supported Versions
+
+Supported Versions:
+
+Currently, we support security patches by committing changes and bumping the version published on Github.
+
+## Reporting a Vulnerability
+
+Found a vulnerability? Please email us:
+
+security@arc53.com
+
--- a/application/Dockerfile
+++ b/application/Dockerfile
@ -1,31 +1,70 @@
-FROM python:3.11-slim-bullseye as builder
-
-# Tiktoken requires Rust toolchain, so build it in a separate stage
-RUN apt-get update && apt-get install -y gcc curl
-RUN apt-get install -y wget unzip
-RUN wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip
-RUN unzip mpnet-base-v2.zip -d model
-RUN rm mpnet-base-v2.zip
-RUN curl https://sh.rustup.rs -sSf | sh -s -- -y && apt-get install --reinstall libc6-dev -y
-ENV PATH="/root/.cargo/bin:${PATH}"
-RUN pip install --upgrade pip && pip install tiktoken==0.5.2
+# Builder Stage
+FROM ubuntu:mantic as builder
+
+# Install necessary packages
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends gcc curl wget unzip libc6-dev python3.11 python3-pip python3-venv && \
+    ln -s /usr/bin/python3.11 /usr/bin/python && \
+    ln -sf /usr/bin/pip3 /usr/bin/pip
+
+# Download and unzip the model
+RUN wget https://d3dg1063dc54p9.cloudfront.net/models/embeddings/mpnet-base-v2.zip && \
+    unzip mpnet-base-v2.zip -d model && \
+    rm mpnet-base-v2.zip
+
+# Install Rust
+RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
+
+# Clean up to reduce container size
+RUN apt-get remove --purge -y wget unzip && apt-get autoremove -y && rm -rf /var/lib/apt/lists/*
+
+# Copy requirements.txt
 COPY requirements.txt .
-RUN pip install -r requirements.txt

+# Setup Python virtual environment
+RUN python3 -m venv /venv
+ENV PATH="/venv/bin:$PATH"

+# Install Python packages
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir tiktoken && \
+    pip install --no-cache-dir -r requirements.txt

-FROM python:3.11-slim-bullseye
+# Final Stage
+FROM ubuntu:mantic as final

-# Copy pre-built packages and binaries from builder stage
-COPY --from=builder /usr/local/ /usr/local/
+# Install Python
+RUN apt-get update && apt-get install -y --no-install-recommends python3.11 python3-pip && \
+    ln -s /usr/bin/python3.11 /usr/bin/python && \
+    rm -rf /var/lib/apt/lists/*

+# Set working directory
 WORKDIR /app
+
+# Create a non-root user: `appuser` (Feel free to choose a name)
+RUN groupadd -r appuser && \
+    useradd -r -g appuser -d /app -s /sbin/nologin -c "Docker image user" appuser
+
+# Copy the virtual environment and model from the builder stage
+COPY --from=builder /venv /venv
 COPY --from=builder /model /app/model

+# Copy your application code
 COPY . /app/application
-ENV FLASK_APP=app.py
-ENV FLASK_DEBUG=true

+# Change the ownership of the /app directory to the appuser
+RUN chown -R appuser:appuser /app
+
+# Set environment variables
+ENV FLASK_APP=app.py \
+    FLASK_DEBUG=true \
+    PATH="/venv/bin:$PATH"
+
+# Expose the port the app runs on
 EXPOSE 7091

-CMD ["gunicorn", "-w", "2", "--timeout", "120", "--bind", "0.0.0.0:7091", "application.wsgi:app"]
+# Switch to non-root user
+USER appuser
+
+# Start Gunicorn
+CMD ["gunicorn", "-w", "2", "--timeout", "120", "--bind", "0.0.0.0:7091", "application.wsgi:app"]
--- a/application/api/answer/routes.py
+++ b/application/api/answer/routes.py
@ -10,14 +10,12 @@ from pymongo import MongoClient
 from bson.objectid import ObjectId


-
 from application.core.settings import settings
 from application.llm.llm_creator import LLMCreator
 from application.retriever.retriever_creator import RetrieverCreator
 from application.error import bad_request


-
 logger = logging.getLogger(__name__)

 mongo = MongoClient(settings.MONGO_URI)
@ -26,20 +24,22 @@ conversations_collection = db["conversations"]
 vectors_collection = db["vectors"]
 prompts_collection = db["prompts"]
 api_key_collection = db["api_keys"]
-answer = Blueprint('answer', __name__)
+answer = Blueprint("answer", __name__)

 gpt_model = ""
 # to have some kind of default behaviour
 if settings.LLM_NAME == "openai":
-    gpt_model = 'gpt-3.5-turbo'
+    gpt_model = "gpt-3.5-turbo"
 elif settings.LLM_NAME == "anthropic":
-    gpt_model = 'claude-2'
+    gpt_model = "claude-2"

 if settings.MODEL_NAME:  # in case there is particular model name configured
    gpt_model = settings.MODEL_NAME

 # load the prompts
-current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+current_dir = os.path.dirname(
+    os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+)
 with open(os.path.join(current_dir, "prompts", "chat_combine_default.txt"), "r") as f:
    chat_combine_template = f.read()

@ -50,7 +50,7 @@ with open(os.path.join(current_dir, "prompts", "chat_combine_creative.txt"), "r"
    chat_combine_creative = f.read()

 with open(os.path.join(current_dir, "prompts", "chat_combine_strict.txt"), "r") as f:
-    chat_combine_strict = f.read()    
+    chat_combine_strict = f.read()

 api_key_set = settings.API_KEY is not None
 embeddings_key_set = settings.EMBEDDINGS_KEY is not None
@ -61,8 +61,6 @@ async def async_generate(chain, question, chat_history):
    return result


-
-
 def run_async_chain(chain, question, chat_history):
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
@ -74,17 +72,18 @@ def run_async_chain(chain, question, chat_history):
    result["answer"] = answer
    return result

+
 def get_data_from_api_key(api_key):
    data = api_key_collection.find_one({"key": api_key})
    if data is None:
        return bad_request(401, "Invalid API key")
    return data
-    
+

 def get_vectorstore(data):
    if "active_docs" in data:
        if data["active_docs"].split("/")[0] == "default":
-                vectorstore = ""
+            vectorstore = ""
        elif data["active_docs"].split("/")[0] == "local":
            vectorstore = "indexes/" + data["active_docs"]
        else:
@ -98,52 +97,82 @@ def get_vectorstore(data):


 def is_azure_configured():
-    return settings.OPENAI_API_BASE and settings.OPENAI_API_VERSION and settings.AZURE_DEPLOYMENT_NAME
+    return (
+        settings.OPENAI_API_BASE
+        and settings.OPENAI_API_VERSION
+        and settings.AZURE_DEPLOYMENT_NAME
+    )
+

 def save_conversation(conversation_id, question, response, source_log_docs, llm):
    if conversation_id is not None and conversation_id != "None":
        conversations_collection.update_one(
            {"_id": ObjectId(conversation_id)},
-            {"$push": {"queries": {"prompt": question, "response": response, "sources": source_log_docs}}},
+            {
+                "$push": {
+                    "queries": {
+                        "prompt": question,
+                        "response": response,
+                        "sources": source_log_docs,
+                    }
+                }
+            },
        )

    else:
        # create new conversation
        # generate summary
-        messages_summary = [{"role": "assistant", "content": "Summarise following conversation in no more than 3 "
-                                                             "words, respond ONLY with the summary, use the same "
-                                                             "language as the system \n\nUser: " + question + "\n\n" +
-                                                             "AI: " +
-                                                             response},
-                            {"role": "user", "content": "Summarise following conversation in no more than 3 words, "
-                                                        "respond ONLY with the summary, use the same language as the "
-                                                        "system"}]
-
-        completion = llm.gen(model=gpt_model,
-                             messages=messages_summary, max_tokens=30)
+        messages_summary = [
+            {
+                "role": "assistant",
+                "content": "Summarise following conversation in no more than 3 "
+                "words, respond ONLY with the summary, use the same "
+                "language as the system \n\nUser: "
+                + question
+                + "\n\n"
+                + "AI: "
+                + response,
+            },
+            {
+                "role": "user",
+                "content": "Summarise following conversation in no more than 3 words, "
+                "respond ONLY with the summary, use the same language as the "
+                "system",
+            },
+        ]
+
+        completion = llm.gen(model=gpt_model, messages=messages_summary, max_tokens=30)
        conversation_id = conversations_collection.insert_one(
-            {"user": "local",
-             "date": datetime.datetime.utcnow(),
-             "name": completion,
-             "queries": [{"prompt": question, "response": response, "sources": source_log_docs}]}
+            {
+                "user": "local",
+                "date": datetime.datetime.utcnow(),
+                "name": completion,
+                "queries": [
+                    {
+                        "prompt": question,
+                        "response": response,
+                        "sources": source_log_docs,
+                    }
+                ],
+            }
        ).inserted_id
    return conversation_id

+
 def get_prompt(prompt_id):
-    if prompt_id == 'default':
+    if prompt_id == "default":
        prompt = chat_combine_template
-    elif prompt_id == 'creative':
+    elif prompt_id == "creative":
        prompt = chat_combine_creative
-    elif prompt_id == 'strict':
+    elif prompt_id == "strict":
        prompt = chat_combine_strict
    else:
        prompt = prompts_collection.find_one({"_id": ObjectId(prompt_id)})["content"]
    return prompt


-def complete_stream(question, retriever, conversation_id):
-    
-    
+def complete_stream(question, retriever, conversation_id, user_api_key):
+
    response_full = ""
    source_log_docs = []
    answer = retriever.gen()
@ -155,9 +184,12 @@ def complete_stream(question, retriever, conversation_id):
        elif "source" in line:
            source_log_docs.append(line["source"])

-
-    llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=settings.API_KEY)
-    conversation_id = save_conversation(conversation_id, question, response_full, source_log_docs, llm)
+    llm = LLMCreator.create_llm(
+        settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=user_api_key
+    )
+    conversation_id = save_conversation(
+        conversation_id, question, response_full, source_log_docs, llm
+    )

    # send data.type = "end" to indicate that the stream has ended as json
    data = json.dumps({"type": "id", "id": str(conversation_id)})
@ -180,17 +212,17 @@ def stream():
        conversation_id = None
    else:
        conversation_id = data["conversation_id"]
-    if 'prompt_id' in data:
+    if "prompt_id" in data:
        prompt_id = data["prompt_id"]
    else:
-        prompt_id = 'default'
-    if 'selectedDocs' in data and data['selectedDocs'] is None:
+        prompt_id = "default"
+    if "selectedDocs" in data and data["selectedDocs"] is None:
        chunks = 0
-    elif 'chunks' in data:
+    elif "chunks" in data:
        chunks = int(data["chunks"])
    else:
        chunks = 2
-    
+
    prompt = get_prompt(prompt_id)

    # check if active_docs is set
@ -198,23 +230,42 @@ def stream():
    if "api_key" in data:
        data_key = get_data_from_api_key(data["api_key"])
        source = {"active_docs": data_key["source"]}
+        user_api_key = data["api_key"]
    elif "active_docs" in data:
        source = {"active_docs": data["active_docs"]}
+        user_api_key = None
    else:
        source = {}
+        user_api_key = None

-    if source["active_docs"].split("/")[0] == "default" or source["active_docs"].split("/")[0] == "local":
+    if (
+        source["active_docs"].split("/")[0] == "default"
+        or source["active_docs"].split("/")[0] == "local"
+    ):
        retriever_name = "classic"
    else:
-        retriever_name = source['active_docs']
-
-    retriever = RetrieverCreator.create_retriever(retriever_name, question=question, 
-        source=source, chat_history=history, prompt=prompt, chunks=chunks, gpt_model=gpt_model
-        )
+        retriever_name = source["active_docs"]
+
+    retriever = RetrieverCreator.create_retriever(
+        retriever_name,
+        question=question,
+        source=source,
+        chat_history=history,
+        prompt=prompt,
+        chunks=chunks,
+        gpt_model=gpt_model,
+        user_api_key=user_api_key,
+    )

    return Response(
-        complete_stream(question=question, retriever=retriever,
-                        conversation_id=conversation_id), mimetype="text/event-stream")
+        complete_stream(
+            question=question,
+            retriever=retriever,
+            conversation_id=conversation_id,
+            user_api_key=user_api_key,
+        ),
+        mimetype="text/event-stream",
+    )


@answer.route("/api/answer", methods=["POST"])
@ -230,15 +281,15 @@ def api_answer():
    else:
        conversation_id = data["conversation_id"]
    print("-" * 5)
-    if 'prompt_id' in data:
+    if "prompt_id" in data:
        prompt_id = data["prompt_id"]
    else:
-        prompt_id = 'default'
-    if 'chunks' in data:
+        prompt_id = "default"
+    if "chunks" in data:
        chunks = int(data["chunks"])
    else:
        chunks = 2
-    
+
    prompt = get_prompt(prompt_id)

    # use try and except  to check for exception
@ -247,17 +298,29 @@ def api_answer():
        if "api_key" in data:
            data_key = get_data_from_api_key(data["api_key"])
            source = {"active_docs": data_key["source"]}
+            user_api_key = data["api_key"]
        else:
            source = {data}
+            user_api_key = None

-        if source["active_docs"].split("/")[0] == "default" or source["active_docs"].split("/")[0] == "local":
+        if (
+            source["active_docs"].split("/")[0] == "default"
+            or source["active_docs"].split("/")[0] == "local"
+        ):
            retriever_name = "classic"
        else:
-            retriever_name = source['active_docs']
-
-        retriever = RetrieverCreator.create_retriever(retriever_name, question=question, 
-            source=source, chat_history=history, prompt=prompt, chunks=chunks, gpt_model=gpt_model
-            )
+            retriever_name = source["active_docs"]
+
+        retriever = RetrieverCreator.create_retriever(
+            retriever_name,
+            question=question,
+            source=source,
+            chat_history=history,
+            prompt=prompt,
+            chunks=chunks,
+            gpt_model=gpt_model,
+            user_api_key=user_api_key,
+        )
        source_log_docs = []
        response_full = ""
        for line in retriever.gen():
@ -265,12 +328,15 @@ def api_answer():
                source_log_docs.append(line["source"])
            elif "answer" in line:
                response_full += line["answer"]
-            
-        llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=settings.API_KEY)
-            
+
+        llm = LLMCreator.create_llm(
+            settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=user_api_key
+        )

        result = {"answer": response_full, "sources": source_log_docs}
-        result["conversation_id"] = save_conversation(conversation_id, question, response_full, source_log_docs, llm)
+        result["conversation_id"] = save_conversation(
+            conversation_id, question, response_full, source_log_docs, llm
+        )

        return result
    except Exception as e:
@ -289,23 +355,35 @@ def api_search():
    if "api_key" in data:
        data_key = get_data_from_api_key(data["api_key"])
        source = {"active_docs": data_key["source"]}
+        user_api_key = data["api_key"]
    elif "active_docs" in data:
        source = {"active_docs": data["active_docs"]}
+        user_api_key = None
    else:
        source = {}
-    if 'chunks' in data:
+        user_api_key = None
+    if "chunks" in data:
        chunks = int(data["chunks"])
    else:
        chunks = 2

-    if source["active_docs"].split("/")[0] == "default" or source["active_docs"].split("/")[0] == "local":
+    if (
+        source["active_docs"].split("/")[0] == "default"
+        or source["active_docs"].split("/")[0] == "local"
+    ):
        retriever_name = "classic"
    else:
-        retriever_name = source['active_docs']
-
-    retriever = RetrieverCreator.create_retriever(retriever_name, question=question, 
-        source=source, chat_history=[], prompt="default", chunks=chunks, gpt_model=gpt_model
-        )
+        retriever_name = source["active_docs"]
+
+    retriever = RetrieverCreator.create_retriever(
+        retriever_name,
+        question=question,
+        source=source,
+        chat_history=[],
+        prompt="default",
+        chunks=chunks,
+        gpt_model=gpt_model,
+        user_api_key=user_api_key,
+    )
    docs = retriever.search()
    return docs
-
--- a/application/api/user/routes.py
+++ b/application/api/user/routes.py
@ -37,6 +37,12 @@ def delete_conversation():

    return {"status": "ok"}

+@user.route("/api/delete_all_conversations", methods=["POST"])
+def delete_all_conversations():
+    user_id = "local"
+    conversations_collection.delete_many({"user":user_id})
+    return {"status": "ok"}
+
@user.route("/api/get_conversations", methods=["get"])
 def get_conversations():
    # provides a list of conversations
--- a/application/llm/anthropic.py
+++ b/application/llm/anthropic.py
@ -1,21 +1,29 @@
 from application.llm.base import BaseLLM
 from application.core.settings import settings

+
 class AnthropicLLM(BaseLLM):

-    def __init__(self, api_key=None):
+    def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
        from anthropic import Anthropic, HUMAN_PROMPT, AI_PROMPT
-        self.api_key = api_key or settings.ANTHROPIC_API_KEY  # If not provided, use a default from settings
+
+        super().__init__(*args, **kwargs)
+        self.api_key = (
+            api_key or settings.ANTHROPIC_API_KEY
+        )  # If not provided, use a default from settings
+        self.user_api_key = user_api_key
        self.anthropic = Anthropic(api_key=self.api_key)
        self.HUMAN_PROMPT = HUMAN_PROMPT
        self.AI_PROMPT = AI_PROMPT

-    def gen(self, model, messages, max_tokens=300, stream=False, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen(
+        self, baseself, model, messages, stream=False, max_tokens=300, **kwargs
+    ):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Context \n {context} \n ### Question \n {user_question}"
        if stream:
-            return self.gen_stream(model, prompt, max_tokens, **kwargs)
+            return self.gen_stream(model, prompt, stream, max_tokens, **kwargs)

        completion = self.anthropic.completions.create(
            model=model,
@ -25,9 +33,11 @@ class AnthropicLLM(BaseLLM):
        )
        return completion.completion

-    def gen_stream(self, model, messages, max_tokens=300, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen_stream(
+        self, baseself, model, messages, stream=True, max_tokens=300, **kwargs
+    ):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Context \n {context} \n ### Question \n {user_question}"
        stream_response = self.anthropic.completions.create(
            model=model,
@ -37,4 +47,4 @@ class AnthropicLLM(BaseLLM):
        )

        for completion in stream_response:
-            yield completion.completion
+            yield completion.completion
--- a/application/llm/base.py
+++ b/application/llm/base.py
@ -1,14 +1,28 @@
 from abc import ABC, abstractmethod
+from application.usage import gen_token_usage, stream_token_usage


 class BaseLLM(ABC):
    def __init__(self):
-        pass
+        self.token_usage = {"prompt_tokens": 0, "generated_tokens": 0}
+
+    def _apply_decorator(self, method, decorator, *args, **kwargs):
+        return decorator(method, *args, **kwargs)

    @abstractmethod
-    def gen(self, *args, **kwargs):
+    def _raw_gen(self, model, messages, stream, *args, **kwargs):
        pass

+    def gen(self, model, messages, stream=False, *args, **kwargs):
+        return self._apply_decorator(self._raw_gen, gen_token_usage)(
+            self, model=model, messages=messages, stream=stream, *args, **kwargs
+        )
+
    @abstractmethod
-    def gen_stream(self, *args, **kwargs):
+    def _raw_gen_stream(self, model, messages, stream, *args, **kwargs):
        pass
+
+    def gen_stream(self, model, messages, stream=True, *args, **kwargs):
+        return self._apply_decorator(self._raw_gen_stream, stream_token_usage)(
+            self, model=model, messages=messages, stream=stream, *args, **kwargs
+        )
--- a/application/llm/docsgpt_provider.py
+++ b/application/llm/docsgpt_provider.py
@ -2,48 +2,43 @@ from application.llm.base import BaseLLM
 import json
 import requests

-class DocsGPTAPILLM(BaseLLM):

-    def __init__(self, *args, **kwargs):
-        self.endpoint =  "https://llm.docsgpt.co.uk"
+class DocsGPTAPILLM(BaseLLM):

+    def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.api_key = api_key
+        self.user_api_key = user_api_key
+        self.endpoint = "https://llm.docsgpt.co.uk"

-    def gen(self, model, messages, stream=False, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen(self, baseself, model, messages, stream=False, *args, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"

        response = requests.post(
-            f"{self.endpoint}/answer",
-            json={
-                "prompt": prompt,
-                "max_new_tokens": 30
-            }
+            f"{self.endpoint}/answer", json={"prompt": prompt, "max_new_tokens": 30}
        )
-        response_clean = response.json()['a'].replace("###", "")
+        response_clean = response.json()["a"].replace("###", "")

        return response_clean

-    def gen_stream(self, model, messages, stream=True, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen_stream(self, baseself, model, messages, stream=True, *args, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"

        # send prompt to endpoint /stream
        response = requests.post(
            f"{self.endpoint}/stream",
-            json={
-                "prompt": prompt,
-                "max_new_tokens": 256
-            },
-            stream=True
+            json={"prompt": prompt, "max_new_tokens": 256},
+            stream=True,
        )
-    
+
        for line in response.iter_lines():
            if line:
-                #data = json.loads(line)
-                data_str = line.decode('utf-8')
+                # data = json.loads(line)
+                data_str = line.decode("utf-8")
                if data_str.startswith("data: "):
                    data = json.loads(data_str[6:])
-                    yield data['a']
-                    
+                    yield data["a"]
--- a/application/llm/huggingface.py
+++ b/application/llm/huggingface.py
@ -1,44 +1,68 @@
 from application.llm.base import BaseLLM

+
 class HuggingFaceLLM(BaseLLM):

-    def __init__(self, api_key, llm_name='Arc53/DocsGPT-7B',q=False):
+    def __init__(
+        self,
+        api_key=None,
+        user_api_key=None,
+        llm_name="Arc53/DocsGPT-7B",
+        q=False,
+        *args,
+        **kwargs,
+    ):
        global hf
-        
+
        from langchain.llms import HuggingFacePipeline
+
        if q:
            import torch
-            from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
+            from transformers import (
+                AutoModelForCausalLM,
+                AutoTokenizer,
+                pipeline,
+                BitsAndBytesConfig,
+            )
+
            tokenizer = AutoTokenizer.from_pretrained(llm_name)
            bnb_config = BitsAndBytesConfig(
-                            load_in_4bit=True,
-                            bnb_4bit_use_double_quant=True,
-                            bnb_4bit_quant_type="nf4",
-                            bnb_4bit_compute_dtype=torch.bfloat16
-                        )
-            model = AutoModelForCausalLM.from_pretrained(llm_name,quantization_config=bnb_config)
+                load_in_4bit=True,
+                bnb_4bit_use_double_quant=True,
+                bnb_4bit_quant_type="nf4",
+                bnb_4bit_compute_dtype=torch.bfloat16,
+            )
+            model = AutoModelForCausalLM.from_pretrained(
+                llm_name, quantization_config=bnb_config
+            )
        else:
            from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
+
            tokenizer = AutoTokenizer.from_pretrained(llm_name)
            model = AutoModelForCausalLM.from_pretrained(llm_name)
-        
+
+        super().__init__(*args, **kwargs)
+        self.api_key = api_key
+        self.user_api_key = user_api_key
        pipe = pipeline(
-            "text-generation", model=model,
-            tokenizer=tokenizer, max_new_tokens=2000,
-            device_map="auto", eos_token_id=tokenizer.eos_token_id
+            "text-generation",
+            model=model,
+            tokenizer=tokenizer,
+            max_new_tokens=2000,
+            device_map="auto",
+            eos_token_id=tokenizer.eos_token_id,
        )
        hf = HuggingFacePipeline(pipeline=pipe)

-    def gen(self, model, messages, stream=False, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"

        result = hf(prompt)

        return result.content

-    def gen_stream(self, model, messages, stream=True, **kwargs):
+    def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):

        raise NotImplementedError("HuggingFaceLLM Streaming is not implemented yet.")
-
--- a/application/llm/llama_cpp.py
+++ b/application/llm/llama_cpp.py
@ -1,32 +1,45 @@
 from application.llm.base import BaseLLM
 from application.core.settings import settings

+
 class LlamaCpp(BaseLLM):

-    def __init__(self, api_key, llm_name=settings.MODEL_PATH, **kwargs):
+    def __init__(
+        self,
+        api_key=None,
+        user_api_key=None,
+        llm_name=settings.MODEL_PATH,
+        *args,
+        **kwargs,
+    ):
        global llama
        try:
            from llama_cpp import Llama
        except ImportError:
-            raise ImportError("Please install llama_cpp using pip install llama-cpp-python")
+            raise ImportError(
+                "Please install llama_cpp using pip install llama-cpp-python"
+            )

+        super().__init__(*args, **kwargs)
+        self.api_key = api_key
+        self.user_api_key = user_api_key
        llama = Llama(model_path=llm_name, n_ctx=2048)

-    def gen(self, model, messages, stream=False, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"

        result = llama(prompt, max_tokens=150, echo=False)

        # import sys
        # print(result['choices'][0]['text'].split('### Answer \n')[-1], file=sys.stderr)
-        
-        return result['choices'][0]['text'].split('### Answer \n')[-1]

-    def gen_stream(self, model, messages, stream=True, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+        return result["choices"][0]["text"].split("### Answer \n")[-1]
+
+    def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"

        result = llama(prompt, max_tokens=150, echo=False, stream=stream)
@ -35,5 +48,5 @@ class LlamaCpp(BaseLLM):
        # print(list(result), file=sys.stderr)

        for item in result:
-            for choice in item['choices']:
-                yield choice['text']
+            for choice in item["choices"]:
+                yield choice["text"]
--- a/application/llm/llm_creator.py
+++ b/application/llm/llm_creator.py
@ -7,22 +7,21 @@ from application.llm.docsgpt_provider import DocsGPTAPILLM
 from application.llm.premai import PremAILLM


-
 class LLMCreator:
    llms = {
-        'openai': OpenAILLM,
-        'azure_openai': AzureOpenAILLM,
-        'sagemaker': SagemakerAPILLM,
-        'huggingface': HuggingFaceLLM,
-        'llama.cpp': LlamaCpp,
-        'anthropic': AnthropicLLM,
-        'docsgpt': DocsGPTAPILLM,
-        'premai': PremAILLM,
+        "openai": OpenAILLM,
+        "azure_openai": AzureOpenAILLM,
+        "sagemaker": SagemakerAPILLM,
+        "huggingface": HuggingFaceLLM,
+        "llama.cpp": LlamaCpp,
+        "anthropic": AnthropicLLM,
+        "docsgpt": DocsGPTAPILLM,
+        "premai": PremAILLM,
    }

    @classmethod
-    def create_llm(cls, type, *args, **kwargs):
+    def create_llm(cls, type, api_key, user_api_key, *args, **kwargs):
        llm_class = cls.llms.get(type.lower())
        if not llm_class:
            raise ValueError(f"No LLM class found for type {type}")
-        return llm_class(*args, **kwargs)
+        return llm_class(api_key, user_api_key, *args, **kwargs)
--- a/application/llm/openai.py
+++ b/application/llm/openai.py
@ -1,36 +1,53 @@
 from application.llm.base import BaseLLM
 from application.core.settings import settings

+
 class OpenAILLM(BaseLLM):

-    def __init__(self, api_key):
+    def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
        global openai
        from openai import OpenAI
-        
+
+        super().__init__(*args, **kwargs)
        self.client = OpenAI(
-                api_key=api_key, 
-            )
+            api_key=api_key,
+        )
        self.api_key = api_key
+        self.user_api_key = user_api_key

    def _get_openai(self):
        # Import openai when needed
        import openai
-        
+
        return openai

-    def gen(self, model, messages, stream=False, engine=settings.AZURE_DEPLOYMENT_NAME, **kwargs):
-        response = self.client.chat.completions.create(model=model,
-            messages=messages,
-            stream=stream,
-            **kwargs)
+    def _raw_gen(
+        self,
+        baseself,
+        model,
+        messages,
+        stream=False,
+        engine=settings.AZURE_DEPLOYMENT_NAME,
+        **kwargs
+    ):
+        response = self.client.chat.completions.create(
+            model=model, messages=messages, stream=stream, **kwargs
+        )

        return response.choices[0].message.content

-    def gen_stream(self, model, messages, stream=True, engine=settings.AZURE_DEPLOYMENT_NAME, **kwargs):
-        response = self.client.chat.completions.create(model=model,
-            messages=messages,
-            stream=stream,
-            **kwargs)
+    def _raw_gen_stream(
+        self,
+        baseself,
+        model,
+        messages,
+        stream=True,
+        engine=settings.AZURE_DEPLOYMENT_NAME,
+        **kwargs
+    ):
+        response = self.client.chat.completions.create(
+            model=model, messages=messages, stream=stream, **kwargs
+        )

        for line in response:
            # import sys
@ -41,14 +58,17 @@ class OpenAILLM(BaseLLM):

 class AzureOpenAILLM(OpenAILLM):

-    def __init__(self, openai_api_key, openai_api_base, openai_api_version, deployment_name):
+    def __init__(
+        self, openai_api_key, openai_api_base, openai_api_version, deployment_name
+    ):
        super().__init__(openai_api_key)
-        self.api_base = settings.OPENAI_API_BASE,
-        self.api_version = settings.OPENAI_API_VERSION,
-        self.deployment_name = settings.AZURE_DEPLOYMENT_NAME,
+        self.api_base = (settings.OPENAI_API_BASE,)
+        self.api_version = (settings.OPENAI_API_VERSION,)
+        self.deployment_name = (settings.AZURE_DEPLOYMENT_NAME,)
        from openai import AzureOpenAI
+
        self.client = AzureOpenAI(
-            api_key=openai_api_key,  
+            api_key=openai_api_key,
            api_version=settings.OPENAI_API_VERSION,
            api_base=settings.OPENAI_API_BASE,
            deployment_name=settings.AZURE_DEPLOYMENT_NAME,
--- a/application/llm/premai.py
+++ b/application/llm/premai.py
@ -1,32 +1,37 @@
 from application.llm.base import BaseLLM
 from application.core.settings import settings

+
 class PremAILLM(BaseLLM):

-    def __init__(self, api_key):
+    def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
        from premai import Prem
-        
-        self.client = Prem(
-            api_key=api_key
-        )
+
+        super().__init__(*args, **kwargs)
+        self.client = Prem(api_key=api_key)
        self.api_key = api_key
+        self.user_api_key = user_api_key
        self.project_id = settings.PREMAI_PROJECT_ID

-    def gen(self, model, messages, stream=False, **kwargs):
-        response = self.client.chat.completions.create(model=model,
+    def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
+        response = self.client.chat.completions.create(
+            model=model,
            project_id=self.project_id,
            messages=messages,
            stream=stream,
-            **kwargs)
+            **kwargs
+        )

        return response.choices[0].message["content"]

-    def gen_stream(self, model, messages, stream=True, **kwargs):
-        response = self.client.chat.completions.create(model=model,
+    def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
+        response = self.client.chat.completions.create(
+            model=model,
            project_id=self.project_id,
            messages=messages,
            stream=stream,
-            **kwargs)
+            **kwargs
+        )

        for line in response:
            if line.choices[0].delta["content"] is not None:
--- a/application/llm/sagemaker.py
+++ b/application/llm/sagemaker.py
@ -4,11 +4,10 @@ import json
 import io


-
 class LineIterator:
    """
-    A helper class for parsing the byte stream input. 
-    
+    A helper class for parsing the byte stream input.
+
    The output of the model will be in the following format:
    ```
    b'{"outputs": [" a"]}\n'
@ -16,21 +15,21 @@ class LineIterator:
    b'{"outputs": [" problem"]}\n'
    ...
    ```
-    
-    While usually each PayloadPart event from the event stream will contain a byte array 
+
+    While usually each PayloadPart event from the event stream will contain a byte array
    with a full json, this is not guaranteed and some of the json objects may be split across
    PayloadPart events. For example:
    ```
    {'PayloadPart': {'Bytes': b'{"outputs": '}}
    {'PayloadPart': {'Bytes': b'[" problem"]}\n'}}
    ```
-    
+
    This class accounts for this by concatenating bytes written via the 'write' function
    and then exposing a method which will return lines (ending with a '\n' character) within
-    the buffer via the 'scan_lines' function. It maintains the position of the last read 
-    position to ensure that previous bytes are not exposed again. 
+    the buffer via the 'scan_lines' function. It maintains the position of the last read
+    position to ensure that previous bytes are not exposed again.
    """
-    
+
    def __init__(self, stream):
        self.byte_iterator = iter(stream)
        self.buffer = io.BytesIO()
@ -43,7 +42,7 @@ class LineIterator:
        while True:
            self.buffer.seek(self.read_pos)
            line = self.buffer.readline()
-            if line and line[-1] == ord('\n'):
+            if line and line[-1] == ord("\n"):
                self.read_pos += len(line)
                return line[:-1]
            try:
@ -52,33 +51,35 @@ class LineIterator:
                if self.read_pos < self.buffer.getbuffer().nbytes:
                    continue
                raise
-            if 'PayloadPart' not in chunk:
-                print('Unknown event type:' + chunk)
+            if "PayloadPart" not in chunk:
+                print("Unknown event type:" + chunk)
                continue
            self.buffer.seek(0, io.SEEK_END)
-            self.buffer.write(chunk['PayloadPart']['Bytes'])
+            self.buffer.write(chunk["PayloadPart"]["Bytes"])
+

 class SagemakerAPILLM(BaseLLM):

-    def __init__(self, *args, **kwargs):
+    def __init__(self, api_key=None, user_api_key=None, *args, **kwargs):
        import boto3
+
        runtime = boto3.client(
-            'runtime.sagemaker',
-            aws_access_key_id='xxx',
-            aws_secret_access_key='xxx',
-            region_name='us-west-2'
+            "runtime.sagemaker",
+            aws_access_key_id="xxx",
+            aws_secret_access_key="xxx",
+            region_name="us-west-2",
        )

-        
-        self.endpoint =  settings.SAGEMAKER_ENDPOINT
+        super().__init__(*args, **kwargs)
+        self.api_key = api_key
+        self.user_api_key = user_api_key
+        self.endpoint = settings.SAGEMAKER_ENDPOINT
        self.runtime = runtime

-
-    def gen(self, model, messages, stream=False, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+    def _raw_gen(self, baseself, model, messages, stream=False, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
-    

        # Construct payload for endpoint
        payload = {
@ -89,25 +90,25 @@ class SagemakerAPILLM(BaseLLM):
                "temperature": 0.1,
                "max_new_tokens": 30,
                "repetition_penalty": 1.03,
-                "stop": ["</s>", "###"]
-            }
+                "stop": ["</s>", "###"],
+            },
        }
-        body_bytes = json.dumps(payload).encode('utf-8')
+        body_bytes = json.dumps(payload).encode("utf-8")

        # Invoke the endpoint
-        response = self.runtime.invoke_endpoint(EndpointName=self.endpoint,
-                                        ContentType='application/json',
-                                        Body=body_bytes)
-        result = json.loads(response['Body'].read().decode())
+        response = self.runtime.invoke_endpoint(
+            EndpointName=self.endpoint, ContentType="application/json", Body=body_bytes
+        )
+        result = json.loads(response["Body"].read().decode())
        import sys
-        print(result[0]['generated_text'], file=sys.stderr)
-        return result[0]['generated_text'][len(prompt):]

-    def gen_stream(self, model, messages, stream=True, **kwargs):
-        context = messages[0]['content']
-        user_question = messages[-1]['content']
+        print(result[0]["generated_text"], file=sys.stderr)
+        return result[0]["generated_text"][len(prompt) :]
+
+    def _raw_gen_stream(self, baseself, model, messages, stream=True, **kwargs):
+        context = messages[0]["content"]
+        user_question = messages[-1]["content"]
        prompt = f"### Instruction \n {user_question} \n ### Context \n {context} \n ### Answer \n"
-    

        # Construct payload for endpoint
        payload = {
@ -118,22 +119,22 @@ class SagemakerAPILLM(BaseLLM):
                "temperature": 0.1,
                "max_new_tokens": 512,
                "repetition_penalty": 1.03,
-                "stop": ["</s>", "###"]
-            }
+                "stop": ["</s>", "###"],
+            },
        }
-        body_bytes = json.dumps(payload).encode('utf-8')
+        body_bytes = json.dumps(payload).encode("utf-8")

        # Invoke the endpoint
-        response = self.runtime.invoke_endpoint_with_response_stream(EndpointName=self.endpoint,
-                                        ContentType='application/json',
-                                        Body=body_bytes)
-        #result = json.loads(response['Body'].read().decode())
-        event_stream = response['Body']
-        start_json = b'{'
+        response = self.runtime.invoke_endpoint_with_response_stream(
+            EndpointName=self.endpoint, ContentType="application/json", Body=body_bytes
+        )
+        # result = json.loads(response['Body'].read().decode())
+        event_stream = response["Body"]
+        start_json = b"{"
        for line in LineIterator(event_stream):
-            if line != b'' and start_json in line:
-                #print(line)
-                data = json.loads(line[line.find(start_json):].decode('utf-8'))
-                if data['token']['text'] not in ["</s>", "###"]:
-                    print(data['token']['text'],end='')
-                    yield data['token']['text']
+            if line != b"" and start_json in line:
+                # print(line)
+                data = json.loads(line[line.find(start_json) :].decode("utf-8"))
+                if data["token"]["text"] not in ["</s>", "###"]:
+                    print(data["token"]["text"], end="")
+                    yield data["token"]["text"]
--- a/application/requirements.txt
+++ b/application/requirements.txt
@ -10,7 +10,7 @@ escodegen==1.0.11
 esprima==4.0.1
 faiss-cpu==1.7.4
 Flask==3.0.1
-gunicorn==21.2.0
+gunicorn==22.0.0
 html2text==2020.1.16
 javalang==0.13.0
 langchain==0.1.4
@ -19,7 +19,7 @@ nltk==3.8.1
 openapi3_parser==1.1.16
 pandas==2.2.0
 pydantic_settings==2.1.0
-pymongo==4.6.1
+pymongo==4.6.3
 PyPDF2==3.0.1
 python-dotenv==1.0.1
 qdrant-client==1.8.2
@ -27,9 +27,9 @@ redis==5.0.1
 Requests==2.31.0
 retry==0.9.2
 sentence-transformers
-tiktoken==0.5.2
-torch==2.1.2
+tiktoken
+torch
 tqdm==4.66.1
-transformers==4.36.2
+transformers==4.38.0
 unstructured==0.12.2
 Werkzeug==3.0.1
--- a/application/retriever/brave_search.py
+++ b/application/retriever/brave_search.py
@ -6,43 +6,54 @@ from application.utils import count_tokens
 from langchain_community.tools import BraveSearch


-
 class BraveRetSearch(BaseRetriever):

-    def __init__(self, question, source, chat_history, prompt, chunks=2, gpt_model='docsgpt'):
+    def __init__(
+        self,
+        question,
+        source,
+        chat_history,
+        prompt,
+        chunks=2,
+        gpt_model="docsgpt",
+        user_api_key=None,
+    ):
        self.question = question
        self.source = source
        self.chat_history = chat_history
        self.prompt = prompt
        self.chunks = chunks
        self.gpt_model = gpt_model
-    
+        self.user_api_key = user_api_key
+
    def _get_data(self):
        if self.chunks == 0:
            docs = []
        else:
-            search = BraveSearch.from_api_key(api_key=settings.BRAVE_SEARCH_API_KEY, 
-                search_kwargs={"count": int(self.chunks)})
+            search = BraveSearch.from_api_key(
+                api_key=settings.BRAVE_SEARCH_API_KEY,
+                search_kwargs={"count": int(self.chunks)},
+            )
            results = search.run(self.question)
            results = json.loads(results)
-        
+
            docs = []
            for i in results:
                try:
-                    title = i['title']
-                    link = i['link']
-                    snippet = i['snippet']
+                    title = i["title"]
+                    link = i["link"]
+                    snippet = i["snippet"]
                    docs.append({"text": snippet, "title": title, "link": link})
                except IndexError:
                    pass
        if settings.LLM_NAME == "llama.cpp":
            docs = [docs[0]]
-        
+
        return docs
-    
+
    def gen(self):
        docs = self._get_data()
-        
+
        # join all page_content together with a newline
        docs_together = "\n".join([doc["text"] for doc in docs])
        p_chat_combine = self.prompt.replace("{summaries}", docs_together)
@ -56,20 +67,29 @@ class BraveRetSearch(BaseRetriever):
            self.chat_history.reverse()
            for i in self.chat_history:
                if "prompt" in i and "response" in i:
-                    tokens_batch = count_tokens(i["prompt"]) + count_tokens(i["response"])
-                    if tokens_current_history + tokens_batch < settings.TOKENS_MAX_HISTORY:
+                    tokens_batch = count_tokens(i["prompt"]) + count_tokens(
+                        i["response"]
+                    )
+                    if (
+                        tokens_current_history + tokens_batch
+                        < settings.TOKENS_MAX_HISTORY
+                    ):
                        tokens_current_history += tokens_batch
-                        messages_combine.append({"role": "user", "content": i["prompt"]})
-                        messages_combine.append({"role": "system", "content": i["response"]})
+                        messages_combine.append(
+                            {"role": "user", "content": i["prompt"]}
+                        )
+                        messages_combine.append(
+                            {"role": "system", "content": i["response"]}
+                        )
        messages_combine.append({"role": "user", "content": self.question})

-        llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=settings.API_KEY)
+        llm = LLMCreator.create_llm(
+            settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=self.user_api_key
+        )

-        completion = llm.gen_stream(model=self.gpt_model,
-                                    messages=messages_combine)
+        completion = llm.gen_stream(model=self.gpt_model, messages=messages_combine)
        for line in completion:
            yield {"answer": str(line)}
-    
+
    def search(self):
        return self._get_data()
-        
--- a/application/retriever/classic_rag.py
+++ b/application/retriever/classic_rag.py
@ -7,21 +7,30 @@ from application.llm.llm_creator import LLMCreator
 from application.utils import count_tokens


-
 class ClassicRAG(BaseRetriever):

-    def __init__(self, question, source, chat_history, prompt, chunks=2, gpt_model='docsgpt'):
+    def __init__(
+        self,
+        question,
+        source,
+        chat_history,
+        prompt,
+        chunks=2,
+        gpt_model="docsgpt",
+        user_api_key=None,
+    ):
        self.question = question
        self.vectorstore = self._get_vectorstore(source=source)
        self.chat_history = chat_history
        self.prompt = prompt
        self.chunks = chunks
        self.gpt_model = gpt_model
+        self.user_api_key = user_api_key

    def _get_vectorstore(self, source):
        if "active_docs" in source:
            if source["active_docs"].split("/")[0] == "default":
-                    vectorstore = ""
+                vectorstore = ""
            elif source["active_docs"].split("/")[0] == "local":
                vectorstore = "indexes/" + source["active_docs"]
            else:
@ -33,32 +42,33 @@ class ClassicRAG(BaseRetriever):
        vectorstore = os.path.join("application", vectorstore)
        return vectorstore

-    
    def _get_data(self):
        if self.chunks == 0:
            docs = []
        else:
            docsearch = VectorCreator.create_vectorstore(
-                settings.VECTOR_STORE, 
-                self.vectorstore, 
-                settings.EMBEDDINGS_KEY
+                settings.VECTOR_STORE, self.vectorstore, settings.EMBEDDINGS_KEY
            )
            docs_temp = docsearch.search(self.question, k=self.chunks)
            docs = [
                {
-                    "title": i.metadata['title'].split('/')[-1] if i.metadata else i.page_content, 
-                    "text": i.page_content
-                } 
+                    "title": (
+                        i.metadata["title"].split("/")[-1]
+                        if i.metadata
+                        else i.page_content
+                    ),
+                    "text": i.page_content,
+                }
                for i in docs_temp
            ]
        if settings.LLM_NAME == "llama.cpp":
            docs = [docs[0]]
-        
+
        return docs
-    
+
    def gen(self):
        docs = self._get_data()
-        
+
        # join all page_content together with a newline
        docs_together = "\n".join([doc["text"] for doc in docs])
        p_chat_combine = self.prompt.replace("{summaries}", docs_together)
@ -72,20 +82,29 @@ class ClassicRAG(BaseRetriever):
            self.chat_history.reverse()
            for i in self.chat_history:
                if "prompt" in i and "response" in i:
-                    tokens_batch = count_tokens(i["prompt"]) + count_tokens(i["response"])
-                    if tokens_current_history + tokens_batch < settings.TOKENS_MAX_HISTORY:
+                    tokens_batch = count_tokens(i["prompt"]) + count_tokens(
+                        i["response"]
+                    )
+                    if (
+                        tokens_current_history + tokens_batch
+                        < settings.TOKENS_MAX_HISTORY
+                    ):
                        tokens_current_history += tokens_batch
-                        messages_combine.append({"role": "user", "content": i["prompt"]})
-                        messages_combine.append({"role": "system", "content": i["response"]})
+                        messages_combine.append(
+                            {"role": "user", "content": i["prompt"]}
+                        )
+                        messages_combine.append(
+                            {"role": "system", "content": i["response"]}
+                        )
        messages_combine.append({"role": "user", "content": self.question})

-        llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=settings.API_KEY)
+        llm = LLMCreator.create_llm(
+            settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=self.user_api_key
+        )

-        completion = llm.gen_stream(model=self.gpt_model,
-                                    messages=messages_combine)
+        completion = llm.gen_stream(model=self.gpt_model, messages=messages_combine)
        for line in completion:
            yield {"answer": str(line)}
-    
+
    def search(self):
        return self._get_data()
-        
--- a/application/retriever/duckduck_search.py
+++ b/application/retriever/duckduck_search.py
@ -6,16 +6,25 @@ from langchain_community.tools import DuckDuckGoSearchResults
 from langchain_community.utilities import DuckDuckGoSearchAPIWrapper


-
 class DuckDuckSearch(BaseRetriever):

-    def __init__(self, question, source, chat_history, prompt, chunks=2, gpt_model='docsgpt'):
+    def __init__(
+        self,
+        question,
+        source,
+        chat_history,
+        prompt,
+        chunks=2,
+        gpt_model="docsgpt",
+        user_api_key=None,
+    ):
        self.question = question
        self.source = source
        self.chat_history = chat_history
        self.prompt = prompt
        self.chunks = chunks
        self.gpt_model = gpt_model
+        self.user_api_key = user_api_key

    def _parse_lang_string(self, input_string):
        result = []
@ -30,12 +39,12 @@ class DuckDuckSearch(BaseRetriever):
                current_item = ""
            elif inside_brackets:
                current_item += char
-        
+
        if inside_brackets:
            result.append(current_item)
-        
+
        return result
-    
+
    def _get_data(self):
        if self.chunks == 0:
            docs = []
@ -44,7 +53,7 @@ class DuckDuckSearch(BaseRetriever):
            search = DuckDuckGoSearchResults(api_wrapper=wrapper)
            results = search.run(self.question)
            results = self._parse_lang_string(results)
-        
+
            docs = []
            for i in results:
                try:
@ -56,12 +65,12 @@ class DuckDuckSearch(BaseRetriever):
                    pass
        if settings.LLM_NAME == "llama.cpp":
            docs = [docs[0]]
-        
+
        return docs
-    
+
    def gen(self):
        docs = self._get_data()
-        
+
        # join all page_content together with a newline
        docs_together = "\n".join([doc["text"] for doc in docs])
        p_chat_combine = self.prompt.replace("{summaries}", docs_together)
@ -75,20 +84,29 @@ class DuckDuckSearch(BaseRetriever):
            self.chat_history.reverse()
            for i in self.chat_history:
                if "prompt" in i and "response" in i:
-                    tokens_batch = count_tokens(i["prompt"]) + count_tokens(i["response"])
-                    if tokens_current_history + tokens_batch < settings.TOKENS_MAX_HISTORY:
+                    tokens_batch = count_tokens(i["prompt"]) + count_tokens(
+                        i["response"]
+                    )
+                    if (
+                        tokens_current_history + tokens_batch
+                        < settings.TOKENS_MAX_HISTORY
+                    ):
                        tokens_current_history += tokens_batch
-                        messages_combine.append({"role": "user", "content": i["prompt"]})
-                        messages_combine.append({"role": "system", "content": i["response"]})
+                        messages_combine.append(
+                            {"role": "user", "content": i["prompt"]}
+                        )
+                        messages_combine.append(
+                            {"role": "system", "content": i["response"]}
+                        )
        messages_combine.append({"role": "user", "content": self.question})

-        llm = LLMCreator.create_llm(settings.LLM_NAME, api_key=settings.API_KEY)
+        llm = LLMCreator.create_llm(
+            settings.LLM_NAME, api_key=settings.API_KEY, user_api_key=self.user_api_key
+        )

-        completion = llm.gen_stream(model=self.gpt_model,
-                                    messages=messages_combine)
+        completion = llm.gen_stream(model=self.gpt_model, messages=messages_combine)
        for line in completion:
            yield {"answer": str(line)}
-    
+
    def search(self):
        return self._get_data()
-        
--- a/application/usage.py
+++ b/application/usage.py
@ -0,0 +1,49 @@
+import sys
+from pymongo import MongoClient
+from datetime import datetime
+from application.core.settings import settings
+from application.utils import count_tokens
+
+mongo = MongoClient(settings.MONGO_URI)
+db = mongo["docsgpt"]
+usage_collection = db["token_usage"]
+
+
+def update_token_usage(user_api_key, token_usage):
+    if "pytest" in sys.modules:
+        return
+    usage_data = {
+        "api_key": user_api_key,
+        "prompt_tokens": token_usage["prompt_tokens"],
+        "generated_tokens": token_usage["generated_tokens"],
+        "timestamp": datetime.now(),
+    }
+    usage_collection.insert_one(usage_data)
+
+
+def gen_token_usage(func):
+    def wrapper(self, model, messages, stream, **kwargs):
+        for message in messages:
+            self.token_usage["prompt_tokens"] += count_tokens(message["content"])
+        result = func(self, model, messages, stream, **kwargs)
+        self.token_usage["generated_tokens"] += count_tokens(result)
+        update_token_usage(self.user_api_key, self.token_usage)
+        return result
+
+    return wrapper
+
+
+def stream_token_usage(func):
+    def wrapper(self, model, messages, stream, **kwargs):
+        for message in messages:
+            self.token_usage["prompt_tokens"] += count_tokens(message["content"])
+        batch = []
+        result = func(self, model, messages, stream, **kwargs)
+        for r in result:
+            batch.append(r)
+            yield r
+        for line in batch:
+            self.token_usage["generated_tokens"] += count_tokens(line)
+        update_token_usage(self.user_api_key, self.token_usage)
+
+    return wrapper
--- a/docs/package-lock.json
+++ b/docs/package-lock.json
@ -8143,9 +8143,9 @@
      "integrity": "sha512-gkXMxRzUH+PB0ax9dUN0yYF0S25BqeAYqhgMaLUFmpXLEk7Fcu8f4emJuOAY0V8kjDICxROIKsTAKsV/v355xw=="
    },
    "node_modules/npm": {
-      "version": "10.5.0",
-      "resolved": "https://registry.npmjs.org/npm/-/npm-10.5.0.tgz",
-      "integrity": "sha512-Ejxwvfh9YnWVU2yA5FzoYLTW52vxHCz+MHrOFg9Cc8IFgF/6f5AGPAvb5WTay5DIUP1NIfN3VBZ0cLlGO0Ys+A==",
+      "version": "10.5.1",
+      "resolved": "https://registry.npmjs.org/npm/-/npm-10.5.1.tgz",
+      "integrity": "sha512-RozZuGuWbbhDM2sRhOSLIRb3DLyof6TREi0TW5b3xUEBropDhDqEHv0iAjA1zsIwXKgfIkR8GvQMd4oeKKg9eQ==",
      "bundleDependencies": [
        "@isaacs/string-locale-compare",
        "@npmcli/arborist",
@ -8154,6 +8154,7 @@
        "@npmcli/map-workspaces",
        "@npmcli/package-json",
        "@npmcli/promise-spawn",
+        "@npmcli/redact",
        "@npmcli/run-script",
        "@sigstore/tuf",
        "abbrev",
@ -8226,23 +8227,24 @@
        "@npmcli/map-workspaces": "^3.0.4",
        "@npmcli/package-json": "^5.0.0",
        "@npmcli/promise-spawn": "^7.0.1",
+        "@npmcli/redact": "^1.1.0",
        "@npmcli/run-script": "^7.0.4",
-        "@sigstore/tuf": "^2.3.1",
+        "@sigstore/tuf": "^2.3.2",
        "abbrev": "^2.0.0",
        "archy": "~1.0.0",
        "cacache": "^18.0.2",
        "chalk": "^5.3.0",
        "ci-info": "^4.0.0",
        "cli-columns": "^4.0.0",
-        "cli-table3": "^0.6.3",
+        "cli-table3": "^0.6.4",
        "columnify": "^1.6.0",
        "fastest-levenshtein": "^1.0.16",
        "fs-minipass": "^3.0.3",
-        "glob": "^10.3.10",
+        "glob": "^10.3.12",
        "graceful-fs": "^4.2.11",
        "hosted-git-info": "^7.0.1",
-        "ini": "^4.1.1",
-        "init-package-json": "^6.0.0",
+        "ini": "^4.1.2",
+        "init-package-json": "^6.0.2",
        "is-cidr": "^5.0.3",
        "json-parse-even-better-errors": "^3.0.1",
        "libnpmaccess": "^8.0.1",
@ -8257,11 +8259,11 @@
        "libnpmteam": "^6.0.0",
        "libnpmversion": "^5.0.1",
        "make-fetch-happen": "^13.0.0",
-        "minimatch": "^9.0.3",
+        "minimatch": "^9.0.4",
        "minipass": "^7.0.4",
        "minipass-pipeline": "^1.2.4",
        "ms": "^2.1.2",
-        "node-gyp": "^10.0.1",
+        "node-gyp": "^10.1.0",
        "nopt": "^7.2.0",
        "normalize-package-data": "^6.0.0",
        "npm-audit-report": "^5.0.0",
@ -8269,7 +8271,7 @@
        "npm-package-arg": "^11.0.1",
        "npm-pick-manifest": "^9.0.0",
        "npm-profile": "^9.0.0",
-        "npm-registry-fetch": "^16.1.0",
+        "npm-registry-fetch": "^16.2.0",
        "npm-user-validate": "^2.0.0",
        "npmlog": "^7.0.1",
        "p-map": "^4.0.0",
@ -8277,12 +8279,12 @@
        "parse-conflict-json": "^3.0.1",
        "proc-log": "^3.0.0",
        "qrcode-terminal": "^0.12.0",
-        "read": "^2.1.0",
+        "read": "^3.0.1",
        "semver": "^7.6.0",
        "spdx-expression-parse": "^3.0.1",
        "ssri": "^10.0.5",
        "supports-color": "^9.4.0",
-        "tar": "^6.2.0",
+        "tar": "^6.2.1",
        "text-table": "~0.2.0",
        "tiny-relative-date": "^1.3.0",
        "treeverse": "^3.0.0",
@ -8339,8 +8341,6 @@
    },
    "node_modules/npm/node_modules/@isaacs/cliui": {
      "version": "8.0.2",
-      "resolved": "https://registry.npmjs.org/@isaacs/cliui/-/cliui-8.0.2.tgz",
-      "integrity": "sha512-O8jcjabXaleOG9DQ0+ARXWZBTfnP4WNAqzuiJK7ll44AmxGKv/J2M4TPjxjY3znBCfvBXFzucm1twdyFybFqEA==",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -8357,8 +8357,6 @@
    },
    "node_modules/npm/node_modules/@isaacs/cliui/node_modules/ansi-regex": {
      "version": "6.0.1",
-      "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.0.1.tgz",
-      "integrity": "sha512-n5M855fKb2SsfMIiFFoVrABHJC8QtHwVx+mHWP3QcEqBHYienj5dHSgjbxtC0WEZXYt4wcD6zrQElDPhFuZgfA==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -8370,8 +8368,6 @@
    },
    "node_modules/npm/node_modules/@isaacs/cliui/node_modules/emoji-regex": {
      "version": "9.2.2",
-      "resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-9.2.2.tgz",
-      "integrity": "sha512-L18DaJsXSUk2+42pv8mLs5jJT2hqFkFE4j21wOmgbUqsZ2hL72NsUU785g9RXgo3s0ZNgVl42TiHp3ZtOv/Vyg==",
      "inBundle": true,
      "license": "MIT"
    },
@ -8393,8 +8389,6 @@
    },
    "node_modules/npm/node_modules/@isaacs/cliui/node_modules/strip-ansi": {
      "version": "7.1.0",
-      "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.1.0.tgz",
-      "integrity": "sha512-iq6eVVI64nQQTRYq2KtEg2d2uU7LElhTJwsH4YzIHZshxlgZms/wIc4VoDQTlG/IvVIrBKG06CrZnp0qv7hkcQ==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -8428,7 +8422,7 @@
      }
    },
    "node_modules/npm/node_modules/@npmcli/arborist": {
-      "version": "7.4.0",
+      "version": "7.4.1",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -8448,12 +8442,12 @@
        "hosted-git-info": "^7.0.1",
        "json-parse-even-better-errors": "^3.0.0",
        "json-stringify-nice": "^1.1.4",
-        "minimatch": "^9.0.0",
+        "minimatch": "^9.0.4",
        "nopt": "^7.0.0",
        "npm-install-checks": "^6.2.0",
        "npm-package-arg": "^11.0.1",
        "npm-pick-manifest": "^9.0.0",
-        "npm-registry-fetch": "^16.0.0",
+        "npm-registry-fetch": "^16.2.0",
        "npmlog": "^7.0.1",
        "pacote": "^17.0.4",
        "parse-conflict-json": "^3.0.0",
@ -8474,13 +8468,13 @@
      }
    },
    "node_modules/npm/node_modules/@npmcli/config": {
-      "version": "8.2.0",
+      "version": "8.2.1",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "@npmcli/map-workspaces": "^3.0.2",
        "ci-info": "^4.0.0",
-        "ini": "^4.1.0",
+        "ini": "^4.1.2",
        "nopt": "^7.0.0",
        "proc-log": "^3.0.0",
        "read-package-json-fast": "^3.0.2",
@ -8643,6 +8637,14 @@
        "node": "^14.17.0 || ^16.13.0 || >=18.0.0"
      }
    },
+    "node_modules/npm/node_modules/@npmcli/redact": {
+      "version": "1.1.0",
+      "inBundle": true,
+      "license": "ISC",
+      "engines": {
+        "node": "^16.14.0 || >=18.0.0"
+      }
+    },
    "node_modules/npm/node_modules/@npmcli/run-script": {
      "version": "7.0.4",
      "inBundle": true,
@ -8660,8 +8662,6 @@
    },
    "node_modules/npm/node_modules/@pkgjs/parseargs": {
      "version": "0.11.0",
-      "resolved": "https://registry.npmjs.org/@pkgjs/parseargs/-/parseargs-0.11.0.tgz",
-      "integrity": "sha512-+1VkjdD0QBLPodGrJUeqarH8VAIvQODIbwh9XpP5Syisf7YoQgsJKPNFoqqLQlu+VQ/tVSshMR6loPMn8U+dPg==",
      "inBundle": true,
      "license": "MIT",
      "optional": true,
@ -8711,7 +8711,7 @@
      }
    },
    "node_modules/npm/node_modules/@sigstore/tuf": {
-      "version": "2.3.1",
+      "version": "2.3.2",
      "inBundle": true,
      "license": "Apache-2.0",
      "dependencies": {
@ -8764,7 +8764,7 @@
      }
    },
    "node_modules/npm/node_modules/agent-base": {
-      "version": "7.1.0",
+      "version": "7.1.1",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -8788,8 +8788,6 @@
    },
    "node_modules/npm/node_modules/ansi-regex": {
      "version": "5.0.1",
-      "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-5.0.1.tgz",
-      "integrity": "sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -8798,8 +8796,6 @@
    },
    "node_modules/npm/node_modules/ansi-styles": {
      "version": "6.2.1",
-      "resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-6.2.1.tgz",
-      "integrity": "sha512-bN798gFfQX+viw3R7yrGWRqnrN2oRkEkUjjl4JNn4E8GxxbjtG3FbrEIIY3l8/hrwUwIeCZvi4QuOTP4MErVug==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -8829,8 +8825,6 @@
    },
    "node_modules/npm/node_modules/balanced-match": {
      "version": "1.0.2",
-      "resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-1.0.2.tgz",
-      "integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==",
      "inBundle": true,
      "license": "MIT"
    },
@ -8849,17 +8843,18 @@
      }
    },
    "node_modules/npm/node_modules/binary-extensions": {
-      "version": "2.2.0",
+      "version": "2.3.0",
      "inBundle": true,
      "license": "MIT",
      "engines": {
        "node": ">=8"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/sindresorhus"
      }
    },
    "node_modules/npm/node_modules/brace-expansion": {
      "version": "2.0.1",
-      "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.1.tgz",
-      "integrity": "sha512-XnAIvQ8eM+kC6aULx6wuQiwVsnzsi9d3WxzV3FpWTGA19F621kwdbsAcFKXgKUHZWsy+mY6iL1sHTxWEFCytDA==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -8961,7 +8956,7 @@
      }
    },
    "node_modules/npm/node_modules/cli-table3": {
-      "version": "0.6.3",
+      "version": "0.6.4",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -8992,8 +8987,6 @@
    },
    "node_modules/npm/node_modules/color-convert": {
      "version": "2.0.1",
-      "resolved": "https://registry.npmjs.org/color-convert/-/color-convert-2.0.1.tgz",
-      "integrity": "sha512-RRECPsj7iu/xb5oKYcsFHSppFNnsj/52OVTRKb4zP5onXwVF3zVmmToNcOfGC+CRDpfK/U584fMg38ZHCaElKQ==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -9005,8 +8998,6 @@
    },
    "node_modules/npm/node_modules/color-name": {
      "version": "1.1.4",
-      "resolved": "https://registry.npmjs.org/color-name/-/color-name-1.1.4.tgz",
-      "integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==",
      "inBundle": true,
      "license": "MIT"
    },
@ -9042,8 +9033,6 @@
    },
    "node_modules/npm/node_modules/cross-spawn": {
      "version": "7.0.3",
-      "resolved": "https://registry.npmjs.org/cross-spawn/-/cross-spawn-7.0.3.tgz",
-      "integrity": "sha512-iRDPJKUPVEND7dHPO8rkbOnPpyDygcDFtWjpeWNCgy8WP2rXcxXL8TskReQl6OrB2G7+UJrags1q15Fudc7G6w==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -9071,8 +9060,6 @@
    },
    "node_modules/npm/node_modules/cssesc": {
      "version": "3.0.0",
-      "resolved": "https://registry.npmjs.org/cssesc/-/cssesc-3.0.0.tgz",
-      "integrity": "sha512-/Tb/JcjK111nNScGob5MNtsntNM1aCNUDipB/TkwZFhyDrrE47SOx/18wF2bbjgc3ZzCSKW1T5nt5EbFoAz/Vg==",
      "inBundle": true,
      "license": "MIT",
      "bin": {
@ -9124,15 +9111,11 @@
    },
    "node_modules/npm/node_modules/eastasianwidth": {
      "version": "0.2.0",
-      "resolved": "https://registry.npmjs.org/eastasianwidth/-/eastasianwidth-0.2.0.tgz",
-      "integrity": "sha512-I88TYZWc9XiYHRQ4/3c5rjjfgkjhLyW2luGIheGERbNQ6OY7yTybanSpDXZa8y7VUP9YmDcYa+eyq4ca7iLqWA==",
      "inBundle": true,
      "license": "MIT"
    },
    "node_modules/npm/node_modules/emoji-regex": {
      "version": "8.0.0",
-      "resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-8.0.0.tgz",
-      "integrity": "sha512-MSjYzcWNOA0ewAHpz0MxpYFvwg6yjy1NG3xteoqz644VCo/RPgnr1/GGt+ic3iJTzQ8Eu3TdM14SawnVUmGE6A==",
      "inBundle": true,
      "license": "MIT"
    },
@ -9173,8 +9156,6 @@
    },
    "node_modules/npm/node_modules/foreground-child": {
      "version": "3.1.1",
-      "resolved": "https://registry.npmjs.org/foreground-child/-/foreground-child-3.1.1.tgz",
-      "integrity": "sha512-TMKDUnIte6bfb5nWv7V/caI169OHgvwjb7V4WkeUvbQQdjr5rWKqHFiKWb/fcOwB+CzBT+qbWjvj+DVwRskpIg==",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -9201,8 +9182,6 @@
    },
    "node_modules/npm/node_modules/function-bind": {
      "version": "1.1.2",
-      "resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz",
-      "integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==",
      "inBundle": true,
      "license": "MIT",
      "funding": {
@ -9228,17 +9207,15 @@
      }
    },
    "node_modules/npm/node_modules/glob": {
-      "version": "10.3.10",
-      "resolved": "https://registry.npmjs.org/glob/-/glob-10.3.10.tgz",
-      "integrity": "sha512-fa46+tv1Ak0UPK1TOy/pZrIybNNt4HCv7SDzwyfiOZkvZLEbjsZkJBPtDHVshZjbecAoAGSC20MjLDG/qr679g==",
+      "version": "10.3.12",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "foreground-child": "^3.1.0",
-        "jackspeak": "^2.3.5",
+        "jackspeak": "^2.3.6",
        "minimatch": "^9.0.1",
-        "minipass": "^5.0.0 || ^6.0.2 || ^7.0.0",
-        "path-scurry": "^1.10.1"
+        "minipass": "^7.0.4",
+        "path-scurry": "^1.10.2"
      },
      "bin": {
        "glob": "dist/esm/bin.mjs"
@ -9252,8 +9229,6 @@
    },
    "node_modules/npm/node_modules/graceful-fs": {
      "version": "4.2.11",
-      "resolved": "https://registry.npmjs.org/graceful-fs/-/graceful-fs-4.2.11.tgz",
-      "integrity": "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ==",
      "inBundle": true,
      "license": "ISC"
    },
@ -9353,7 +9328,7 @@
      }
    },
    "node_modules/npm/node_modules/ini": {
-      "version": "4.1.1",
+      "version": "4.1.2",
      "inBundle": true,
      "license": "ISC",
      "engines": {
@ -9361,14 +9336,14 @@
      }
    },
    "node_modules/npm/node_modules/init-package-json": {
-      "version": "6.0.0",
+      "version": "6.0.2",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
+        "@npmcli/package-json": "^5.0.0",
        "npm-package-arg": "^11.0.0",
        "promzard": "^1.0.0",
-        "read": "^2.0.0",
-        "read-package-json": "^7.0.0",
+        "read": "^3.0.1",
        "semver": "^7.3.5",
        "validate-npm-package-license": "^3.0.4",
        "validate-npm-package-name": "^5.0.0"
@ -9429,8 +9404,6 @@
    },
    "node_modules/npm/node_modules/is-fullwidth-code-point": {
      "version": "3.0.0",
-      "resolved": "https://registry.npmjs.org/is-fullwidth-code-point/-/is-fullwidth-code-point-3.0.0.tgz",
-      "integrity": "sha512-zymm5+u+sCsSWyD9qNaejV3DFvhCKclKdizYaJUuHA83RLjb7nSuGnddCHGv0hk+KY7BMAlsWeK4Ueg6EV6XQg==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -9444,15 +9417,11 @@
    },
    "node_modules/npm/node_modules/isexe": {
      "version": "2.0.0",
-      "resolved": "https://registry.npmjs.org/isexe/-/isexe-2.0.0.tgz",
-      "integrity": "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==",
      "inBundle": true,
      "license": "ISC"
    },
    "node_modules/npm/node_modules/jackspeak": {
      "version": "2.3.6",
-      "resolved": "https://registry.npmjs.org/jackspeak/-/jackspeak-2.3.6.tgz",
-      "integrity": "sha512-N3yCS/NegsOBokc8GAdM8UcmfsKiSS8cipheD/nivzr700H+nsMOxJjQnvwOcRYVuFkdH0wGUvW2WbXGmrZGbQ==",
      "inBundle": true,
      "license": "BlueOak-1.0.0",
      "dependencies": {
@ -9508,38 +9477,38 @@
      "license": "MIT"
    },
    "node_modules/npm/node_modules/libnpmaccess": {
-      "version": "8.0.2",
+      "version": "8.0.3",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "npm-package-arg": "^11.0.1",
-        "npm-registry-fetch": "^16.0.0"
+        "npm-registry-fetch": "^16.2.0"
      },
      "engines": {
        "node": "^16.14.0 || >=18.0.0"
      }
    },
    "node_modules/npm/node_modules/libnpmdiff": {
-      "version": "6.0.7",
+      "version": "6.0.8",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "@npmcli/arborist": "^7.2.1",
        "@npmcli/disparity-colors": "^3.0.0",
        "@npmcli/installed-package-contents": "^2.0.2",
-        "binary-extensions": "^2.2.0",
+        "binary-extensions": "^2.3.0",
        "diff": "^5.1.0",
-        "minimatch": "^9.0.0",
+        "minimatch": "^9.0.4",
        "npm-package-arg": "^11.0.1",
        "pacote": "^17.0.4",
-        "tar": "^6.2.0"
+        "tar": "^6.2.1"
      },
      "engines": {
        "node": "^16.14.0 || >=18.0.0"
      }
    },
    "node_modules/npm/node_modules/libnpmexec": {
-      "version": "7.0.8",
+      "version": "7.0.9",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -9550,7 +9519,7 @@
        "npmlog": "^7.0.1",
        "pacote": "^17.0.4",
        "proc-log": "^3.0.0",
-        "read": "^2.0.0",
+        "read": "^3.0.1",
        "read-package-json-fast": "^3.0.2",
        "semver": "^7.3.7",
        "walk-up-path": "^3.0.1"
@ -9560,7 +9529,7 @@
      }
    },
    "node_modules/npm/node_modules/libnpmfund": {
-      "version": "5.0.5",
+      "version": "5.0.6",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -9571,31 +9540,31 @@
      }
    },
    "node_modules/npm/node_modules/libnpmhook": {
-      "version": "10.0.1",
+      "version": "10.0.2",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "aproba": "^2.0.0",
-        "npm-registry-fetch": "^16.0.0"
+        "npm-registry-fetch": "^16.2.0"
      },
      "engines": {
        "node": "^16.14.0 || >=18.0.0"
      }
    },
    "node_modules/npm/node_modules/libnpmorg": {
-      "version": "6.0.2",
+      "version": "6.0.3",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "aproba": "^2.0.0",
-        "npm-registry-fetch": "^16.0.0"
+        "npm-registry-fetch": "^16.2.0"
      },
      "engines": {
        "node": "^16.14.0 || >=18.0.0"
      }
    },
    "node_modules/npm/node_modules/libnpmpack": {
-      "version": "6.0.7",
+      "version": "6.0.8",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -9609,14 +9578,14 @@
      }
    },
    "node_modules/npm/node_modules/libnpmpublish": {
-      "version": "9.0.4",
+      "version": "9.0.5",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "ci-info": "^4.0.0",
        "normalize-package-data": "^6.0.0",
        "npm-package-arg": "^11.0.1",
-        "npm-registry-fetch": "^16.0.0",
+        "npm-registry-fetch": "^16.2.0",
        "proc-log": "^3.0.0",
        "semver": "^7.3.7",
        "sigstore": "^2.2.0",
@ -9627,23 +9596,23 @@
      }
    },
    "node_modules/npm/node_modules/libnpmsearch": {
-      "version": "7.0.1",
+      "version": "7.0.2",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
-        "npm-registry-fetch": "^16.0.0"
+        "npm-registry-fetch": "^16.2.0"
      },
      "engines": {
        "node": "^16.14.0 || >=18.0.0"
      }
    },
    "node_modules/npm/node_modules/libnpmteam": {
-      "version": "6.0.1",
+      "version": "6.0.2",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
        "aproba": "^2.0.0",
-        "npm-registry-fetch": "^16.0.0"
+        "npm-registry-fetch": "^16.2.0"
      },
      "engines": {
        "node": "^16.14.0 || >=18.0.0"
@ -9694,9 +9663,7 @@
      }
    },
    "node_modules/npm/node_modules/minimatch": {
-      "version": "9.0.3",
-      "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-9.0.3.tgz",
-      "integrity": "sha512-RHiac9mvaRw0x3AYRgDC1CxAP7HTcNrrECeA8YYJeWnpo+2Q5CegtZjaotWTWxDG3UeGA1coE05iH1mPjT/2mg==",
+      "version": "9.0.4",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -9711,8 +9678,6 @@
    },
    "node_modules/npm/node_modules/minipass": {
      "version": "7.0.4",
-      "resolved": "https://registry.npmjs.org/minipass/-/minipass-7.0.4.tgz",
-      "integrity": "sha512-jYofLM5Dam9279rdkWzqHozUo4ybjdZmCsDHePy5V/PbBcVMiSZR97gmAy45aqi8CK1lG2ECd356FU86avfwUQ==",
      "inBundle": true,
      "license": "ISC",
      "engines": {
@ -9888,7 +9853,7 @@
      }
    },
    "node_modules/npm/node_modules/node-gyp": {
-      "version": "10.0.1",
+      "version": "10.1.0",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10028,10 +9993,11 @@
      }
    },
    "node_modules/npm/node_modules/npm-registry-fetch": {
-      "version": "16.1.0",
+      "version": "16.2.0",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
+        "@npmcli/redact": "^1.1.0",
        "make-fetch-happen": "^13.0.0",
        "minipass": "^7.0.2",
        "minipass-fetch": "^3.0.0",
@ -10126,8 +10092,6 @@
    },
    "node_modules/npm/node_modules/path-key": {
      "version": "3.1.1",
-      "resolved": "https://registry.npmjs.org/path-key/-/path-key-3.1.1.tgz",
-      "integrity": "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -10135,13 +10099,11 @@
      }
    },
    "node_modules/npm/node_modules/path-scurry": {
-      "version": "1.10.1",
-      "resolved": "https://registry.npmjs.org/path-scurry/-/path-scurry-1.10.1.tgz",
-      "integrity": "sha512-MkhCqzzBEpPvxxQ71Md0b1Kk51W01lrYvlMzSUaIzNsODdd7mqhiimSZlr+VegAz5Z6Vzt9Xg2ttE//XBhH3EQ==",
+      "version": "1.10.2",
      "inBundle": true,
      "license": "BlueOak-1.0.0",
      "dependencies": {
-        "lru-cache": "^9.1.1 || ^10.0.0",
+        "lru-cache": "^10.2.0",
        "minipass": "^5.0.0 || ^6.0.2 || ^7.0.0"
      },
      "engines": {
@ -10205,11 +10167,11 @@
      }
    },
    "node_modules/npm/node_modules/promzard": {
-      "version": "1.0.0",
+      "version": "1.0.1",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
-        "read": "^2.0.0"
+        "read": "^3.0.1"
      },
      "engines": {
        "node": "^14.17.0 || ^16.13.0 || >=18.0.0"
@ -10223,11 +10185,11 @@
      }
    },
    "node_modules/npm/node_modules/read": {
-      "version": "2.1.0",
+      "version": "3.0.1",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
-        "mute-stream": "~1.0.0"
+        "mute-stream": "^1.0.0"
      },
      "engines": {
        "node": "^14.17.0 || ^16.13.0 || >=18.0.0"
@ -10277,8 +10239,6 @@
    },
    "node_modules/npm/node_modules/safer-buffer": {
      "version": "2.1.2",
-      "resolved": "https://registry.npmjs.org/safer-buffer/-/safer-buffer-2.1.2.tgz",
-      "integrity": "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg==",
      "inBundle": true,
      "license": "MIT",
      "optional": true
@ -10315,8 +10275,6 @@
    },
    "node_modules/npm/node_modules/shebang-command": {
      "version": "2.0.0",
-      "resolved": "https://registry.npmjs.org/shebang-command/-/shebang-command-2.0.0.tgz",
-      "integrity": "sha512-kHxr2zZpYtdmrN1qDjrrX/Z1rR1kG8Dx+gkpK1G4eXmvXswmcE1hTWBWYUzlraYw1/yZp6YuDY77YtvbN0dmDA==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10328,8 +10286,6 @@
    },
    "node_modules/npm/node_modules/shebang-regex": {
      "version": "3.0.0",
-      "resolved": "https://registry.npmjs.org/shebang-regex/-/shebang-regex-3.0.0.tgz",
-      "integrity": "sha512-7++dFhtcx3353uBaq8DDR4NuxBetBzC7ZQOhmTQInHEd6bSrXdiEyzCvG07Z44UYdLShWUyXt5M/yhz8ekcb1A==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -10338,8 +10294,6 @@
    },
    "node_modules/npm/node_modules/signal-exit": {
      "version": "4.1.0",
-      "resolved": "https://registry.npmjs.org/signal-exit/-/signal-exit-4.1.0.tgz",
-      "integrity": "sha512-bzyZ1e88w9O1iNJbKnOlvYTrWPDl46O1bG0D3XInv+9tkPrxrN8jUUTiFlDkkmKWgn1M6CfIA13SuGqOa9Korw==",
      "inBundle": true,
      "license": "ISC",
      "engines": {
@ -10441,8 +10395,6 @@
    },
    "node_modules/npm/node_modules/string-width": {
      "version": "4.2.3",
-      "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz",
-      "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10457,8 +10409,6 @@
    "node_modules/npm/node_modules/string-width-cjs": {
      "name": "string-width",
      "version": "4.2.3",
-      "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz",
-      "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10472,8 +10422,6 @@
    },
    "node_modules/npm/node_modules/strip-ansi": {
      "version": "6.0.1",
-      "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-6.0.1.tgz",
-      "integrity": "sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10486,8 +10434,6 @@
    "node_modules/npm/node_modules/strip-ansi-cjs": {
      "name": "strip-ansi",
      "version": "6.0.1",
-      "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-6.0.1.tgz",
-      "integrity": "sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10509,7 +10455,7 @@
      }
    },
    "node_modules/npm/node_modules/tar": {
-      "version": "6.2.0",
+      "version": "6.2.1",
      "inBundle": true,
      "license": "ISC",
      "dependencies": {
@ -10609,8 +10555,6 @@
    },
    "node_modules/npm/node_modules/util-deprecate": {
      "version": "1.0.2",
-      "resolved": "https://registry.npmjs.org/util-deprecate/-/util-deprecate-1.0.2.tgz",
-      "integrity": "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw==",
      "inBundle": true,
      "license": "MIT"
    },
@ -10679,8 +10623,6 @@
    },
    "node_modules/npm/node_modules/wrap-ansi": {
      "version": "8.1.0",
-      "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-8.1.0.tgz",
-      "integrity": "sha512-si7QWI6zUMq56bESFvagtmzMdGOtoxfR+Sez11Mobfc7tm+VkUckk9bW2UeffTGVUbOksxmSw0AA2gs8g71NCQ==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10698,8 +10640,6 @@
    "node_modules/npm/node_modules/wrap-ansi-cjs": {
      "name": "wrap-ansi",
      "version": "7.0.0",
-      "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-7.0.0.tgz",
-      "integrity": "sha512-YVGIj2kamLSTxw6NsZjoBxfSwsn0ycdesmc4p+Q21c5zPuZ1pl+NfxVdxPtdHvmNVOQ6XSYG4AUtyt/Fi7D16Q==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
@ -10730,8 +10670,6 @@
    },
    "node_modules/npm/node_modules/wrap-ansi/node_modules/ansi-regex": {
      "version": "6.0.1",
-      "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.0.1.tgz",
-      "integrity": "sha512-n5M855fKb2SsfMIiFFoVrABHJC8QtHwVx+mHWP3QcEqBHYienj5dHSgjbxtC0WEZXYt4wcD6zrQElDPhFuZgfA==",
      "inBundle": true,
      "license": "MIT",
      "engines": {
@ -10743,8 +10681,6 @@
    },
    "node_modules/npm/node_modules/wrap-ansi/node_modules/emoji-regex": {
      "version": "9.2.2",
-      "resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-9.2.2.tgz",
-      "integrity": "sha512-L18DaJsXSUk2+42pv8mLs5jJT2hqFkFE4j21wOmgbUqsZ2hL72NsUU785g9RXgo3s0ZNgVl42TiHp3ZtOv/Vyg==",
      "inBundle": true,
      "license": "MIT"
    },
@ -10766,8 +10702,6 @@
    },
    "node_modules/npm/node_modules/wrap-ansi/node_modules/strip-ansi": {
      "version": "7.1.0",
-      "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.1.0.tgz",
-      "integrity": "sha512-iq6eVVI64nQQTRYq2KtEg2d2uU7LElhTJwsH4YzIHZshxlgZms/wIc4VoDQTlG/IvVIrBKG06CrZnp0qv7hkcQ==",
      "inBundle": true,
      "license": "MIT",
      "dependencies": {
--- a/extensions/react-widget/package-lock.json
+++ b/extensions/react-widget/package-lock.json
@ -1,12 +1,12 @@
 {
  "name": "docsgpt",
-  "version": "0.3.6",
+  "version": "0.3.7",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "docsgpt",
-      "version": "0.3.6",
+      "version": "0.3.7",
      "license": "Apache-2.0",
      "dependencies": {
        "@babel/plugin-transform-flow-strip-types": "^7.23.3",
--- a/extensions/react-widget/package.json
+++ b/extensions/react-widget/package.json
@ -19,7 +19,7 @@
  },
  "scripts": {
    "build": "parcel build src/index.ts",
-    "dev": "parcel",
+    "dev": "parcel src/index.html -p 3000",
    "test": "jest",
    "lint": "eslint",
    "check": "tsc --noEmit",
--- a/extensions/react-widget/src/components/DocsGPTWidget.tsx
+++ b/extensions/react-widget/src/components/DocsGPTWidget.tsx
@ -1,8 +1,7 @@
 "use client";
 import { Fragment, useEffect, useRef, useState } from 'react'
 import { PaperPlaneIcon, RocketIcon, ExclamationTriangleIcon, Cross2Icon } from '@radix-ui/react-icons';
-import { MESSAGE_TYPE } from '../models/types';
-import { Query, Status } from '../models/types';
+import { MESSAGE_TYPE, Query, Status } from '../types/index';
 import MessageIcon from '../assets/message.svg'
 import { fetchAnswerStreaming } from '../requests/streamingApi';
 import styled, { keyframes, createGlobalStyle } from 'styled-components';
--- a/extensions/react-widget/src/types/index.ts
+++ b/extensions/react-widget/src/types/index.ts
@ -0,0 +1,13 @@
+export type MESSAGE_TYPE = 'QUESTION' | 'ANSWER' | 'ERROR';
+export type Status = 'idle' | 'loading' | 'failed';
+export type FEEDBACK = 'LIKE' | 'DISLIKE';
+
+export interface Query {
+  prompt: string;
+  response?: string;
+  feedback?: FEEDBACK;
+  error?: string;
+  sources?: { title: string; text: string }[];
+  conversationId?: string | null;
+  title?: string | null;
+}
--- a/frontend/package.json
+++ b/frontend/package.json
@ -58,4 +58,5 @@
    "vite": "^5.0.13",
    "vite-plugin-svgr": "^4.2.0"
  }
+  
 }
--- a/frontend/src/Modal/index.tsx
+++ b/frontend/src/Modal/index.tsx
@ -8,7 +8,9 @@ interface ModalProps {
  modalState: string;
  isError: boolean;
  errorMessage?: string;
+  textDelete?: boolean;
 }
+
 const Modal = (props: ModalProps) => {
  return (
    <div
@ -23,7 +25,7 @@ const Modal = (props: ModalProps) => {
            onClick={() => props.handleSubmit()}
            className="ml-auto h-10 w-20 rounded-3xl bg-violet-800 text-white transition-all hover:bg-violet-700"
          >
-            Save
+            {props.textDelete ? 'Delete' : 'Save'}
          </button>
          {props.isCancellable && (
            <button
--- a/frontend/src/Navigation.tsx
+++ b/frontend/src/Navigation.tsx
@ -20,6 +20,8 @@ import Add from './assets/add.svg';
 import UploadIcon from './assets/upload.svg';
 import { ActiveState } from './models/misc';
 import APIKeyModal from './preferences/APIKeyModal';
+import DeleteConvModal from './preferences/DeleteConvModal';
+
 import {
  selectApiKeyStatus,
  selectSelectedDocs,
@ -29,6 +31,8 @@ import {
  selectConversations,
  setConversations,
  selectConversationId,
+  selectModalStateDeleteConv,
+  setModalStateDeleteConv,
 } from './preferences/preferenceSlice';
 import {
  setConversation,
@ -66,7 +70,9 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
  const docs = useSelector(selectSourceDocs);
  const selectedDocs = useSelector(selectSelectedDocs);
  const conversations = useSelector(selectConversations);
+  const modalStateDeleteConv = useSelector(selectModalStateDeleteConv);
  const conversationId = useSelector(selectConversationId);
+
  const { isMobile } = useMediaQuery();
  const [isDarkTheme] = useDarkTheme();
  const [isDocsListOpen, setIsDocsListOpen] = useState(false);
@ -92,6 +98,7 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
      fetchConversations();
    }
  }, [conversations, dispatch]);
+
  async function fetchConversations() {
    return await getConversations()
      .then((fetchedConversations) => {
@ -102,6 +109,16 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
      });
  }

+  const handleDeleteAllConversations = () => {
+    fetch(`${apiHost}/api/delete_all_conversations`, {
+      method: 'POST',
+    })
+      .then(() => {
+        fetchConversations();
+      })
+      .catch((error) => console.error(error));
+  };
+
  const handleDeleteConversation = (id: string) => {
    fetch(`${apiHost}/api/delete_conversation?id=${id}`, {
      method: 'POST',
@ -260,7 +277,9 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
        <div className="mb-auto h-[56vh] overflow-y-auto overflow-x-hidden dark:text-white">
          {conversations && (
            <div>
-              <p className="ml-6 mt-3 text-sm font-semibold">Chats</p>
+              <div className=" my-auto mx-4 mt-2 flex h-6 items-center justify-between gap-4 rounded-3xl">
+                <p className="my-auto ml-6 text-sm font-semibold">Chats</p>
+              </div>
              <div className="conversations-container">
                {conversations?.map((conversation) => (
                  <ConversationTile
@ -312,7 +331,6 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
              </p>
            </NavLink>
          </div>
-
          <div className="flex flex-col gap-2 border-b-[1.5px] py-2 dark:border-b-purple-taupe">
            <NavLink
              to="/about"
@ -370,6 +388,7 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
          />
        </button>
      </div>
+
      <SelectDocsModal
        modalState={selectedDocsModalState}
        setModalState={setSelectedDocsModalState}
@ -380,6 +399,11 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
        setModalState={setApiKeyModalState}
        isCancellable={isApiKeySet}
      />
+      <DeleteConvModal
+        modalState={modalStateDeleteConv}
+        setModalState={setModalStateDeleteConv}
+        handleDeleteAllConv={handleDeleteAllConversations}
+      />
      <Upload
        modalState={uploadModalState}
        setModalState={setUploadModalState}
--- a/frontend/src/preferences/DeleteConvModal.tsx
+++ b/frontend/src/preferences/DeleteConvModal.tsx
@ -0,0 +1,63 @@
+import { useRef } from 'react';
+import { ActiveState } from '../models/misc';
+import { useMediaQuery, useOutsideAlerter } from './../hooks';
+import Modal from '../Modal';
+import { useDispatch } from 'react-redux';
+import { Action } from '@reduxjs/toolkit';
+
+export default function DeleteConvModal({
+  modalState,
+  setModalState,
+  handleDeleteAllConv,
+}: {
+  modalState: ActiveState;
+  setModalState: (val: ActiveState) => Action;
+  handleDeleteAllConv: () => void;
+}) {
+  const dispatch = useDispatch();
+  const modalRef = useRef(null);
+  const { isMobile } = useMediaQuery();
+
+  useOutsideAlerter(
+    modalRef,
+    () => {
+      if (isMobile && modalState === 'ACTIVE') {
+        dispatch(setModalState('INACTIVE'));
+      }
+    },
+    [modalState],
+  );
+
+  function handleSubmit() {
+    handleDeleteAllConv();
+    dispatch(setModalState('INACTIVE'));
+  }
+
+  function handleCancel() {
+    dispatch(setModalState('INACTIVE'));
+  }
+
+  return (
+    <Modal
+      handleCancel={handleCancel}
+      isError={false}
+      modalState={modalState}
+      isCancellable={true}
+      handleSubmit={handleSubmit}
+      textDelete={true}
+      render={() => {
+        return (
+          <article
+            ref={modalRef}
+            className="mx-auto mt-24 flex w-[90vw] max-w-lg  flex-col gap-4 rounded-t-lg bg-white p-6 shadow-lg"
+          >
+            <p className="text-xl text-jet">
+              Are you sure you want to delete all the conversations?
+            </p>
+            <p className="text-md leading-6 text-gray-500"></p>
+          </article>
+        );
+      }}
+    />
+  );
+}
--- a/frontend/src/preferences/preferenceSlice.ts
+++ b/frontend/src/preferences/preferenceSlice.ts
@ -1,10 +1,12 @@
 import {
+  PayloadAction,
  createListenerMiddleware,
  createSlice,
  isAnyOf,
 } from '@reduxjs/toolkit';
 import { Doc, setLocalApiKey, setLocalRecentDocs } from './preferenceApi';
 import { RootState } from '../store';
+import { ActiveState } from '../models/misc';

 interface Preference {
  apiKey: string;
@ -13,6 +15,7 @@ interface Preference {
  chunks: string;
  sourceDocs: Doc[] | null;
  conversations: { name: string; id: string }[] | null;
+  modalState: ActiveState;
 }

 const initialState: Preference = {
@ -32,6 +35,7 @@ const initialState: Preference = {
  } as Doc,
  sourceDocs: null,
  conversations: null,
+  modalState: 'INACTIVE',
 };

 export const prefSlice = createSlice({
@ -56,6 +60,9 @@ export const prefSlice = createSlice({
    setChunks: (state, action) => {
      state.chunks = action.payload;
    },
+    setModalStateDeleteConv: (state, action: PayloadAction<ActiveState>) => {
+      state.modalState = action.payload;
+    },
  },
 });

@ -66,6 +73,7 @@ export const {
  setConversations,
  setPrompt,
  setChunks,
+  setModalStateDeleteConv,
 } = prefSlice.actions;
 export default prefSlice.reducer;

@ -114,6 +122,8 @@ export const selectSelectedDocsStatus = (state: RootState) =>
  !!state.preference.selectedDocs;
 export const selectSourceDocs = (state: RootState) =>
  state.preference.sourceDocs;
+export const selectModalStateDeleteConv = (state: RootState) =>
+  state.preference.modalState;
 export const selectSelectedDocs = (state: RootState) =>
  state.preference.selectedDocs;
 export const selectConversations = (state: RootState) =>
--- a/frontend/src/settings/General.tsx
+++ b/frontend/src/settings/General.tsx
@ -8,6 +8,7 @@ import {
  setPrompt,
  setChunks,
  selectChunks,
+  setModalStateDeleteConv,
 } from '../preferences/preferenceSlice';

 const apiHost = import.meta.env.VITE_API_HOST || 'https://docsapi.arc53.com';
@ -43,6 +44,7 @@ const General: React.FC = () => {
    };
    fetchPrompts();
  }, []);
+
  return (
    <div className="mt-[59px]">
      <div className="mb-4">
@ -93,6 +95,19 @@ const General: React.FC = () => {
          apiHost={apiHost}
        />
      </div>
+      <div className="w-55 w-56">
+        <p className="font-bold text-jet dark:text-bright-gray">
+          Delete all conversations
+        </p>
+        <button
+          className="mt-2 flex w-full cursor-pointer items-center justify-between rounded-3xl  border-2 border-solid border-purple-30 bg-white  px-5 py-3 text-purple-30 hover:bg-purple-30 hover:text-white dark:border-chinese-silver dark:bg-transparent"
+          onClick={() => dispatch(setModalStateDeleteConv('ACTIVE'))}
+        >
+          <span className="overflow-hidden text-ellipsis dark:text-bright-gray">
+            Delete
+          </span>
+        </button>
+      </div>
    </div>
  );
 };
--- a/frontend/src/store.ts
+++ b/frontend/src/store.ts
@ -34,6 +34,7 @@ const store = configureStore({
          model: '1.0',
        },
      ],
+      modalState: 'INACTIVE',
    },
  },
  reducer: {
Author	SHA1	Message	Date
dependabot[bot]	2d1a1b99cc	build(deps): bump transformers from 4.36.2 to 4.38.0 in /application Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.2 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.36.2...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2 weeks ago
Alex	784206b39b	chore: Update Dockerfile to use Ubuntu mantic as base image and upgrade gunicorn to version 22.0.0	2 weeks ago
Alex	7c8264e221	Merge pull request #929 from TomasMatarazzo/issue-button-to-clean-chat-history Issue button to clean chat history	3 weeks ago
TomasMatarazzo	db7195aa30	Update Navigation.tsx	3 weeks ago
TomasMatarazzo	eb7bbc1612	TS2741	3 weeks ago
TomasMatarazzo	ee3792181d	probando	3 weeks ago
TomasMatarazzo	9804965a20	style in button and user in back route delete all conv	3 weeks ago
TomasMatarazzo	b84842df3d	Fixing types	4 weeks ago
TomasMatarazzo	fc170d3033	Update package.json	4 weeks ago
TomasMatarazzo	8fa4ec7ad8	delete console.log	4 weeks ago
TomasMatarazzo	480825ddd7	now is working in settings	4 weeks ago
TomasMatarazzo	260e328cc1	first change	4 weeks ago
Alex	8873428b4b	Merge pull request #926 from siiddhantt/feature Feature: Logging token usage info to MongoDB	4 weeks ago
Alex	ab43c20b8f	delete test output	4 weeks ago
TomasMatarazzo	88d9d4f4a3	Update DeleteConvModal.tsx	1 month ago
TomasMatarazzo	d4840f85c0	change text in modal	1 month ago
TomasMatarazzo	6f9ddeaed0	Button to clean chat history	1 month ago
Siddhant Rai	af5e73c8cb	fix: user_api_key capturing	1 month ago
Siddhant Rai	333b6e60e1	fix: anthropic llm positional arguments	1 month ago
Siddhant Rai	1b61337b75	fix: skip logging to db during tests	1 month ago
Siddhant Rai	77991896b4	fix: api_key capturing + pytest errors	1 month ago
Siddhant Rai	60a670ce29	fix: changes to llm classes according to base	1 month ago
Siddhant Rai	c1c69ed22b	fix: pytest issues	1 month ago
Siddhant Rai	d71c74c6fb	Merge branch 'feature' of https://github.com/siiddhantt/DocsGPT into feature	1 month ago
Siddhant Rai	590aa8b43f	update: apply decorator to abstract classes	1 month ago
Siddhant Rai	607e0166f6	Merge branch 'arc53:main' into feature	1 month ago
Alex	130c83ee92	Merge pull request #911 from arc53/dependabot/pip/application/pymongo-4.6.3 Bump pymongo from 4.6.1 to 4.6.3 in /application	1 month ago
Alex	fd5e418abf	Merge pull request #919 from arc53/dependabot/npm_and_yarn/docs/multi-4407677fd1 build(deps): bump tar and npm in /docs	1 month ago
Siddhant Rai	262d160314	Merge with branch main	1 month ago
Siddhant Rai	9146827590	fix: removed unused import	1 month ago
Siddhant Rai	062b108259	Merge branch 'arc53:main' into feature	1 month ago
Siddhant Rai	ba796b6be1	feat: logging token usage to database	1 month ago
Alex	3d763235e1	Merge pull request #925 from ManishMadan2882/main Untraced types in react widget	1 month ago
Manish Madan	c30c6d9f10	Merge branch 'arc53:main' into main	1 month ago
ManishMadan2882	311716ed18	refactored fs, fix: untracked dir	1 month ago
Alex	19bb1b4aa4	Create SECURITY.md	1 month ago
dependabot[bot]	340dcfb70d	build(deps): bump tar and npm in /docs Removes [tar](https://github.com/isaacs/node-tar). It's no longer used after updating ancestor dependency [npm](https://github.com/npm/cli). These dependencies need to be updated together. Removes `tar` Updates `npm` from 10.5.0 to 10.5.1 - [Release notes](https://github.com/npm/cli/releases) - [Changelog](https://github.com/npm/cli/blob/latest/CHANGELOG.md) - [Commits](https://github.com/npm/cli/compare/v10.5.0...v10.5.1) --- updated-dependencies: - dependency-name: tar dependency-type: indirect - dependency-name: npm dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	1 month ago
dependabot[bot]	83fa850142	Bump pymongo from 4.6.1 to 4.6.3 in /application Bumps [pymongo](https://github.com/mongodb/mongo-python-driver) from 4.6.1 to 4.6.3. - [Release notes](https://github.com/mongodb/mongo-python-driver/releases) - [Changelog](https://github.com/mongodb/mongo-python-driver/blob/master/doc/changelog.rst) - [Commits](https://github.com/mongodb/mongo-python-driver/compare/4.6.1...4.6.3) --- updated-dependencies: - dependency-name: pymongo dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	1 month ago