From c4341463e84fa032259ba0620c027de00263d7a8 Mon Sep 17 00:00:00 2001
From: eryk-dsai <142571618+eryk-dsai@users.noreply.github.com>
Date: Tue, 17 Oct 2023 05:28:32 +0200
Subject: [PATCH] Include information on the tools for creating gbnf grammar
 files in the llama-cpp notebook (#11764)

Hi,

I recently experimented with grammar-based sampling and discovered two
methods for speeding up the creation of gbnf grammar files:
1. [Online grammar generator
app](https://github.com/ggerganov/llama.cpp/discussions/2494) introduced
[here](https://github.com/ggerganov/llama.cpp/discussions/2494)
2.
[Script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py)
for parsing json schema to gbnf grammar

I believe it is a good idea to include the information that leads to
them in the `llama-cpp` notebook.

***

Codespell check fails but due to the unrelated script

Co-authored-by: Bagatur <baskaryan@gmail.com>
---
 docs/docs/integrations/llms/llamacpp.ipynb | 30 ++++++++++++++++------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/docs/docs/integrations/llms/llamacpp.ipynb b/docs/docs/integrations/llms/llamacpp.ipynb
index 45df2ac8a8..cf9fa21bb5 100644
--- a/docs/docs/integrations/llms/llamacpp.ipynb
+++ b/docs/docs/integrations/llms/llamacpp.ipynb
@@ -189,7 +189,8 @@
    "outputs": [],
    "source": [
     "from langchain.llms import LlamaCpp\n",
-    "from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain\n",
+    "from langchain.prompts import PromptTemplate\n",
+    "from langchain.chains import LLMChain\n",
     "from langchain.callbacks.manager import CallbackManager\n",
     "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
    ]
@@ -532,12 +533,20 @@
    "source": [
     "### Grammars\n",
     "\n",
+    "We can use [grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md) to constrain model outputs and sample tokens based on the rules defined in them.\n",
     "\n",
-    "We can specify [grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md) to constrain model outputs.\n",
+    "To demonstrate this concept, we've included [sample grammar files](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/llms/grammars), that will be used in the examples below.\n",
     "\n",
-    "This will sample tokens according to the grammar.\n",
-    "  \n",
-    "For example, supply the path to the specifed `json.gbnf` file in order to produce JSON."
+    "Creating gbnf grammar files can be time-consuming, but if you have a use-case where output schemas are important, there are two tools that can help:\n",
+    "- [Online grammar generator app](https://grammar.intrinsiclabs.ai/) that converts TypeScript interface definitions to gbnf file.\n",
+    "- [Python script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py) for converting json schema to gbnf file. You can for example create `pydantic` object, generate its JSON schema using `.schema_json()` method, and then use this script to convert it to gbnf file."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the first example, supply the path to the specifed `json.gbnf` file in order to produce JSON:"
    ]
   },
   {
@@ -612,7 +621,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We can also supply `list.gbnf` to return a list."
+    "We can also supply `list.gbnf` to return a list:"
    ]
   },
   {
@@ -667,7 +676,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "Python 3.10.12 ('langchain_venv': venv)",
    "language": "python",
    "name": "python3"
   },
@@ -681,7 +690,12 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.16"
+   "version": "3.10.12"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "d1d3a3c58a58885896c5459933a599607cdbb9917d7e1ad7516c8786c51f2dd2"
+   }
   }
  },
  "nbformat": 4,