Include information on the tools for creating gbnf grammar files in the llama-cpp notebook (#11764)

Hi, I recently experimented with grammar-based sampling and discovered two methods for speeding up the creation of gbnf grammar files: 1. [Online grammar generator app](https://github.com/ggerganov/llama.cpp/discussions/2494) introduced [here](https://github.com/ggerganov/llama.cpp/discussions/2494) 2. [Script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py) for parsing json schema to gbnf grammar I believe it is a good idea to include the information that leads to them in the `llama-cpp` notebook. *** Codespell check fails but due to the unrelated script Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-11-06 03:20:49 +00:00 · 2023-10-17 05:28:32 +02:00 · 2023-10-17 05:28:32 +02:00 · c4341463e8
commit c4341463e8
parent c15701eebf
1 changed files with 22 additions and 8 deletions
--- a/docs/docs/integrations/llms/llamacpp.ipynb
+++ b/docs/docs/integrations/llms/llamacpp.ipynb
@ -189,7 +189,8 @@
   "outputs": [],
   "source": [
    "from langchain.llms import LlamaCpp\n",
-    "from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain\n",
+    "from langchain.prompts import PromptTemplate\n",
    "from langchain.chains import LLMChain\n",
    "from langchain.callbacks.manager import CallbackManager\n",
    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
   ]
@ -532,12 +533,20 @@
   "source": [
    "### Grammars\n",
    "\n",
    "We can use [grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md) to constrain model outputs and sample tokens based on the rules defined in them.\n",
    "\n",
-    "We can specify [grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md) to constrain model outputs.\n",
+    "To demonstrate this concept, we've included [sample grammar files](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/llms/grammars), that will be used in the examples below.\n",
    "\n",
-    "This will sample tokens according to the grammar.\n",
+    "Creating gbnf grammar files can be time-consuming, but if you have a use-case where output schemas are important, there are two tools that can help:\n",
-    "  \n",
+    "- [Online grammar generator app](https://grammar.intrinsiclabs.ai/) that converts TypeScript interface definitions to gbnf file.\n",
-    "For example, supply the path to the specifed `json.gbnf` file in order to produce JSON."
+    "- [Python script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py) for converting json schema to gbnf file. You can for example create `pydantic` object, generate its JSON schema using `.schema_json()` method, and then use this script to convert it to gbnf file."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the first example, supply the path to the specifed `json.gbnf` file in order to produce JSON:"
   ]
  },
  {
@ -612,7 +621,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We can also supply `list.gbnf` to return a list."
+    "We can also supply `list.gbnf` to return a list:"
   ]
  },
  {
@ -667,7 +676,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "Python 3.10.12 ('langchain_venv': venv)",
   "language": "python",
   "name": "python3"
  },
@ -681,7 +690,12 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.16"
+   "version": "3.10.12"
  },
  "vscode": {
   "interpreter": {
    "hash": "d1d3a3c58a58885896c5459933a599607cdbb9917d7e1ad7516c8786c51f2dd2"
   }
  }
 },
 "nbformat": 4,