Include information on the tools for creating gbnf grammar files in the llama-cpp notebook (#11764)

Hi,

I recently experimented with grammar-based sampling and discovered two
methods for speeding up the creation of gbnf grammar files:
1. [Online grammar generator
app](https://github.com/ggerganov/llama.cpp/discussions/2494) introduced
[here](https://github.com/ggerganov/llama.cpp/discussions/2494)
2.
[Script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py)
for parsing json schema to gbnf grammar

I believe it is a good idea to include the information that leads to
them in the `llama-cpp` notebook.

***

Codespell check fails but due to the unrelated script

Co-authored-by: Bagatur <baskaryan@gmail.com>
pull/11909/head
eryk-dsai 10 months ago committed by GitHub
parent c15701eebf
commit c4341463e8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -189,7 +189,8 @@
"outputs": [],
"source": [
"from langchain.llms import LlamaCpp\n",
"from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain\n",
"from langchain.callbacks.manager import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
]
@ -532,12 +533,20 @@
"source": [
"### Grammars\n",
"\n",
"We can use [grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md) to constrain model outputs and sample tokens based on the rules defined in them.\n",
"\n",
"We can specify [grammars](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md) to constrain model outputs.\n",
"To demonstrate this concept, we've included [sample grammar files](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/llms/grammars), that will be used in the examples below.\n",
"\n",
"This will sample tokens according to the grammar.\n",
" \n",
"For example, supply the path to the specifed `json.gbnf` file in order to produce JSON."
"Creating gbnf grammar files can be time-consuming, but if you have a use-case where output schemas are important, there are two tools that can help:\n",
"- [Online grammar generator app](https://grammar.intrinsiclabs.ai/) that converts TypeScript interface definitions to gbnf file.\n",
"- [Python script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py) for converting json schema to gbnf file. You can for example create `pydantic` object, generate its JSON schema using `.schema_json()` method, and then use this script to convert it to gbnf file."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the first example, supply the path to the specifed `json.gbnf` file in order to produce JSON:"
]
},
{
@ -612,7 +621,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also supply `list.gbnf` to return a list."
"We can also supply `list.gbnf` to return a list:"
]
},
{
@ -667,7 +676,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python 3.10.12 ('langchain_venv': venv)",
"language": "python",
"name": "python3"
},
@ -681,7 +690,12 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.12"
},
"vscode": {
"interpreter": {
"hash": "d1d3a3c58a58885896c5459933a599607cdbb9917d7e1ad7516c8786c51f2dd2"
}
}
},
"nbformat": 4,

Loading…
Cancel
Save