|
|
@ -6,7 +6,7 @@
|
|
|
|
"metadata": {},
|
|
|
|
"metadata": {},
|
|
|
|
"source": [
|
|
|
|
"source": [
|
|
|
|
"# tiktoken (OpenAI) Length Function\n",
|
|
|
|
"# tiktoken (OpenAI) Length Function\n",
|
|
|
|
"You can also use tiktoken, a open source tokenizer package from OpenAI to estimate tokens used. Will probably be more accurate for their models.\n",
|
|
|
|
"You can also use tiktoken, an open source tokenizer package from OpenAI to estimate tokens used. Will probably be more accurate for their models.\n",
|
|
|
|
"\n",
|
|
|
|
"\n",
|
|
|
|
"1. How the text is split: by character passed in\n",
|
|
|
|
"1. How the text is split: by character passed in\n",
|
|
|
|
"2. How the chunk size is measured: by `tiktoken` tokenizer"
|
|
|
|
"2. How the chunk size is measured: by `tiktoken` tokenizer"
|
|
|
|