Update huggingface_length_function.ipynb (#2203)

HuggingFace -> Hugging Face
This commit is contained in:
Ikko Eltociear Ashimine 2023-03-31 12:43:58 +09:00 committed by GitHub
parent 2d3918c152
commit a4a1ee6b5d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -5,8 +5,8 @@
"id": "13dc0983", "id": "13dc0983",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# HuggingFace Length Function\n", "# Hugging Face Length Function\n",
"Most LLMs are constrained by the number of tokens that you can pass in, which is not the same as the number of characters. In order to get a more accurate estimate, we can use HuggingFace tokenizers to count the text length.\n", "Most LLMs are constrained by the number of tokens that you can pass in, which is not the same as the number of characters. In order to get a more accurate estimate, we can use Hugging Face tokenizers to count the text length.\n",
"\n", "\n",
"1. How the text is split: by character passed in\n", "1. How the text is split: by character passed in\n",
"2. How the chunk size is measured: by Hugging Face tokenizer" "2. How the chunk size is measured: by Hugging Face tokenizer"