Update `n_gpu_layers`"s description (#16685)

The `n_gpu_layers` parameter in `llama.cpp` supports the use of `-1`, which means to offload all layers to the GPU, so the document has been updated. Ref: 35918873b4/llama_cpp/server/settings.py (L29C22-L29C117) 35918873b4/llama_cpp/llama.py (L125)
8 months ago · 0866a984fe
parent 0600998f38
commit 0866a984fe
1 changed files with 3 additions and 3 deletions
--- a/docs/docs/integrations/llms/llamacpp.ipynb
+++ b/docs/docs/integrations/llms/llamacpp.ipynb
@ -415,7 +415,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "n_gpu_layers = 40  # Change this value based on your model and your GPU VRAM pool.\n",
+    "n_gpu_layers = -1  # The number of layers to put on the GPU. The rest will be on the CPU. If you don't know how many layers there are, you can use -1 to move all to GPU.\n",
    "n_batch = 512  # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.\n",
    "\n",
    "# Make sure the model path is correct for your system!\n",
@ -501,7 +501,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "n_gpu_layers = 1  # Change this value based on your model and your GPU VRAM pool.\n",
+    "n_gpu_layers = 1  # The number of layers to put on the GPU. The rest will be on the CPU. If you don't know how many layers there are, you can use -1 to move all to GPU.\n",
    "n_batch = 512  # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.\n",
    "# Make sure the model path is correct for your system!\n",
    "llm = LlamaCpp(\n",
@ -559,7 +559,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "n_gpu_layers = 1  # Metal set to 1 is enough.\n",
+    "n_gpu_layers = 1  # The number of layers to put on the GPU. The rest will be on the CPU. If you don't know how many layers there are, you can use -1 to move all to GPU.\n",
    "n_batch = 512  # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.\n",
    "# Make sure the model path is correct for your system!\n",
    "llm = LlamaCpp(\n",