mirror of
https://github.com/nomic-ai/gpt4all
synced 2024-11-10 01:10:35 +00:00
55f3b056b7
Signed-off-by: jacob <jacoobes@sern.dev> Signed-off-by: limez <limez@protonmail.com> Signed-off-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: limez <limez@protonmail.com> Co-authored-by: Jared Van Bortel <jared@nomic.ai>
1030 lines
35 KiB
Markdown
1030 lines
35 KiB
Markdown
# GPT4All Node.js API
|
|
|
|
Native Node.js LLM bindings for all.
|
|
|
|
```sh
|
|
yarn add gpt4all@latest
|
|
|
|
npm install gpt4all@latest
|
|
|
|
pnpm install gpt4all@latest
|
|
|
|
```
|
|
|
|
## Contents
|
|
|
|
* See [API Reference](#api-reference)
|
|
* See [Examples](#api-example)
|
|
* See [Developing](#develop)
|
|
* GPT4ALL nodejs bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use.
|
|
|
|
## Api Example
|
|
|
|
### Chat Completion
|
|
|
|
```js
|
|
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel } from '../src/gpt4all.js'
|
|
|
|
const model = await loadModel( 'mistral-7b-openorca.gguf2.Q4_0.gguf', { verbose: true, device: 'gpu' });
|
|
|
|
const completion1 = await createCompletion(model, 'What is 1 + 1?', { verbose: true, })
|
|
console.log(completion1.message)
|
|
|
|
const completion2 = await createCompletion(model, 'And if we add two?', { verbose: true })
|
|
console.log(completion2.message)
|
|
|
|
model.dispose()
|
|
```
|
|
|
|
### Embedding
|
|
|
|
```js
|
|
import { loadModel, createEmbedding } from '../src/gpt4all.js'
|
|
|
|
const embedder = await loadModel("all-MiniLM-L6-v2-f16.gguf", { verbose: true, type: 'embedding'})
|
|
|
|
console.log(createEmbedding(embedder, "Maybe Minecraft was the friends we made along the way"));
|
|
```
|
|
|
|
### Chat Sessions
|
|
|
|
```js
|
|
import { loadModel, createCompletion } from "../src/gpt4all.js";
|
|
|
|
const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", {
|
|
verbose: true,
|
|
device: "gpu",
|
|
});
|
|
|
|
const chat = await model.createChatSession();
|
|
|
|
await createCompletion(
|
|
chat,
|
|
"Why are bananas rather blue than bread at night sometimes?",
|
|
{
|
|
verbose: true,
|
|
}
|
|
);
|
|
await createCompletion(chat, "Are you sure?", { verbose: true, });
|
|
|
|
```
|
|
|
|
### Streaming responses
|
|
|
|
```js
|
|
import gpt from "../src/gpt4all.js";
|
|
|
|
const model = await gpt.loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", {
|
|
device: "gpu",
|
|
});
|
|
|
|
process.stdout.write("### Stream:");
|
|
const stream = gpt.createCompletionStream(model, "How are you?");
|
|
stream.tokens.on("data", (data) => {
|
|
process.stdout.write(data);
|
|
});
|
|
//wait till stream finishes. We cannot continue until this one is done.
|
|
await stream.result;
|
|
process.stdout.write("\n");
|
|
|
|
process.stdout.write("### Stream with pipe:");
|
|
const stream2 = gpt.createCompletionStream(
|
|
model,
|
|
"Please say something nice about node streams."
|
|
);
|
|
stream2.tokens.pipe(process.stdout);
|
|
await stream2.result;
|
|
process.stdout.write("\n");
|
|
|
|
console.log("done");
|
|
model.dispose();
|
|
```
|
|
|
|
### Async Generators
|
|
|
|
```js
|
|
import gpt from "../src/gpt4all.js";
|
|
|
|
const model = await gpt.loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", {
|
|
device: "gpu",
|
|
});
|
|
|
|
process.stdout.write("### Generator:");
|
|
const gen = gpt.createCompletionGenerator(model, "Redstone in Minecraft is Turing Complete. Let that sink in. (let it in!)");
|
|
for await (const chunk of gen) {
|
|
process.stdout.write(chunk);
|
|
}
|
|
|
|
process.stdout.write("\n");
|
|
model.dispose();
|
|
```
|
|
|
|
## Develop
|
|
|
|
### Build Instructions
|
|
|
|
* binding.gyp is compile config
|
|
* Tested on Ubuntu. Everything seems to work fine
|
|
* Tested on Windows. Everything works fine.
|
|
* Sparse testing on mac os.
|
|
* MingW works as well to build the gpt4all-backend. **HOWEVER**, this package works only with MSVC built dlls.
|
|
|
|
### Requirements
|
|
|
|
* git
|
|
* [node.js >= 18.0.0](https://nodejs.org/en)
|
|
* [yarn](https://yarnpkg.com/)
|
|
* [node-gyp](https://github.com/nodejs/node-gyp)
|
|
* all of its requirements.
|
|
* (unix) gcc version 12
|
|
* (win) msvc version 143
|
|
* Can be obtained with visual studio 2022 build tools
|
|
* python 3
|
|
* On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home
|
|
* macOS users do not need Vulkan, as GPT4All will use Metal instead.
|
|
|
|
### Build (from source)
|
|
|
|
```sh
|
|
git clone https://github.com/nomic-ai/gpt4all.git
|
|
cd gpt4all-bindings/typescript
|
|
```
|
|
|
|
* The below shell commands assume the current working directory is `typescript`.
|
|
|
|
* To Build and Rebuild:
|
|
|
|
```sh
|
|
yarn
|
|
```
|
|
|
|
* llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory
|
|
|
|
```sh
|
|
git submodule update --init --depth 1 --recursive
|
|
```
|
|
|
|
```sh
|
|
yarn build:backend
|
|
```
|
|
|
|
This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native The only current way to use them is to put them in the current working directory of your application. That is, **WHEREVER YOU RUN YOUR NODE APPLICATION**
|
|
|
|
* llama-xxxx.dll is required.
|
|
* According to whatever model you are using, you'll need to select the proper model loader.
|
|
* For example, if you running an Mosaic MPT model, you will need to select the mpt-(buildvariant).(dynamiclibrary)
|
|
|
|
### Test
|
|
|
|
```sh
|
|
yarn test
|
|
```
|
|
|
|
### Source Overview
|
|
|
|
#### src/
|
|
|
|
* Extra functions to help aid devex
|
|
* Typings for the native node addon
|
|
* the javascript interface
|
|
|
|
#### test/
|
|
|
|
* simple unit testings for some functions exported.
|
|
* more advanced ai testing is not handled
|
|
|
|
#### spec/
|
|
|
|
* Average look and feel of the api
|
|
* Should work assuming a model and libraries are installed locally in working directory
|
|
|
|
#### index.cc
|
|
|
|
* The bridge between nodejs and c. Where the bindings are.
|
|
|
|
#### prompt.cc
|
|
|
|
* Handling prompting and inference of models in a threadsafe, asynchronous way.
|
|
|
|
### Known Issues
|
|
|
|
* why your model may be spewing bull 💩
|
|
* The downloaded model is broken (just reinstall or download from official site)
|
|
* Your model is hanging after a call to generate tokens.
|
|
* Is `nPast` set too high? This may cause your model to hang (03/16/2024), Linux Mint, Ubuntu 22.04
|
|
* Your GPU usage is still high after node.js exits.
|
|
* Make sure to call `model.dispose()`!!!
|
|
|
|
### Roadmap
|
|
|
|
This package has been stabilizing over time development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
|
|
|
* \[ ] Purely offline. Per the gui, which can be run completely offline, the bindings should be as well.
|
|
* \[ ] NPM bundle size reduction via optionalDependencies strategy (need help)
|
|
* Should include prebuilds to avoid painful node-gyp errors
|
|
* \[x] createChatSession ( the python equivalent to create\_chat\_session )
|
|
* \[x] generateTokens, the new name for createTokenStream. As of 3.2.0, this is released but not 100% tested. Check spec/generator.mjs!
|
|
* \[x] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete
|
|
* \[x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
|
|
* \[x] generateTokens is the new name for this^
|
|
* \[x] proper unit testing (integrate with circle ci)
|
|
* \[x] publish to npm under alpha tag `gpt4all@alpha`
|
|
* \[x] have more people test on other platforms (mac tester needed)
|
|
* \[x] switch to new pluggable backend
|
|
|
|
### API Reference
|
|
|
|
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
|
|
|
|
##### Table of Contents
|
|
|
|
* [type](#type)
|
|
* [TokenCallback](#tokencallback)
|
|
* [ChatSessionOptions](#chatsessionoptions)
|
|
* [systemPrompt](#systemprompt)
|
|
* [messages](#messages)
|
|
* [initialize](#initialize)
|
|
* [Parameters](#parameters)
|
|
* [generate](#generate)
|
|
* [Parameters](#parameters-1)
|
|
* [InferenceModel](#inferencemodel)
|
|
* [createChatSession](#createchatsession)
|
|
* [Parameters](#parameters-2)
|
|
* [generate](#generate-1)
|
|
* [Parameters](#parameters-3)
|
|
* [dispose](#dispose)
|
|
* [EmbeddingModel](#embeddingmodel)
|
|
* [dispose](#dispose-1)
|
|
* [InferenceResult](#inferenceresult)
|
|
* [LLModel](#llmodel)
|
|
* [constructor](#constructor)
|
|
* [Parameters](#parameters-4)
|
|
* [type](#type-1)
|
|
* [name](#name)
|
|
* [stateSize](#statesize)
|
|
* [threadCount](#threadcount)
|
|
* [setThreadCount](#setthreadcount)
|
|
* [Parameters](#parameters-5)
|
|
* [infer](#infer)
|
|
* [Parameters](#parameters-6)
|
|
* [embed](#embed)
|
|
* [Parameters](#parameters-7)
|
|
* [isModelLoaded](#ismodelloaded)
|
|
* [setLibraryPath](#setlibrarypath)
|
|
* [Parameters](#parameters-8)
|
|
* [getLibraryPath](#getlibrarypath)
|
|
* [initGpuByString](#initgpubystring)
|
|
* [Parameters](#parameters-9)
|
|
* [hasGpuDevice](#hasgpudevice)
|
|
* [listGpu](#listgpu)
|
|
* [Parameters](#parameters-10)
|
|
* [dispose](#dispose-2)
|
|
* [GpuDevice](#gpudevice)
|
|
* [type](#type-2)
|
|
* [LoadModelOptions](#loadmodeloptions)
|
|
* [modelPath](#modelpath)
|
|
* [librariesPath](#librariespath)
|
|
* [modelConfigFile](#modelconfigfile)
|
|
* [allowDownload](#allowdownload)
|
|
* [verbose](#verbose)
|
|
* [device](#device)
|
|
* [nCtx](#nctx)
|
|
* [ngl](#ngl)
|
|
* [loadModel](#loadmodel)
|
|
* [Parameters](#parameters-11)
|
|
* [InferenceProvider](#inferenceprovider)
|
|
* [createCompletion](#createcompletion)
|
|
* [Parameters](#parameters-12)
|
|
* [createCompletionStream](#createcompletionstream)
|
|
* [Parameters](#parameters-13)
|
|
* [createCompletionGenerator](#createcompletiongenerator)
|
|
* [Parameters](#parameters-14)
|
|
* [createEmbedding](#createembedding)
|
|
* [Parameters](#parameters-15)
|
|
* [CompletionOptions](#completionoptions)
|
|
* [verbose](#verbose-1)
|
|
* [onToken](#ontoken)
|
|
* [Message](#message)
|
|
* [role](#role)
|
|
* [content](#content)
|
|
* [prompt\_tokens](#prompt_tokens)
|
|
* [completion\_tokens](#completion_tokens)
|
|
* [total\_tokens](#total_tokens)
|
|
* [n\_past\_tokens](#n_past_tokens)
|
|
* [CompletionReturn](#completionreturn)
|
|
* [model](#model)
|
|
* [usage](#usage)
|
|
* [message](#message-1)
|
|
* [CompletionStreamReturn](#completionstreamreturn)
|
|
* [LLModelPromptContext](#llmodelpromptcontext)
|
|
* [logitsSize](#logitssize)
|
|
* [tokensSize](#tokenssize)
|
|
* [nPast](#npast)
|
|
* [nPredict](#npredict)
|
|
* [promptTemplate](#prompttemplate)
|
|
* [nCtx](#nctx-1)
|
|
* [topK](#topk)
|
|
* [topP](#topp)
|
|
* [minP](#minp)
|
|
* [temperature](#temperature)
|
|
* [nBatch](#nbatch)
|
|
* [repeatPenalty](#repeatpenalty)
|
|
* [repeatLastN](#repeatlastn)
|
|
* [contextErase](#contexterase)
|
|
* [DEFAULT\_DIRECTORY](#default_directory)
|
|
* [DEFAULT\_LIBRARIES\_DIRECTORY](#default_libraries_directory)
|
|
* [DEFAULT\_MODEL\_CONFIG](#default_model_config)
|
|
* [DEFAULT\_PROMPT\_CONTEXT](#default_prompt_context)
|
|
* [DEFAULT\_MODEL\_LIST\_URL](#default_model_list_url)
|
|
* [downloadModel](#downloadmodel)
|
|
* [Parameters](#parameters-16)
|
|
* [Examples](#examples)
|
|
* [DownloadModelOptions](#downloadmodeloptions)
|
|
* [modelPath](#modelpath-1)
|
|
* [verbose](#verbose-2)
|
|
* [url](#url)
|
|
* [md5sum](#md5sum)
|
|
* [DownloadController](#downloadcontroller)
|
|
* [cancel](#cancel)
|
|
* [promise](#promise)
|
|
|
|
#### type
|
|
|
|
Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### TokenCallback
|
|
|
|
Callback for controlling token generation. Return false to stop token generation.
|
|
|
|
Type: function (tokenId: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), token: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String), total: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)): [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
|
|
|
#### ChatSessionOptions
|
|
|
|
**Extends Partial\<LLModelPromptContext>**
|
|
|
|
Options for the chat session.
|
|
|
|
##### systemPrompt
|
|
|
|
System prompt to ingest on initialization.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### messages
|
|
|
|
Messages to ingest on initialization.
|
|
|
|
Type: [Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[Message](#message)>
|
|
|
|
#### initialize
|
|
|
|
Ingests system prompt and initial messages.
|
|
Sets this chat session as the active chat session of the model.
|
|
|
|
##### Parameters
|
|
|
|
* `options` **[ChatSessionOptions](#chatsessionoptions)** The options for the chat session.
|
|
|
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<void>** 
|
|
|
|
#### generate
|
|
|
|
Prompts the model in chat-session context.
|
|
|
|
##### Parameters
|
|
|
|
* `prompt` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input.
|
|
* `options` **[CompletionOptions](#completionoptions)?** Prompt context and other options.
|
|
* `callback` **[TokenCallback](#tokencallback)?** Token generation callback.
|
|
|
|
<!---->
|
|
|
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the chat session is not the active chat session of the model.
|
|
|
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[CompletionReturn](#completionreturn)>** The model's response to the prompt.
|
|
|
|
#### InferenceModel
|
|
|
|
InferenceModel represents an LLM which can make chat predictions, similar to GPT transformers.
|
|
|
|
##### createChatSession
|
|
|
|
Create a chat session with the model.
|
|
|
|
###### Parameters
|
|
|
|
* `options` **[ChatSessionOptions](#chatsessionoptions)?** The options for the chat session.
|
|
|
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<ChatSession>** The chat session.
|
|
|
|
##### generate
|
|
|
|
Prompts the model with a given input and optional parameters.
|
|
|
|
###### Parameters
|
|
|
|
* `prompt` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
|
* `options` **[CompletionOptions](#completionoptions)?** Prompt context and other options.
|
|
* `callback` **[TokenCallback](#tokencallback)?** Token generation callback.
|
|
* `input` The prompt input.
|
|
|
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[CompletionReturn](#completionreturn)>** The model's response to the prompt.
|
|
|
|
##### dispose
|
|
|
|
delete and cleanup the native model
|
|
|
|
Returns **void** 
|
|
|
|
#### EmbeddingModel
|
|
|
|
EmbeddingModel represents an LLM which can create embeddings, which are float arrays
|
|
|
|
##### dispose
|
|
|
|
delete and cleanup the native model
|
|
|
|
Returns **void** 
|
|
|
|
#### InferenceResult
|
|
|
|
Shape of LLModel's inference result.
|
|
|
|
#### LLModel
|
|
|
|
LLModel class representing a language model.
|
|
This is a base class that provides common functionality for different types of language models.
|
|
|
|
##### constructor
|
|
|
|
Initialize a new LLModel.
|
|
|
|
###### Parameters
|
|
|
|
* `path` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** Absolute path to the model file.
|
|
|
|
<!---->
|
|
|
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model file does not exist.
|
|
|
|
##### type
|
|
|
|
undefined or user supplied
|
|
|
|
Returns **([string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))** 
|
|
|
|
##### name
|
|
|
|
The name of the model.
|
|
|
|
Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
|
|
|
##### stateSize
|
|
|
|
Get the size of the internal state of the model.
|
|
NOTE: This state data is specific to the type of model you have created.
|
|
|
|
Returns **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** the size in bytes of the internal state of the model
|
|
|
|
##### threadCount
|
|
|
|
Get the number of threads used for model inference.
|
|
The default is the number of physical cores your computer has.
|
|
|
|
Returns **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** The number of threads used for model inference.
|
|
|
|
##### setThreadCount
|
|
|
|
Set the number of threads used for model inference.
|
|
|
|
###### Parameters
|
|
|
|
* `newNumber` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** The new number of threads.
|
|
|
|
Returns **void** 
|
|
|
|
##### infer
|
|
|
|
Prompt the model with a given input and optional parameters.
|
|
This is the raw output from model.
|
|
Use the prompt function exported for a value
|
|
|
|
###### Parameters
|
|
|
|
* `prompt` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input.
|
|
* `promptContext` **Partial<[LLModelPromptContext](#llmodelpromptcontext)>** Optional parameters for the prompt context.
|
|
* `callback` **[TokenCallback](#tokencallback)?** optional callback to control token generation.
|
|
|
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[InferenceResult](#inferenceresult)>** The result of the model prompt.
|
|
|
|
##### embed
|
|
|
|
Embed text with the model. Keep in mind that
|
|
Use the prompt function exported for a value
|
|
|
|
###### Parameters
|
|
|
|
* `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input.
|
|
|
|
Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The result of the model prompt.
|
|
|
|
##### isModelLoaded
|
|
|
|
Whether the model is loaded or not.
|
|
|
|
Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** 
|
|
|
|
##### setLibraryPath
|
|
|
|
Where to search for the pluggable backend libraries
|
|
|
|
###### Parameters
|
|
|
|
* `s` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
|
|
|
Returns **void** 
|
|
|
|
##### getLibraryPath
|
|
|
|
Where to get the pluggable backend libraries
|
|
|
|
Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
|
|
|
##### initGpuByString
|
|
|
|
Initiate a GPU by a string identifier.
|
|
|
|
###### Parameters
|
|
|
|
* `memory_required` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** Should be in the range size\_t or will throw
|
|
* `device_name` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 'amd' | 'nvidia' | 'intel' | 'gpu' | gpu name.
|
|
read LoadModelOptions.device for more information
|
|
|
|
Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** 
|
|
|
|
##### hasGpuDevice
|
|
|
|
From C documentation
|
|
|
|
Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** True if a GPU device is successfully initialized, false otherwise.
|
|
|
|
##### listGpu
|
|
|
|
GPUs that are usable for this LLModel
|
|
|
|
###### Parameters
|
|
|
|
* `nCtx` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** Maximum size of context window
|
|
|
|
<!---->
|
|
|
|
* Throws **any** if hasGpuDevice returns false (i think)
|
|
|
|
Returns **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[GpuDevice](#gpudevice)>** 
|
|
|
|
##### dispose
|
|
|
|
delete and cleanup the native model
|
|
|
|
Returns **void** 
|
|
|
|
#### GpuDevice
|
|
|
|
an object that contains gpu data on this machine.
|
|
|
|
##### type
|
|
|
|
same as VkPhysicalDeviceType
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### LoadModelOptions
|
|
|
|
Options that configure a model's behavior.
|
|
|
|
##### modelPath
|
|
|
|
Where to look for model files.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### librariesPath
|
|
|
|
Where to look for the backend libraries.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### modelConfigFile
|
|
|
|
The path to the model configuration file, useful for offline usage or custom model configurations.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### allowDownload
|
|
|
|
Whether to allow downloading the model if it is not present at the specified path.
|
|
|
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
|
|
|
##### verbose
|
|
|
|
Enable verbose logging.
|
|
|
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
|
|
|
##### device
|
|
|
|
The processing unit on which the model will run. It can be set to
|
|
|
|
* "cpu": Model will run on the central processing unit.
|
|
* "gpu": Model will run on the best available graphics processing unit, irrespective of its vendor.
|
|
* "amd", "nvidia", "intel": Model will run on the best available GPU from the specified vendor.
|
|
* "gpu name": Model will run on the GPU that matches the name if it's available.
|
|
Note: If a GPU device lacks sufficient RAM to accommodate the model, an error will be thrown, and the GPT4All
|
|
instance will be rendered invalid. It's advised to ensure the device has enough memory before initiating the
|
|
model.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### nCtx
|
|
|
|
The Maximum window size of this model
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### ngl
|
|
|
|
Number of gpu layers needed
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### loadModel
|
|
|
|
Loads a machine learning model with the specified name. The defacto way to create a model.
|
|
By default this will download a model from the official GPT4ALL website, if a model is not present at given path.
|
|
|
|
##### Parameters
|
|
|
|
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The name of the model to load.
|
|
* `options` **([LoadModelOptions](#loadmodeloptions) | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** (Optional) Additional options for loading the model.
|
|
|
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<([InferenceModel](#inferencemodel) | [EmbeddingModel](#embeddingmodel))>** A promise that resolves to an instance of the loaded LLModel.
|
|
|
|
#### InferenceProvider
|
|
|
|
Interface for inference, implemented by InferenceModel and ChatSession.
|
|
|
|
#### createCompletion
|
|
|
|
The nodejs equivalent to python binding's chat\_completion
|
|
|
|
##### Parameters
|
|
|
|
* `provider` **[InferenceProvider](#inferenceprovider)** The inference model object or chat session
|
|
* `message` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The user input message
|
|
* `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.
|
|
|
|
Returns **[CompletionReturn](#completionreturn)** The completion result.
|
|
|
|
#### createCompletionStream
|
|
|
|
Streaming variant of createCompletion, returns a stream of tokens and a promise that resolves to the completion result.
|
|
|
|
##### Parameters
|
|
|
|
* `provider` **[InferenceProvider](#inferenceprovider)** The inference model object or chat session
|
|
* `message` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The user input message.
|
|
* `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.
|
|
|
|
Returns **[CompletionStreamReturn](#completionstreamreturn)** An object of token stream and the completion result promise.
|
|
|
|
#### createCompletionGenerator
|
|
|
|
Creates an async generator of tokens
|
|
|
|
##### Parameters
|
|
|
|
* `provider` **[InferenceProvider](#inferenceprovider)** The inference model object or chat session
|
|
* `message` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The user input message.
|
|
* `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.
|
|
|
|
Returns **AsyncGenerator<[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)>** The stream of generated tokens
|
|
|
|
#### createEmbedding
|
|
|
|
The nodejs moral equivalent to python binding's Embed4All().embed()
|
|
meow
|
|
|
|
##### Parameters
|
|
|
|
* `model` **[EmbeddingModel](#embeddingmodel)** The language model object.
|
|
* `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** text to embed
|
|
|
|
Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The completion result.
|
|
|
|
#### CompletionOptions
|
|
|
|
**Extends Partial\<LLModelPromptContext>**
|
|
|
|
The options for creating the completion.
|
|
|
|
##### verbose
|
|
|
|
Indicates if verbose logging is enabled.
|
|
|
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
|
|
|
##### onToken
|
|
|
|
Callback for controlling token generation. Return false to stop processing.
|
|
|
|
Type: [TokenCallback](#tokencallback)
|
|
|
|
#### Message
|
|
|
|
A message in the conversation.
|
|
|
|
##### role
|
|
|
|
The role of the message.
|
|
|
|
Type: (`"system"` | `"assistant"` | `"user"`)
|
|
|
|
##### content
|
|
|
|
The message content.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### prompt\_tokens
|
|
|
|
The number of tokens used in the prompt. Currently not available and always 0.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### completion\_tokens
|
|
|
|
The number of tokens used in the completion.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### total\_tokens
|
|
|
|
The total number of tokens used. Currently not available and always 0.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### n\_past\_tokens
|
|
|
|
Number of tokens used in the conversation.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### CompletionReturn
|
|
|
|
The result of a completion.
|
|
|
|
##### model
|
|
|
|
The model used for the completion.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### usage
|
|
|
|
Token usage report.
|
|
|
|
Type: {prompt\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), completion\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), total\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), n\_past\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)}
|
|
|
|
##### message
|
|
|
|
The generated completion.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### CompletionStreamReturn
|
|
|
|
The result of a streamed completion, containing a stream of tokens and a promise that resolves to the completion result.
|
|
|
|
#### LLModelPromptContext
|
|
|
|
Model inference arguments for generating completions.
|
|
|
|
##### logitsSize
|
|
|
|
The size of the raw logits vector.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### tokensSize
|
|
|
|
The size of the raw tokens vector.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### nPast
|
|
|
|
The number of tokens in the past conversation.
|
|
This controls how far back the model looks when generating completions.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### nPredict
|
|
|
|
The maximum number of tokens to predict.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### promptTemplate
|
|
|
|
Template for user / assistant message pairs.
|
|
%1 is required and will be replaced by the user input.
|
|
%2 is optional and will be replaced by the assistant response.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### nCtx
|
|
|
|
The context window size. Do not use, it has no effect. See loadModel options.
|
|
THIS IS DEPRECATED!!!
|
|
Use loadModel's nCtx option instead.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### topK
|
|
|
|
The top-k logits to sample from.
|
|
Top-K sampling selects the next token only from the top K most likely tokens predicted by the model.
|
|
It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit
|
|
the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead
|
|
to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate
|
|
more conservative text. 30 - 60 is a good range for most tasks.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### topP
|
|
|
|
The nucleus sampling probability threshold.
|
|
Top-P limits the selection of the next token to a subset of tokens with a cumulative probability
|
|
above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity
|
|
and quality by considering both token probabilities and the number of tokens available for sampling.
|
|
When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse.
|
|
On the other hand, a lower value (eg., 0.1) produces more focused and conservative text.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### minP
|
|
|
|
The minimum probability of a token to be considered.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### temperature
|
|
|
|
The temperature to adjust the model's output distribution.
|
|
Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures
|
|
(eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5)
|
|
make the output more focused, predictable, and conservative. When the temperature is set to 0, the output
|
|
becomes completely deterministic, always selecting the most probable next token and producing identical results
|
|
each time. A safe range would be around 0.6 - 0.85, but you are free to search what value fits best for you.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### nBatch
|
|
|
|
The number of predictions to generate in parallel.
|
|
By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However,
|
|
this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts
|
|
with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing.
|
|
To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### repeatPenalty
|
|
|
|
The penalty factor for repeated tokens.
|
|
Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt.
|
|
A token that has already appeared five times is penalized more heavily than a token that has appeared only one time.
|
|
A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### repeatLastN
|
|
|
|
The number of last tokens to penalize.
|
|
The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition.
|
|
A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only
|
|
consider recent tokens.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
##### contextErase
|
|
|
|
The percentage of context to erase if the context window is exceeded.
|
|
|
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
|
|
|
#### DEFAULT\_DIRECTORY
|
|
|
|
From python api:
|
|
models will be stored in (homedir)/.cache/gpt4all/\`
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### DEFAULT\_LIBRARIES\_DIRECTORY
|
|
|
|
From python api:
|
|
The default path for dynamic libraries to be stored.
|
|
You may separate paths by a semicolon to search in multiple areas.
|
|
This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### DEFAULT\_MODEL\_CONFIG
|
|
|
|
Default model configuration.
|
|
|
|
Type: ModelConfig
|
|
|
|
#### DEFAULT\_PROMPT\_CONTEXT
|
|
|
|
Default prompt context.
|
|
|
|
Type: [LLModelPromptContext](#llmodelpromptcontext)
|
|
|
|
#### DEFAULT\_MODEL\_LIST\_URL
|
|
|
|
Default model list url.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### downloadModel
|
|
|
|
Initiates the download of a model file.
|
|
By default this downloads without waiting. use the controller returned to alter this behavior.
|
|
|
|
##### Parameters
|
|
|
|
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The model to be downloaded.
|
|
* `options` **[DownloadModelOptions](#downloadmodeloptions)** to pass into the downloader. Default is { location: (cwd), verbose: false }.
|
|
|
|
##### Examples
|
|
|
|
```javascript
|
|
const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin')
|
|
download.promise.then(() => console.log('Downloaded!'))
|
|
```
|
|
|
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model already exists in the specified location.
|
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model cannot be found at the specified url.
|
|
|
|
Returns **[DownloadController](#downloadcontroller)** object that allows controlling the download process.
|
|
|
|
#### DownloadModelOptions
|
|
|
|
Options for the model download process.
|
|
|
|
##### modelPath
|
|
|
|
location to download the model.
|
|
Default is process.cwd(), or the current working directory
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### verbose
|
|
|
|
Debug mode -- check how long it took to download in seconds
|
|
|
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
|
|
|
##### url
|
|
|
|
Remote download url. Defaults to `https://gpt4all.io/models/gguf/<modelName>`
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
##### md5sum
|
|
|
|
MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum.
|
|
If the sums do not match, an error will be thrown and the file will be deleted.
|
|
|
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
|
|
|
#### DownloadController
|
|
|
|
Model download controller.
|
|
|
|
##### cancel
|
|
|
|
Cancel the request to download if this is called.
|
|
|
|
Type: function (): void
|
|
|
|
##### promise
|
|
|
|
A promise resolving to the downloaded models config once the download is done
|
|
|
|
Type: [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<ModelConfig>
|