You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

main

hybrid_search

release-v330

add-sysreq-doc

thilotee-addmodel4

improve-local-server-ccache

singleton_instance

proposed_temp_mod

feat/change-link-to-new-website

new-server-backend

readme-add-blog

AndriyMulyar-patch-13

recursive_text_chunking

brave_search_tool

fix-embedding-device

bump_for_v3.2.1

bugfix_release

fix-cuda-arch-defaults

fix-py-cpu-segv

update_news

release-v3.2.0

fix-macos-target

whitelist_langs

pull-2403

update_changelog

fix-disabled-translations

fix_translation_default_model

ci-update-xcode

fix-macos-build-aug8

update-changelog-aug7

enable-translations

py-release-280

thilotee-addmodel3

llmodel-fixes

add_tooltemplate

fix_compare_versions

thilotee-addmodel2

thilotee-addmodel1

update_llamacpp_rope_fix

update_translations

revert-2731-models_json_310

release_notes_3_1_0

fix_prompt_settings

inc_timeout_ci

docs/obsidian_tutorial

gha/add-basic-automatic-ci

whitelist-deepseek2

fix-gemma2-crash

thiloteE-readme1

update-llamacpp

fix_settings_translations

fix-llmodeltype-regression

add_new_maintainer

fix-localdocs-link-to-nomic-embed-docs

maint-add-jstayco

ci-update-win2022

remove_gptj

case_insensitive_pdf

settings_to_enums

jared/update-maintainers

fix-llmodel-search-path

romanian_translation

translations

dynamic_language_change

translation_zh_CN

akgom-patch-5

add_translations

akgom-patch-4

akgom-patch-2

feedback-plugin

links_for_latestnews

fix_for_2609

fix-kompute-cpu-fallback

suggested_followups

try_cuda_11_8

akgom-patch-3

cookbook-googledrive-formatting

akgom-patch-1

change_stop_generation

better_model_thinking_animation

fix_for_2519

fix_chatview_scrolling

fix-blank-cloned-model

fix_linux_folder_dialog

align_combobox_and_menu_styles

fix_dialogs

jul-2-docs-updates

unrelease-301

release_notes_v3.0.0

macos-online-workflow-offline-off

add-online-workflow

fixup-windows-offline-signing

windows-online-worklfow

windows-offline-signing-workflow

remove-alt-logo

update-site-date-description

fix_erase_icon

mkdocs-material-imaging-reqs

v3-docs-markdown-captions

v3-docs-max

fix-long-input-crash

remove_rehighlight

less_bottom_padding

fix_response_gen

fix_reload_button

latest_rc5_fixes

increase_sz_conversation_tray

latest_v3.0.0-rc4_fixes

chatview_and_combobox_ui_fixes

fix_text_processsing_perf

newui-fixes-5

fix-addcollection-labels

fix_loadmodel_button

fix-darwin-app-signing

fix_hovered_link

open_markdown_links

improved_markdown

aaron/fdas

newui-fixes-3

markdown_support

change_localdocs_sources_display

fix_font_sizes

macos/cloud-signing-workflow

newui-bugfixes

fix-emb-setthreads

major_new_ui_redesign

add-backend-doc-notice

configurable-doc-exts

newui-refactor-mysettings

fix-localdocs-stale

fix-cuda-arch-default

nightly-offline-build-4-ui-redesign

fix-moc-build-failure

fix-backend-includes

win-suppress-dll-errors

fix-win-unicode-libpath

v281-dev-console

cuda-early-alloc

fix-embed-over-512

fix-embedding-after-cuda

v280-release-notes

localdocs_changes

new_icon_set

major_new_ui_redesign_draft

gpt4all-2.8.0-pre1

fix-metal-build

fix-missing-batch-free

fix-archless-gguf-crash-uniqueptr

shutdown_embedding_thread

fix-reload-progress

new-chat-fixes

improve-chat-ctxmenus

fix-send-while-responding

fix-generated-name

fix-win-icon

add-cuda-support-wip

remove-docker-server

readme-update

release-notes-sign-up

release-275

mixpanel-device-stats

fix-opt-out

fix-nomic-embed-error

fix-localdocs-startup-event

v274-release-notes

fix-msvc-cpuid

add-llama3-instruct

feat/remove-town-hall

localdocs-fixes

temp-revert-new-ui

fix-codeblock-trimming

gpt4all-2024-roadmap

py-suppress-gpu-stderr

linux-273-debug

localdocs_contextlinks

embed4all-dynamic

docs/roadmap-update

py-listgpu

ui_changes

pull-2045

fix-py39-py310-typeddict

fix-unicode-paths

fix-readme-quants

py-fix-partial

fix_server_colors

ui_redesign

python-doc-updates

intel-mac-test

split_main

fix_rmodel_convert

restrict_chat_width

rework_chat_panel

fix_2105

manyoso-patch-6

fix_stale_model_settings

load-checkpoint

fix_2092

necessary_sort

batch_updatedata

fix_clones_2087

model-warnings

fix_for_2080

bindings-expose-fakereply

fix-resetcontext-ub

fix-linux-downloadbtn

fix_another_crasher

ts-async-stream

ci-separate-holds

pr-1897

build-chat-271

add-gpu-model-arches

fix_chatgpt_crash

fix-mistral-fname

update-modelsjson

add-models3

chatml-fix

add-gemma

save_window_geometry

model_loading_revamp

gpu-diag

network_check

pr-1915

limit-localdocs-exts

remove_docx

enable-pascal-gpus

cfg-gpu-layers

update-llamacpp-vulkan

new_ui_themes

py-bindings-readme

ci-confchange-runall

update-llama.cpp

jacoobes-patch-1

fix/macm1ts

configurable-ctx

fix-direct-avx2-link

fix-deserialize-assert

update-models-list-v260

network_retry

manyoso-patch-5

manyoso-patch-4

localdocs_fix

localdocs_v2

feat-ts/streaming

building-qt-add-libs

fix-old-lib-refs

ggufv3

more-words

remove-old-chat

v2.5.1

readme-gguf-note

py-default-ext

dll-namepat

executable-scripts

fix-main-qml

replace-ggml-refs

dll-load

gguf-python

v2.5.0

fix-model-urls

replit-1.5

llama-log-errors

remove-star-hist

py-improvements

update-py-bindings

py-win-mingw-path

kp-logger-fix

mmcleanup

fix-autoconfig

py-macos-version

no-mv-only-mm

miniorca3b-up

more-mm

replace-pkg-resources

embedding-default-model-fix

quiet-by-default

gguf-mm

always_save_chats

restore_state_from_text

change-issue-template

quiet-codespell

clearer-fallback-msg

apage43-patch-1

gguf_latest_llama

cpu-fallback-reason

atreat_latest_kernel_refactor

vulkan_bert

matmatwip

circleci-installers

gguf-mac-build-fix

vulkan_subgroups

deverbosify

fixvulkanwinmsvc

dynlog

actual_device

dupe-gpu-fix

dupe-gpu-name-fix

vkpy2

pybump-vkagain

py-vklinking

use-vk-nodynamic

m3zh-patch-1

vulkan_backend

vkwinfix

python-cleanup-3

crosspath

niansa-patch-7

fix/tstests

feat(typescript)/dynamic-template

font-sizing

evaluate

llamabump714

scrollbar-fix

light-mode-fix

unicode-decoding

pretty-jazz-62

tgi-oss

fixstarcoderjson

light-mode-chat

gpt4all-api-monitoring

fix-batched

AndriyMulyar-patch-12

rguo123/embed4all-js-fork

quantize

AndriyMulyar-patch-11

starcoder

bert_fixes

bert_latest

more-highlighting-rules

modelclone

system_prompts

subdirmodels

prefer7b

busy_for_modelsjson

7b_preferred

avxonlyapi

openai-fix

cmake-exportbuild

threadcount

mpt

json-highlighting

kvcswap

backend_impl_cleanup

pycontextfix

per_model

settings_dialog_redesign

go-bash

dlopen_gpu

python-bindings-bugfix-2

python-bindings-bugfix

modelmem2

python-bindings-love-broken

modelmem

replit-formatsync

java-highlighting

dlopen_gpu_rebase

triton-inference

consolidate_settings

settings_refactor

falcon-mem

forcemetal

mem-req-calc

modellist

mainline-llama-up

niansa-patch-6

niansa_replit_warn_fix

threads

more_fixes

fixes

typescript

token_speed

models_update

deprecated

rguo123/no-stdio

mb-mib

code-style-unification

kquant-fix

niansa-patch-5

niansa-patch-4

llmodel_better_promptfnc

rguo123/metalreplit-update

metalreplit

llmodel-shared-toktoidsv

llama-mainline-up

rguo123/pypi-ver-bump

rguo123/windows-debug

recalc_bos

fix_codespell

prompt_syntax

cpp_syntax

syntax_highlighting

refactor_context_links

metal-wip

niansa-patch-3

kquant_llama_fix

always_sync

llmodel_c_test

niansa-patch-2

update_llama_cpp

circleci_for_gpt4all

rguo123/update-models-json

circleci_for_gpt4all_v2

python_bugfix_download_hf

AndriyMulyar-patch-10

revert-chatgpt-context-bug

localdocs-label

manyoso-patch-3

release_notes_2.4.5

rguo123/update-pypi

AndriyMulyar-patch-9

remove_older_models

manyoso-patch-2

minimum_hardware

backend_prompt_dedup

junior

niansa-patch-1

rguo123/python-bindings-ggmlver-update

modelsjson-spellfix

recalcuatecontext_nonvirtual

localdocs_servermode

dlopen_backend_5

dlopen_backend_4

dlopen_backend_3

revert-747-dlopen_better_implementation_management

fix_warnings

dlopen_backend_2

hotfix/python-download-model-bug

adamt_localdocs

dlopen_backend

fix_build

settings_dialog_fixes

ui_tweaks

AndriyMulyar-patch-8

Yuvanesh-ux-patch-1

zanussbaum-patch-1

dedup_qml

fix_folderdialog

fixtab_borders

mlock_true_apple

mlock_true

AndriyMulyar-patch-7

rguo123/gpt4all-wiki

rguo123-close-issues-patch-2

rguo123/python-streaming

rguo123/small-doc-fixes

chat-doc-fixes

AndriyMulyar-patch-6

AndriyMulyar-docs-typo

fix_installers

AndriyMulyar-gpt4all-chat-docs

chatclient_docs

adamt_misc

AndriyMulyar-patch-5

rguo123/pr_template_fix_2

AndriyMulyar-patch-4

AndriyMulyar-patch-3

AndriyMulyar-patch-2

AndriyMulyar-patch-1

server_lifetime_mgmt

threaded_memory_mgmt

rguo123-readme-patch-1

modal_labs_python_docs

update_readme

httpserver

rguo123/mpt-python-bindings

manyoso_cleanup_chatllm

clear-cloudfront-cache

rguo123/docs-cicd

manyoso-patch-1

rguo123/pr-template-fix

fix_mpt_ggml

readme_update

pythia

duplicates

mosaic

accel_eval

license

gptj_eval

roadmap

train

eval

python-v2.8.2

python-v2.8.1

v3.2.1

v3.2.0

python-v2.8.0

v3.1.1

v3.1.1-web_search_beta_2

v3.1.0-web_search_beta

v3.1.0

v3.0.0

v3.0.0-rc5

v3.0.0-rc4

v3.0.0-rc3

v3.0.0-rc2

v3.0.0-rc1

v2.8.0

python-v2.6.0

v2.7.5

v2.7.4

python-v2.5.1

python-v2.5.0

python-v2.4.0

python-v2.3.2

python-v2.3.1

python-v2.3.0

python-v2.2.1.post1

python-v2.2.1

python-v2.2.0

python-v2.1.0

python-v2.0.2

python-v2.0.1

python-v2.0.0

python-v2.0.0rc2

python-v2.0.0rc1

v2.7.3

v2.7.2

v2.7.1

v2.7.0

v2.5.1

v2.5.0

v2.6.2

v2.6.0

v2.5.4

v2.5.3

v2.5.2

v2.4.19

v2.4.18

v2.4.17

v2.4.16

python-v1.0.11

python-v1.0.12

v2.5.0-pre1

v2.6.1

v2.8.0-pre1

3.4 KiB

Raw Blame History Unescape Escape

Monitoring

Leverage OpenTelemetry to perform real-time monitoring of your LLM application and GPUs using OpenLIT. This tool helps you easily collect data on user interactions, performance metrics, along with GPU Performance metrics, which can assist in enhancing the functionality and dependability of your GPT4All based LLM application.

How it works?

OpenLIT adds automatic OTel instrumentation to the GPT4All SDK. It covers the generate and embedding functions, helping to track LLM usage by gathering inputs and outputs. This allows users to monitor and evaluate the performance and behavior of their LLM application in different environments. OpenLIT also provides OTel auto-instrumentation for monitoring GPU metrics like utilization, temperature, power usage, and memory usage.

Additionally, you have the flexibility to view and analyze the generated traces and metrics either in the OpenLIT UI or by exporting them to widely used observability tools like Grafana and DataDog for more comprehensive analysis and visualization.

Getting Started

Here’s a straightforward guide to help you set up and start monitoring your application:

1. Install the OpenLIT SDK

Open your terminal and run:

pip install openlit

2. Setup Monitoring for your Application

In your application, initiate OpenLIT as outlined below:

from gpt4all import GPT4All
import openlit

openlit.init()  # Initialize OpenLIT monitoring

model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf')

# Start a chat session and send queries
with model.chat_session():
    response1 = model.generate(prompt='hello', temp=0)
    response2 = model.generate(prompt='write me a short poem', temp=0)
    response3 = model.generate(prompt='thank you', temp=0)

    print(model.current_chat_session)

This setup wraps your gpt4all model interactions, capturing valuable data about each request and response.

3. (Optional) Enable GPU Monitoring

If your application runs on NVIDIA GPUs, you can enable GPU stats collection in the OpenLIT SDK by adding collect_gpu_stats=True. This collects GPU metrics like utilization, temperature, power usage, and memory-related performance metrics. The collected metrics are OpenTelemetry gauges.

from gpt4all import GPT4All
import openlit

openlit.init(collect_gpu_stats=True)  # Initialize OpenLIT monitoring

model = GPT4All(model_name='orca-mini-3b-gguf2-q4_0.gguf')

# Start a chat session and send queries
with model.chat_session():
    response1 = model.generate(prompt='hello', temp=0)
    response2 = model.generate(prompt='write me a short poem', temp=0)
    response3 = model.generate(prompt='thank you', temp=0)

    print(model.current_chat_session)

Visualize

Once you've set up data collection with OpenLIT, you can visualize and analyze this information to better understand your application's performance:

Using OpenLIT UI: Connect to OpenLIT's UI to start exploring performance metrics. Visit the OpenLIT Quickstart Guide for step-by-step details.
Integrate with existing Observability Tools: If you use tools like Grafana or DataDog, you can integrate the data collected by OpenLIT. For instructions on setting up these connections, check the OpenLIT Connections Guide.

3.4 KiB Raw Blame History Unescape Escape