⚙️Models Management

Sanctum lets you run open-source large language models (LLMs) on your computer using either the CPU or GPU.

How to choose and download a model?

Sanctum makes it easy to discover and download models via its built-in manager, integrated with Hugging Face. Visit Models > Featured for top picks, or Models > Explore to browse all GGUF models.

You'll see different versions of models, each with varying resource requirements. Sanctum highlights compatible models with a green checkmark and shows details like memory needs, disk space, and popularity to help you choose.

How to change the models directory?

Need to change where models are stored? Go to Settings > Storage and select "Change Folder" to update the directory. This allows you to easily organize model storage or move them to a different drive.

How to remove model?

To manage your downloaded models, head to Models > My Models. You can remove individual models by clicking the trash bin icon. Alternatively, go to Settings > Storage to remove all models at once if you need a clean slate.

What is GGUF?

GGUF (GPT-Generated Unified Format) is an optimized format to run large language models on standard CPUs, making AI accessible without specialized hardware. Key features:

CPU Optimization: Runs models smoothly on desktop CPUs, with optional GPU support.
Reduced Resource Usage: Uses quantization for efficiency.
Portability: Minimal dependencies allow use across systems.

This makes transformer models available locally, without relying on the cloud.

How to customize model?

In regular mode, you can configure the following settings:

Model Preset: Select from predefined configurations optimized for different models.
Enable GPU: Boost performance for computationally intensive tasks.
GPU Layers: Control how many layers of the model are processed on the GPU, allowing you to balance performance and resource usage.
Context Length: Set the maximum number of tokens (words, characters, or parts of words) the model can consider from the conversation history when generating a response. Keep in mind that longer context lengths may slow down model performance.

For more advanced customization, turn on the dev mode. And check the Advanced Model Settings for further instructions.

PreviousKeyboard Shortcuts NextModel Performance

Last updated 7 months ago

Was this helpful?