Configuring Generative AI
Configurationâ
A Generative AI provider can be configured in the global config, which will make the Generative AI features available for use. There are currently 4 native providers available to integrate with Frigate. Other providers that support the OpenAI standard API can also be used. See the OpenAI-Compatible section below.
To use Generative AI, you must define a single provider at the global level of your Frigate configuration. If the provider you choose requires an API key, you may either directly paste it in your configuration, or store it in an environment variable prefixed with FRIGATE_.
Local Providersâ
Local providers run on your own hardware and keep all data processing private. These require a GPU or dedicated hardware for best performance.
Running Generative AI models on CPU is not recommended, as high inference times make using Generative AI impractical.
Recommended Local Modelsâ
You must use a vision-capable model with Frigate. The following models are recommended for local deployment:
| Model | Notes |
|---|---|
qwen3-vl | Strong visual and situational understanding, strong ability to identify smaller objects and interactions with object. |
qwen3.5 | Strong situational understanding, but missing DeepStack from qwen3-vl leading to worse performance for identifying objects in people's hand and other small details. |
Intern3.5VL | Relatively fast with good vision comprehension |
gemma3 | Slower model with good vision and temporal understanding |
qwen2.5-vl | Fast but capable model with good vision comprehension |
Each model is available in multiple parameter sizes (3b, 4b, 8b, etc.). Larger sizes are more capable of complex tasks and understanding of situations, but requires more memory and computational resources. It is recommended to try multiple models and experiment to see which performs best.
You should have at least 8 GB of RAM available (or VRAM if running on GPU) to run the 7B models, 16 GB to run the 13B models, and 24 GB to run the 33B models.
Model Types: Instruct vs Thinkingâ
Most vision-language models are available as instruct models, which are fine-tuned to follow instructions and respond concisely to prompts. However, some models (such as certain Qwen-VL or minigpt variants) offer both instruct and thinking versions.
- Instruct models are always recommended for use with Frigate. These models generate direct, relevant, actionable descriptions that best fit Frigate's object and event summary use case.
- Reasoning / Thinking models are fine-tuned for more free-form, open-ended, and speculative outputs, which are typically not concise and may not provide the practical summaries Frigate expects. For this reason, Frigate does not recommend or support using thinking models.
Some models are labeled as hybrid (capable of both thinking and instruct tasks). In these cases, it is recommended to disable reasoning / thinking, which is generally model specific (see your models documentation).
Recommendation:
Always select the -instruct or documented instruct/tagged variant of any model you use in your Frigate configuration. If in doubt, refer to your model provider's documentation or model library for guidance on the correct model variant to use.