Models

Every tracked request is classified into a model type that determines which metrics are extracted, which pricing formula is applied, and how the request is rendered in the dashboard. The provider's handler automatically detects the model type from the endpoint and response shape, so no additional configuration is needed.

Supported Providers

OpenAI
Anthropic
Google AI
Groq
xAI
OpenRouter
Ollama
Replicate
Cohere
Mistral

To support a provider not listed here, see Custom Providers.

Model Types

Text Completions

Text generation is the most widely supported model type, covering chat completions, message generation, and content generation across all major providers.

Metric	Column	Description
Prompt tokens	`prompt_tokens`	Number of input tokens sent to the model
Completion tokens	`completion_tokens`	Number of output tokens generated by the model
Cached tokens	Deducted from prompt cost	Tokens served from the provider's prompt cache at a discounted rate
Reasoning tokens	`reasoning_tokens`	Tokens used for internal reasoning (o-series, thinking models)
Finish reason	`finish_reason`	Why generation stopped — `stop`, `length`, `tool_calls`, `end_turn`, etc.

Cached Token Pricing

Providers such as OpenAI, Anthropic, Google, and xAI offer prompt caching, where previously seen input tokens are served at a reduced rate. Spectra detects cached tokens automatically and applies the discounted price:

regular_prompt_tokens = prompt_tokens - cached_tokens
prompt_cost = (regular_tokens × input_price + cached_tokens × cached_price) / 1,000,000

Embeddings

Embeddings convert text into dense vector representations used for semantic search, clustering, and similarity comparisons. Unlike text completions, embedding models produce no output tokens — the cost is based entirely on input token count.

Metric	Column	Description
Prompt tokens	`prompt_tokens`	Number of input tokens in the text being embedded
Completion tokens	`completion_tokens`	Always `0` for embeddings

When embedding multiple texts in a single request (batch embeddings), Spectra tracks the total token count across all inputs.

Image Generation

Image generation APIs typically return temporary URLs that expire after a set period. When media persistence is enabled, Spectra downloads and stores generated images before the URLs expire.

Metric	Column	Description
Image count	`image_count`	Number of images generated in the request

Image models are priced per image. The cost depends on the model, image dimensions, and quality settings.

Video Generation

Video generation is typically asynchronous — you submit a creation request and then poll for the result. Spectra's video handlers implement the SkipsResponse interface, persisting only the final completed response with full metrics.

Metric	Column	Description
Video count	`video_count`	Number of videos generated
Duration	`duration_seconds`	Total duration of the generated video in seconds

Video models are priced per video generated. Media persistence works the same way as for images.

Text-to-Speech

TTS endpoints return binary audio data (MP3, Opus, etc.) rather than JSON. The response payload is not stored in the database — only the request parameters and extracted metrics are recorded.

Metric	Column	Description
Input characters	`input_characters`	Number of characters in the input text

TTS models are priced per character or per million characters, depending on the provider.

Speech-to-Text

STT requests use multipart form data rather than JSON — the audio file is sent as a file attachment. Spectra handles this automatically.

Metric	Column	Description
Duration	`duration_seconds`	Length of the input audio in seconds
Prompt tokens	`prompt_tokens`	Token count from the transcription (when available)
Completion tokens	`completion_tokens`	Token count from the output (when available)

STT models are priced per minute of input audio.

Not Yet Supported

Category	Example Models	Reason
Realtime (WebSocket)	gpt-4o-realtime-preview, gpt-4o-mini-realtime-preview	Realtime models use persistent WebSocket connections rather than HTTP request/response cycles

Models ​

Supported Providers ​

Model Types ​

Text Completions ​

Cached Token Pricing ​

Embeddings ​

Image Generation ​

Video Generation ​

Text-to-Speech ​

Speech-to-Text ​

Not Yet Supported ​