# deepseek/deepseek-v4-flash > Designed for ultimate responsiveness and cost efficiency, DeepSeek V4 Flash is a 284B parameter (13B activated) Mixture-of-Experts model from DeepSeek. It integrates hybrid attention to streamline long-context processing while maintaining exceptional reasoning and coding output during high-throughput workloads. Built-in execution support for high and xhigh (max reasoning) efforts offers scalable logic depth. It is technically primed for demanding integration scenarios, including agent workflows, coding assistants, and real-time chat systems. ## Overview - **Endpoint**: `https://api.shortapi.ai/v1/chat/completions` - **Model ID**: `deepseek/deepseek-v4-flash` - **Category**: llm - **Kind**: text-generation ## Pricing Input: $0.0028/M (cache hit) · $0.14/M (cache miss) · Output: $0.28/M For more details, please check our pricing page. ## API Information This model can be used via our HTTP API or more conveniently via our client libraries. See the input and output schema below, as well as the usage examples. ### Input Schema The API accepts the following input parameters (standard OpenAI format): - **`model`** (`string`, _required_): Model ID - **`messages`** (`array`, _required_): List of conversation messages - **`role`** (`string`, _required_): Message Role - Options: "system", "user", "assistant", "tool", "developer" - **`content`** (`string | array`, _required_): Message content - **`type`** (`string`, _optional_): - Options: "text", "image_url", "input_audio", "file", "video_url" - **`text`** (`string`, _optional_): - **`image_url`** (`object`, _optional_): - **`url`** (`string`, _required_): Image URL or base64 - **`detail`** (`string`, _optional_): - Options: "auto", "low", "high" - **`input_audio`** (`object`, _optional_): - **`data`** (`string`, _optional_): Base64 encoded audio data - **`format`** (`string`, _optional_): - Options: "wav", "mp3" - **`file`** (`object`, _optional_): - **`file_id`** (`string`, _required_): File ID - **`filename`** (`string`, _optional_): - **`file_data`** (`string`, _optional_): - **`video_url`** (`object`, _optional_): - **`url`** (`string`, _optional_): - **`name`** (`string`, _optional_): Sender's Name - **`tool_calls`** (`array`, _optional_): - **`id`** (`string`, _required_): - **`type`** (`string`, _required_): - **`function`** (`object`, _optional_): - **`name`** (`string`, _optional_): - **`arguments`** (`string`, _optional_): - **`tool_call_id`** (`string`, _optional_): Tool invocation ID (used for messages of the tool role) - **`reasoning_content`** (`string`, _optional_): Reasoning content - **`temperature`** (`number`, _optional_): Sampling temperature - Default: `1` - Range: `0` to `2` - **`top_p`** (`number`, _optional_): Nuclear sampling parameters - Default: `1` - Range: `0` to `1` - **`stream`** (`boolean`, _optional_): Is it a streaming response? - Default: `false` - **`stream_options`** (`object`, _optional_): - **`include_usage`** (`boolean`, _optional_): - **`stop`** (`string | array`, _optional_): Stop sequence - **`max_tokens`** (`integer`, _optional_): Maximum number of generated tokens - **`max_completion_tokens`** (`integer`, _optional_): Maximum number of completed tokens - **`presence_penalty`** (`number`, _optional_): - Default: `0` - Range: `-2` to `2` - **`frequency_penalty`** (`number`, _optional_): - Default: `0` - Range: `-2` to `2` - **`logit_bias`** (`object`, _optional_): - **`user`** (`string`, _optional_): - **`tools`** (`array`, _optional_): - **`type`** (`string`, _required_): - **`function`** (`object`, _required_): The function definition - **`name`** (`string`, _required_): - **`description`** (`string`, _optional_): - **`parameters`** (`object`, _optional_): Parameter definitions in JSON Schema format - **`tool_choice`** (`string | object`, _optional_): - **`type`** (`string`, _optional_): - **`function`** (`object`, _optional_): The function definition - **`name`** (`string`, _optional_): - **`response_format`** (`object`, _optional_): - **`type`** (`string`, _optional_): - Options: "text", "json_object", "json_schema" - **`schema`** (`object`, _optional_): JSON Schema definition - **`seed`** (`integer`, _optional_): - **`reasoning_effort`** (`string`, _optional_): Inference strength (the model used to support the inference) - Options: "low", "medium", "high" - **`modalities`** (`array`, _optional_): - **`audio`** (`object`, _optional_): - **`format`** (`string`, _optional_): - **`voice`** (`string`, _optional_): ### Output Schema The API returns a standard OpenAI JSON response. ```json { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677858242, "model": "gpt-4", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 10, "completion_tokens": 25, "total_tokens": 35, "prompt_tokens_details": { "cached_tokens": 0, "text_tokens": 10, "audio_tokens": 0, "image_tokens": 0 }, "completion_tokens_details": { "text_tokens": 25, "audio_tokens": 0, "reasoning_tokens": 0 } }, "system_fingerprint": "fp_44709d6fcb" } ``` ## Use Example ### Bash (cURL) ```bash curl --request POST \ --url https://api.shortapi.ai/v1/chat/completions \ --header "Authorization: Bearer $SHORTAPI_KEY" \ --header "Content-Type: application/json" \ --data '{ "model": "deepseek/deepseek-v4-flash", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello, how are you?" } ], "temperature": 0.7, "max_tokens": 384000 }' ``` ### JavaScript (Fetch API) ```javascript const response = await fetch(`https://api.shortapi.ai/v1/chat/completions`, { method: "POST", headers: { "Authorization": `Bearer ${SHORTAPI_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ model: "deepseek/deepseek-v4-flash", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Hello, how are you?" } ], temperature: 0.7, max_tokens: 384000 }) }); const data = await response.json(); ``` ### Python (Requests) ```python import requests url = "https://api.shortapi.ai/v1/chat/completions" payload = { "model": "deepseek/deepseek-v4-flash", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello, how are you?" } ], "temperature": 0.7, "max_tokens": 384000 } headers = { "Authorization": f"Bearer {SHORTAPI_KEY}", "Content-Type": "application/json" } response = requests.post(url, json=payload, headers=headers) data = response.json() ```