System Docs

Page endpoints and external endpoints are protocol adapters. They do not call each other over HTTP; they enter the same generation, billing, scheduling, and storage path. Default deployments enable self-use mode: public registration is closed and the first startup creates a local super admin with a random password.

Request Routing Diagram

For ordinary image/responses requests, user custom API keeps the highest priority for now. When it wins, GPT2IMAGE does not charge account credits or external API key quota. Agent and explicitly Codex/Responses-only entries ignore user custom API. External endpoints do not call internal /api/images/* routes.

Entry
Page text-to-imageimage_generation
POST /api/images/generate
Page image editimage_edit
POST /api/images/edit
Page image chatchat
POST /api/images/chat
Page Agent image runagent
POST /api/images/chat
External image APIimage_generation
POST /v1/images/generations
External edit APIimage_edit
POST /v1/images/edits
External Responses APIresponses
POST /v1/responses
External Agent image APIagent
POST /v1/agents/images
Unified Handler
  1. 1Validate session or external API key
  2. 2Convert page forms or OpenAI-compatible requests into unified run parameters
  3. 3Calculate credits and moderation cost
  4. 4Call runImageGenerationForUser for the shared generation path
Group Selection
  1. 1External API key bound group first
  2. 2Then the user's selected image backend group
  3. 3Then the enabled default group
  4. 4Group checks plan access, enabled state, and content safety setting
Backend Target
User Custom API

If the user configured an OpenAI-compatible API, ordinary image/responses requests use it first. When it wins, useCredits=false, so GPT2IMAGE account balance and API key quota are not charged.

Web Account Pool

Uses the ChatGPT Web path for page generation, edit, and image chat.

Codex/Responses Pool

Uses Responses semantics for responses and can convert image generation/edit into responses requests.

External API Backend

Admin-configured OpenAI-compatible Base URL/API Key; calls images or responses endpoints by request type.

How The Six Endpoints Relate

The relationship is not external API -> page API. It is multiple adapters -> one shared service layer.

Three page endpoints
/api/images/generate, /api/images/edit, /api/images/chat
Browser-session entrypoints that adapt page forms, reference images, and internal stream events.
Agent mode
/api/images/chat + agentMode=true
Enables a Codex-style tool loop and automatic image iteration inside the page Chat endpoint.
Four external image endpoints
/v1/images/generations, /v1/images/edits, /v1/responses, /v1/agents/images
/api/v1/* is an alias to the same handlers; these adapt API keys and OpenAI-compatible request/response formats.
Shared core
runImageGenerationForUser
Credits, moderation, queueing, backend pool selection, error marking, cooldowns, refunds, and storage live here.
Backend execution
generateImage / editImage / generateChatImage
The selected member is converted to a ChatGPT Web, Codex/Responses, or external API request.
Page Agent Mode

Agent is a Codex-style automatic run mode. The page version reuses /api/images/chat and shows task cards; /v1/agents/images exposes the same run style as JSON/SSE for external clients.

  • Enabled only when Codex/Responses capability is available; the Web branch does not run Agent tools.
  • Default tools include image_generation, web_search, and continue_generation. The backend does not force tool_choice so the model can combine search, image generation, and continuation.
  • Each round shows Agent task cards such as web search, tool compatibility adjustment, image generation, streaming preview, and continue/stop decisions.
  • Uploaded text/code files can be read as request context; prompted server filesystem paths are not read.
  • Max rounds are configurable. With force rounds enabled, Agent runs the selected number of rounds; otherwise the model decides whether to continue through continue_generation.
  • Draft images from multiple rounds are stored as iteration variants, with the last image selected as the default final output.
  • Billing has a base Agent round charge plus actual image output credits. The default is 3 credits per Agent round, controlled by the Plan Capability Matrix.
  • External /v1/responses is not Agent. It adapts the OpenAI Responses protocol and does not automatically enable the Agent tool loop.
  • generate_image_batch-style concurrent batch tooling is not wired in yet to avoid breaking Responses native state and linear iteration.
External API Reference

This documents the currently supported OpenAI-compatible surface. Bold fields are GPT2IMAGE extensions or compatibility additions, not standard OpenAI fields.

Base URL
https://your-domain.example

Common Rules

  • All external endpoints require Authorization: Bearer <GPT2IMAGE API key>.
  • Image generation and image edits require Starter or higher; Responses requires Pro or higher; Agent image runs require Ultra by default and can be changed with externalApi.agent in the Plan Capability Matrix.
  • /api/v1/* and /v1/* use the same handlers; they are path aliases.
  • response_format controls URL vs base64; output_format controls the image file format. They are different fields.
  • Error responses use an OpenAI-style error object. GPT2IMAGE may also return generation_id, generationId, and credits_consumed for debugging and reconciliation.
  • A backend group bound to the external API key wins first. Otherwise the user's default group is used, then the enabled platform default group.
  • Backend group billing multipliers are applied to pre-charge, settlement, refunds, and usage records. When a mixed parent group dispatches to a child group member, the parent and child multipliers are multiplied.
  • External API keys can have independent credit limits. GET /v1/credits returns key quota, used credits, and account balance.
  • If the user has enabled a custom upstream API, ordinary /v1/images/generations, /v1/images/edits, and /v1/responses still use that custom API first. When it wins, credits_consumed is 0 and GPT2IMAGE does not charge account credits or API key quota.
  • /v1/agents/images and page features that require Codex/Responses capability ignore user custom API and are billed through the platform or external backend pool.
  • Image endpoint force_web / forceWeb only applies after routing enters a platform mixed backend group; it does not override user custom API.
GET/v1/modelsNo request body

List models

Compatible with OpenAI List models. Lists image models and Responses models visible to the current API key's user.

Request Example

curl https://your-domain.example/v1/models \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY"

Response Example

{
  "object": "list",
  "data": [
    {
      "id": "gpt-image-2",
      "object": "model",
      "created": 0,
      "owned_by": "gpt2image"
    }
  ]
}

Request Fields

Authorization
Required header
Bearer <GPT2IMAGE API key>.

Response And Streaming

object
Always list.
data[].id
Model ID. Includes exposed image models and Responses models available to the current plan.
data[].object / created / owned_by
Compatible with the OpenAI model object shape.

Implementation Notes

  • Only model listing is implemented; /v1/models/{model} is not implemented.
  • Returned models are filtered by plan. Ultra users can see additional Responses models.
GET/v1/creditsNo request body

Get credits

Returns the current Bearer API key's credit limit, used credits, remaining credits, and owning account balance.

Request Example

curl https://your-domain.example/v1/credits \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY"

Response Example

{
  "object": "credit_balance",
  "account": {
    "balance": 15702.45,
    "total_earned": 20000,
    "total_spent": 4297.55,
    "status": "active"
  },
  "api_key": {
    "credit_limit": 1000,
    "credits_used": 12.7,
    "credits_remaining": 987.3,
    "unlimited": false
  }
}

Request Fields

Authorization
Required header
Bearer <GPT2IMAGE API key>.

Response And Streaming

account.balance
Current available credits on the owning account.
api_key.credit_limit
Total limit for this API key; null means unlimited.
api_key.credits_used / credits_remaining
Used and remaining quota for this key. credits_remaining is null when unlimited.

Implementation Notes

  • The API key quota only limits this key. Calls through the GPT2IMAGE-billed platform path still require enough account credits.
  • When a user custom upstream API wins, GPT2IMAGE does not charge account credits or key quota.
  • Failed-generation refunds, moderation settlement, and actual-size corrections also update key usage.
POST/v1/images/generationsapplication/json

Create image

Compatible with OpenAI Images generation. Requests become image_generation jobs in the shared generation path.

Request Example

# 1. Official Images-style request. b64_json is the default.
curl https://your-domain.example/v1/images/generations \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A cute baby sea otter",
    "n": 1,
    "size": "1024x1024",
    "quality": "medium",
    "moderation": "auto"
  }'

# 2. Return a URL and disable GPT2IMAGE prompt optimization.
curl https://your-domain.example/v1/images/generations \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "A cyberpunk city at night after rain, neon reflections",
    "n": 2,
    "size": "1024x1024",
    "quality": "high",
    "moderation": "low",
    "response_format": "url",
    "output_format": "webp",
    "output_compression": 85,
    "prompt_optimization": false
  }'

# 3. Codex/Responses backend-only parameters. Plain Images API backends may ignore them.
curl https://your-domain.example/v1/images/generations \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "Create a 16:9 product campaign poster",
    "size": "1536x864",
    "response_format": "url",
    "output_format": "jpeg",
    "output_compression": 90,
    "gptModel": "gpt-5.4",
    "thinking": "high",
    "promptOptimization": false
  }'

# 4. Force Web account scheduling for mixed groups within the configured pixel range. Non-mixed groups ignore force_web.
curl https://your-domain.example/v1/images/generations \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A 1:1 avatar poster",
    "size": "1024x1024",
    "response_format": "url",
    "force_web": true
  }'

# 5. Streaming response. Accept: text/event-stream also enables streaming.
curl -N https://your-domain.example/v1/images/generations \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A transparent glass futuristic coffee cup",
    "size": "1024x1024",
    "response_format": "url",
    "stream": true
  }'

Response Example

{
  "created": 1713833628,
  "data": [
    {
      "url": "https://your-domain.example/api/storage/generations/...",
      "revised_prompt": "..."
    }
  ],
  "generation_id": "gen_...",
  "generationId": "gen_...",
  "credits_consumed": 1.31,
  "usage": null
}

# SSE when stream=true
event: image_generation.partial_image
data: {"type":"image_generation.partial_image","index":0,"partial_image_index":0,"url":"https://your-domain.example/api/storage/generations/..."}

event: image_generation.completed
data: {"type":"image_generation.completed","index":0,"generation_id":"...","generationId":"...","model":"gpt-image-2","size":"1024x1024","credits_consumed":1.31,"url":"https://your-domain.example/api/storage/generations/...","data":[{"url":"https://your-domain.example/api/storage/generations/...","revised_prompt":"..."}]}

Request Fields

prompt
Required
Image prompt, up to 4000 characters.
model
Optional
Image model. GPT2IMAGE accepts gpt-image-* style image models here. Use /v1/responses for Responses chat models.
n
Optional
Number of images, 1 to 10.
size
Optional
Target size. GPT2IMAGE validates the size and rejects invalid values.
quality
Optional
auto, low, medium, or high.
moderation
Optional
auto or low.
response_format
Optional
url or b64_json. Defaults to b64_json. url returns a GPT2IMAGE storage URL.
output_format
Optional
png, jpeg, or webp. Controls the actual output image format; upstream support may vary.
output_compression
Optional
0 to 100, only meaningful for jpeg/webp. Higher values mean higher quality.
stream
Optional
true returns text/event-stream.
promptOptimization / prompt_optimization
Extension
Optional
Controls whether GPT2IMAGE may further optimize prompt. If prompt is already the final optimized prompt, pass false.
gptModel / gpt_model
Extension
Optional
When routed to Codex/Responses accounts, this is the top-level Responses GPT model. Plain Images API backends may ignore it.
thinking
Extension
Optional
minimal, none, low, medium, high, or xhigh. Only applies to Codex/Responses backends; Web or plain Images API backends may ignore it.
force_web / forceWeb
Extension
Optional
Only supported by image endpoints. Ignored when a user custom upstream API takes priority; after routing enters the platform pool, mixed backend groups schedule Web accounts only when the requested total pixels are between IMAGE_FORCE_WEB_MIN_PIXELS and IMAGE_FORCE_WEB_MAX_PIXELS. The default range is 0.66MP-2MP; non-mixed or out-of-range requests ignore this field. Web backends cannot strictly guarantee resolution or 4K output.

Response And Streaming

created
Unix timestamp in seconds.
data[].b64_json / data[].url
Base64 or URL according to response_format.
data[].revised_prompt
Returned when the upstream provides a revised prompt.
generation_id / generationId / credits_consumed
Extension
GPT2IMAGE extension. Non-stream success responses return the generation record ID and GPT2IMAGE-billed credits at the top level; batch requests return generation_ids / generationIds and total credits_consumed. This is 0 when a user custom upstream API wins.
SSE image_generation.partial_image
Only returned with stream=true or Accept: text/event-stream. Represents one partial image.
SSE image_generation.completed
Only returned in streaming mode. Indicates one image is complete; event data includes generation_id, credits_consumed, model, size, and the final image.

Implementation Notes

  • This endpoint does not call page /api/images/generate; it directly enters the shared service layer.
  • When routed to a Responses account, the image request is converted into a Responses image_generation tool request.
  • n/count is one HTTP request. A 10-image request creates 10 generation records and bills 10 outputs. GPT2IMAGE runs batch items with bounded parallelism based on the plan image-generation concurrency; items beyond that concurrency wait inside the same batch.
  • Concurrency and queueing: the runtime uses one in-process image queue. Tasks are sorted by plan queue priority, then FIFO within the same priority, and are started only when both the global concurrency and per-user image-generation concurrency allow it. Global concurrency is configurable in Admin System Settings > Models > Global image generation concurrency; IMAGE_GENERATION_GLOBAL_CONCURRENCY is only the fallback default. Batch requests add a request-local bounded runner, so only the allowed number of batch items are started and the rest wait inside that batch instead of flooding the shared queue.
  • Waiting in a queue does not create a generation record or charge image credits. If the shared queue wait exceeds IMAGE_GENERATION_QUEUE_TIMEOUT_MS, the API returns a 429-style error. The 20-minute runtime timeout starts only after an individual image task begins execution, and timeout settlement follows the failed-generation credit rules.
  • Web backends cannot strictly control output dimensions or output format. GPT2IMAGE labels stored files by the detected image header and MIME.
  • If the actual generated dimensions differ from the requested size, GPT2IMAGE records and bills using the detected actual size.
  • The official Images API may return usage. GPT2IMAGE usually returns usage: null, but GPT2IMAGE-billed credits are returned through top-level credits_consumed, error payloads, or streaming completion events. When a user custom upstream API wins, GPT2IMAGE does not charge credits.
POST/v1/images/editsmultipart/form-data or application/json

Create image edit

Compatible with OpenAI Images edit. multipart uploads files; JSON can reference public image URLs.

Request Example

# 1. multipart upload reference image.
curl https://your-domain.example/v1/images/edits \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -F model="gpt-image-2" \
  -F prompt="Turn the reference image into a cinematic poster" \
  -F n="1" \
  -F size="1024x1024" \
  -F quality="high" \
  -F moderation="auto" \
  -F response_format="url" \
  -F output_format="jpeg" \
  -F output_compression="90" \
  -F 'image[]=@/path/to/reference.png'

# 2. multipart multiple references + mask + Codex/Responses fields.
curl https://your-domain.example/v1/images/edits \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -F model="gpt-image-2" \
  -F prompt="Only redraw the masked area and keep the face unchanged" \
  -F size="1536x1024" \
  -F quality="medium" \
  -F response_format="b64_json" \
  -F promptOptimization="false" \
  -F gpt_model="gpt-5.4" \
  -F thinking="medium" \
  -F 'image[]=@/path/to/person.png' \
  -F 'image_2=@/path/to/style.png' \
  -F mask="@/path/to/mask.png"

# 3. JSON image URLs. Prefer images; image_url/image_urls are shortcuts.
curl https://your-domain.example/v1/images/edits \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "Turn the reference into a clean ecommerce hero image",
    "images": [
      "https://example.com/reference.png",
      { "image_url": "https://example.com/detail.webp" }
    ],
    "image_url": "https://example.com/single-reference.png",
    "image_urls": ["https://example.com/extra.jpg"],
    "mask_url": "https://example.com/mask.png",
    "mask_image_url": "https://example.com/mask-alt.png",
    "n": 1,
    "size": "1024x1024",
    "quality": "auto",
    "moderation": "low",
    "response_format": "url",
    "output_format": "webp",
    "output_compression": 80,
    "prompt_optimization": false,
    "gptModel": "gpt-5.4-mini",
    "thinking": "low"
  }'

# 4. Force Web account scheduling for mixed groups within the configured pixel range. Non-mixed groups ignore force_web.
curl https://your-domain.example/v1/images/edits \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "Keep the person and make it look like a cinematic still",
    "images": ["https://example.com/reference.png"],
    "size": "1024x1024",
    "response_format": "url",
    "force_web": true
  }'

# 5. Streaming image edit.
curl -N https://your-domain.example/v1/images/edits \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Accept: text/event-stream" \
  -F model="gpt-image-2" \
  -F prompt="Keep the composition and convert it to watercolor illustration" \
  -F size="1024x1024" \
  -F response_format="url" \
  -F stream="true" \
  -F 'image=@/path/to/reference.png'

Response Example

{
  "created": 1713833628,
  "data": [
    {
      "url": "https://your-domain.example/api/storage/generations/...",
      "revised_prompt": "..."
    }
  ],
  "generation_id": "gen_...",
  "generationId": "gen_...",
  "credits_consumed": 1.31,
  "usage": null
}

# SSE when stream=true
event: image_edit.partial_image
data: {"type":"image_edit.partial_image","index":0,"partial_image_index":0,"url":"https://your-domain.example/api/storage/generations/..."}

event: image_edit.completed
data: {"type":"image_edit.completed","index":0,"generation_id":"...","generationId":"...","model":"gpt-image-2","size":"1024x1024","credits_consumed":1.31,"url":"https://your-domain.example/api/storage/generations/...","data":[{"url":"https://your-domain.example/api/storage/generations/...","revised_prompt":"..."}]}

Request Fields

prompt
Required
Edit prompt, up to 4000 characters.
image / image[] / image_*
Required for multipart
Reference image files, up to 16 images.
images
Optional for JSON
Image reference array. GPT2IMAGE accepts string URLs or { image_url/url }. file_id is not supported.
mask
Optional
PNG mask file; JSON can provide a mask URL reference.
model
Optional
Image model; must be a gpt-image-* style image model.
n
Optional
Number of outputs, 1 to 10.
size
Optional
Target size.
quality
Optional
auto, low, medium, or high.
moderation
Optional
auto or low.
response_format
Optional
url or b64_json. Defaults to b64_json.
output_format
Optional
png, jpeg, or webp. Controls the actual output image format; upstream support may vary.
output_compression
Optional
0 to 100, only meaningful for jpeg/webp. Higher values mean higher quality.
stream
Optional
true returns text/event-stream.
image_url / image_urls
Extension
Optional JSON or form field
Compatibility shortcut fields. Prefer images; when both are provided, GPT2IMAGE merges them into one reference list and deduplicates by URL.
mask_url / mask_image_url
Extension
Optional JSON or form field
Convenience fields for a mask image URL.
promptOptimization / prompt_optimization
Extension
Optional
Controls whether GPT2IMAGE may further optimize prompt. If prompt is already the final optimized prompt, pass false.
gptModel / gpt_model
Extension
Optional
Same as Create image.
thinking
Extension
Optional
minimal, none, low, medium, high, or xhigh. Only applies to Codex/Responses backends; Web or plain Images API backends may ignore it.
force_web / forceWeb
Extension
Optional
Only supported by image endpoints. Ignored when a user custom upstream API takes priority; after routing enters the platform pool, mixed backend groups schedule Web accounts only when the requested total pixels are between IMAGE_FORCE_WEB_MIN_PIXELS and IMAGE_FORCE_WEB_MAX_PIXELS. The default range is 0.66MP-2MP; non-mixed or out-of-range requests ignore this field. Web backends cannot strictly guarantee resolution or 4K output.

Response And Streaming

created / data[]
Same as /v1/images/generations.
generation_id / generationId / credits_consumed
Extension
GPT2IMAGE extension. Non-stream success responses return the generation record ID and GPT2IMAGE-billed credits at the top level; batch requests return generation_ids / generationIds and total credits_consumed. This is 0 when a user custom upstream API wins.
SSE image_edit.partial_image
Only returned with stream=true or Accept: text/event-stream. Represents one partial edited image.
SSE image_edit.completed
Only returned in streaming mode. Indicates one edited image is complete; event data includes generation_id, credits_consumed, model, size, and the final image.

Implementation Notes

  • URL images are downloaded server-side and checked for public reachability, type, and size.
  • Private networks, localhost, metadata/internal hosts, and URLs with credentials are rejected.
  • Official JSON file_id image references are not implemented. Use public image_url or multipart uploads.
POST/v1/agents/imagesapplication/json or multipart/form-data

Create Agent image run

GPT2IMAGE extension that exposes the page Agent run style to external API clients. It uses Codex/Responses scheduling, web search, tool loop continuation, attachment context, and multi-round image iteration.

Request Example

# 1. JSON Agent image run. Ultra is required by default; admins can change externalApi.agent.
curl https://your-domain.example/v1/agents/images \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "image_model": "gpt-image-2",
    "prompt": "Search public information about Zhejiang Shuangyuan Technology and iterate an enterprise poster",
    "size": "1536x1024",
    "quality": "high",
    "thinking": "medium",
    "agent_max_rounds": 3,
    "agent_force_max_rounds": false,
    "response_format": "url"
  }'

# 2. With reference image URLs. images / image_url / image_urls are merged and deduplicated.
curl https://your-domain.example/v1/agents/images \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4-mini",
    "image_model": "gpt-image-2",
    "prompt": "Analyze this product photo and create an ecommerce poster",
    "images": ["https://example.com/product.png"],
    "size": "1024x1024",
    "agent_max_rounds": 2
  }'

# 3. multipart reference image plus PDF/text attachments.
curl https://your-domain.example/v1/agents/images \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -F model="gpt-5.4" \
  -F image_model="gpt-image-2" \
  -F prompt="Read the attachment and create a trade-show poster" \
  -F size="1536x1024" \
  -F response_format="url" \
  -F agent_max_rounds="3" \
  -F 'image[]=@/path/to/reference.png' \
  -F 'file=@/path/to/company-profile.pdf'

# 4. Streaming Agent events.
curl -N https://your-domain.example/v1/agents/images \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "image_model": "gpt-image-2",
    "prompt": "Search first, then iterate a technology-blue enterprise poster",
    "size": "1536x1024",
    "stream": true,
    "agent_max_rounds": 2,
    "agent_force_max_rounds": true
  }'

Response Example

{
  "object": "agent.image_run",
  "created": 1713833628,
  "generation_id": "gen_...",
  "generationId": "gen_...",
  "model": "gpt-5.4",
  "size": "1536x1024",
  "response_text": "Research and poster generation completed.",
  "agent_round_count": 2,
  "credits_consumed": 8.42,
  "data": [
    {
      "url": "https://your-domain.example/api/storage/generations/...",
      "revised_prompt": "...",
      "output_role": "agent_draft"
    },
    {
      "url": "https://your-domain.example/api/storage/generations/...",
      "revised_prompt": "...",
      "output_role": "final"
    }
  ],
  "agent_events": [],
  "usage": null
}

# SSE when stream=true
event: agent.event
data: {"type":"agent.event","event":{"kind":"web_search","status":"completed","title":"Web search completed","detail":"Zhejiang Shuangyuan Technology official site"}}

event: agent.partial_image
data: {"type":"agent.partial_image","partial_image_index":0,"url":"https://your-domain.example/api/storage/generations/..."}

event: agent.completed
data: {"type":"agent.completed","generation_id":"...","generationId":"...","agent_round_count":2,"credits_consumed":8.42,"data":[{"url":"https://your-domain.example/api/storage/generations/...","output_role":"final"}]}

Request Fields

prompt
Required
Current Agent task, up to 4000 characters.
model / gptModel / gpt_model
Optional
Top-level GPT/Responses model. If model is gpt-image-*, GPT2IMAGE treats it as image_model for compatibility.
image_model / imageModel
Optional
Image model used by the image_generation tool, usually gpt-image-*.
images / image_url / image_urls
Optional for JSON
Public reference image URLs. The server downloads and validates public reachability, type, and size.
image / image[] / image_*
Optional for multipart
Reference image files. Images plus attachments are limited by maxChatImages.
file / file[] / attachment
Optional for multipart
Text, code, CSV, JSON, Markdown, XML, YAML, log, or PDF attachments. Text files become context; PDFs become Responses file inputs.
history
Optional
Previous conversation array such as [{ role, text, imageUrls, variants }] for continuing an external Agent conversation.
agent_max_rounds
Extension
Optional
1 to 8. Caps automatic Agent iteration rounds.
agent_force_max_rounds
Extension
Optional
When true, runs exactly agent_max_rounds. When false, the model may stop through continue_generation.
n / count
Optional
The Agent API runs one task at a time; when supplied this must be 1. Use concurrent requests for multiple tasks.
size / quality / moderation / output_format / output_compression
Optional
Same as image endpoints; used as runtime image_generation parameters inside Agent.
thinking
Extension
Optional
minimal, none, low, medium, high, or xhigh.
response_format
Optional
url or b64_json. Agent defaults to url to avoid oversized multi-round responses.
stream
Optional
true or Accept: text/event-stream returns SSE and also requires externalApi.streaming.

Response And Streaming

object / generation_id / model / size
Agent run object, generation record, model, and size.
data[]
Images produced by this Agent run. output_role may be agent_draft or final; the final item is the default deliverable.
agent_events[]
Structured task events such as web search, image generation, and continue/stop decisions.
credits_consumed / agent_round_count
Extension
GPT2IMAGE-billed credits and Agent rounds. Agent always requires Codex/Responses capability and does not use user custom API. Billing = Agent base round credits + final image output credits + moderation credits, with backend group multipliers applied.
SSE agent.event / agent.partial_image / agent.completed
Streaming task events, streaming previews, and final completion.

Implementation Notes

  • This endpoint is a GPT2IMAGE extension, not an official OpenAI endpoint. /api/v1/agents/images is an alias.
  • Ultra is required by default; admins can change externalApi.agent in the Plan Capability Matrix.
  • It forces requiresResponsesBackend and never schedules Web accounts; it can use Codex/Responses accounts or external API backends that support /responses.
  • It does not call page /api/images/chat; it shares the runImageGenerationForUser service layer with page Agent.
  • generate_image_batch concurrent tooling is intentionally not exposed yet to preserve linear iteration and Responses native state.
POST/v1/responsesapplication/json

Create response

A GPT2IMAGE image-generation adapter based on the OpenAI Responses API. It routes as responses and selects Codex/Responses groups or external /responses API backends.

Request Example

# 1. Minimal Responses image request. Requires Pro plan.
curl https://your-domain.example/v1/responses \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "input": "Generate a 1:1 futuristic product render",
    "size": "1024x1024",
    "quality": "high",
    "moderation": "auto"
  }'

# 2. Explicit image_generation tool with image model.
curl https://your-domain.example/v1/responses \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "input": "Generate a landscape technology product key visual",
    "tools": [{ "type": "image_generation", "model": "gpt-image-2" }],
    "size": "1536x864",
    "quality": "medium",
    "reasoning": { "effort": "low" },
    "store": true
  }'

# 3. Responses input with a reference image.
curl https://your-domain.example/v1/responses \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4-mini",
    "input": [
      {
        "role": "user",
        "content": [
          { "type": "input_text", "text": "Use this image as reference and make a winter poster" },
          { "type": "input_image", "image_url": "https://example.com/reference.png" }
        ]
      }
    ],
    "tools": [{ "type": "image_generation", "model": "gpt-image-2" }],
    "size": "1024x1024",
    "output_format": "webp",
    "output_compression": 85,
    "moderation": "low"
  }'

# 4. Continue a previous response and stream the result.
curl -N https://your-domain.example/v1/responses \
  -H "Authorization: Bearer $GPT2IMAGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "previous_response_id": "resp_previous_id",
    "input": "Add a moon based on the previous image",
    "tools": [{ "type": "image_generation", "model": "gpt-image-2" }],
    "size": "1024x1024",
    "reasoning": { "effort": "minimal" },
    "stream": true
  }'

Response Example

{
  "id": "resp_...",
  "object": "response",
  "created_at": 1713833628,
  "status": "completed",
  "model": "gpt-5.4",
  "output": [
    {
      "id": "ig_...",
      "type": "image_generation_call",
      "status": "completed",
      "result": "..."
    }
  ],
  "usage": null,
  "metadata": {
    "generation_id": "...",
    "credits_consumed": 1.31,
    "size": "1024x1024"
  }
}

# SSE when stream=true
event: response.output_item.done
data: {"type":"response.output_item.done","item":{"id":"ig_...","type":"image_generation_call","status":"completed","result":"..."}}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_...","object":"response","created_at":1713833628,"status":"completed","model":"gpt-5.4","output":[{"id":"ig_...","type":"image_generation_call","status":"completed","result":"..."}],"usage":null,"metadata":{"generation_id":"...","credits_consumed":1.31,"size":"1024x1024"}}}

Request Fields

model
Optional
Top-level Responses model. Availability is determined by /v1/models and the current plan.
input
Required
A string or message array. Message content supports strings, input_text/output_text, and input_image.image_url.
previous_response_id
Optional
Continues a previous response. GPT2IMAGE loads stored webConversation/fallbackHistory continuation state.
tools
Optional
If provided, must include { type: "image_generation" }. If omitted, GPT2IMAGE adds image_generation automatically. Put the image model in the image_generation tool's model field.
tool_choice
Optional
Accepted for compatibility. Do not force it in chat or multi-tool runs unless needed, because it can prevent the model from using web search, code interpreter, or image generation together.
stream
Optional
true returns Responses-style SSE events.
store
Optional
Accepted for compatibility. GPT2IMAGE stores continuation state internally and does not guarantee official store semantics.
reasoning.effort
Optional
Supports minimal, none, low, medium, high, and xhigh. Actual support depends on the selected backend.
size
Extension
Optional
Convenience field used as the run-time image size when the image_generation tool does not provide one.
quality
Extension
Optional
Convenience field used as the run-time image quality.
moderation
Extension
Optional
Convenience field used as the run-time image moderation setting.
output_format
Extension
Optional
Convenience field used as the run-time output_format when the image_generation tool does not provide one. You may also put it directly in the image_generation tool.
output_compression
Extension
Optional
Convenience field used as the run-time output_compression when the image_generation tool does not provide one.

Response And Streaming

id / object / created_at / status / model / output
Compatible with the basic Responses response object.
output[].type = image_generation_call
Image result is returned in result as b64_json.
output[].type = message
Upstream text, when present, is returned as output_text.
metadata.generation_id / credits_consumed / size
Extension
GPT2IMAGE generation record, billed credits, and size metadata. credits_consumed is 0 when a user custom upstream API wins.
SSE response.output_item.done / response.completed
Streaming output item and completion events.
SSE response.output_text.delta / response.reasoning_summary_text.delta
Text and reasoning summary delta events.

Implementation Notes

  • This endpoint requires Pro plan or higher.
  • This is not Chat Completions. /v1/chat/completions is still unsupported.
  • input_image supports image_url/data URLs. file_id/file inputs are not used as references today.
  • If tools is provided without image_generation, GPT2IMAGE returns an error to avoid text-only responses.
  • Page Chat mode uses normal multimodal chat/image semantics. Agent mode provides image_generation, web_search, and the linear continuation tool continue_generation by default without forcing tool_choice.
  • Page Chat/Agent can read uploaded local text/code files as request context. Prompted server filesystem paths are not read.
  • Page Chat/Agent base round credits are configured per plan in the admin Plan Capability Matrix. Defaults are 1 credit per Chat round and 3 credits per Agent round; completed images are additionally billed by detected output size and output count.
  • Agent feeds the previous round's text, tool outputs, and generated draft images into the next round so the model can decide whether to refine again. The cap is IMAGE_AGENT_MAX_ROUNDS, default 3. Concurrent batch tools such as generate_image_batch are not wired into runtime yet because they need a native Responses state design first.
  • Multiple Agent image_generation_call outputs are shown as automatic iteration variants, with the last image selected by default.
Entry To Backend Mapping

Page Requests

Create page generation
/api/images/generate
image_generation
Can route to user custom API, Web account, Codex/Responses account, or external API backend.
Create page edit
/api/images/edit
image_edit
Reference images enter the internal endpoint first, then route through the selected backend group.
Create page image chat
/api/images/chat
chat
Uses chat routing; can select Web accounts, Codex/Responses accounts, or external API backends that support /responses.
Create page Agent run
/api/images/chat
agent
Same internal endpoint, but uses Codex/Responses capability; it provides image_generation, web_search, continue_generation, and visible task cards.

External API Requests

OpenAI images generation
/v1/images/generations
image_generation
Validates API key and plan, then enters the same generation path; b64_json is the default response format, url can be requested explicitly.
OpenAI images edit
/v1/images/edits
image_edit
Multipart images are converted into unified image inputs before backend routing.
OpenAI Responses
/v1/responses
responses
Adds the image_generation tool when tools are omitted; explicit tools must include image_generation. User custom API still wins when available; otherwise responses routing selects Codex/Responses groups or external /responses API backends.
GPT2IMAGE Agent image run
/v1/agents/images
agent
GPT2IMAGE extension. Requires externalApi.agent, routes to Codex/Responses only, and can stream Agent task events plus multi-round image outputs.
OpenAI models
/v1/models
-
Only lists models visible to the current plan/API key and does not trigger backend pool routing.
GPT2IMAGE credits
/v1/credits
-
Returns the current API key quota, usage, remaining quota, and the owning account credit balance without backend routing.
Web Accounts

Uses ChatGPT Web image generation. It can reuse Web account quota, but it is not a strictly parameterized Images/Responses API.

  • Resolution is not strictly controllable; size is only a hint/record value and output may differ.
  • 4K output is not guaranteed; high-resolution output depends on current ChatGPT Web capability and account state.
  • The main GPT conversation model and Web thinking level can be controlled; image model is not mapped to a separate Web image model.
  • When prompt optimization is off, GPT2IMAGE sends the original prompt and forces Web thinking to instant to reduce platform-side rewriting.
  • External /v1/responses is adapted into the shared chat generation path, but its scheduling type remains responses; it only selects Codex/Responses groups or external Responses API backends, not Web account pools.
  • For external /v1/responses, an empty model uses the backend default; explicit models must be listed by /v1/models or GPT2IMAGE rejects them.
  • Cannot guarantee prompt text is never interpreted, expanded, or revised by ChatGPT Web upstream.
Codex / Responses Accounts

Uses Responses semantics and is the most parameterized system-account backend.

  • GPT model is sent as the top-level Responses model.
  • Image model is sent as the image_generation tool model.
  • size, quality, moderation, reference images, and mask are assembled into the Responses tool request.
  • Page Chat mode uses normal multimodal chat/image semantics. Page Agent mode provides image_generation, web_search, and continue_generation by default without forcing tool_choice, and can continue across linear automatic rounds so the model can search, read uploaded text-file context, generate drafts, and refine like Codex.
  • Uploaded local text/code files in Chat/Agent are read as request context. Server filesystem paths written in prompts are not read.
  • Supports external /v1/responses and can also handle converted /v1/images/generations and /v1/images/edits requests.
  • When prompt optimization is off, GPT2IMAGE instructs the model not to modify the prompt; this is best effort and upstream may still deviate.
  • Page Chat/Agent base round credits are configured per plan in the admin Plan Capability Matrix. Defaults are 1 credit per Chat round and 3 credits per Agent round; completed image outputs are billed additionally by detected size and count.
  • Not ChatGPT Web, so Web-only capability or quota semantics do not apply.
  • On rate limits, quota errors, or invalid credentials, the scheduler cools down/marks the account and tries another one.
External API Backends

Uses an admin-configured OpenAI-compatible Base URL/API Key. Final capability depends on that service.

  • Images generation/edit call the external Images API.
  • Responses requests call the external /responses endpoint.
  • Model, size, quality, streaming events, and usage fields depend on the external API implementation.
  • Does not consume GPT2IMAGE Web or Codex account pool quota.
  • If the external service rewrites prompts or limits resolution, GPT2IMAGE cannot override it.
Prompt Optimization And Thinking
Prompt optimization on
Optimized prompt may be used; Web thinking follows the selected value.
Prompt optimization off
Original prompt is sent; Web is forced to instant to minimize changes.
Codex/Responses
When prompt optimization is off, GPT2IMAGE instructs the model not to modify the prompt, but final behavior still depends on the upstream model/tool.
External API
The platform passes through where possible; the external service decides final behavior.
Roadmap
  • Sub2API non-database interface: current sync uses SUB2API_POSTGRES_URL to connect to Sub2API PostgreSQL. Future work should evaluate the Sub2API admin key / HTTP API path for account lookup, group filtering, status reads, error cleanup, and sync jobs; keep direct DB access only as a fallback when the API lacks required fields.
  • PSD generation API: prepare support for PSD/layered outputs by defining the upstream contract, MIME/extension handling, storage and preview behavior, credit billing, external API response fields, capability matrix switch, and page download entry.
  • Agent batch image tool: evaluate a generate_image_batch-style tool where the model plans multiple independent images and the backend executes them with bounded parallelism; design the interaction with Responses previous_response_id before enabling it.
  • Image reference UX: improve atomic @图1 and @第N轮图M tokens, remap references after image reorder, and surface missing-reference warnings.
  • Agent branching: when editing or regenerating an older round, fork a new branch instead of overwriting later records.