Interface ChatVisionInput

ChatVisionInput represents the full potential input options for Vision chat.

interface ChatVisionInput {
    frequencyPenalty?: number;
    image: Base64Encoder;
    inputExtension?: InputExtension;
    logitBias?: {
        [token: string]: number;
    };
    maxCompletionTokens?: number;
    maxTokens?: number;
    model: string;
    outputExtension?: OutputExtension;
    parallelToolCalls?: boolean;
    presencePenalty?: number;
    question: string;
    reasoningEffort?: string;
    role: Roles;
    stop?: string | string[];
    temperature?: number;
    toolChoice?: string | ToolChoiceFunction;
    tools?: Tool[];
    topK?: number;
    topP?: number;
}

Properties

frequencyPenalty?: number

frequencyPenalty represents a value between -2.0 and 2.0 to penalize tokens based on frequency.

image represents an object that knows how to retrieve an image.

inputExtension?: InputExtension

inputExtension represents a set of optional pre-processing integrations.

logitBias?: {
    [token: string]: number;
}

logitBias modifies the likelihood of specified tokens appearing in a response.

Type declaration

  • [token: string]: number
maxCompletionTokens?: number

maxCompletionTokens represents the maximum number of tokens in the generated completion.

maxTokens?: number

maxTokens represents the max number of tokens to return.

Deprecated

Use maxCompletionTokens instead.

model: string

model represents the model to use.

outputExtension?: OutputExtension

outputExtension represents a set of optional post-processing integrations.

parallelToolCalls?: boolean

parallelToolCalls represents whether to enable parallel function calling during tool use.

presencePenalty?: number

presencePenalty represents a value between -2.0 and 2.0 to penalize tokens based on presence.

question: string

question represents the question about the image.

reasoningEffort?: string

reasoningEffort constrains effort on reasoning for reasoning models.

role: Roles

role represents the role of the sender (user or assistant).

stop?: string | string[]

stop represents one or more sequences where the API will stop generating tokens.

temperature?: number

temperature represents the randomness in GPT's output.

toolChoice?: string | ToolChoiceFunction

toolChoice represents whether to enable tool calling and which tool to use.

tools?: Tool[]

tools represents an array of tools the model may call.

topK?: number

topK represents the variability of the generated text.

topP?: number

topP represents the diversity of the generated text.