image represents an object that knows how to retrieve an image.
inputExtension represents a set of optional pre-processing integrations.
maxTokens represents the max number of tokens to return.
model represents the model to use.
inputExtension represents a set of optional post-processing integrations.
question represents the question about the image.
role represents the role of the sender (user or assistant).
temperature represents the randomness in GPT's output.
topK represents the variability of the generated text.
topP represents the diversity of the generated text.
ChatVisionInput represents the full potential input options for Vision chat.