end represents the end time of the segment in seconds.
id represents the id of the segment.
Optional speakerspeaker represents the speaker of the segment.
start represents the start time of the segment in seconds.
text represents the text for the segment.
AudioSegment represents a segment in audio transcription.