Track Overview
The SEGMENT Track is the short-video track of the ORena SAVE FOCUS Challenge. It evaluates whether a submitted algorithm can answer clinically relevant questions from a laparoscopic video segment of up to 5 minutes, using only the video segment and the provided question.
This page provides the technical track specification. Dataset composition, taxonomy details, resources, and general challenge background are described in the corresponding Overview and Data tabs.
SEGMENT Track
Description
The SEGMENT Track focuses on short-term surgical video understanding. Each task instance consists of a laparoscopic video segment and a natural-language question about foreign objects, actions, events, or object-related context visible within the segment. The algorithm must return a short text answer.
The track targets all capabilities listed in the taxonomy overview in the data section, as long as the answer can be inferred from the provided video segment.
Algorithm Docker Input
The algorithm input consists of the video segment and the question. The question includes the metadata and the question text itself.
| Video segment | Laparoscopic video segment of up to 5 minutes. |
| Question | Natural-language VQA question including the relevant metadata, such as procedure name, time point or segment context, expected output, and list of foreign objects. |
The exact file structure and schema will follow the official submission template repository.
Algorithm Docker Output
| Answer | Short text answer to the provided question. |
The exact output format and validation rules will follow the official submission template repository.
Runtime Environment
|
AWS Hardware NVIDIA H100 GPU 80GB VRAM |
Time Limit 15 seconds per question |
Execution Docker container No internet access during inference |
Evaluation Scope
SEGMENT submissions are evaluated on short-video question answering. Questions are restricted to information that can be inferred from the provided video segment and question metadata. The track evaluates local temporal reasoning, short-term object tracking, event recognition, aggregation within the segment, and visually grounded reasoning over the segment context.
Official Track Document
For the full formal specification, please consult the official SEGMENT Track document:
👉 SEGMENT Track PDF