Track Overview

The SEGMENT Track is the short-video track of the ORena SAVE FOCUS Challenge. It evaluates whether a submitted algorithm can answer clinically relevant questions from a laparoscopic video segment of up to 5 minutes, using only the video segment and the provided question.

This page provides the technical track specification. Dataset composition, taxonomy details, resources, and general challenge background are described in the corresponding Overview and Data tabs.


SEGMENT Track

Description

The SEGMENT Track focuses on short-term surgical video understanding. Each task instance consists of a laparoscopic video segment and a natural-language question about foreign objects, actions, events, or object-related context visible within the segment. The algorithm must return a short text answer.

The track targets all capabilities listed in the taxonomy overview in the data section, as long as the answer can be inferred from the provided video segment.


Algorithm Docker Input

The algorithm input consists of the video segment and the question. The question includes the metadata and the question text itself.

Video segment Laparoscopic video segment of up to 5 minutes.
Question Natural-language VQA question including the relevant metadata, such as procedure name, time point or segment context, expected output, and list of foreign objects.

The exact file structure and schema will follow the official submission template repository.


Algorithm Docker Output

Answer Short text answer to the provided question.

The exact output format and validation rules will follow the official submission template repository.


Runtime Environment

AWS Hardware
NVIDIA H100 GPU
80GB VRAM
Time Limit
15 seconds per question
Execution
Docker container
No internet access during inference

Evaluation Scope

SEGMENT submissions are evaluated on short-video question answering. Questions are restricted to information that can be inferred from the provided video segment and question metadata. The track evaluates local temporal reasoning, short-term object tracking, event recognition, aggregation within the segment, and visually grounded reasoning over the segment context.


Official Track Document

For the full formal specification, please consult the official SEGMENT Track document:

👉 SEGMENT Track PDF