VisionAI Data Format

The VisionAI data format schema for multi-sensor data annotation is organized as a dictionary and can be comprehensively described and consulted via a JSON schema file, adhering to the guidelines set forth by the ASAM OpenLABEL format. To ensure compliance with the format's specifications, the annotation file must undergo JSON validation using the OpenLABEL JSON schema, with the "openlabel" keyword being substituted with "visionai".

Schema

The following JSON example outlines the top-level components of the VisionAI format. Further details about each element will be provided in separate sections.

Example

The structure of the VisionAI JSON file (visionai.json) format is outlined below. Additional information on the file structure can be found in the Data Structure Details section.

{
    "visionai": {
        "coordinate_systems": { ... },
        "streams": { ... },
        "contexts": { ... },
        "objects": { ... },
        "frames": { ... },
        "frame_intervals": { ... },
        "tags": { ... },
        "metadata": { ... },
    }
}

visionai {}

NameDefinitionRequired

visionai

The name of this particular data format.

true

coordinate_system

A numerical system that specifies the location of points and geometric elements within a given space. In the VisionAI format, coordinate systems are declared by name and linked through parent-child relationships to establish the hierarchy.

streams

A source of the data sequence, typically a sensor. The VisionAI format combines multi-sensor information (streams) to describe annotations for corresponding streams. Stream keys contain information such as intrinsic calibration parameters for cameras.

true

contexts

Lists all contextual information present in the annotation. For example, the details about the scene such as properties, weather conditions, or location. In VisionAI format, the "contexts" is mainly used for classification and tagging information.

false

objects

The objects are physical entities within a scene, such as people, cars, or lane markings. Object keys contain information such as the object's name, type, annotation location, and frame intervals. UUIDs are used as keys.

false

frames

Containers for dynamic, time-based information. Each frame in JSON format data is described by an integer number.

true

frame_intervals

An array defining the frame intervals for which JSON data contains information.

true

tags

Tags are used to provide information about a certain data file, which may be specified in the tags entry in the JSON file.

false

metadata

The version string for this schema.

true

Project Sensor Settings

The inclusion of “coordinate_systems” and “streams” in the VisionAI format is contingent on the sensor settings of each project. Please consult the following combination for more information:

Project Sensor Settingscoordinate_systems (Required)

1* camera

false

n* cameras

false

1* lidar

false

n* lidars

true

m* camera + n* lidar

true


Data Structure

The data folder structure for using VisionAI data format is primarily organized into sequences, containing all sensor data arranged in frame order.

The sensor names are associated with the stream_name in the visionai.json file.

The sequence and frame numbers, as well as their folder and file names, are composed of 12 digits and begin with 000000000000, representing the sequence IDs and frame IDs, respectively.

Each sequence contains annotation information, including both ground truth and other annotations. Ground truth information is mandatory and must be labeled as "groundtruth" in the folder name, while other annotation folders ($NAME) must correspond to the information provided in the visionai.json file.

Example

A sample folder structure of 2 sequences, multi-sensors (“camera1” and “lidar1“) within 3 frames is provided below:

/data/000000000000/data/camera1/000000000000.png
                               /000000000001.png
                               /000000000002.png
                       /lidar1/000000000000.pcd
                               /000000000001.pcd
                               /000000000002.pcd
                  /annotations/groundtruth/visionai.json #visionai ground truth annotations, required in annotated data.
                  /annotations/$NAME/visionai.json #visionai annotations(optional).
     /000000000001/data/camera1/000000000000.png
                               /000000000001.png
                               /000000000002.png
                       /lidar1/000000000000.pcd
                               /000000000001.pcd
                               /000000000002.pcd
                  /annotations/groundtruth/visionai.json #visionai ground truth annotations, required in annotated data.
                  /annotations/$NAME/visionai.json #visionai annotations(optional).

Folder information such as sensor name and annotation ($NAME) naming rule

The name may only contain lowercase letters, numbers, and hyphens (-), and must begin and end with a letter or a number. Each hyphen (-) must be preceded and followed by a non-hyphen character. The name must also be between 3 and 40 characters long.

PCD (Point Cloud Data) file format

The PCD (Point Cloud Data) file format allows only the "DATA binary_compressed" option. This means that the point cloud data in the PCD file is stored in binary compressed format. Other data storage options, such as "ASCII" or "binary", are not permitted in this format.


Use Case

More infomation on

pageUse Case

bbox

To describe a bbox dataset with one camera sensor:

pagebbox

bbox + cuboid (3D)

To describe a dataset with one camera sensor (bbox annotation) and one lidar sensor (cuboid annotation) in the coordinate system of iso8855-1:

pagebbox + cuboid (3d)

semantic segmentation

To describe a semantic segmentation dataset with one camera sensor:

pagesemantic segmetation

tagging

To describe a dataset with taggings:

pagetagging

Last updated