VisionAI DataVerse
Observ
English
English
  • Meet DataVerse
  • Data Management
    • Creating Your First Project
    • Import Your Dataset
    • Data Slice - Specific Subsets
    • Data Visualization
      • Image
      • Point Cloud
      • Frame View
      • Sequence View
      • <Use Case> Clean Raw Data
      • <Use Case> Find More Rare Cases
      • <Use Case> Identify Model Weakness
    • Data Metrics
  • Advanced Data Features
    • Image Quality Assessment (IQA)
    • Auto-Tagging
    • Data Discovery (Beta)
    • Data Sampling
    • Data Splitting
    • Data Query
      • Element
      • Logic
      • Use Cases
  • ANNOTATION
    • Before Starting Annotation Task...
    • Create Annotation Task
    • Task Overview
    • Manpower
    • Labeling/Reviewing Panel
      • VQA Labeling Panel
    • Statistics
    • Detail
  • Model Training
    • Train Your AI Model
    • Model Performance
    • Prediction
    • Model Convert (Beta)
    • Model Download (Beta)
  • VisionAI Format
    • VisionAI Data Format
      • coordinate_systems
      • streams
      • contexts
      • objects
      • frames
        • objects
          • bbox
          • cuboid
          • poly2d
          • point
          • binary
        • contexts
        • attributes
      • frame_interval
      • tags
      • metadata
    • Use Case
      • bbox
      • polygon
      • polyline
      • point
      • semantic segmetation
      • classification
      • bbox + cuboid (3d)
      • tagging
    • Format FQA
    • VLM Data Format (VQA)
    • Appendix: Training Format
  • DataVerse Usage
    • Usage and Billing
  • Updates
    • Updates & Release Information
      • Release 2025/4/10
      • Release 2025/1/8
      • Release 2024/11/12
      • Release 2024/09/18
      • Release 2024/08/06
      • Initial Release 2024/01/01
Powered by GitBook
On this page
  • Example
  • streams {}
  • Intrinsics {}
  • Use Case
  • bbox
  • bbox + cuboid (3d)
  1. VisionAI Format
  2. VisionAI Data Format

streams

Previouscoordinate_systemsNextcontexts

Last updated 4 months ago

A stream serves to identify the origin of a data sequence, commonly referring to a sensor.

The VisionAI schema represents a confluence of multi-sensor (stream) information, allocated within a specific space, that enables the description of corresponding annotation streams. A stream key contains a wealth of sensor information, such as intrinsic calibration parameters for cameras.

Hint

When your data includes only one sensor (e.g., camera), it is sufficient to provide the description of your stream (sensor name and type) without the need for additional calibration information.

For example:

"streams": { "camera1": { "type": "camera" } }

Example

"streams": {
    "camera1": {
        "type": "camera",
        "uri": "./some_path/some_video.mp4",
        "description": "Frontal camera",
        "stream_properties": { 
            "intrinsics_pinhole": {
                "camera_matrix_3x4": [ 1000.0,    0.0, 500.0, 0.0,
                                            0.0, 1000.0, 500.0, 0.0,
                                            0.0,    0.0,   1.0, 0.0],
                "height_px": 480,
                "width_px": 640
            }
        }
    }
}

streams {}

name
description
type
required

${STREAM_NAME}

The name of this stream

object

true

type

The type of this string. Must be one of "camera", "lidar".

string

true

uri

The uri of file of this stream. Setting uri here means this sequence has only this stream file. (static information)

string

false

description

The description of the stream.

string

false

stream_properties

The object in stream properties must be one of the following types. All types are an object.

➤ intrinsics_pinhole The object details follow the intrinsics as below table.

object

Intrinsics {}

The intrinsic parameters of a camera pertain to how the device captures images. These parameters include factors like focal length, aperture, field-of-view, resolution, and others that influence the intrinsic matrix of a camera model.

The intrinsic matrix denotes a transformation matrix that converts points from the camera's coordinate system into the pixel coordinate system.

Intrinsic information in the VisionAI format serves solely as a means of recording relevant data and is not utilized in any system-based conversion processes.

intrinsics_pinhole # recording relevant data

name
type
required

camera_matrix_3x4

12 elements array of float

true

distortion_coeffs_1xN

number of array

false

height_px

int

true

width_px

int

true

Project Sensor Settings

The inclusion of “stream_properties” in the “streams“ is contingent on the sensor settings of each project. Please consult the following combination for more information:

Project Sensor Settings
stream_properties (required)

1* camera

false

n* cameras

false

1* lidar

false

n* lidars

false

m* camera + n* lidar

true


Use Case

bbox

To describe a bbox dataset with one camera sensor:

  • sensor: camera (#camera1)

Example Code

bbox + cuboid (3d)

To describe a dataset with one camera sensor (bbox annotation) and one lidar sensor (cuboid annotation) in the coordinate system of iso8855-1:

  • sensor: camera (#camera1), lidar (#lidar1)

Example Code

bbox
bbox + cuboid (3d)
by project setting