Import Your Dataset

Dataverse streamlines your data management processes, making it easier to organize and visualize your data.

Dataverse accepts various data formats, and you can choose opendataset as your data source.

Quick Start

Starting with a bunch of images or videos? Try folder upload. And if you're starting from scratch, pop in prepared open datasets and get rolling.

Click on the video below for a quick overview:

Dataset List

On the Dataset page, you can view a list of all uploaded datasets and their status.

Click "Detail" to view the details of this dataset, including whether the upload was successful or if an error message is displayed.

Hint

The system supports resuming from breakpoints.

Clicking "View in Data Visualization" will display all files in this folder.

Import Dataset

The Linker platform supports several methods, allowing you to choose your preference for uploading data.

Click on "Import," select the data format you want to upload and enter the data upload settings.

  • Upload from Local: upload directly from local folder.

  • Raw Data: data without annotations

  • Annotated Data: data with annotations

  • Existing: import dataset from other project.

  • Open Dataset: import preselceted dataset.

Good to know: Starting with a bunch of images or videos? Try folder upload. And if you're starting from scratch, pop in prepared open datasets and get rolling.

For different options, different data content may be required to assist in data uploading.

On the Dataset page, click "Import," choose the target based on the uploaded data content, and enter the data upload settings.


Data Upload from Local

Browse and select images or videos directly from your local folder for easy importation into the platform.

  1. Image upload

  2. Video upload

Only single-camera projects are suitable for fast folder upload.


Annotated Data Upload Format

The platform provides the following data upload formats:

After filling in the upload format, indicate whether the data is sequence data or whether it is uploaded only for the results of specific sensors.

After clicking "Import," the system will save the file as a database according to the settings, making it easier to filter and search in the future.

Uploading through AWS

Connect to your AWS S3 bucket, and the file will be transferred to the platform.。

  • Bucket URL: AWS S3 bucket link (Bucket URL)

  • Data Folder: Location of the file folder that needs to be imported.

  • Access Key ID & Secret Access Key (Optional): Use your Access Key ID and Secret Access Key for secure AWS S3 operations. These keys authenticate and authorize actions like retrieving or listing objects. Remember, specific permissions (e.g.,GetObject,ListObjects,ListObjectsV2,CopyObject) are required for this operations.

Amazon S3 virtual-hosted–style URLs use the following format:

https://bucket-name.s3.region-code.amazonaws.com/

Hint: The Access Key ID and Secret Access Key fields are optional for public AWS S3 buckets. If your S3 bucket is set to public, you can leave these fields blank.

Uploading through Azure

Connect to your Azure blob storage, and the file will be transferred to the platform.

  • Blob Storage URL: Azure storage link (Blob storage account URL)

  • Container Name: Blob storage container name

  • Data Folder: Location of the data folder stored in the container that needs to be imported.

  • SAS Token: SAS token that provides read permissions to the Azure blob storage account.

Please provide the SAS token for your Azure Blob Storage account, not the container, when uploading.


Import Open Dataset

Import pre-selected open datasets directly into your workspace for efficient model training and data analysis.


Adding dataset using Python SDK.

Use Dataverse-SDK for Python to help you to interact with the Dataverse platform by Python. Currently, the library supports:

  • Get Project by project-id

  • Create Dataset from your AWS/Azure storage or local

  • Get Dataset by dataset-id

Firstly, prepare the target project and its ontology. + Create Project

Import Dataset Example

Use create_dataset to import dataset from cloud storage:

dataset_data = {
    "name": "Dataset 1",
    "data_source": DataSource.Azure/DataSource.AWS,
    "storage_url": "storage/url",
    "container_name": "azure container name",
    "data_folder": "datafolder/to/vai_anno",
    "sensors": project.sensors,
    "type": DatasetType.ANNOTATED_DATA,
    "annotation_format": AnnotationFormat.VISION_AI,
    "annotations": ["groundtruth"],
    "sequential": False,
    "render_pcd": False,
    "generate_metadata": False,
    "auto_tagging": ["timeofday"],
    "sas_token": "azure sas token",  # only for azure storage
    "access_key_id" : "aws s3 access key id",# only for private s3 bucket, don't need to assign it in case of public s3 bucket or azure data source
    "secret_access_key": "aws s3 secret access key"# only for private s3 bucket, don't need to assign it in case of public s3 bucket or azure data source
}
dataset = project.create_dataset(**dataset_data)

Please refer to the details on the GitHub.


Import from Existing Data

The system offers the option to import a dataset from an existing dataset of another project. At this point, an ontology mapping is required to ensure data compatibility.


Advanced Image Processing

Point Cloud File 2D Preview Image

When generating 2D preview images, the system can batch-generate 2D preview images for point cloud data, which helps with later visualization browsing.

IQA for Image Analysis Results

When the automatic image quality assessment generation function is turned on, the system will batch-generate data for each image, including brightness, contrast, and other information. For details, please refer to the documentation.

pageImage Quality Assessment (IQA)

Auto-Tagging

Dataverse offers an advanced auto-tagging feature to simplify data management, automatically generating image tags (weather, scene, time of day) for users.

pageAuto-Tagging

Last updated