Import Your Dataset
Dataverse streamlines your data management processes, making it easier to organize and visualize your data.
Last updated
Dataverse streamlines your data management processes, making it easier to organize and visualize your data.
Last updated
Dataverse accepts various data formats, and you can choose opendataset as your data source.
Starting with a bunch of images or videos? Try folder upload. And if you're starting from scratch, pop in prepared open datasets and get rolling.
Click on the video below for a quick overview:
On the Dataset page, you can view a list of all uploaded datasets and their status.
Click "Detail" to view the details of this dataset, including whether the upload was successful or if an error message is displayed.
Hint
The system supports resuming from breakpoints.
Clicking "View in Data Visualization" will display all files in this folder.
The Linker platform supports several methods, allowing you to choose your preference for uploading data.
Click on "Import," select the data format you want to upload and enter the data upload settings.
Upload from Local: upload directly from local folder.
Raw Data: data without annotations
Annotated Data: data with annotations
Existing: import dataset from other project.
Open Dataset: import preselceted dataset.
SDK: Use SDK to upload from the cloud or locally.
Good to know: Starting with a bunch of images or videos? Try folder upload. And if you're starting from scratch, pop in prepared open datasets and get rolling.
For different options, different data content may be required to assist in data uploading.
On the Dataset page, click "Import," choose the target based on the uploaded data content, and enter the data upload settings.
Browse and select images or videos directly from your local folder for easy importation into the platform.
Image upload
Video upload
Only single-camera projects are suitable for fast folder upload.
The platform provides the following data upload formats:
VisionAI data format - object detection
VLM data format - VQA
Coco format: https://cocodataset.org/#format-data
After filling in the upload format, indicate whether the data is sequence data or whether it is uploaded only for the results of specific sensors.
After clicking "Import," the system will save the file as a database according to the settings, making it easier to filter and search in the future.
Connect to your AWS S3 bucket, and the file will be transferred to the platform.。
Bucket URL: AWS S3 bucket link (Bucket URL)
Data Folder: Location of the file folder that needs to be imported.
Access Key ID & Secret Access Key (Optional): Use your Access Key ID and Secret Access Key for secure AWS S3 operations. These keys authenticate and authorize actions like retrieving or listing objects. Remember, specific permissions (e.g.,GetObject,ListObjects,ListObjectsV2,CopyObject) are required for this operations.
Amazon S3 virtual-hosted–style URLs use the following format:
https://bucket-name.s3.region-code.amazonaws.com/
Hint: The Access Key ID and Secret Access Key fields are optional for public AWS S3 buckets. If your S3 bucket is set to public, you can leave these fields blank.
Connect to your Azure blob storage, and the file will be transferred to the platform.
Blob Storage URL: Azure storage link (Blob storage account URL)
Container Name: Blob storage container name
Data Folder: Location of the data folder stored in the container that needs to be imported.
SAS Token: SAS token that provides read permissions to the Azure blob storage account.
Please provide the SAS token for your Azure Blob Storage account, not the container, when uploading.
Import pre-selected open datasets directly into your workspace for efficient model training and data analysis.
Coco dataset: https://cocodataset.org/#home
KITTI dataset:https://www.cvlibs.net/datasets/kitti-360/index.php
LVIS dataset:https://www.lvisdataset.org/
BDD dataset: https://bdd-data.berkeley.edu/
Use Dataverse-SDK for Python to help you to interact with the Dataverse platform by Python. Currently, the library supports:
Get Project by project-id
Create Dataset from your AWS/Azure storage or local
Get Dataset by dataset-id
Firstly, prepare the target project and its ontology. + Create Project
Import Dataset Example
Use create_dataset
to import dataset from cloud storage:
Please refer to the github content for setting details.
To use the SDK, you need to provide service-id.
When you select SDK in Import Dataset, your Service ID will be displayed in the content, which can be copied and used.
The system offers the option to import a dataset from an existing dataset of another project. At this point, an ontology mapping is required to ensure data compatibility.
When generating 2D preview images, the system can batch-generate 2D preview images for point cloud data, which helps with later visualization browsing.
When the automatic image quality assessment generation function is turned on, the system will batch-generate data for each image, including brightness, contrast, and other information. For details, please refer to the documentation.
Auto-Tagging
Dataverse offers an advanced auto-tagging feature to simplify data management, automatically generating image tags (weather, scene, time of day) for users.