VLM Playground (Beta)

The VLM Playground provides an interactive platform for users to experiment with Vision-Language Models (VLMs) and configure templates for various analysis tasks. It allows users to test inputs, observe outputs, and save reusable templates for streamlined workflows.

How to Use VLM Playground

  • Upload Image Select an image to analyze using the VLM model.

  • Instruction Tell what will this vlm do. An instruction to guide the behavior and response of the vlm. Configure input prompts and explicitly define the parameter keys with expected output types.

Instruction Suggestion

The system automatically detects output types such as string, number, vec, and boolean.

Users are encouraged to explicitly define output expectations in the prompt and provide detailed descriptions for each parameter. This ensures stable and structured data formats, enabling consistent results when templates are reused.

Example: For a question like "What is the weather?", specify single-choice options (e.g., Sunny, Cloudy, Rainy) and a format type (vec).

Why Define Outputs Clearly?

  • Explicitly defining expected output types in the prompt ensures:

    • Consistency: Guarantees that future tasks using the template produce stable and structured data formats.

    • Reliability: Reduces the chances of unexpected results or variations in output.

    • Data Analysis: Stable, structured data is easier to analyze and integrate into larger workflows.

  • Adjust Settings

    Fine-tune model settings such as temperature for creativity or token limit for response length.

  • Submit to Run Analysis Submit the input to the VLM and observe the model's output in real-time.

  • Save as Template

    Once satisfied with the setup, click Save Template to store the configuration for future use.


VLM Templates List

Navigate to the VLM Templates List tab to view, edit, or reuse existing templates.

  • View all saved templates in the Templates List section.

  • Templates are organized by name, parameter keys, and creation date for easy access and management.

Templates for Tasks:

These templates can be directly applied to VLM tasks to automate analysis processes such as:

  • Detecting scene details (e.g., weather, object types, abnormal events).

  • Providing structured outputs for population density, traffic analysis, or emergency detection.

  • Supporting event-based triggers, ensuring detailed and consistent context for detected events.

The VLM Playground allows users to experiment with Vision-Language Models by testing prompts, configuring parameters, and saving templates for reuse. By explicitly defining structured outputs, users can ensure consistent and reliable data formats for analysis. These templates can be seamlessly applied to tasks, enabling automated and standardized VLM processing for real-world applications.

Last updated