Overview

What Kite solves

The robotics stack is fragmented — simulation, training, data generation, deployment, and hardware integration live in separate tools. Kite merges them into one IDE so teams spend time on policies, not plumbing.

Before Kite, engineers wire together MuJoCo or Isaac for simulation, write custom scripts for synthetic data, provision cloud GPUs by hand, maintain URDFs across repos, and re-implement glue code for every new robot. A new team typically takes weeks to reach the first meaningful training run.

With Kite, you open the IDE, pick a robot, describe the environment, and run a policy. Everything underneath — sim backends, controllers, sensors, datasets, and compute — is already wired and validated. Sim-to-real compatibility is baked into the project layout rather than bolted on afterwards.

Unified workspace

Agents, models, data, simulation, and monitoring in one surface.

Sim-to-real on day one

Compatibility validated at project creation, not at deploy time.

Cloud compute

GPU and CPU provisioned per run. No rig to buy or maintain.

Zero setup

URDFs, packages, libraries, and drivers handled upfront.

Tip

New team? Start from a preset project (quadruped, arm, or humanoid). Everything — URDF, sim, datasets, and compute — is already configured so you can run a training job in minutes.

Platform

Pre-supported robots & URDF upload

Kite ships with curated URDFs for common research platforms and accepts your own robot definition with a drag-and-drop upload.

Unitree G1 (humanoid), Unitree Go2 (quadruped), Boston Dynamics Spot, and the SO-100 arm come preloaded with validated URDFs, collision meshes, actuator models, and reference controllers. Switch between them from the IDE's robot picker without modifying project code.

Drop a custom URDF (single file or zipped package with meshes) into the project. Kite parses joints, links, transmissions, and sensors; validates the kinematic chain; surfaces physical inconsistencies; and rebuilds the simulation scene automatically.

Built-in platforms

Unitree G1, Unitree Go2, Boston Dynamics Spot, and SO-100.

URDF validation

Inertia, joint limits, and mesh integrity checked on upload.

Dependency-free

Packages and libraries resolved automatically.

Sim-to-real parity

One URDF drives sim, visualization, and real hardware.

Platform

Robotics agent

A coding agent tuned for robotics — aware of URDFs, controllers, sensors, and the Kite SDK. It writes policy scaffolding, debugs training runs, and suggests fixes grounded in the current project state.

The agent reads your project files, URDF, sim config, and run logs. It proposes code edits, generates reward functions, fixes observation and action spaces, and explains why a training run is diverging — with references to the exact file and line.

A fast coding agent ships on the Free plan for single-file assistance. The Pro plan unlocks a deeper agent with multi-file refactors, longer context, and access to training-run telemetry as a first-class tool.

Project-grounded

Suggestions reference real joints, links, and sensors.

Multi-file edits

Cross-file refactors and project-wide search.

Live telemetry

Streams simulation and training logs during a run.

Models & Data

World generation — World Labs

Generate photorealistic 3D scenes from text or reference imagery using the World Labs integration, then drop them directly into your training environment.

Describe an environment — warehouse with mixed-height shelving, cluttered domestic kitchen — and Kite produces a Gaussian-splat plus mesh representation with collision geometry. Seeds are preserved so a training run can be reproduced from the prompt alone.

Vary lighting, clutter density, surface materials, and layout per episode to avoid overfitting. Each variation is deterministic given its seed, which keeps evaluation runs reproducible.

Text-to-3D

Prompt or reference image → navigable 3D scene.

Mesh + splat

Both representations available, usable as sim geometry.

Seeded variation

Deterministic randomization for reproducible eval.

Preset worlds

Warehouse, kitchen, and lab bench seeded per project.

Models & Data

Objects with SAM 3D

Segment any object from an image and turn it into a physics-ready 3D asset using the SAM 3D pipeline integrated into the IDE.

Upload an image, click the object you want, and Kite runs SAM 3D to produce a clean mesh with estimated geometry and textures. The result is tagged with a material model and ready to drop into a world.

Generated assets come with inertia estimates, collision primitives, and contact-friendly meshes so they behave predictably in MuJoCo. Batch generation is supported for data-augmentation workflows.

Click-to-lift

SAM 2 for segmentation, SAM 3D for 3D reconstruction.

Sim-ready meshes

Collision primitives auto-paired with each asset.

Material hints

Friction and material estimates attached on import.

Batch mode

Generate object libraries for randomized training.

Models & Data

Vision-Language-Action models

Frontier policies from Physical Intelligence — the π-series — are ready to train from the IDE. Kite handles tokenizers, adapters, and cloud training orchestration.

The current registry includes Physical Intelligence π0 and π0-FAST, with additional members of the π-series onboarded as they are released.

Point a training job at a Kite dataset, pick a base checkpoint, and launch. Kite manages image, state, and action normalization, PEFT adapters, and evaluation replay against the sim you trained on.

Policy registry

One-click fine-tune from the registry UI.

PEFT adapters

LoRA-style training fits on a single GPU.

Sim evaluation

Automatic policy rollout after each checkpoint.

Dataset adapters

HuggingFace, LeRobot, and custom formats supported.

Note

Fine-tunes inherit the policy's original image and action normalization — you don't need to re-implement preprocessing to match a base checkpoint.

Models & Data

Fleet policy evaluation

Run a trained policy across many randomized cloud simulations, measure success and failure rates, and get per-episode video proof — no real robot, no babysitting.

An evaluation run rolls one policy out across N randomized sim episodes in parallel on GPU MuJoCo (MJX). Each episode is scored into a labelled outcome — success, or a failure mode (object dropped, timeout, not placed, not reached) — and the dashboard aggregates the fleet into a success rate, a failure-mode histogram, and a tile grid that flips between pass and fail as outcomes land, each with a playable rollout video.

Choose the scene from one of your projects, or from a built-in benchmark-environment catalog (e.g. SO-Arm Pick & Place, SO-Arm Reach) so you can evaluate a policy with no project setup at all. ACT, π-series, smolVLA, and reinforcement-learning policies are all supported.

Success / failure rates

Labelled outcomes aggregated into a rate plus a failure-mode histogram.

Video proof

A per-episode MP4 (and Rerun .rrd) for every rollout.

Benchmark catalog

Built-in environments to test a policy against, decoupled from projects.

Massively parallel

N environments batched into one GPU job via MJX.

Tip

Runs reconcile server-side, so a fleet closes out with its success rate even if you close the tab mid-run.

Models & Data

Dataset Augmentation Studio

Paste a HuggingFace LeRobot dataset, describe a visual change — lighting, surface colour, texture, added or removed objects — and Kite generates an expanded dataset with new videos and matching joint trajectories, ready to push back to your own HuggingFace account for the next training round.

Augmentation is a standalone, in-memory flow: import a dataset, preview a few episodes, approve, and the pipeline generates visually-varied episodes with Google's Gemini models while preserving each episode's motion. The joint/action trajectory is carried over from the source so the new episodes stay physically consistent, with an optional Claude vision pass for object-changing edits.

Nothing is stored on Kite's servers — you review the generated episodes and push the result to HuggingFace under your account using a write token you connect in Settings → Integrations. The pipeline is provider-pluggable (mock for offline dev, per-frame image editing, Veo, and Gemini Omni via the Vertex Agent Platform).

Prompt-driven variation

Lighting, surface, texture, and object changes from plain text.

Motion preserved

Joint trajectories carried from the source episodes.

LeRobot in/out

Reads and writes valid LeRobot v3.0 datasets.

Push to your Hub

Results go to your own HuggingFace account, not Kite's.

Tip

Use augmentation to cheaply grow training variety without re-teleoperating: keep the same motions, change the scene, and fine-tune on the expanded set.

Models & Data

Physics & simulation engine

MuJoCo is the primary physics backend, extended by Kimodo — Kite's in-house kinematics and motion layer — so motion capture, retargeting, and contact-rich tasks work out of the box.

MuJoCo provides articulated dynamics, soft contacts, sensors, and deterministic rollout. Kite's adapter exposes the state as structured observations and wires your robot's URDF into the scene graph automatically.

Kimodo handles mocap retargeting, joint-space motion blending, and quaternion-safe transform composition. It bridges captured or synthesized motions into the simulator without the usual coordinate-system breakage.

MuJoCo backend

Articulated bodies, soft contacts, and sensors.

Kimodo motion

Mocap retargeting with quaternion-safe transforms.

Sensor config

RGB, depth, IMU, and encoders as project state.

Scene builder

Composes robots, worlds, and objects — no manual XML.

Models & Data

Synthetic data generation

Kimodo is Kite's synthetic-data engine — a motion generation and retargeting layer that produces physically consistent episodes across any robot in the project. Pair it with domain randomization and you get training corpora at scale, without recording a single frame of real data.

Kimodo sits between your task definition and the simulator. It generates joint-space motions from high-level task primitives, retargets them onto the target robot's kinematic topology, and preserves quaternion-safe transforms across coordinate frames — so the same task runs on a quadruped, an arm, or a humanoid without re-authoring per embodiment.

Each episode produces synchronized observations, proprioception, and actions in the Kite dataset format. Lighting, textures, friction, mass, sensor noise, and camera placement are randomized per episode; seeds are logged so any sample is reproducible. Datasets export directly to HuggingFace, LeRobot, and raw parquet for downstream training.

Kimodo motion engine

Generates joint-space motions from task primitives — the core of Kite's synthetic data.

Cross-embodiment retargeting

Same episode runs on quadrupeds, arms, and humanoids without re-authoring.

Quaternion-safe transforms

Coordinate frames stay consistent across sim, mocap, and export.

Randomization + export

Per-episode domain randomization, with HuggingFace, LeRobot, and parquet outputs.

Compute

GPU integrations for cloud workloads

Run training and data-generation jobs on cloud compute without managing infrastructure. CPU processing is included in the Basic plan; on-demand GPU compute is included in the Pro plan.

Training jobs run on Google Vertex AI custom jobs and Cloud Run GPU, with the same container image used locally. Kite mints short-lived tokens per job and streams logs back into the IDE.

Basic-plan workspaces get CPU processing — enough to run simulations, generate datasets, and train small policies. Upgrade to Pro to unlock GPU compute on demand: provisioned per-run with burst capacity during training spikes, then released. No rig to buy, cool, or maintain.

CPU — Basic plan

CPU processing included at the Basic tier.

GPU — Pro plan

On-demand cloud GPUs included with Pro.

Vertex AI & Cloud Run

Same container image used locally and in the cloud.

Everything included in Kite.

What Kite solves

Pre-supported robots & URDF upload

Robotics agent

World generation — World Labs

Objects with SAM 3D

Vision-Language-Action models

Fleet policy evaluation

Dataset Augmentation Studio

Physics & simulation engine

Synthetic data generation

GPU integrations for cloud workloads

Built for the team you're becoming.

Basic

Startups

Pro