|| Paper | Code ||

Generating interactive multimedia content such as video games or animations (with sound) is a challenging task. In this work, I propose an evaluation metric for assessing and a multi-agent system for generating such content.

AVR-Eval is an evaluation metric for multimedia content (video games, animations, etc.) through an omni-modal model (processing text, video, and audio) that compares the Audio-Visual Recordings (AVR) of two contents. Contrary to metrics like FVD or JEDi, it does not require any dataset. Contrary to WebDev Arena, it does not require human evaluators.

AVR-Agent is a multi-agent framework leveraging both coding and omni-modal models for video-game generation through AVR feedback and a bank of multimedia assets made by artists (images, sound, music, 3D models).

I show below examples of generated games. Read the paper for more details. The code for AVR-Eval and AVR-Agent is available here.

AVR-Agent



Examples games made with Kimi-K2

Beat em up game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Incremental game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Platformer game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Examples games made with Qwen3-Coder

Beat em up game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Incremental game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Bowling game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Solitaire game

One-shot with assets

AVR-Agent (10 steps) with assets and AVR feedback

AVR-Agent (10 steps) without assets nor AVR feedback



Evaluation metric: AVR-eval

AVR-Agent



Multi-agent framework: AVR-Agent

AVR-Agent