MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
NeurIPSJun 17, 2022Datasets & Benchmarks Best Paper
Autonomous agents have made great strides in specialist domains like Atari
games and Go. However, they typically learn tabula rasa in isolated
environments with limited and manually conceived objectives, thus failing to
generalize across a wide spectrum of tasks and capabilities. Inspired by how
humans continually learn and adapt in the open world, we advocate a trinity of
ingredients for building generalist agents: 1) an environment that supports a
multitude of tasks and goals, 2) a large-scale database of multimodal
knowledge, and 3) a flexible and scalable agent architecture. We introduce
MineDojo, a new framework built on the popular Minecraft game that features a
simulation suite with thousands of diverse open-ended tasks and an
internet-scale knowledge base with Minecraft videos, tutorials, wiki pages, and
forum discussions. Using MineDojo's data, we propose a novel agent learning
algorithm that leverages large pre-trained video-language models as a learned
reward function. Our agent is able to solve a variety of open-ended tasks
specified in free-form language without any manually designed dense shaping
reward. We open-source the simulation suite, knowledge bases, algorithm
implementation, and pretrained models (https://minedojo.org) to promote
research towards the goal of generally capable embodied agents.