Using natural language and program abstractions to instill human inductive biases in machines
NeurIPSMay 23, 2022Outstanding Paper
Strong inductive biases give humans the ability to quickly learn to perform a
variety of tasks. Although meta-learning is a method to endow neural networks
with useful inductive biases, agents trained by meta-learning may sometimes
acquire very different strategies from humans. We show that co-training these
agents on predicting representations from natural language task descriptions
and programs induced to generate such tasks guides them toward more human-like
inductive biases. Human-generated language descriptions and program induction
models that add new learned primitives both contain abstract concepts that can
compress description length. Co-training on these representations result in
more human-like behavior in downstream meta-reinforcement learning agents than
less abstract controls (synthetic language descriptions, program induction
without learned primitives), suggesting that the abstraction supported by these
representations is key.