Can Wikipedia Help Offline Reinforcement Learning? (Paper Explained)

#wikipedia #reinforcementlearning #languagemodels Transformers have come to overtake many domain-targeted custom models in a wide variety of fields, such as Natural Language Processing, Computer Vision, Generative Modelling, and recently also Reinforcement Learning. This paper looks at the Decision Transformer and shows that, surprisingly, pre-training the model on a language-modelling task significantly boosts its performance on Offline Reinforcement Learning. The resulting model achieves higher scores, can get away with less parameters, and exhibits superior scaling properties. This raises many questions about the fundamental connection between the domains of language and RL. OUTLINE: 0:00 - Intro 1:35 - Paper Overview 7:35 - Offline Reinforcement Learning as Sequence Modelling 12:00 - Input Embedding Alignment & other additions 16:50 - Main experimental results 20:45 - Analysis of the attention patterns across models 32:25 - More experimental results (scaling properties, ablations, etc.) 37:30 - Final th

4 views