– Europe/Lisbon
Online
Pedro A. Santos, Instituto Superior Técnico and INESC-ID
In this presentation, I will introduce some traditional Reinforcement Learning problems and algorithms, and analyze how some problems can be avoided and convergence results obtained using a two-time scale variation of the usual stochastic approximation approach.
This variation was inspired by the practical successes of Deep Q-Learning in attaining superhuman performance at some classical Atari games by Deepmind's research team in 2015. Machine Learning practical successes like this often have no corresponding explaining theory. The work that will be presented intends to contribute to that goal.
Joint work with Diogo Carvalho and Francisco Melo from INESC-ID.