Ensuring Monotonic Policy Improvement in Entropy-regularized Value-based Reinforcement Learning

Published in arXiv eprint, 2020