Least-squares policy iterationReinforcement learning as classification: Leveraging modern classifiers