See Reinforcement Learning
Given a current state, chosing the next favorable action, Modelled via Neural Networks, yields a policy network