The problem posed is a very general case of optimal control of a dynamic, potentially stochastic, and partially observable system for which a model is not necessarily available. We analyze the disadvantages of classical approaches of the control theory and present a new modified numerical reinforcement learning rule of machine learning algorithm. Control theory is a field that has been studied for a very long time and which deals with the behavior of dynamic systems and how to influence it. Among the best-known examples are LQG (Linear Quadratic Gaussian) or PID (Proportional Integral Derivative) controllers. Most of the existing approaches presuppose (analytical) knowledge of the dynamic system, and one of the constraints is the need to be able to free oneself from a priori models. We focus on modified reinforcement learning approach to adaptive control policy as perspective area of control of complex dynamical system under uncertainty.