USING OF MULTILAYER NEURAL NETWORKS FOR THE SOLVING SYSTEMS OF DIFFERENTIAL EQUATIONS

The article considers the study of methods for numerical solution of systems of differential equations using neural networks. To achieve this goal, the following interdependent tasks were solved: an overview of industries that need to solve systems of differential equations, as well as implemented a method of solving systems of differential equations using neural networks. It is shown that different types of systems of differential equations can be solved by a single method, which requires only the problem of loss function for optimization, which is directly created from differential equations and does not require solving equations for the highest derivative. The solution of differential equations’ system using a multilayer neural networks is the functions given in analytical form, which can be differentiated or integrated analytically. In the course of this work, an improved form of construction of a test solution of systems of differential equations was found, which satisfies the initial conditions for construction, but has less impact on the solution error at a distance from the initial conditions compared to the form of such solution. The way has also been found to modify the calculation of the loss function for cases when the solution process stops at the local minimum, which will be caused by the high dependence of the subsequent values of the functions on the accuracy of finding the previous values. Among the results, it can be noted that the solution of differential equations’ system using artificial neural networks may be more accurate than classical numerical methods for solving differential equations, but usually takes much longer to achieve similar results on small problems. The main advantage of using neural networks to solve differential equations` system is that the solution is in analytical form and can be found not only for individual values of parameters of equations, but also for all values of parameters in a limited range of values.

Ключевые слова: системы дифференциальных уравнений, искусственные нейронные сети, многослойная нейронная сеть, многочисленные методы, метод градиентного спуска, функция погрешности решения Вісник Національного технічного університету «ХПІ». Серія: Системний 82 аналіз, управління та інформаційні технології, № 2 (6)'2021 Introduction. Differential equations and their systems are widely used in mathematical modeling to describe a variety of real processes: physical, environmental, biological, and other. Solving some equations in partial derivatives in cases that allow the separation of variables is also reduced to problems for ordinary differential equations. These are, as a rule, boundary value problems (problems of natural oscillations of elastic beams and plates, determination of the spectrum of natural values of particle energy in spherically symmetric fields, etc.). In addition, higher-order differential equations lead to the solution of systems of differential equations. It is known that solutions of differential equations and their systems can be found analytically or numerically. Finding analytical solutions is a very time consuming process, and in most cases impossible. Therefore, at present, traditional numerical methods are widely used to solve differential equations and their systems, among which the most well-known are Runge -Kutta methods, finite-difference methods, prediction and correction methods [1,2].
The general problem of classical numerical methods is the need to choose their parameters to ensure a compromise between computational costs and the accuracy of the result. Therefore, in this work it is forbidden to use artificial multilayer neural networks, where, in contrast to classical methods, the solution is presented in analytical form, from which you can repeatedly take derivatives [3,4]. Solutions are stored as neural network parameters, which requires much less memory than storing a solution as a discrete array in traditional numerical methods [5,6]. The method is also universal and can therefore be used to solve different types of differential equations and their systems, both ordinary and partial derivatives [7][8].
The main advantage of using neural networks to solve differential equations' systems is that the solution is in analytical form and can be found not only for individual values of parameters of equations, but also for all values of parameters in a limited range of values.
A review of the literature showed the relevance of the problem and the feasibility of creating software. Therefore, the aim of this article is to solve systems of differential equations using a multilayer neural network.
The article's objective is to study the methods of numerical solution of ordinary differential equations` systems and to develop software for their solution using multilayer neural networks.
To solve this problem, the solution is presented in the form [3,5]: whereneural network function with parameters and input values .
In this case, the initial conditions are not satisfied by the creation and therefore are studied gradually during the learning of the neural network.
The construction of the solution of differential equations' systems can be written in a form that satisfies the initial conditions from the beginning: where ( ) is a function that satisfies the initial conditions in advance; Z( )function what construct as the points corresponding which are equal to zero to the coordinates of the initial conditions ( , )output of backforward neural network with input the weights .
The task of a building function ( ) is reduced to the task of the function that takes a certain values in the given points, and can take any value at all other points. To find the function, for example, an interpolation polynomic of Lagrange can be used in this case that looks like: where ( ) -basic polynomials are determined by the formula: To reduce the influence of the shape of the error of the approximation of the solution, we write the expression for Z( ) in the form: A multilayer neural network of direct propagation is chosen as the structure of the neural network for solving differential equations' systems. The number of layers and the number of neurons in each layer are chosen based on the structure of the problem and the complexity of the form of the solution. These parameters are chosen after the experiments, because it is impossible to know in advance the optimal parameters of the neural network structure for each task.
The description of a multilayer neural network. An artificial neural network is a structure that consists of a large number of processor elements, each of which has local memory and can interact with other elements [3,4,6,9,12]. This interaction takes place through communication channels in order to transmit data that can be interpreted in any way. Processor elements independently in time process the local data arriving to them through input channels. Changing the parameters of the algorithms of such processing depends only on the characteristics of the data. If we consider an artificial Вісник Національного технічного університету «ХПІ». Серія: Системний аналіз, управління та інформаційні технології, № 2 (6)'2021 83 neural network as an environment for information processing, then it can be set by defining the elements of this environment and the rules of their interaction. Multilayer artificial neural networks can be considered as a serial connection of single-layer artificial neural networks of direct propagation. The structure of weights in these networks is organized in such a way that more complex classes are processed on layers of highlevel neurons by combining and intersecting simple classes, which are formed at lower levels of artificial neural networks. There is strong evidence that two-layer artificial neural networks are able to recognize any class of convex shape, provided that it is possible to use a sufficient number of hidden layer neurons, and the weights are adjusted accordingly [8][9]. Artificial neural networks of direct propagation with several hidden layers are potentially capable of recognizing classes of arbitrary shape. Therefore, setting the problem on artificial neural networks of direct propagation includes determining the minimum possible number of neurons in the hidden layer and choosing an effective method of adjusting the weights. To date, both of these problems are not trivial. To explain the basic principles of building teaching methods with the teacher we will consider a two-layer artificial neural network. The zero layer of this network performs the auxiliary function of signal branching and does not contain neurons. For this reason, his work does not lead to modification of the input vector. The last layer of artificial neural networks is called the source layer. All layers located between zero and source are hidden layers with nonlinear activation function of neurons. In this example, we will consider one hidden layer with m neurons that use the hyperbolic tangent as an activation function.
It consists of neurons that are simultaneously able to receive the input vector of signals = ( 1 , … , , … , ). To reproduce the elements of this vector use special devices, which are shown to the left of the neurons. These devices do not perform information processing, so they are not considered a layer of the neural network. According to the model of a formal neuron, each of its input signals is multiplied by a weighting factor , wherethe current vector element number , аthe current neuron number. All weights of a single-layer neural network form a matrix of weights Then the vector of arguments is defined as the product of = and the vector of output signals is the vector of values of activation functions: The name of the networks indicates that they have a dedicated direction of propagation of signals that move from the input through one or more hidden layers to the output layer. It is easy to see that a multilayer neural network can be obtained by cascading single-layer networks with matrices of weights 1 , 2 , … , , where is the number of layers of the neural network. If the multilayer neural network is linear, then for activation functions it can be reduced to the equivalent single-layer with a matrix of weights = 1 * 2 * … * . This means that the formation of such structures makes sense if nonlinear activation functions in neurons are used.
The gradient descent method for artificial neural networks.
The idea of the gradient descent method is to sequentially change the parameters of the artificial neural network in a direction that reduces the target function [5]. Since the function is differentiated by each of the parameters, it is possible to calculate the gradient vector. Moving in the direction of the negative gradient for each of the parameters, we find the local minima of the objective function. The change in the parameter is expressed by the formula: This algorithm is called a batch-type algorithm, because to determine the magnitude of the step of changing the parameter, it is necessary to process the entire training sample.
The parameter is searched as (7): The step of changing the weights of the source layer is equal to (8): The step of changing the weights of the hidden layer: But in fact, the correct calculation of values at points closer to the initial conditions is much more important than the calculation at points further away. To correct the optimization to take into account the influence of the values of the functions in the previous points on the values of the functions in the following points, the calculation of the loss function was modified to give the greatest weight to points closer to the initial conditions, keeping the sum of the loss function.
where ( )the loss function; argument of the required function; points at which optimization is performed; the number of points at which optimization is performed.
The main results of the work. The Python and R programming languages were used for perform this work, the TensorFlow library was chosen as the machine learning library with neural network learning support, and the PyCharm environment was used as the integrated development environment.
The final result of solving the problem is shown in fig. 10-11. The obtained optimization result in a form that satisfies the initial conditions for construction: 10000/10000-18s-loss: 3.3917• 10 −07 -rmse: The solution has a root mean square error 1.3370 • 10 −04 in comparison with the solution of the implicit Runge -Kutta's method of the 4th order with the step 10 −03 . Conclusions. The system (12) is difficult to solve with neural networks and could not be solved without additional changes to the loss function, regardless of the form of solution. The applied modification can be used in other cases, when the solution of differential equations by optimization methods coincides to the local minimum. When solving the system (11), the accuracy of the reproduction of the initial conditions had a significant effect on the whole solution. The error of the solution in the basic form was 4.59 times higher than the error of the solution in the form with satisfaction of the initial conditions for construction. Based on the results, we can say that the choice of the form of the solution and the construction of the loss function depends on the differential equations system and the needs of the problem to be solved. Some differential equations require special forms of construction of the loss function to be solved. Fig. 10 The final solution of the differential equations system (12) Fig. 11. The final solution error`s function and the redistributed solution error function of the differential equations system (12)