ADAPTIVE GRADIENT METHOD FOR THE OPTIMAL CONTROL OF AN ANTICANCER THERAPY MODEL MÉTODO DO GRADIENTE ADAPTATIVO PARA O CONTROLE ÓTIMO DE UM MODELO COM TERAPIA ANTICÂNCER

Cancer is a set of diseases whose mechanisms of emergence and growth are not completely known. Mathematical and computational models aim to contribute to a better understanding of these mechanisms in tumor dynamics. They can also be used to analyze the impact of anticancer therapeutic protocols. In this sense, we investigate a mathematical model of tumor growth dynamics from the optimal control point of view. We apply the Switching-TimeVariation method to solve the control problem and modify its implementation in order to reduce computational time. Our results show that the execution time of the adaptive method is significantly shorter, and optimal controls for the observed scenario are of the bang-bang type with administration by maximum tolerated dose. The analysis also suggests that the application of another drug capable of acting on resistant tumor cells is required.


INTRODUCTION
The apparently disordered cell growth, which often invades different tissues and organs, configures a set of diseases called cancer (WHO, 2020;INCA, 2020). A typical modality of disease control is chemotherapy. Chemotherapy can be administered in a neoadjuvant way, before surgery to remove tumors, or in an adjuvant way, after surgery to eradicate any remaining malignant cells. In both cases, the objective is to control the tumor cell population through an optimal administration protocol of one or more drugs. It is in this context that mathematical and computational modeling can contribute, investigating in silico protocols that maximize the chance of eradicating the disease.
In this work, we aim to study a mathematical model of tumor growth dynamics (VENDITE et al., 1988), represented by a system of two ordinary differential equations that describe the temporal evolution of drug-resistant and -sensitive tumor cell populations. The control term due to the drug acts only on the sensitive cell population. This model is analyzed from the optimal control point of view, as described in Costa, Boldrini, and Bassanezi (1992). In order to solve the control problem, we apply the Switching-Time-Variation method (MOHLER, 1973;DUDA, 1995). This method provides successive approximations to bang-bang optimal control problems, based on a modified gradient scheme. However, this requires the specification of a parameter that substantially reflects the convergence and computational efficiency of the method. Thus, to avoid inappropriate choices, we develop an adaptive version, in which this parameter is automatically adjusted, and we apply this new version to solve the same control problem. Our results match the Goldie-Coldman model, which describes that mutation rates from sensitive to resistant cells are relatively high. Therefore, treatment should be started as soon as possible to prevent the emergence of more resistant cells (BENZEKRY et al., 2015).

Mathematical Model
We assume spatial homogeneity and that the tumor is heterogeneous, composed of drug-resistant and -sensitive cancer cells. We also assume that drug resistance is acquired spontaneously by mutation. The total population of tumor cells at a given time t is denoted by y(t), resulting from the sum of the number of sensitive and resistant cells. The population of resistant tumor cells is denoted by x(t). The evolution over time of these populations under Revista Mundi, Engenharia e Gestão, Paranaguá, PR, v. 6, n. 3, p. 357-01, 357-10, 2021. DOI: 10.21575/25254782rmetg2021vol6n31657 chemotherapy has been modeled in Vendite (1988) and is described mathematically by: where u(t) indicates the drug concentration in the microenvironment that acts on the sensitive cell population y − x. This model assumes a logistic (resource-limited) growth of the tumor cell population, and the term αr(1 − ky)(y − x) describes the number of sensitive cells that become drug-resistant cells per unit of time. The meaning of the parameters is shown in Table  1. Their values were taken from Vendite (1988) and are associated with an in vitro experiment with CCRF-CEM cells, a cell lineage of leukemia.

Optimal Control Problem
The optimal control problem associated with (1) has been formulated and analyzed in Costa, Boldrini, and Bassanezi (1992). Defining the optimal drug concentration by u * (t), with 0 ≤ u * (t) ≤ u max , administered until time t * f , 0 ≤ t * f < ∞, the optimal strategy is such that it minimizes the functional (COSTA; BOLDRINI; BASSANEZI, 1992): The aim is, therefore, to reduce the total tumor size at the end of therapy, y(t f ), using the lowest drug dose to minimize therapy toxicity. The parameter c ≥ 0 balances the relative importance between these two effects.

Switching-Time-Variation Method
The analytical solution of an optimal control problem, such as that presented in the previous section, cannot always be determined, and numerical techniques are usually required. A common procedure is to initially assume a solution and then solve the problem iteratively so that its conditions are solved incrementally in each iteration (MOHLER, 1973). In this work, we implement an iterative computational method that provides a successive approximation of a solution to the problem (2). This method, called Switching-Time-Variation (STV), is a computational technique to obtain optimal bang-bang controls through a modified gradient scheme. It aims to generate a sequence of switching functions and calculate the cost functional gradient in relation to the switching times. From this gradient, the times are corrected in each iteration so that the control sequence converges to a non-singular optimal control (MOHLER, 1973).
Assuming that the optimal control is a bang-bang process and N is the number of switchings, we define the vector τ where H(·) is the Heaviside function, Algorithm 1 presents a pseudocode of the STV method considering problem (2). The method requires the choice of an initial number of switchings N , a step size κ ij and tolerances ε δτ , ε τ , ε Φ , ε J , which are problem-dependent. Schättler and Ledzewicz (2015) suggest using a sufficiently large number of switchings so that it is not necessary to increase this number during the method execution. They also suggest adopting sufficiently small tolerance values to ensure the solution convergence. Finally, Mohler (1973) points out that, if κ ij is too small, the solution convergence can be slow; on the other hand, if κ ij is too large, the conditions τ N ij < τ N ij+1 and J i−1 > J i can be violated.

Adaptive Switching-Time-Variation Method
In solving the optimal control problem using the STV method, we identify that the parameter κ ij has a critical role in the convergence and computational efficiency of the method. To avoid inappropriate choices, we propose an automatic construction for this parameter, capable of adaptively adjusting it to avoid both too small and too big steps. To this end, at each iteration of the STV outer loop, we define a vector κ i whose elements are given by: where κ < 1 is a factor that weights the maximum allowable step size. The values of the correction δτ N ij and the new switching vector τ N i+1 are also modified so that, if the condition τ N ij < τ N ij+1 for some j = 1, . . . , N − 1 is not satisfied, κ ij is multiplied by the factor κ reducing the step size. When all switching times are correctly ordered, we build the new switching vector Construct a vector κ i containing step sizes repeat 4: Starting from x(0) = x 0 and y(0) = y 0 , integrate (1) progressively over time;

RESULTS AND DISCUSSIONS
In the following experiments, we adopt u max = 0.483 day −1 and observe the behavior of the model solution until time t f = 100 days, using c = 100 to characterize high toxicity caused by the drug. Using a number of N = 99 switchings, a step size of κ ij = 10 −4 for all j = 1, . . . , N , and tolerances equal to ε δτ = 10 −2 , ε τ = 10 −3 , ε Φ = 10 −5 and ε J = 10 −5 , we solve the optimal control problem (2) by applying the STV method. These parameter values were defined according to the suggestions previously discussed. We also define a maximum number of iterations of the outer and inner loops by 10 6 and 10 3 , respectively, so that the iterative process ends even if the stopping criteria are not satisfied. Figure 1 illustrates the optimal control u * (t) and the trajectory of states x(t) and y(t) obtained for the performed simulation.
The behavior of the solutions x(t) and y(t) resulted in a decrease in the total population Revista Mundi, Engenharia e Gestão, Paranaguá, PR, v. 6, n. 3, p. 357-01, 357-10, 2021. DOI: 10.21575/25254782rmetg2021vol6n31657 Figure 1: Optimal control u * (t) at t * f = 33.371 days and trajectory of states x(t) and y(t) in logarithmic scale, over time t (in days), considering the maximum tolerated dose u max = 0.483 day −1 .
(a) Optimal control u * (t) (b) Trajectory of states x(t) and y(t) Source: The authors.
of tumor cells at the beginning of therapy due to the drug action on the sensitive cell population. It also resulted in a simultaneous increase in the resistant cell population due to its growth rate and mutation rate from sensitive cells. At the end of therapy, the total population of tumor cells becomes predominantly drug-resistant, requiring, for example, the application of another drug capable of acting on the remaining tumor cell population.
In this experiment, the value obtained for the cost functional was J(u * , t * f ) = 1.809 × 10 5 . The number of outer loop iterations was equal to the predefined maximum number (= 10 6 ), meaning that the stopping criteria were not satisfied. However, we consider the resulting controls as optimal since starting from a given iteration when N = 1 switching, the remaining switching time t f oscillated between two relatively close values. Thus, the stopping criteria were not satisfied since the adopted tolerances ε Φ and ε J were excessively stringent.
In order to analyze the solution convergence of Algorithm 2 with respect to Algorithm 1, we develop two sets of computational simulations. In the first set of simulations, we define a number of N = 99 switchings and tolerances equal to ε τ = 10 −3 and ε J = 10 −5 . We also define both the maximum numbers of iterations of the outer and inner loops as 10 3 . Finally, we set different values for the tolerances ε Φ (10 −3 , 10 −4 , 10 −5 ) and ε δτ (10 −1 , 10 −2 , 10 −3 , 10 −4 ), as well as for the step size-associated factor κ (0.5, 0.1). The results showed that the variation of the tolerance ε Φ had no impact on the switching time t * f ; there was only a small variation in the number of outer loop iterations. However, we must keep the tolerance ε δτ and factor κ Revista Mundi, Engenharia e Gestão, Paranaguá, PR, v. 6, n. 3, p. 357-01, 357-10, 2021. DOI: 10.21575/25254782rmetg2021vol6n31657 sufficiently small; otherwise, we observe that the method can converge to another switching vector, failing to reach the desired solution to the optimal control problem.
In order to analyze the impact of other parameters on the solution convergence of the optimal control problem (2), we define again a number of N = 99 switchings and adopt κ = 0, 1, ε δτ = 10 −4 , and ε Φ = 10 −4 . We define the same maximum number of iterations of the outer and inner loops (= 10 3 ). Finally, we vary the values of the tolerances ε J (10 −3 , 10 −4 , 10 −5 ) and ε τ (10 −3 , 10 −4 , 10 −5 ). In this second set of simulations, the tolerance variation ε J did not impact the switching time t * f , and the number of outer loop iterations also remained the same. However, we cannot adopt significantly small values for the tolerance ε τ , which could lead to an increase in the number of outer loop iterations, and the solution may not converge.
Given the obtained results, we use a number of N = 99 switchings, a step size-associated factor of κ = 0.1, and tolerances equal to ε δτ = 10 −4 , ε τ = 10 −3 , ε Φ = 10 −4 , and ε J = 10 −5 to solve the optimal control problem (2) by applying the adaptive STV method. We also keep the maximum numbers of iterations of the outer and inner loops equal to 10 3 . Figure 2 illustrates the optimal control u * (t) and trajectory of states x(t) and y(t) obtained for the performed simulation. Source: The authors.
The resulting optimal control u * (t) has only N = 1 switching at time t * f = 33.418 days. Thus, the behavior of the solutions represented in Figure 2 is qualitatively and quantitatively Revista Mundi, Engenharia e Gestão, Paranaguá, PR, v. 6, n. 3, p. 357-01, 357-10, 2021. DOI: 10.21575/25254782rmetg2021vol6n31657 8 consistent with the behavior obtained in Figure 1. The value obtained for the cost functional J(u * , t * f ) is also equivalent to 1.809 × 10 5 , and the number of outer loop iterations is 174, and thus significantly lower compared to the simulation obtained using Algorithm 1.

CONCLUSION
In this work, numerical simulations involving the scenario depicted in Vendite (1988) of a chemotherapy treatment were analyzed to investigate the solution of the optimal control problem (2). To solve the control problem, we used the STV method (MOHLER, 1973;DUDA, 1995) and modified its implementation, aiming to reduce the computational execution time. The proposed adaptation resulted in the acceleration of the convergence of the solution for the considered test problem, with significant reduction in the final computational cost. From the observed results, optimal controls for the scenario are of the bang-bang type with, at most, N = 1 switching from u = u max to u = 0. Concerning the behavior of tumor cell populations, the best treatment protocol is to eliminate sensitive cells as soon as possible since resistant cells cannot be eliminated. In this way, tumor growth becomes limited to the resistant population growth rate because the rapid elimination of sensitive cells prevents the increase in the population of resistant cells due to mutation (SCHÄTTLER; LEDZEWICZ, 2015). Scenarios involving multiple drug therapies and their implications are still under investigation.