Difference between revisions of "Fast derivatives of likelihood functionals for ODE based models using adjoint-state method"

1 Citation

Melicher, V., Haber, T. & Vanroose, W. Fast derivatives of likelihood functionals for ODE based models using adjoint-state method. Comput Stat 32, 1621–1643 (2017).

2 Summary

In this paper, the adjoint-state method (ASM) for computation of the gradient and the Hessian of likelihood functionals for time series data modelled by ordinary differential equations (ODEs) derived and analyzed. Discrete data and the continuous model are interfaced on the level of likelihood functional, using the concept of point-wise distributions.

This alternative approach is compared to sensitivity equations (SE) and finite differences.

3 Study outcomes

3.1 Outcome O1

When using ASM for computing the gradient of linear (and diagonal) ODE models, the speed increase goes linearly with the number of states which is here equivalent to the number of parameters. So, the higher dimensionality of the problem, the more beneficial is the use of ASM.

Outcome O1 is presented as Figure 1 in the original publication.

3.2 Outcome O2

When using ASM for computing the gradient of linear (and diagonal) ODE models, the acceleration of ASM compared to SE declines exponentially with the number of data points. So, the more observations of the problem, the less beneficial is the use of ASM. For the maximal number of data points investigated here, both procedures have same speed.

Outcome O2 is presented as Figure 2 in the original publication.

3.3 Outcome O3

When using ASM for computing the Hessian of linear (and diagonal) ODE models, the speed increase goes linearly with the number of states which is here equivalent to the number of parameters. So, the higher dimensionality of the problem, the more beneficial is the use of adjoints. FD adjoints is slightly faster than SE adjoints.

Accuracy of adjoint FD is not as high as adjoint SE but usually sufficient.

Outcome O3 is presented as Figure 3 in the original publication.

3.4 Outcome O4

When using ASM for computing the gradient of one specific nonlinear ODE model, the factor of acceleration of ASM compared to SE declines exponentially with the number of data points. At roughly n = 6, the two methods are equally fast. For smaller n, ASM is preferable. For higher n, SE is preferable.

Outcome O4 is presented as Figure 4 in the original publication.

3.5 Outcome O5

When using ASM for computing the Hessian of one specific nonlinear ODE model, the factor of acceleration of ASM compared to both SE and FD declines exponentially with the number of data points. Adjoint FD is faster than adjoint SE. Adjoint FD is therefore preferable over adjoint SE. At roughly n = 7, adjoint FD and FD are equally fast. For smaller n, adjoint FD is preferable. For higher n, FD is preferable.

Outcome O5 is presented as Figure 5 in the original publication.

3.6 Further outcomes

1. The implementation of SE approach is so efficient, that it renders the finite difference approximation practically obsolete, due to its superior accuracy.
2. The ASM efficiency is dependent on the number of measurement times, which is not the case for SE approach.

4 Study design and evidence level

4.1 General aspects

• They use CVODES solver from the SUNDIALS
• As a modelling toolbox, they use DiffMEM which is originated is mixed effects modelling. It is a C-library with R and Python interfaces provided.
• Accuracies (Tolerances) of the ODE solver are provided and claimed to be sufficient to have no effect on results.
• Special case of linear and diagonal system. How do results translate to non-diagonal linear systems?

4.2 Design for Outcome O1

• ODE models with linear and diagonal RHS (-> number of states = number of parameters) are randomly generated with -0.1 > pi > -1.1.
• Dimensionality of problem is varied between 2 and 122 with 13 different dimensions.
• Synthetic data: 11 equidistant measurements between 0 and 100.
• 100 repetitions to estimate variance of the performance.
• Gaussian noise with variance 1% of maximum prediction.

4.3 Design for Outcome O2

• ODE models with linear and diagonal RHS (-> number of states = number of parameters) are randomly generated with -0.1 > pi > -1.1.
• Dimensionality of problem is fixed to 50 dimensions.
• Synthetic data: The number of time observations fluctuates between 2 and 122 in 12 steps 11 equidistant measurements between 0 and 100.
• 100 repetitions to estimate variance of the performance.
• Gaussian noise with variance 1% of maximum prediction.

4.4 Design for Outcome O3

Basically the same as the design for outcome O1.

4.5 Design for Outcome O4

• Latent dynamic HIV model from Lavielle et al. (2011). It consists of 6 states, 11 parameters and 2 observables.
• Parameters are randomly perturbed within 5% deviation.
• Other procedures are identical to O2, i.e. number of observations is changed from 2 to 122.

4.6 Design for Outcome O5

• Using the model of design for Outcome O4 and the study setup of design of Outcome O2.