-
Notifications
You must be signed in to change notification settings - Fork 5
Theory
This page provides an overview of the theory behind BrainTrak. This section is best read in conjunction with the theory paper link - this is the paper which the equation numbers here refer to.
The basic idea behind curve fitting is to minimize the difference between the experimental data and the model fit. This difference is quantified by Eq. (27), and is the sum of squared differences. The chisq statistic is zero if the fit is identical to the data, and it becomes larger as the fit becomes more and more different to the data. We want to choose the parameters to minimize chisq, and thus minimize the difference between the data and the fit.
Traditional optimization methods work with chisq directly, and seek to minimize its value. BrainTrak uses a somewhat different approach. Instead of using chisq directly, we consider the 'likelihood' of the parameters by using exp(-chisq/2) as per Eq. (30). This measure is maximized when chisq is zero (perfect fit) and approaches zero as the fit becomes worse and worse. Criticially, unlike chisq itself, the likelihood is bound between 0 and 1. The likelihood is defined for all possible model parameters.
We can then interpret the likelihood as a probability distribution. However, the normalization is unknown, because the integral of Eq. (30) over all parameters will almost certainly not be 1. In addition, we also do not know the analytic form of L(x) beyond being able to evaluate it via chisq for individual sets of parameters. Both of these problems can be solved by using the Metropolis-Hastings algorithm, a form of Markov Chain Monte Carlo (MCMC) sampling. For our purposes, the role of MCMC sampling is to draw samples from an unknown probability distribution - and that distribution does not need to be normalized. The MCMC algorithm will thus generate more samples for parameters where the probability is high, and less samples where the probability is low. The distribution of the sampled points is the same as the distribution of the unknown distribution. In our case, the unknown distribution is Eq. (30). The MCMC algorithm will return many thousands of parameter combinations. There will be many parameter sets where the fit is good, and fewer parameter sets where the fit is poor. The distribution in Eq. (30) can be estimated by looking at the distribution of the sampled parameters. Don't forget that this distribution corresponds to the goodness of fit, and thus gives us information about the quality of the fit and the uncertainty in the fitted parameters.
The distribution in Eq. (30) is a joint distribution for all of the model parameters. That is, it reflects all of the correlations and interdependences in the parameters. We can construct marginal distributions for individual parameters by integrating the distribution over all of the 'nuisance' parameters (that is, all of the parameters we are not interested in). As shown in the figure below, the samples are the black dots, one contour from the joint distribution is shown in green, and then the marginal distributions for each of the individual parameters are shown in red and blue. BrainTrak returns the samples themselves, which means that the joint distribution or any of the possible marginal distributions can all be computed.
- ML vs MAP