-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1.1] Add support for errors-in-variables #19
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
{ | ||
"name": "ml-levenberg-marquardt", | ||
"version": "1.0.3", | ||
"version": "1.1.0", | ||
"description": "Curve fitting method in javascript", | ||
"main": "./lib/index.js", | ||
"module": "./src/index.js", | ||
|
@@ -12,7 +12,7 @@ | |
"scripts": { | ||
"eslint": "eslint src", | ||
"eslint-fix": "npm run eslint -- --fix", | ||
"prepare": "rollup -c", | ||
"preparea": "rollup -c", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was this intended? I indeed have no idea about what it should do, but it seems a lot like a typo to me 😀 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Haha, it was a temporary rename because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So yes, typo. You can push up a fix. |
||
"test": "npm run test-only && npm run eslint", | ||
"test-coverage": "npm run test-only -- --coverage", | ||
"test-only": "jest", | ||
|
@@ -49,6 +49,10 @@ | |
"rollup": "^0.66.6" | ||
}, | ||
"dependencies": { | ||
"ml-matrix": "^5.2.0" | ||
"ml-matrix": "^5.2.0", | ||
"norm-dist": "^2.0.1" | ||
}, | ||
"jest": { | ||
"testURL": "http://localhost" | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,10 @@ | ||
import errorCalculation from '../errorCalculation'; | ||
import * as errCalc from '../errorCalculation'; | ||
|
||
function sinFunction([a, b]) { | ||
return (t) => a * Math.sin(b * t); | ||
} | ||
|
||
describe('errorCalculation test', () => { | ||
describe('sum of residuals', () => { | ||
it('Simple case', () => { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we simplify this test by making the const sampleFunction = ([k]) => (x => k); // resulting function returns constant
const n = 3;
const xs = new Array(n).fill(0).map((zero, i) => i); // [0, 1, ..., n-1] but doesn't matter because f(x; [k]) -> k
const data = {
x: xs,
y: xs.map(sampleFunction([0])) // all zeros
};
expect(errCalc.sumOfResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1);
expect(errCalc.sumOfResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1);
expect(errCalc.sumOfResiduals(data, [2], sampleFunction)).toBeCloseTo(2*n, 1);
expect(errCalc.sumOfSquaredResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1);
expect(errCalc.sumOfSquaredResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1);
expect(errCalc.sumOfSquaredResiduals(data, [2], sampleFunction)).toBeCloseTo(Math.pow(2*n, 2), 1); There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, a simpler function could be used. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realize now that all your change really does here is update the name. The simplification of the test function is outside the scope of this PR. I've attempted to address this in #26, so after that merges this PR can be rebased to resolve this. |
||
const len = 20; | ||
let data = { | ||
|
@@ -17,7 +17,32 @@ describe('errorCalculation test', () => { | |
data.y[i] = sampleFunction(i); | ||
} | ||
|
||
expect(errorCalculation(data, [2, 2], sinFunction)).toBeCloseTo(0, 3); | ||
expect(errorCalculation(data, [4, 4], sinFunction)).toBeCloseTo(48.7, 1); | ||
expect(errCalc.sumOfResiduals(data, [2, 2], sinFunction)).toBeCloseTo(0, 5); | ||
expect(errCalc.sumOfResiduals(data, [4, 4], sinFunction)).toBeCloseTo(48.7, 1); | ||
|
||
expect(errCalc.sumOfSquaredResiduals(data, [2, 2], sinFunction)).toBeCloseTo(0, 5); | ||
expect(errCalc.sumOfSquaredResiduals(data, [4, 4], sinFunction)).toBeCloseTo(165.6, 1); | ||
}); | ||
}); | ||
|
||
describe('error propagation estimator', () => { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @m93a Could you please help me understand these test values? When I look at them it is not at all obvious to me what the correct values should be. (Partly because I am not sure how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The This means that if we slice the interval of all variable's possible values to eg. 10 equiprobable intervals, there's the same probability that the measured value will be in the first interval, as there is probability that it will be in the second, third, etc. interval. For practical purposes, I don't really count with all of the possible values, I only use values between x̅ − 2.25σ and x̅ + 2.25σ. (There's 97.6 % probability that the measured value will be between those values). This process is not pseudo-random, but completely deterministic – the intervals are computed using the CDF of normal distribution. Then, the functional values on the boundaries of the intervals are computed. This way the function can be approximated as a linear function on each interval. When a variable “passes through” a linear function, its standard error gets “stretched” (ie. multiplied) by the slope of the function. If a variable has the same probability of passing through multiple linear functions, its standard error (on average) gets stretched by the average of the slopes of those linear functions. Of course the error distribution of the functional value most probably won't be a Gauss curve, even if the input variable was normally distributed in the first place. But as our fitting algorithm can't deal with those either (ie. it's not robust), it's not a big deal. We can simply pretend that all of the resulting distributions are close enough to the normal distribution (which is mostly true as long as the errors are small). (An uncanny fact as a bonus: if the error is large enough and the function is very unsymmetrical, the mean of the functional value won't be equal to the functional value of the mean. For now let's just pretend those cases never occur.) Now about the test values. You're right that the values are not exactly obvious. The results should match the standard deviation of normally distributed data that passed through the function. One could use Monte Carlo to get the right value. // Pseudo-code to validate the results of errorPropagation
const mean = 42;
const sigma = 0.1;
let data = Array(BILLIONS_AND_BILLIONS)
data = data.map(() => normal.random(mean, sigma));
data = data.map( x => fn(x) );
const propagatedSigma = stdev(data);
expect(errCalc.errorPropagation(fn, mean, sigma))
.toBeCloseTo(propagatedSigma, DEPENDS_ON_NUMBER_OF_INTERVALS); Wow, this was really time-consuming, I'm taking a break for today. I've got other things to do 😉 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the detailed explanation! I plan to read through it a few times and attempt to make the test cases more expressive once I understand it. |
||
it('Simple case', () => { | ||
const q = Math.PI / 2; | ||
|
||
errCalc.initializeErrorPropagation(10); | ||
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.001)).toBeCloseTo(0.016, 5); | ||
expect(errCalc.errorPropagation(sinFunction([1, 1]), q, 0.010)).toBeCloseTo(0.000, 5); | ||
|
||
errCalc.initializeErrorPropagation(50); | ||
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 1); | ||
|
||
errCalc.initializeErrorPropagation(100); | ||
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 2); | ||
|
||
errCalc.initializeErrorPropagation(1000); | ||
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 4); | ||
|
||
errCalc.initializeErrorPropagation(10000); | ||
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 6); | ||
}); | ||
}); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,151 @@ | ||
import nd from 'norm-dist'; | ||
|
||
/** | ||
* @private | ||
* @param {number} x | ||
* @return {number} | ||
*/ | ||
const sq = (x) => x * x; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI, this isn't really any faster than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I doubt that the code in the perf actually does anything. Since you're effectively throwing away the result right after the computation, the interpreter probably optimizes that part out and just iterates You're probably right about this function not adding very much to performance. If you don't like it, just remove it, but it seems quite readable to me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess what I'm trying to say is that if you're doing it for performance reasons then it's probably better to remove it. If you think it is more readable I can see why that might be a good reason. However, I think it's hard to argue that a custom function that is identical to a built-in one is easier to read (because one has to look it up to see what it does rather than just recognize it), though in this case it matters little either way since the function is so simple. You are probably right about jsPerf...sometimes it's too easy to fool oneself. Here's an updated example adapted to reduce the likelihood of this kind of mistake. It still shows less than 1% difference between the two methods of interest. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right, the performance differences are negligible. I would choose |
||
|
||
/** | ||
* @private | ||
* @param {...number} n | ||
* @return {boolean} | ||
*/ | ||
const isNaN = (...n) => n.some((x) => Number.isNaN(x)); | ||
|
||
/** | ||
* Compute equiprobable ranges in normal distribution | ||
* @private | ||
* @param {number} iterations - How many evaluations (per point per step) of the fitted function to use to approximate the error propagation through it | ||
*/ | ||
export function initializeErrorPropagation(iterations) { | ||
if (equiprobableStops.length === iterations) return; | ||
|
||
const min = nd.cdf(-2.25); | ||
const max = 1 - min; | ||
const step = (max - min) / (iterations - 1); | ||
|
||
for (var i = 0, x = min; i < iterations; i++, x += step) { | ||
equiprobableStops[i] = nd.icdf(x); | ||
} | ||
} | ||
|
||
/** @type {number[]} */ | ||
const equiprobableStops = []; | ||
|
||
/** | ||
* Estimate error propagation through a function | ||
* @private | ||
* @param {(x: number) => number} fn - The function to approximate, outside of the function domain return `NaN` | ||
* @param {number} x - The error propagation will be approximate in the neighbourhood of this point | ||
* @param {number} xSigma - The standard deviation of `x` | ||
* @return {number} The estimated standard deviation of `fn(x)` | ||
*/ | ||
export function errorPropagation(fn, x, xSigma) { | ||
const stopCount = equiprobableStops.length; | ||
|
||
var slope = 0; | ||
var N = 0; | ||
|
||
var xLast = x + xSigma * equiprobableStops[0]; | ||
var yLast = fn(xLast); | ||
|
||
for (var stop = 1; stop < stopCount; stop++) { | ||
var xNew = x + xSigma * equiprobableStops[stop]; | ||
var yNew = fn(xNew); | ||
|
||
if (!isNaN(xNew, yNew, xLast, yLast)) { | ||
slope += (yNew - yLast) / (xNew - xLast); | ||
N++; | ||
} | ||
|
||
xLast = xNew; | ||
yLast = yNew; | ||
} | ||
|
||
const avgSlope = slope / N; | ||
|
||
return Math.abs(avgSlope * xSigma); | ||
} | ||
|
||
/** | ||
* Calculate current error | ||
* Approximate errors in point location and calculate point weights | ||
* @ignore | ||
* @param {{x:Array<number>, y:Array<number>, xError:Array<number>|void, yError:Array<number>|void}} data - Array of points to fit in the format [x1, x2, ... ], [y1, y2, ... ] | ||
* @param {Array<number>} params - Array of parameter values | ||
* @param {(n: number[]) => (x: number) => number} paramFunction - Takes the parameters and returns a function with the independent variable as a parameter | ||
* @return {Array<number>} Array of point weights | ||
*/ | ||
export function pointWeights(data, params, paramFunction) { | ||
const m = data.x.length; | ||
|
||
/** @type {Array<number>} */ | ||
const errs = new Array(m); | ||
|
||
if (!data.xError) { | ||
if (!data.yError) { | ||
return errs.fill(1); | ||
} else { | ||
errs.splice(0, m, ...data.yError); | ||
} | ||
} else { | ||
const fn = paramFunction(params); | ||
var point; | ||
|
||
for (point = 0; point < m; point++) { | ||
errs[point] = errorPropagation(fn, data.x[point], data.xError[point]); | ||
} | ||
|
||
if (data.yError) { | ||
for (point = 0; point < m; point++) { | ||
errs[point] = Math.sqrt(sq(errs[point]) + sq(data.yError[point])); | ||
} | ||
} | ||
} | ||
|
||
// Point weight is the reciprocal of its error | ||
for (point = 0; point < m; point++) { | ||
errs[point] = 1 / errs[point]; | ||
} | ||
|
||
return errs; | ||
} | ||
|
||
/** | ||
* Calculate the current sum of residuals | ||
* @ignore | ||
* @param {{x:Array<number>, y:Array<number>}} data - Array of points to fit in the format [x1, x2, ... ], [y1, y2, ... ] | ||
* @param {Array<number>} parameters - Array of current parameter values | ||
* @param {function} parameterizedFunction - The parameters and returns a function with the independent variable as a parameter | ||
* @param {(...n: number[]) => (x: number) => number} paramFunction - The parameters and returns a function with the independent variable as a parameter | ||
* @return {number} | ||
*/ | ||
export default function errorCalculation( | ||
data, | ||
parameters, | ||
parameterizedFunction | ||
) { | ||
export function sumOfResiduals(data, parameters, paramFunction) { | ||
var error = 0; | ||
const func = parameterizedFunction(parameters); | ||
const func = paramFunction(parameters); | ||
|
||
for (var i = 0; i < data.x.length; i++) { | ||
error += Math.abs(data.y[i] - func(data.x[i])); | ||
} | ||
|
||
return error; | ||
} | ||
|
||
/** | ||
* Calculate the current sum of squares of residuals | ||
* @ignore | ||
* @param {{x:Array<number>, y:Array<number>}} data - Array of points to fit in the format [x1, x2, ... ], [y1, y2, ... ] | ||
* @param {Array<number>} parameters - Array of current parameter values | ||
* @param {(...n: number[]) => (x: number) => number} paramFunction - The parameters and returns a function with the independent variable as a parameter | ||
* @return {number} | ||
*/ | ||
export function sumOfSquaredResiduals(data, parameters, paramFunction) { | ||
var error = 0; | ||
const func = paramFunction(parameters); | ||
|
||
for (var i = 0; i < data.x.length; i++) { | ||
error += sq(data.y[i] - func(data.x[i])); | ||
} | ||
|
||
return error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's customary to separate release bump commits from other work (some package maintainers get really peeved about this). @targos Are there strong opinions about this for this project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we use a tool that bumps the version number automatically (based on the commit messages) when we publish the release