Cyclic/Circular block bootstrapping - refactor #522

nikeethr · 2024-06-12T00:59:06Z

Update - 29/01/2025

This is issue is slightly morphed to instead be refactoring follow-up of #418 - now that xbootstrap is ported into scores and no longer needs an alternate implementation as an option.

However, the internal functions still need to conform with the styles and patterns used in other scores metrics. I believe in particular a lot of the "nested" logic can be replaced with a more readable and extensible design (though not trivial, there are elegant ways to do this)

See: #522 (comment) for further comments

~~I would like the following data processing tool to be considered for addition to the scores repository~~

I want to add an emerging version of the block bootstrapping.

A common implementation used scientific users, from the xbootstrap package. An initial review of the code (albeit by myself) found that while the functionally seems to be correct, it has some things I'm unsure about/are hard to verify, which are explained in #418. Shaping it to conform to the coding standards in scores can be tricky due to the reliance of multiple coding paradigms in the original implementation.

Furthermore, any updates/bugs/api incompatibilities related to the original implementation will have to be tracked and ported across.

I think it may be good to have a "in-house"/redesigned version. This version would be more in line with our code design paradigms, and hopefully improve maintainability and extension.

Please provide a reference data processing tool

See:

PR to port original tool: 114 add circular block bootstrapping #418 (note: currently houses both xbootstrap and a sample emerging implementation. The latter will be ported out to a fork - and will be the implementation to this issue.)

original tool: https://github.com/dougiesquire/xbootstrap/blob/main/xbootstrap/core.py

reference: Wilks, Daniel S. Statistical methods in the atmospheric sciences. Vol. 100
Academic press, 2011.

Note

The concept of bootstrapping isn't in itself that hard. It basically involves sampling a block of data per iteration rather than a single point and reshaping it to fit the original dataset, potentially stacked over several iterations. This is done to address cross-correlations between samples affecting various statistical estimators e.g. mae.

This is easily available in (in-built?) R packages. The trickiness comes from having to deal with arbitrary number of axes (nd-arrays) efficiently, and block sizes that don't fill the nd-axes tightly. The solution is actually fairly straightforward with recursive algorithms/functional programming, but things can become verbose in an iterative implementation.

~~There are several tricks that can be used to overcome this, so I would suggest an initial implementation be in the **https://github.com/nci/scores/labels/emerging** space~~

The text was updated successfully, but these errors were encountered:

nikeethr · 2024-06-12T01:01:43Z

From a pedantic view, "cyclic" permutation makes more sense than "circular" from a mathematical standpoint, and in an implementation context. Indices can be cyclic, not circular - which tends to be more of a geometric nomenclature. Though admittedly "circular" is what some reference papers use (in the context of atmospheric science.).

I think it's dependent on the target user, but I prefer cyclic.

I can see it being used in a general context as well e.g. in machine learning.

nikeethr · 2025-01-29T11:52:32Z

Update 29/01/2025

I've changed this issue focus to instead address refactoring of the implementation in 114 add circular block bootstrapping #418 which is a port of xbootstrap, in order for it to conform closer to other scores metrics in terms of programming patterns.
The implementation in (stale/re-evaluate) Add cyclic block bootstrap - alternate implementation #523 is probably stale, and can no longer be used directly - we can probably cherry pick some patterns and utilize them in a refactor.
In hindsight, to counter Cyclic/Circular block bootstrapping - refactor #522 (comment), the term "circular" is probably okay since "circular" buffers are a thing - essentially buffers with cyclic indexing, but pictorially easier to visualise what's going on as if elements are on a circle. (This does break down for multi-dimensional cyclic indexing, where "circular" is no longer a meaningful representation, but lets not worry about that.)

nikeethr self-assigned this Jun 12, 2024

nikeethr added the emerging label Jun 12, 2024

nikeethr added this to the Wishlist milestone Jun 12, 2024

nikeethr mentioned this issue Jun 12, 2024

Add circular block bootstrapping #114

Closed

nikeethr added a commit that referenced this issue Jun 12, 2024

[revert] remove emerging implementation, moved to #522

566078d

This was referenced Jun 12, 2024

(stale/re-evaluate) Add cyclic block bootstrap - alternate implementation #523

Draft

114 add circular block bootstrapping #418

Merged

nicholasloveday pushed a commit that referenced this issue Jan 29, 2025

[revert] remove emerging implementation, moved to #522

f00921e

nicholasloveday pushed a commit that referenced this issue Jan 29, 2025

[revert] remove emerging implementation, moved to #522

d87e4db

nikeethr changed the title ~~Cyclic block bootstrapping alternate version~~ Cyclic/Circular block bootstrapping - refactor Jan 29, 2025

nikeethr added refactoring and removed emerging labels Jan 29, 2025

tennlee pushed a commit that referenced this issue Feb 12, 2025

[revert] remove emerging implementation, moved to #522

83fb1ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cyclic/Circular block bootstrapping - refactor #522

Cyclic/Circular block bootstrapping - refactor #522

nikeethr commented Jun 12, 2024 •

edited

Loading

nikeethr commented Jun 12, 2024 •

edited

Loading

nikeethr commented Jan 29, 2025 •

edited

Loading

Cyclic/Circular block bootstrapping - refactor #522

Cyclic/Circular block bootstrapping - refactor #522

Comments

nikeethr commented Jun 12, 2024 • edited Loading

nikeethr commented Jun 12, 2024 • edited Loading

nikeethr commented Jan 29, 2025 • edited Loading

nikeethr commented Jun 12, 2024 •

edited

Loading

nikeethr commented Jun 12, 2024 •

edited

Loading

nikeethr commented Jan 29, 2025 •

edited

Loading