Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint of Euler spiral based stroke expansion #496

Merged
merged 17 commits into from
Mar 13, 2024
Merged

Checkpoint of Euler spiral based stroke expansion #496

merged 17 commits into from
Mar 13, 2024

Conversation

raphlinus
Copy link
Contributor

This branch contains an implementation of stroke expansion based on Euler spirals. At this point, the CPU shader implementation (in cpu-shader/flatten.rs) should be a reasonably solid implementation, with care taken for numerical robustness in particular, but more validation is needed.

In particular, some of the "tricky stroke" cases have a semi-circle taken out near the endpoints. This is the expected rendering from slightly perturbing the co-linear cubic Bézier, then drawing just the parallel curve and cap. However, it is not strongly correct rendering (in the sense of the Nehab 2020 "Converting stroked primitives to filled primitives" paper sense).

The next major step is translation of that logic to WGSL so strokes can be fully evaluated on GPU.

Progress toward #303

This just imports Arman's patch and tries to update it for other changes.
Apply Euler spiral numerical robustness to CPU shader.

There are a number of minor tweaks, and some important changes:

* Cubic segments are computed with points and derivatives, not subdivision
* Those derivatives are adjusted if they are near-zero
* The ESPC integral is robust to small k1 and dist

Note that dealing with small dist is essential for doing ES flattening for fills as well as strokes.
This pipes through the start/end points (always using them as endpoints) and always uses Euler flattening.

It seems to basically work for fills, but has lots of artifacts still for strokes.
Drop zero-length lines. This doesn't get rid of all the code, but it shouldn't be needed for correctness.

In addition, zero length lines are dropped. We need to come back to this, they do have some semantics for strokes.
Reuse endpoints of Euler segments. In theory, this should ensure watertight meshes, but we're definitely not seeing it.
Calculation of end normals for stroke offset was mistaken. At this point, it's starting to look fairly good. Some of the tricky strokes need investigation, and handling of zero-length lines is still a bit dodgy.
Euler flattening seems to be almost working as long as subdivision is disabled, at least for fills. This commit has that disabled.
Simple typo in subdivision logic (and re-enable subdivision, which was disabled in previous commit).

Sign wrong in inverse integral.

Typo in calculation of start tangent (repeated p3 twice instead of p2, p3).

Note: also enables GPU-side stroking
Make handling of small tangent consistent. Remove cubic flattening. Make subdivision limiting logic of CPU shader consistent with GPU version. Get rid of unused is_line predicate on CPU side.
(and I also added this to my local pre-push git hook)
@raphlinus raphlinus marked this pull request as ready for review March 12, 2024 20:01
Copy link
Collaborator

@armansito armansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress and exciting milestone! I left a number of minor suggestions, largely around documentation. This will get better as we work through the writeup soon but I think some things could be explained briefly in the code now.

shader/flatten.wgsl Outdated Show resolved Hide resolved
shader/flatten.wgsl Outdated Show resolved Hide resolved
src/cpu_shader/euler.rs Show resolved Hide resolved
pub fn from_points_derivs(p0: Vec2, p1: Vec2, q0: Vec2, q1: Vec2, dt: f32) -> Self {
let chord = p1 - p0;
// Robustness note: we must protect this function from being called when the
// chord length is (near-)zero.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the CPU shaders should assert that the chord length is less than some epsilon (I'm not sure if Rust panics on a floating point division by zero).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not. I'll add an assert and some comments, that should help us debug the case.

And fun! When I added the assertion it got triggered. I'll add the robustness logic for this now.

// Rationale: this happens when fitting a cusp or near-cusp with
// a near 180 degree u-turn. The actual ES is bounded in that case.
// Further subdivision won't reduce the angles if actually a cusp.
return 2.0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed briefly, I would elaborate on the choice of 2.0 here: what is the expected downstream effect and why is that acceptable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a short sentence about the effect of 2.0 vs another number like 1.0, e.g. "Returning an error value of 2.0 terminates the subdivisions because so and so".

MAX_LINES
} else {
MAX_LINES.min((std::f32::consts::TAU / theta).ceil() as u32)
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formulation is different from what's in the WGSL (and the removed lines below):

let MIN_THETA = 0.0001;
let tol = 0.1;
let radius = max(tol, length(p0 - transform_apply(transform, center)));
let theta = max(MIN_THETA, 2. * acos(1. - tol / radius));

// Always output at least one line so that we always draw the chord.
let n_lines = max(1u, u32(ceil(angle / theta)));

I believe the WGSL version is correct (apart from the tolerance and MIN_THETA values, which need tuning). I would keep the CPU and WGSL versions consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't intended to change this. I'm pretty sure that what happened is that #415 didn't make it into the euler-spiral-strokes branch. I've restored it, and good catch.

keep_params[i] = params;
val += params.val;
qp0 = qp2;
// Drop zero length lines.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention here why the exact matches are valid (e.g. any invariants about prior detection of zero-length lines).

if t0 == 1.0 {
break;
}
loop {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a short explainer here about the derivative threshold, the iterative search to handle cusps, and the 3 cases below?

Copy link
Contributor Author

@raphlinus raphlinus Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels more like paper than code, but I'll add a sentence each. I'm generally doing more writeup in the Rust version than the WGSL; this is one of the costs of not having a good enough shader language to also handle CPU.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think the paper will be the in-depth reference for all of this BUT I would like the code to eventually contain explainers and relevant citations as much as possible. I think the comments can be improved but I'm OK with the current state of things.

val: f32,
a0: f32,
a2: f32,
// TODO: remove this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove uses of log!? Or should we leave them in as a no-op?

P.S.: IWBN to have a debug macro that's optionally enabled for this purpose in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok leaving them in for the time being, I found it useful while developing to be able to turn it on. I agree that a more sophisticated log story would be nice.

@@ -116,7 +116,7 @@ impl Scene {
const SHAPE_TOLERANCE: f64 = 0.01;
const STROKE_TOLERANCE: f64 = SHAPE_TOLERANCE;

const GPU_STROKES: bool = false; // Set this to `true` to enable GPU-side stroking
const GPU_STROKES: bool = true; // Set this to `true` to enable GPU-side stroking
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Mostly doc improvements, but also improvements to robustness for short chords.
@raphlinus
Copy link
Contributor Author

Ok, I think I've addressed the review comments, but I may have missed one.

Remaining issues:

  • investigate why longpathdash is taking more memory
  • revisit tolerance tuning parameters

These two items may not be completely separate :)

I have validated that the tolerance value for both arcs and Bézier segments is calibrated correctly. Thus, it makes sense to set both to a value of 0.25, so that caps are consistent. This comes quite close to matching the amount of subdivision in current main, and the line count is also consistent between CPU and GPU stroke expansion.
Oops, I made the change only on the CPU side. This brings GPU into agreement.
Copy link
Collaborator

@armansito armansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over the follow up changes. LGTM % minor suggestions.

/// TODO: investigate needed accuracy. We might be able to get away
/// with 8th order.
/// This is a 10th order polynomial. The evaluation method is explained in
/// Raph's thesis in section 8.1.2.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

if t0 == 1.0 {
break;
}
loop {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think the paper will be the in-depth reference for all of this BUT I would like the code to eventually contain explainers and relevant citations as much as possible. I think the comments can be improved but I'm OK with the current state of things.

return 2.0;
}
let e0 = (2. / 3.) / (1.0 + cth0);
let e1 = (2. / 3.) / (1.0 + cth1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding assertions here that cth0 and cth1 won't cause divisions by zero.

// Rationale: this happens when fitting a cusp or near-cusp with
// a near 180 degree u-turn. The actual ES is bounded in that case.
// Further subdivision won't reduce the angles if actually a cusp.
return 2.0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a short sentence about the effect of 2.0 vs another number like 1.0, e.g. "Returning an error value of 2.0 terminates the subdivisions because so and so".

A bit more comment, but also a more robust protection against a divide by zero, which happens in the colinear double-cusp case, as exercised by tricky-strokes.
@raphlinus raphlinus added this pull request to the merge queue Mar 13, 2024
Merged via the queue into main with commit fe3273a Mar 13, 2024
9 checks passed
@raphlinus raphlinus deleted the euler2 branch March 14, 2024 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants