Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSS: parse/skip at-rules, support @media, @supports #454

Merged
merged 10 commits into from
Sep 9, 2021

Conversation

poire-z
Copy link
Contributor

@poire-z poire-z commented Sep 6, 2021

#### (Upstream) Minimal support for zip64 postponing this one for another PR

This is enough to open zip archives larger than 4GB.
From buggins/coolreader#310 - not sure we're using it as it depends on a #define - a bit of refacto to bits of code we use, hope it's fine.

XML parsing: strip leading \n in PRE and TEXTAREA

See #345 (comment) for details.
"For historical reasons, in the HTML syntax, a leading newline character immediately following the pre element start tag is stripped."
Bump DOM version as this can shift highlight xpointers indexes by 1. (It may still cause a shift for very old opened books that will get migrated from that very old DOM version to this one... but well, who highlights in PRE ?!! Or I could do this inconditionnally:
https://github.com/koreader/koreader/blob/ab4e27908b292ab7172f06bb2b1e203558780d96/frontend/apps/reader/modules/readerrolling.lua#L1415-L1420

Text: allow wrap on any space in "white-space: pre"

Note: there might still be something fishy with multiple spaces at boundaries between pre and non-pre text nodes, possibly overlapping.
Before, a long sequence of space was unbreakable, so the possible break was found by hyphenating the previous word (and the next break is because there's no other solution):
image
After:
image
The fix looks simple and sane - but well, with our text rendering code, we never know...
I think we have many other issues with white-space:pre (like what we do this, we don't do it in gerRenderedWidths()), but well, didn't have the courage for more...

initTableRendMethods(): avoid expensive processing

Not sure why we went with initNodeRendMethodRecursive(), it feels we just want to set it to this single node, whose inner content has already been inited, and shouldn't need them changed.
Fixes the 2nd issue noticed at #370 (comment):

More serious issue is with tables:
The table in 13.9 of https://diveintopython3.net/serializing.html would take a lot of time to render [...]
Add another layer ot tr > th > td, and it takes minutes...
It gets stuck into nested initTableRendMethods() initNodeRendMethodRecursive() recurseElementsDeepFirst(), so we probably have a real issue here.

Tables: fix handling of whitespace between nodes

Properly handle whitespaces between table element nodes according to the specs.
This fix was mostly not needed with normal tables as the XML parsers would remove these whitespace text, and the table rendering algorithm would ignore them.
But it might be needed when completing incomplete tables (where these whitespace may have been kept by the parser and should, or not, be included with what is wrapped) or when table elements are white-space:pre (which is rare).
We probably won't trigger this in real-life cases, but I had the bad idea to wrap all my "complete incomplete tables" test case in white-space:pre, and we did render mostly everything unlike Firefox.
Got to answer a few of my questions in the comments... A bit risky though (had to fight with "styles have changed in such a way..." on re-rendering), hope it will be ok.

ldomDocument: store screen size

Have LVDocView store the screen size in ldomDocument just so we'll be able to check CSS media queries like @media (min-device-width:500px) or (device-aspect-ratio:4/3)

CSS: more robust skipping of invalid content

Add a generic parsing helper skip_to_next() that properly observes the rules for matching pairs of (), [], {}, "", and '', and correctly handle escapes, as per specs.
Use it for next_property() and next_token().
Will be also of use when parsing CSS @-rules.

CSS: parse/skip at-rules, support @media, @supports

Properly parse CSS @-rules, skipping most of them (by handling their content as nested stylesheets or nested declarations) as we don't support much except @media and @supports.
Implement a AtRuleLogicalConditionParser and dedicated subclasses to parse logical combinations in @media and @supports.
Implement handling of @supports items by tweaking LVCssDeclaration::parse() and parse_content_property() so that the same code can be used for parsing and just checking if we would parse it fine.
References:
https://developer.mozilla.org/en-US/docs/Web/CSS/At-rule
https://developer.mozilla.org/en-US/docs/Web/CSS/Media_Queries/Using_media_queries
https://www.w3.org/TR/css3-mediaqueries/#media0
https://drafts.csswg.org/css-conditional/#at-supports

Some previous discussion in koreader/koreader#3179
Should allow closing koreader/koreader#7199.

CSS: handle media condition with @import and <link/>

Add support for:

  <link href="s.css" type="text/css" rel="stylesheet"
      media="screen and (orientation:portrait)"/>

and

  @import "s.css" screen and (orientation:portrait);

Tweak stylesheets handling in ldomDocumentWriters:
Parse and handle media= in <link href="foo" media="bar">, and translate them into @import url("foo") bar; in the <stylesheet> child element's text we put them in.
Rework a bit ldomDocumentFragmentWriter to make this parsing a bit clearer.

Handle media conditions in LVProcessStyleSheetImport().

No change to LVLoadStylesheetFile(), used by frontends to get the user-agent stylesheet, which still handle only a unique first @import, without media condition support.

Should allow closing koreader/koreader#3179.

CSS: fix parsing of "font-family:inherit!important"

When without any space.

LDOMNameIdMap::deserialize(): fix when >1024 items

Our max nb of elements and attributes IDs starts with 1024, but can grow if we meet more. Be sure we grow them too as needed when loading from cache.
Should avoid the issue noticed at koreader/koreader#7343 (comment).


(I apologize to the 3 users that decided to pick the nicks @media @supports @import. They are in my commit messages, so for any people that will sync their crengine fork with this repo, they may get a mail - but I guess that's not new to them and that by now, they have disabled such github mails :)


This change is Reviewable

@poire-z
Copy link
Contributor Author

poire-z commented Sep 6, 2021

@virxkane : I picked your commit without really studying it.
clang-tidy in our CI complains about:
crengine/src/lvstream.cpp:2770:17: warning: Potential leak of memory pointed to by 'item' [clang-analyzer-cplusplus.NewDeleteLeaks] and stuff related to your added NextOffset.
Details in https://github.com/koreader/crengine/pull/454/checks?check_run_id=3527231130

Hope I didn't do an error when picking :/ Can you have a look at this clang output and see if this is an issue (if not, we should find a way to silence it).

@virxkane
Copy link
Contributor

virxkane commented Sep 6, 2021

I picked your commit without really studying it.
clang-tidy in our CI complains about:
crengine/src/lvstream.cpp:2770:17: warning: Potential leak of memory pointed to by 'item' [clang-analyzer-cplusplus.NewDeleteLeaks] and stuff related to your added NextOffset.
Details in https://github.com/koreader/crengine/pull/454/checks?check_run_id=3527231130

Should be fixed in https://github.com/virxkane/coolreader/commit/11d48c157e426acf8e950af674dba0dbe6037536

@virxkane
Copy link
Contributor

virxkane commented Sep 6, 2021

not sure we're using it as it depends on a #define - a bit of refacto to bits of code we use, hope it's fine.

zip64 is only relevant if 'large file support' is enabled, so we check LVLONG_FILE_SUPPORT==1.

@poire-z
Copy link
Contributor Author

poire-z commented Sep 6, 2021

Thanks! clang-tidy no longer complains about it !
(Still 2 other minor complains I can't understand, but ok.)

zip64 is only relevant if 'large file support' is enabled, so we check LVLONG_FILE_SUPPORT==1.

Yes, our build stuff vs. crsetup.h is a bit hard to grasp, but I think we don't enable it.

@poire-z
Copy link
Contributor Author

poire-z commented Sep 7, 2021

Added a commit to fix the issue noticed at koreader/koreader#7343 (comment).

while (*str && *str != ' ' ) {
// skip next char until we find a space or '!', that would start
// a new token or '!important" itself if stuck to previous token
while (*str && *str != ' ' && *str != '!') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, seems like something I might've noticed sooner. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine in the general parsing, line-height:1!important has been working without issue.

The issue was just with font-family which has some dedicated parsing, for splitting the multiple font names.
(And I was indeed lazy as you, typing font-family:inherit!important without spaces in my book style tweaks editor :)

Copy link
Member

@Frenzie Frenzie Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you could call it lazy, but from a reading perspective, not a writing one.

I think it's easier to read (as long as it doesn't get too long):

a {line-height:1!important; font-weight:bold}

than

a {
 line-height: 1 !important;
 font-weight: bold;
}

The reason I usually write the latter is a combination of how diffs work and that most other people seem to prefer it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I'm not really sure about !important with or without a space. I normally don't use it, except in user styles.

Copy link
Member

@Frenzie Frenzie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, no immediate comments

"For historical reasons, in the HTML syntax, a leading
newline character immediately following the pre element
start tag is stripped."
Bump DOM version as this can shift highlight xpointers
indexes by 1.
Note: there might still be something fishy with multiple
spaces at boundaries between pre and non-pre text nodes,
possibly overlapping.
Not sure why we went with initNodeRendMethodRecursive(),
it feels we just want to set it to this single node,
whose inner content has already been inited, and
shouldn't need them changed.
Properly handle whitespaces between table element nodes
according to the specs.
This fix was mostly not needed with normal tables as the
XML parsers would remove these whitespace text, and the
table rendering algorithm would ignore them.
But it might be needed when completing incomplete tables
(where these whitespace may have been kept by the parser
and should, or not, be included with what is wrapped) or
when table elements are white-space:pre (which is rare).
Have LVDocView store the screen size in ldomDocument just
so we'll be able to check CSS media queries like
@media (min-device-width:500px) or (device-aspect-ratio:4/3)
Add a generic parsing helper skip_to_next() that properly
observes the rules for matching pairs of (), [], {}, "",
and '', and correctly handle escapes, as per specs.
Use it for next_property() and next_token().
Will be also of use when parsing CSS @-rules.
Properly parse CSS @-rules, skipping most of them (by
handling their content as nested stylesheets or nested
declarations) as we don't support much except @media
and @supports.
Implement a AtRuleLogicalConditionParser and dedicated
subclasses to parse logical combinations in @media and
@supports.
Implement handling of @supports items by tweaking
LVCssDeclaration::parse() and parse_content_property()
so that the same code can be used for parsing and just
checking if we would parse it fine.
Add support for:
  <link href="s.css" type="text/css" rel="stylesheet"
      media="screen and (orientation:portrait)"/>
and
  @import "s.css" screen and (orientation:portrait);

Tweak stylesheets handling in ldomDocumentWriters:
Parse and handle media= in '<link href="foo" media="bar">',
and translate them into '@import url("foo") bar;' in the
<stylesheet> child element's text we put them in.
Rework a bit ldomDocumentFragmentWriter to make this
parsing a bit clearer.

Handle media conditions in LVProcessStyleSheetImport().

No change to LVLoadStylesheetFile(), used by frontends
to get the user-agent stylesheet, which still handle only
a unique first @import, without media condition support.
Our max nb of elements and attributes IDs starts with
1024, but can grow if we meet more. Be sure we grow
them too as needed when loading from cache.
@poire-z
Copy link
Contributor Author

poire-z commented Sep 9, 2021

Postponing the zip64 stuff to an other PR, with more stuff from buggins/coolreader#312.

@ptrm
Copy link

ptrm commented Sep 10, 2021

That is great news! Thanks!

For the sake of css tweaks I would like to also get specific answer: @media (orientation: portrait) { body { font-size: 9pt; } } is still not supported, right? I tried this with no results, but want to double check;) I'm thinking of conditional stuff (like bigger margin on the button-facing screen edge on Forma or Libra) that could live in my tweaks instead of modified epubs :)

@poire-z
Copy link
Contributor Author

poire-z commented Sep 10, 2021

This in a user style tweak works for me:

@media (orientation: portrait) { body { color: blue; } }

Note that it's the orientation of the "page" (where the BODY is rendered, excluding KOReader's own margins set via the bottom menu), and not of the "device screen".
So, in two-column mode in landscape, you may still trigger orientation: portrait

(Beware your 9pt will make your font a fixed size, and the only way to change it then is thru Zoom/DPI)

@ptrm
Copy link

ptrm commented Sep 10, 2021

(Beware your 9pt will make your font a fixed size, and the only way to change it then is thru Zoom/DPI)

Sure, that was just an example :)

So, in two-column mode in landscape, you may still trigger orientation: portrait

True. Same goes for width queries (match the column width), which has its upsides tbh :)

It must have been a typo with orientations, after checking max- and min-width, orientation is working :) Crazy thing is that min-color is working, too :D

I'm very glad it works inside css files too, anyway, will keep fiddling with it and report if need be :))))

@poire-z
Copy link
Contributor Author

poire-z commented Sep 10, 2021

Crazy thing is that min-color is working, too :D

That is a crazy lie :) :

"color", // We don't know the buffer type when loading a document,
"min-color", // so we can't be accurate: we'll pretend for now we have 16M colors
"max-color",
"monochrome", // For now, we'll say we're not monochrome, even on eInk
"grid", // false, we're not a tty terminal with a fixed font
"scripting", // "none", we don't support scripting
"update", // "none", "Once it has been rendered, the layout can no longer be updated"
"overflow-inline", // "none", no horizontal scrolling
"overflow-block", // "paged", which is our main purpose (even if we have a scroll mode)
// Not implemented:
// "color-index",
// "min-color-index",
// "max-color-index",

@ptrm
Copy link

ptrm commented Sep 10, 2021

That is a crazy lie :) :

Wait, I swear on my PW4 (min-color: 12) was enabling the tweak inside the query, while (min-color: 16) wasn't :o

EDIT:

int colors = 8;

Ohhh, it was 8 not 12 in min-color that worked, and it makes sense if color value is about bits per channel, which it is per spec. I mistook it for color-index :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants