Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird <!--[--> sytax causing rendering issues on html_render_diff #193

Open
Mr0grog opened this issue Jan 24, 2025 · 1 comment
Open

Weird <!--[--> sytax causing rendering issues on html_render_diff #193

Mr0grog opened this issue Jan 24, 2025 · 1 comment
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Jan 24, 2025

Version of https://science.nasa.gov/climate-change/adaptation-mitigation/ from December 2024 (and probably much earlier) are causing some weird rendering issues on html_render_diff where we are seeing text like <>>. Screenshot:

Screenshot of funky characters in diff

(See this live in Scanner)

In the source (archived copy), there are some IE conditional-comment-like syntax that I think is causing the problem:

<!--[--><div class="absolute z-top"><!--[--><!--[-->

That’s not actual conditional comments syntax, though. Maybe an error was causing this output? Or it’s something I haven’t seen before.

Then we get this mess out the other end of the diff:

&lt;<cyfunction 0x76bbfbe6a4d0="" at="" comment="">&gt;<!--<cyfunction Comment at 0x76bbfbe6a4d0-->&gt;&lt;<cyfunction 0x76bbfbe6a4d0="" at="" comment="">&gt;<!--<cyfunction Comment at 0x76bbfbe6a4d0-->&gt;<div><div class="absolute z-top">

Which seems like maybe lxml or html-parser are choking on this in a pretty painful way.

@Mr0grog Mr0grog added bug Something isn't working help wanted Extra attention is needed labels Jan 24, 2025
@Mr0grog
Copy link
Member Author

Mr0grog commented Jan 30, 2025

Possibly related, an EDGI analyst noted this problematic diff today, which features these messy comments on the “from” side, and shows the whole page being added on the “to” side: https://monitoring.envirodatagov.org/page/cd80203e-d5cf-4453-8042-fe971141f431/83615878-7465-4d83-ba1d-ebb3f9c626d5..126227d1-4be0-4730-83ad-63179fa376a9

Image

The “from” side stuff is definitely the same as this issue. Not sure if it’s also causing the “to” side problems. On the “to” side, the concrete cause of the out-of-control highlighting is problems re-assembling the markup; the first closing </ins> tag is missing or maybe getting written out as &lt;. Just not sure if the cause of that is these comments or something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant