Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confused about convert_special and sax_html parsing #358

Closed
davetron5000 opened this issue Aug 14, 2024 · 2 comments
Closed

Confused about convert_special and sax_html parsing #358

davetron5000 opened this issue Aug 14, 2024 · 2 comments

Comments

@davetron5000
Copy link

Hi, sorry if this is not the best place for questions or support, but I guess it's possible my issue is a bug.

I'm trying to use Ox to parse HTML5, and I'm finding that it is escaping < and & in attributes, text, and CDATA. I understand this is correct behavior for XML, so I set convert_special, but it doesn't have the effect I'm looking for:

<html>
<head>
<style>
  <![CDATA[
    .foo {
      content: ">";
    }
  ]]>
</style>
</head>
<body>
<h1>Hello</h1>
</body>
</html>

When I parse this using a class passed to Ox.sax_html, text(), and cdata() are both given escaped strings, so if I try to recreate that <style> block, it will show content: "&gt;";.

So, question is - is this correct behavior and, if so, can it be controlled and/or disabled?

@ohler55
Copy link
Owner

ohler55 commented Aug 14, 2024

Looking at the code, attributes and text use the :convert_special option but CDATA does not. Can you provide the code (handler) that received the &gt; string?

@davetron5000
Copy link
Author

OK, in putting together a minimal example, I’m realizing the behavior is not in sax_parse, but I was also creating a document and it's that that was escaping the values, which seems reasonable and consistent with the docs. Sorry for the bother, but thanks for being responsive!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants