Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question/Feature request: dealing with tags that are not self/closing (sgml) #31

Open
guraltsev opened this issue Nov 30, 2020 · 1 comment
Labels

Comments

@guraltsev
Copy link

This is more of a question than a bug report. Please tell me if I am using the wrong tool for the job. In SGML some tags are not allowed to be self-closing. e.g. all modern browsers error out if they encounter

<script src="somescript.js"/>

even is the document is of type html5. The script must be

<script src="somescript.js"></script>

For context:

I use esxml in several filter functions that I execute when exporting org files to html. I do not want to redefine a new exporter, just to filter some things. I use libxml to parse the output of the org exporter at different steps, I modify it using dom, and then I re-output it using esxml. I actually found it very surprising that esxml is not built into emacs proper.

One example is that I define an option (variable) that decides whether local css and JS scripts should be inlined in the html.

I added a hook function into org-export-filter-final-output-functions (see (https://orgmode.org/manual/Advanced-Export-Configuration.html)[Advanced Export Configuration]) that parses the output html, changes the contents of the script tag if inlining is required, then finally pushes everything back to html. However, if I am not inlining the scripts, the roundtrip html->emasc xml->html
changes <script>[...]</script> pairs into <script/> making the html output incorrect. I temporary solved the issue by adding a whitespace as a string inside empty <script></script> tags.

Do you have any comments about this? Now that there is a pcase would it be appropriate to implement such behavior? Technically this is not an XML behavior but de-facto I see no other library to programmatically edit HTML documents in emacs.

@tali713
Copy link
Owner

tali713 commented Nov 30, 2020

Okay, so this is a problem also for textarea. When I was originally designing, I decided that the simplest way to address this problem was simply to allow (tag attr-list) transform to <tag attributes/> but (tag attr-list "") will transform to <tag attributes></tag>. the empty string being equivalent to the empty-body. So your solution is indeed the intended solution.

However, let's say this was not good enough for you, and you really wanted to have a personal sublanguage with any special cases you like,

esxml/esxml.el

Lines 194 to 213 in 2656460

(defun sxml-to-esxml (sxml)
"Translates sxml to esxml so the common standard can be used.
See: http://okmij.org/ftp/Scheme/SXML.html."
(pcase sxml
(`(,tag (@ . ,attrs) . ,body)
`(,tag ,(mapcar (lambda (attr)
(cons (first attr)
(or (second attr)
(prin1-to-string (first attr)))))
attrs)
,@(mapcar 'sxml-to-esxml body)))
(`(,tag . ,body)
`(,tag nil
,@(mapcar 'sxml-to-esxml body)))
((and sxml (pred stringp)) sxml)))
(defun sxml-to-xml (sxml)
"Translates sxml to xml, via esxml, hey it's only a constant
factor. :)"
(esxml-to-xml (sxml-to-esxml sxml)))
shows how.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants