-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ST4: 4065+ HTMLSheet Updates #107
Closed
Closed
Changes from 10 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
8327457
implement: sublime HTMLSheet
TerminalFi e7be2fc
implement: ImageParser.py
TerminalFi e91bdca
remove: 2markdown.py
TerminalFi 65020fe
implement: invalid_image
TerminalFi e6b91cd
fix: on_pre_close
TerminalFi a0cdcba
fix: strip html comments
TerminalFi 24da85d
fix: passing html and stating it was MD
TerminalFi 25456e5
implement: render checkbox with unicode
TerminalFi 654ab4d
fix: BS4 kept adding "closing <br>"
TerminalFi a0d51da
implement: multiple syntax support via `syntax` setting
TerminalFi c81b061
implement: MarkdownLivePreviewBaseCommand for shared functions
TerminalFi f9345b5
fix: delay logic
TerminalFi 0ac89d6
format: ImageParser
TerminalFi bbab51b
remove: unneeded markdown frontmatter
TerminalFi 23b507f
implement: render_checkbox font size
TerminalFi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
import concurrent.futures | ||
import os.path | ||
import re | ||
import urllib.request | ||
from base64 import b64encode | ||
from functools import partial | ||
|
||
import bs4 | ||
|
||
__all__ = ("imageparser",) | ||
|
||
|
||
RE_BAD_ENTITIES = re.compile(r"(&(?!amp;|lt;|gt;|nbsp;)(?:\w+;|#\d+;))") | ||
|
||
# FIXME: how do I choose how many workers I want? | ||
# - Does thread pool reuse threads or does it stupidly throw them out? | ||
# - (we could implement something of our own) | ||
executor = concurrent.futures.ThreadPoolExecutor(max_workers=5) | ||
|
||
|
||
def _remove_entities(text): | ||
"""Remove unsupported HTML entities.""" | ||
|
||
import html.parser | ||
|
||
html = html.parser.HTMLParser() | ||
text = text.replace("<br/>", "<br>").replace("<hr/>", "<hr />") | ||
|
||
def repl(m): | ||
"""Replace entities except &, <, >, and `nbsp`.""" | ||
return html.unescape(m.group(1)) | ||
|
||
return RE_BAD_ENTITIES.sub(repl, text) | ||
|
||
|
||
def imageparser(html, basepath, re_render, resources): | ||
soup = bs4.BeautifulSoup(html, "html.parser") | ||
for img_element in soup.find_all("img"): | ||
src = img_element["src"] | ||
|
||
# already in base64, or something of the like | ||
# FIXME: what other types are possible? Are they handled by ST? | ||
# - If not, could we convert it into base64? is it worth the effort? | ||
if src.startswith("data:image/"): | ||
continue | ||
if src.startswith("http://") or src.startswith("https://"): | ||
path = src | ||
elif src.startswith("file://"): | ||
path = src[len("file://") :] | ||
else: | ||
if basepath is None: | ||
basepath = "." | ||
path = os.path.realpath(os.path.expanduser(os.path.join(basepath, src))) | ||
|
||
base64 = get_base64_image(path, re_render, resources) | ||
|
||
img_element["src"] = base64 | ||
|
||
return _remove_entities(soup.prettify(formatter="html")) | ||
|
||
|
||
images_cache = {} | ||
images_loading = [] | ||
|
||
|
||
def get_base64_image(path, re_render, resources): | ||
""" Gets the base64 for the image (local and remote images). | ||
re_render is a callback which is called when we finish loading an | ||
image from the internet to trigger an update of the preview | ||
(the image will then be loaded from the cache) | ||
return base64_data, (width, height) | ||
""" | ||
|
||
def callback(path, resources, future): | ||
# altering images_cache is "safe" to do because callback | ||
# is called in the same thread as add_done_callback: | ||
# > Added callables are called in the order that they | ||
# - were added and are always | ||
# > called in a thread belonging to the process that added them | ||
# > --- Python docs | ||
try: | ||
images_cache[path] = future.result() | ||
except urllib.error.HTTPError as e: | ||
images_cache[path] = resources["base64_404_image"] | ||
print("Error loading {!r}: {!r}".format(path, e)) | ||
|
||
images_loading.remove(path) | ||
|
||
# we render, which means this function will be called again, | ||
# but this time, we will read from the cache | ||
re_render() | ||
|
||
if path in images_cache: | ||
return images_cache[path] | ||
|
||
if path.startswith("http://") or path.startswith("https://"): | ||
# FIXME: submiting a load of loaders, we should only have one | ||
if path not in images_loading: | ||
executor.submit(load_image, path).add_done_callback( | ||
partial(callback, path, resources) | ||
) | ||
images_loading.append(path) | ||
return resources["base64_loading_image"] | ||
|
||
if not os.path.isfile(path): | ||
return resources["base64_invalid_image"] | ||
|
||
with open(path, "rb") as fhandle: | ||
image_content = fhandle.read() | ||
|
||
image = "{}{}".format( | ||
"data:image/png;base64,", b64encode(image_content).decode("utf-8") | ||
) | ||
images_cache[path] = image | ||
return images_cache[path] | ||
|
||
|
||
def load_image(url): | ||
with urllib.request.urlopen(url, timeout=60) as conn: | ||
image_content = conn.read() | ||
|
||
content_type = conn.info().get_content_type() | ||
if "image" not in content_type: | ||
raise ValueError( | ||
"{!r} doesn't point to an image, but to a {!r}".format( | ||
url, content_type | ||
) | ||
) | ||
return "{}{}".format( | ||
"data:image/png;base64,", b64encode(image_content).decode("utf-8") | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like you want to use asyncio instead of a ThreadPoolExecutor (which is IMO completely the wrong tool for this job). |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is overkill for this sort of plugin.