You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm getting into a case where the unwrap_html includes html_top field and the value of it is pretty much all the html when I provide it with a forward message that doesn't contain any message above the ----Forward message---- pattern. In comparison, the unwrap function doesn't include the text_top if you give it the plain text version of the same email. See some tests you can add and see this pass/fail in actual repo.
A test example that would pass
Adding a similar test to UnwrapTestCase.test_gmail_forward, but without the "Hello" message at the beginning:
classUnwrapTestCase(TestCase):
...
deftest_gmail_forward_no_message(self):
# No more Hello message at the beginning, unlike test_gmail_forwardself.assertEqual(unwrap("""---------- Forwarded message ----------From: Someone <[email protected]>Date: Fri, Apr 26, 2013 at 8:13 PMSubject: Weekend Spanish classesTo: [email protected]Spanish ClassesLearn Spanish"""), {
'type': 'forward',
'from': 'Someone <[email protected]>',
'date': 'Fri, Apr 26, 2013 at 8:13 PM',
'subject': 'Weekend Spanish classes',
'to': '[email protected]',
'text': 'Spanish Classes\nLearn Spanish',
})
and this passes just fine, since unwrap doesn't include text_top in the output when there's no message at the top.
A test example that would fail
Adding a similar test to HTMLUnwrapTestCase.test_gmail_forward, but without the text at the top again:
classHTMLUnwrapTestCase(TestCase):
...
deftest_gmail_forward_no_message(self):
# diff to the html from test_gmail_forward ; missing:# < test# < <div><br></div># < <div>blah</div>html='''<html> <head></head> <body> <div dir="ltr"> <div><br><div class="gmail_quote">---------- Forwarded message ----------<br> From: <b class="gmail_sendername">Foo Bar</b> <span dir="ltr"><<a href="mailto:[email protected]">[email protected]</a>></span><br> Date: Thu, Mar 24, 2016 at 5:17 PM<br> Subject: The Subject<br> To: John Doe <<a href="mailto:[email protected]">[email protected]</a>><br><br><br> <div dir="ltr">Some text<div><br></div><div><br></div></div></div><br> </div> </div> </body></html>'''self.assertEqual(unwrap_html(html), {
'type': 'forward',
'subject': 'The Subject',
'date': 'Thu, Mar 24, 2016 at 5:17 PM',
'from': 'Foo Bar <[email protected]>',
'to': 'John Doe <[email protected]>',
'html': '<html><head></head>\n <body><div dir="ltr"><div><div class="gmail_quote"><div dir="ltr">Some text</div></div></div></div></body></html>',
})
and this fails, since unwrap_html also includes a html_top field in the output. And the value of the html_top is basically all the email in this case, so not useful:
u'<html><head></head>\n <body><div dir="ltr"><div><div class="gmail_quote">---------- Forwarded message ----------<br>\n From: <b class="gmail_sendername">Foo Bar</b> <span dir="ltr"><<a href="mailto:[email protected]">[email protected]</a>></span><br>\n Date: Thu, Mar 24, 2016 at 5:17 PM<br>\n Subject: The Subject<br>\n To: John Doe <<a href="mailto:[email protected]">[email protected]</a>><br><br><br>\n <div dir="ltr">Some text<div><br></div><div><br></div></div></div><br>\n </div>\n </div>\n </body>\n</html>
Do you think that it should behave the way I'm suggesting, so not include html_top here? In that case I can look into it and provide a PR.
Something unrelated to this but I realized I haven't tried making it have an html_bottom too, but might not make sense in forward case, only in reply case, right? Since how would you tell if that's the bottom or it's part of the forwarded email itself?
The text was updated successfully, but these errors were encountered:
First, thanks for the great library!
I'm getting into a case where the
unwrap_html
includeshtml_top
field and the value of it is pretty much all the html when I provide it with a forward message that doesn't contain any message above the----Forward message----
pattern. In comparison, theunwrap
function doesn't include thetext_top
if you give it the plain text version of the same email. See some tests you can add and see this pass/fail in actual repo.A test example that would pass
Adding a similar test to
UnwrapTestCase.test_gmail_forward
, but without the "Hello" message at the beginning:and this passes just fine, since
unwrap
doesn't includetext_top
in the output when there's no message at the top.A test example that would fail
Adding a similar test to
HTMLUnwrapTestCase.test_gmail_forward
, but without the text at the top again:and this fails, since
unwrap_html
also includes ahtml_top
field in the output. And the value of thehtml_top
is basically all the email in this case, so not useful:Do you think that it should behave the way I'm suggesting, so not include
html_top
here? In that case I can look into it and provide a PR.Something unrelated to this but I realized I haven't tried making it have an
html_bottom
too, but might not make sense in forward case, only in reply case, right? Since how would you tell if that's the bottom or it's part of the forwarded email itself?The text was updated successfully, but these errors were encountered: