You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
URI's are explicitly declared to be a sequence of characters, and not a sequence of octets, as per the RFC.
Thus ByteString seems like a dangerous type to use for this purpose, as it represents a sequence of octets and not a sequence of characters.
This would also be more compatible with IRIs, as according to the RFC they are also a sequence of characters, and the characters do not fit within ASCII.
The text was updated successfully, but these errors were encountered:
tysonzero
changed the title
What is the reason for choosing bytestring over text
Consider Text instead of ByteString
Aug 28, 2019
I don't think this library clashes with the spec. The interpretation of the bytestrings is left to the caller. That actually seems like the right thing to do:
This specification does not mandate any particular character encoding for mapping between URI characters and the octets used to store or transmit those characters. When a URI appears in a protocol element, the character encoding is defined by that protocol; without such a definition, a URI is assumed to be in the same character encoding as the surrounding text.
The characters in the ABNF grammar are ASCII and as such we don't need to know the encoding to parse:
The ABNF notation defines its terminal values to be non-negative integers (codepoints) based on the US-ASCII coded character set [ASCII]. Because a URI is a sequence of characters, we must invert that relation in order to understand the URI syntax.
URI's are explicitly declared to be a sequence of characters, and not a sequence of octets, as per the RFC.
Thus ByteString seems like a dangerous type to use for this purpose, as it represents a sequence of octets and not a sequence of characters.
This would also be more compatible with IRIs, as according to the RFC they are also a sequence of characters, and the characters do not fit within ASCII.
The text was updated successfully, but these errors were encountered: