You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an excellent implementation with great documentation.
One thing I think, is that building rules inside a clojure source file can be a nice gameful challenge, yet tedious, and more importantly unreadable during later maintenance ― whenever there is the need to escape characters. E.g. consider this definition below, even the comment inside it requires escaping, not just the quote signs and back-slashes. It might be good to slightly more explicitly recommend, in the readme, as a rule of thumb, switching to resource files as early as the need to escape anything arise.
(defwikiextractor-parser"a parser for the output of wikiextractor (https://github.com/attardi/wikiextractor)"
(parser" S = Entry* Entry = <Header> ContentAsText <Trailer> <OptionalPadding*> Header = '<doc' (' ' HeaderProp)* '>' HeaderProp = #'[^=]*' '=' '\"' #'[^\"]*' '\"' (* e.g. id=\"4030\" *) ContentAsText = Anychar* Anychar = #'(?sm).' Trailer = '</doc>' OptionalPadding = #'\\s'")
)
The text was updated successfully, but these errors were encountered:
This is an excellent implementation with great documentation.
One thing I think, is that building rules inside a clojure source file can be a nice gameful challenge, yet tedious, and more importantly unreadable during later maintenance ― whenever there is the need to escape characters. E.g. consider this definition below, even the comment inside it requires escaping, not just the quote signs and back-slashes. It might be good to slightly more explicitly recommend, in the readme, as a rule of thumb, switching to resource files as early as the need to escape anything arise.
The text was updated successfully, but these errors were encountered: