Text processing can use some TLC #3217
Replies: 2 comments
-
Also, something that is relevant to wide characters: Has normalizing strings before working on them or while enumerating their runes ever been explored, here? If it hasn't (some of that stuff is pretty new).... It takes care of the issues that otherwise arise from the fact that some characters can be represented by multiple code points, so we wouldn't have to explicitly handle those contingencies ourselves. |
Beta Was this translation helpful? Give feedback.
-
Also... Yes, this is closely related to #3214, but that one is kind of a master issue, in lieu of a project. This is more targeted at a specific area. |
Beta Was this translation helpful? Give feedback.
-
A lot of very hot code paths in the library live in the StringExtensions, RuneExtensions, and TextFormatter classes.
That code - which does great stuff and you guys deserve plenty of kudos for - needs to be cleaned up before RTM of v2.
The biggest things:
I'm really tempted to do at least some minor work on some very low hanging fruit there, while waiting on the formatting stuff. There is even a hot-looped string concatenation that, all by itself, is capable of wasting shocking amounts of memory very quickly, even for fairly small inputs, for example. And that method is called by a ton of others, sometimes even doing the work more than once.
Wanted to get some feedback on those classes before I convert this to an issue.
Beta Was this translation helpful? Give feedback.
All reactions