Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let Location::str() handle \r correctly #127

Merged
merged 12 commits into from
Aug 22, 2024

Conversation

fhackett
Copy link
Contributor

@fhackett fhackett commented Jul 19, 2024

This PR is split off from #126, because really it's an orthogonal fix.

The observed problem was that, on Windows, error messages had a strange \r in reported source locations that was inconsistent with both other platforms, and other parts of the same output string (you would see \n once then \r\n further along the same string).

The trouble was traced back to source.h, which was separating lines based on only the \n character, and assuming that line breaks were only 1 char wide. This PR rewrites the 3 line/columns handling functions that did this, so that they will accept \r\n, \n, and \r as line delimiters while reporting correct line lengths (and so forth) on all platforms.

The main change of substance is that the lines field of SourceDef changes from a vector of size_t to a vector of pair<size_t, size_t>, because it is no longer possible to derive where a line ends from where another line begins (we might have to add/subtract either 1 or 2 chars). Rather than try to detect some file-wide line break width (and possibly have problems due to corrupted files mixing line ending types), this rewrite just looks at which line ending variant each line had and records span information to match.

Additionally, this PR adds exhaustive generative testing of this changed feature.
The test code is commented to explain logically how it builds all possible permutations of line endings and line contents up to a certain length (albeit using just a as contents since this code is not sensitive to string contents).
As configured in the CMakeLists.txt, it runs 729 test cases up to length 6 (6 chars, 6 line endings, and everything in between).

This fine-grained testing should help keep this code robust if it ever needs to change again - in fact it already helped catch errors when implementing some review recommendations from the other PR this was split from.

include/trieste/source.h Outdated Show resolved Hide resolved
include/trieste/source.h Outdated Show resolved Hide resolved
@fhackett fhackett changed the title Let SourceDef handle \r\n and co correctly Let Location::str() handle \r correctly Jul 30, 2024
@fhackett
Copy link
Contributor Author

Based on meeting feedback, this new version hardens Location::str() against carriage returns instead.

Because I was exhaustively testing and found a lot of edge cases, I rewrote most of Location::str() in order more fundamentally satisfy the testing requirements.
That said, I am relatively sure that, if you don't use \r, the old and new implementation are exhaustively equivalent in their behavior. It can be double checked by reverting my source.h changes and running the tester with --skip-carriage-returns.

Let me know if there are any questions/clarifications/requests.


std::string escape_string(const std::string str)
{
std::string result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this should be a string stream.

@matajoh matajoh merged commit 467157a into microsoft:main Aug 22, 2024
21 checks passed
@fhackett fhackett deleted the fhackett-r-location branch August 22, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants