Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ByteBuffer methods getUTF8ValidatedString and readUTF8ValidatedString #2973

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

adam-fowler
Copy link
Contributor

Add methods to ByteBuffer to read validated UTF8 strings

Motivation:

The current readString and getString methods of ByteBuffer do not verify that the string being read is valid UTF8. The Swift 6 standard library comes with a new initialiser String(validating:as:). This PR adds alternative methods to ByteBuffer which uses this instead of String(decoding:as:).

Modifications:

Added ByteBuffer.getUTF8ValidatedString(at:length:)
Added ByteBuffer.readUTF8ValidatedString(length:)

Result:

You can read strings from a ByteBuffer and be certain they are valid UTF8

Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I think the API design needs a bit of a tweak.

I also wonder if we can implement an older-Swift fallback using https://developer.apple.com/documentation/swift/string/init(validatingutf8:)-208fn.

}
return self.withUnsafeReadableBytes { pointer in
assert(range.lowerBound >= 0 && (range.upperBound - range.lowerBound) <= pointer.count)
return String(validating: UnsafeRawBufferPointer(fastRebase: pointer[range]), as: Unicode.UTF8.self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not the spelling we want. Right now it's not possible to tell if this API call fails because there aren't enough readable bytes or because the string is not valid UTF8. We need to distinguish the two cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could throw an error for invalid strings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that might be appropriate, yeah.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok it is now throwing an error on invalid UTF8

@adam-fowler
Copy link
Contributor Author

Thanks for this! I think the API design needs a bit of a tweak.

I also wonder if we can implement an older-Swift fallback using https://developer.apple.com/documentation/swift/string/init(validatingutf8:)-208fn.

The problem with using the above function is it requires the string to be null terminated. I guess I could copy it out to a separate buffer and add the null termination.

@Lukasa
Copy link
Contributor

Lukasa commented Nov 14, 2024

@adam-fowler
Copy link
Contributor Author

Oh, we could do this one instead: https://developer.apple.com/documentation/swift/string/init(utf8string:)-8qmaq

Which also requires a null-terminated string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants