-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure Path _str takes UTF-8 String fast paths #72
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea - I think this is likely the right choice, but we should try to get some evidence to confirm that this will improve overall performance.
Could you try running the performance tests and report back how the numbers look before and after this change?
Which tests would you recommend? I looked at PathPerfTests but those all pass native Swift Strings into the constructor, so I wouldn't expect any major changes there. Would you like me to add a new test? |
Adding a new perf test for this will be great! |
Client strings typically arrive via IPC and get constructed by SWBuild so I'm not sure if we really suffer from NSString bridging; did you profile a real build and see time spent there? By the way, there's some progress in Path towards migrating to System.FilePath as the underlying representation, I'm curious what performance impact that would have. |
Yes, for the Apple people here please take a look at FB11698739 (Path._isNormalized should be called less often, made more efficient, or cached) |
They can be efficiently iterated but require more context than the index to "utf8 char at index" function provides. This absolutely could be improved but with a deeper analysis of how to do this resiliently without causing a large |
I've added a performance test that operates on NSPathStore2 (at least on macOS). This change makes it run about 9x faster. I'm not entirely sure what the point of measure is given it discards its timing information but I just copied whatever the other tests were doing (but running my own performance analysis). |
Just noting that this problem is why we use ByteString in a bunch of places. WRT the PR, the cost to I had a chat with @Catfish-Man about it and I came away thinking that this is a real problem but putting Why do we have NSStrings or non-contiguous/non-UTF8 strings? I think one reason could be that we use Foundation/NS APIs to read or decode data, and we should replace those with native APIs. @Catfish-Man also mentioned that random innocent looking String initializers also will give you non-contiguous/non-UTF8 strings, e.g. String(contentsOf:) does, but String(contentsOf:encoding:) does not IFF the encoding is ASCII or UTF8. @saagarjha what do you think about that approach? Instead of converting, assert and find them at the source, then replace with native contiguous UTF8-producing APIs? |
Btw I also did some work in |
I’d be fine with that as long as it does get tracked down and fixed. I believe many of these are coming from Xcode itself, which I can’t actually make changes in, so someone else would have to make the fixes there. |
@saagarjha the good thing is that SWBBuildService normally runs out of process from Xcode, and Xcode sends requests over IPC, so we don't actually take any NSString objects from Xcode and it's within our control how Strings are constructed and represented. I bet we have our own code paths (maybe something like |
Property list and JSON decoding is one area where I definitely expect we're getting NSStrings back. |
@saagarjha what do you think about that? If you have time and want to pursue this, for example you could change your PR to have |
Sure, that sounds reasonable. I'll take a look. |
These are the ones I could find. I chose to keep the assert because it should help prevent new clients from accidentally reintroducing this issue. |
Awesome! Cleaner than I expected. |
@swift-ci test |
Hmm, looks like I must have missed something. Let me take a look… |
Hmm, I can't reproduce the test failure. Is there a way to find out which test is failing? |
@shahmishal @owenv, can we get result bundles or crash logs out of swift ci? This console log can't identify which test is hitting the assertion https://ci.swift.org/job/pr-swift-build-macos/110/ |
Don't think we have a way to get crash logs from swift-ci, that's probably the only way to see which test triggered the assertion. |
BTW, my random guess would be that |
I can't reproduce either. @saagarjha you can land without the assert. |
Path relies on fast iteration of its backing _str for most of its operations. To avoid the performance hit of repeatedly calling -characterAtIndex: when constructed with an unfortunate string, update clients to avoid doing this.
Ok, I've removed the assert. |
@swift-ci test |
@@ -737,7 +737,11 @@ class LocalFS: FSProxy, @unchecked Sendable { | |||
|
|||
// FIXME: This enumerator has unclear error handling. Does nextObject() return nil on error? That's wrong. | |||
return try enumerator.compactMap { next in | |||
let nextPath = path.join(next as? String) | |||
// FileManager.DirectoryEnumerator produces Strings that are backed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it any different if we use the URL API instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saagarjha I'm good to land as is and track this as a future issue for someone to explore, but want to run by you to see if you agree or want to look at this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to you. I did some basic tests and it seems like this API might actually do the conversions for us but I need to investigate it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if it's up to me :) then I'm not in a rush to merge, let me know what you find here
@swift-ci test |
@mirza-garibovic |
Some clients (cough cough Xcode) like to create Paths out of bridged NSStrings, which cannot be efficiently iterated. As most of our operations require doing this frequently, we might as well convert on construction to avoid the -characterAtIndex: slow path.