Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support raw identifiers (backtick-delimited identifiers containing non-identifier characters). #76636

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

allevato
Copy link
Member

@allevato allevato commented Sep 22, 2024

Implementation of SE-0451.

Comment on lines +114 to +118
/// Returns true if this identifier contains non-identifier characters and
/// must always be escaped with backticks, even in contexts were other
/// escaped identifiers could omit backticks (like keywords as argument
/// labels).
bool mustAlwaysBeEscaped() const;
Copy link
Member

@rintaro rintaro Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some test cases for code completion checking mustAlwaysBeEscaped() items are actually escaped in code completion suggestions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Those did need to be fixed.

@allevato
Copy link
Member Author

This PR is ready except for the issue mentioned in the description of specialized C++ templates getting escaped by backticks (and now since the backticks are added to the mangling, it changes their mangled names as well). After the holidays I'll take another look at how to handle these.

@allevato
Copy link
Member Author

allevato commented Jan 7, 2025

The issues with imported C++ template specializations have been fixed.

@allevato
Copy link
Member Author

allevato commented Jan 7, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 7, 2025

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@grynspan
Copy link
Contributor

grynspan commented Jan 8, 2025

Is there a named feature I can test for with hasFeature()? We may not be able to adopt until 6.3 without it because we risk breaking on older 6.2 toolchains that don't support raw identifiers.

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

Is there a named feature I can test for with hasFeature()? We may not be able to adopt until 6.3 without it because we risk breaking on older 6.2 toolchains that don't support raw identifiers.

Looks like LANGUAGE_FEATURE(RawIdentifiers, ...) works for this, without being an upcoming or experimental feature (it's neither). Thanks for the suggestion!

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

1 similar comment
@allevato
Copy link
Member Author

allevato commented Jan 9, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 9, 2025

swiftlang/swift-syntax#2857

@swift-ci please test macOS platform

1 similar comment
@allevato
Copy link
Member Author

swiftlang/swift-syntax#2857

@swift-ci please test macOS platform

@allevato allevato force-pushed the rich-identifiers branch 3 times, most recently from d31b241 to 82eb01b Compare January 12, 2025 15:10
@allevato
Copy link
Member Author

@DougGregor @jckarter Can you review please, or forward along to someone who can? Thanks!

Copy link
Member

@rintaro rintaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parser and Index related changes looks good to me.
Deferring to others for other areas.

if (Name.mustAlwaysBeEscaped()) {
// Names that are raw identifiers must always be escaped regardless of
// their position.
shouldEscapeKeywords = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls Lexer:: identifierMustAlwaysBeEscaped(StringRef) twice, in Name.mustAlwaysBeEscaped() and Builder.escapeKeyword() which is a waste. Can we avoid that? I'm fine with just manually wrapping it with back-ticks here, or you could factor out the escaping logic from Builder.escapeKeyword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've factored out a escapeWithBackticks method in the builder so it can be called directly.

Identifier Name) {
if (Name.mustAlwaysBeEscaped()) {
SmallString<16> buffer;
Builder.addBaseName(Builder.escapeKeyword(Name.str(), true, buffer));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use swift-ide-test in test/IDE for testing these? Our primary place for testing code completion results is test/IDE/complete_*.swift.
test/SourceKit/CodeComplete is for SourceKit specific functionalities, like response format, ordering or such.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the tests to test/IDE and updated them to use %batch-code-completion.

allevato and others added 8 commits February 23, 2025 20:10
Raw identifiers are backtick-delimited identifiers that can contain any
non-identifier character other than the backtick itself, CR, LF, or other
non-printable ASCII code units, and which are also not composed entirely
of operator characters.
This requires no additional work beyond just parsing the identifier because Clang module maps already support such names in quotes.
The original module names themselves must still be valid unescaped identifiers; most of the serialization logic in the compiler depends on the name of a module matching its name on the file system, and it would be very complex to turn escaped identifiers into file-safe names.
… JSON file.

For build systems that already generate these files, it makes sense to include the aliases so that the map file serves as a comprehensive index of how the module inputs are referenced.
`#fileID` never accounted for the possibility that someone one have
a module alias _itself_, so it always generated the module's real
(physical) name. This _technically_ changes the behavior of `#fileID`
for self-aliased modules, but since nobody would have ever had a reason
to do that before raw identifiers, it's unlikely that this change would
affect anyone in practice.
If a decl is exported to Objective-C (explicitly or implicitly), it
must be given an explicit name that is a valid Objective-C identifier.
Imported C++ template specializations receive identifiers that contain
their type signature; e.g., `X<Y, Z>`. Since this means the identifier
contains non-identifier characters, the new behavior was trying to
escape them with backticks in ASTPrinter, ASTMangler, and the runtime
metadata. This pulls that back to preserve the current behavior for
specifically those types.
This lets clients test `#if hasFeature(RawIdentifiers)` to
determine compiler support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants