Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex for the asian languages does not work, e.g. (?<!\p{Han}) or (?!\p{Lo}) #373

Open
iG8R opened this issue Apr 15, 2024 · 5 comments
Open

Comments

@iG8R
Copy link

iG8R commented Apr 15, 2024

Sometimes I need to replace Asian characters when they surrounded only with Western ones.
To do this, I'm trying to use Lookbehind and Lookahead constructions, e.g. (?<!\p{Han}) or (?!\p{Lo}), but they don't work in FoxReplace at all, although everything is fine when I check them, for example, on https://regex101.com.

image

image

image

image

@Woundorf
Copy link
Owner

This is because I don't use any of the Unicode flags when creating the RegExp object, and they are needed to support these \p{...} character classes (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Unicode_character_class_escape). Maybe they should be used always or with an option (like the case sensitivity), but I didn't know about these flags until relatively recently.

Could you please create another issue asking for unicode support, with a link to this one as example?

In the meantime, as a workaround you could replace with a function where you test if the found text actually matches the correct regexp and then return the replaced string and otherwise the unmodified string.

@iG8R
Copy link
Author

iG8R commented Apr 16, 2024

Thanks a lot for your attention and advice, I've already tried it and it is too cumbersome to use the function in this case.

@iG8R
Copy link
Author

iG8R commented Apr 16, 2024

Maybe there is also a flag that make it possible to use the change capitalization escape in a substitution equation, like the following on https://stackoverflow.com/a/33351224/6773436:

  1. Capitalize words

    Find: (\s)([a-z]) (\s also matches new lines, i.e. "venuS" => "VenuS")
    Replace: $1\u$2

  2. Uncapitalize words

    Find: (\s)([A-Z])
    Replace: $1\l$2

  3. Remove camel case (e.g. cAmelCAse => camelcAse => camelcase)

    Find: ([a-z])([A-Z])
    Replace: $1\l$2

  4. Lowercase letters within words (e.g. LowerCASe => Lowercase)

    Find: (\w)([A-Z]+)
    Replace: $1\L$2
    Alternate Replace: \L$0

  5. Uppercase letters within words (e.g. upperCASe => uPPERCASE)

    Find: (\w)([A-Z]+)
    Replace: $1\U$2

  6. Uppercase previous (e.g. upperCase => UPPERCase)

    Find: (\w+)([A-Z])
    Replace: \U$1$2

  7. Lowercase previous (e.g. LOWERCase => lowerCase)

    Find: (\w+)([A-Z])
    Replace: \L$1$2

  8. Uppercase the rest (e.g. upperCase => upperCASE)

    Find: ([A-Z])(\w+)
    Replace: $1\U$2

  9. Lowercase the rest (e.g. lOWERCASE => lOwercase)

    Find: ([A-Z])(\w+)
    Replace: $1\L$2

  10. Shift-right-uppercase (e.g. Case => cAse => caSe => casE)

    Find: ([a-z\s])([A-Z])(\w)
    Replace: $1\l$2\u$3

  11. Shift-left-uppercase (e.g. CasE => CaSe => CAse => Case)

    Find: (\w)([A-Z])([a-z\s])
    Replace: \u$1\l$2$3

@Woundorf
Copy link
Owner

This is not possible in JavaScript without using a custom function. The only recognized special strings in JavaScript are listed here.

The examples listed in the Stack Overflow answer are for Sublime Text, which according to another comment relies on Boost, which following the links it seems that supports the same things as Perl.

@iG8R
Copy link
Author

iG8R commented Apr 19, 2024

Thanks a lot for the clarification. It is so pity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants