Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanded Python syntax file #3609

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

EllKyGr
Copy link

@EllKyGr EllKyGr commented Jan 14, 2025

Greetings,
I recently discovered Micro and it's an amazing editor. I managed to customize it to my needs and while doing so I stumble with the color schemes. I couldn't edit it to my hearts content so I gave it a try with the syntax file method and now I want to share it. After some trial and error I added some missing basic cases when coding in Python:

  • Static typing
  • Dot operand followed by a function call and so on.
  • Built in and custom function calls
  • Variable assignments
  • Placeholders in loops
  • Import, class and error handling
  • Relocate/modify previous statements
  • f-strings

New statements (and modified ones) come with a comment explaining the new highlighted cases. Sadly I wasn't capable to address all of them as I wanted. For instance the identifier.var at line 32, won't work in a dictionary because of the colon (overruled by the type statement at 46 which was relocated in order to take precedence): also it only properly works if, say a list, only contains variables and numbers (both floats and integers) if there's a string then the variable won't be highlighted as intended, it will use the default color stated in the .micro color scheme file.
Finally I wasn't able to use the color-link related to URL's. Although is a fairly generic regexp the editor just won't highlight it, I can only assume it needs a set of specific rules to properly work but to be honest I'm not really sure how to do so.
Even if you can target every possible scenario with the current groups and subgroups, by using better regexp and/or relocating some to take precedence, I would suggest to expand them overall for instance identifier.var.attr when calling and defining functions.

Because of this small endeavor I also rework a couple of the color schemes complying with this new syntax file which is meant to work specifically with Python files (unless the file content from other languages shares keywords and such with Python).
This "new" color schemes (three at the moment, two are expanded from yours, the third is an adapted version) are divided by syntax related highlight and the editor appearance with some comments to improve readability. Some of them I let them unchanged mainly because I didn't found the case they're supposed to highlight.
I believe a deeper explanation of every group and subgroup may facilitate expanding/creating both syntax files and color schemes to a higher degree.

@EllKyGr EllKyGr changed the title Micro c sch Expanded Python syntax file Jan 14, 2025
Copy link
Contributor

@Andriamanitra Andriamanitra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think having color schemes that are specific to one syntax file is a good idea. You should probably stick to using the same widely used groups as the other syntax files, and leave color schemes alone (or extend specific existing colorschemes instead of adding new ones).

You don't need to add a comment # New one after every change, git keeps track of the changes already.

Overall I'm not sure if this syntax is an improvement, I think it introduces more problems than it solves to be honest.

Comment on lines +17 to +40
# Basic variable assign
- identifier.var: "((^|\\s+|\\()[\\w]+(\\s|,)?)+=" # New one
# Variable assignment to another variable
- identifier.var: "=\\s?[\\w]+" # New one
# Variables followed by ':' (for typing), '.' operand (call a function i.e. dict.items()) and as placeholders inside 'for' loop
- identifier.var: "[\\w]+((\\.|\\s)\\b|: )" # New one
# Variables stated after specific keywords
- identifier.var: "(return|while|for|if|and|or|in)\\s[a-zA-Z_0-9]+\\s?" # New one
# Variables followed or preceded by comparative operands.
- identifier.var: "[a-zA-Z0-9_]+\\s(<|>|<=|>=|!=|==)\\s([a-zA-Z0-9_]+)?:?$" # New one
# Variables followed/preceded by normal or compound assignment
- identifier.var: "(\\+|\\-|\\/|\\*)\\s\\w+" # New one
- identifier.var: "(^|\\s+|\\()\\w+\\s(\\+|\\-|\\/|\\*)?=?" # New one
# Variables inside any type of brackets. NOTE: variables with anything but integers or float, won't work.
# Dictionaries highlight variables as 'types' (because of the ':') instead of a regular variable
- identifier.var: "(\\(|\\[|{)([\\w]+|[\\d_]+(\\.[\\d_])*)((,|:)\\s([\\w]+|[\\d_]+(\\.[\\d_])*)?){0,},?(\\)|\\]|})\\s?(:|=)?" # New one
# Iterator slicing and key assignment
- identifier.var: "\\b[\\w]+(\\[[\\w:.\\(\\)\\s/*-+]+\\])" # New one
# Custom function call
- identifier: "[\\w]+\\(" # New one
# Functions called with dot operand
- identifier: "\\.[a-z]+\\(?" # New one
# Import, class and error statements
- identifier.class: "(import|from|as|class|except)\\s[\\w]+((,\\s[\\w]+)+|:)?" # New one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why you want to combine highlighting identifiers with other things like keywords and operators. What purpose does it serve?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there are not enough groups and subgroups to distinguish one from another. Even if I only addressed the basic x = 5, if you want to use x in a while loop for instance, the variable itself won't be highlighted as you would expect as in pretty much any text editor with this capacities.
Also those keywords and operators won't be highlighted only what follows them, specifically variables (neither numbers nor strings are affected) but I had to add them inside the regex to actually work.
It doesn't look particularly pretty repeating several times identifier.var but it is clear to me that the editor could actually handle more specific subgroups to avoid these inconveniences.

@@ -45,13 +73,20 @@ rules:
skip: "\\\\."
rules:
- constant.specialChar: "\\\\."
# f-string format
- constant.specialChar: "\\{[\\w\\d/*-+.\\s\\(\\)]*?\\}" # New one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works for tiny subset of f-string syntax (you may want to read the comments in #3605). Also I think it doesn't make sense to highlight the format groups in regular strings.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look at it

Comment on lines +88 to +89
# NOTE: Requires unique symbols %&#... Currently not working presumably missing specific rules
- constant.string.url: "(http(s)?:\\/\\/w{3}\\.)?[\\w]+(\\.[a-z]{,3})+((\\/[\\w]+){1,})?" # New one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would not want to have URLs syntax highlighted as they are not part of Python syntax.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, though I never implied that was such the case. Rather since there's, apparently, a color group for the urls I assumed some people might actually want to see them highlighted. Otherwise what's the point of the subgroup then?

Comment on lines +45 to +46
# From `typing` module and other modules
- type: "(:|->)\\s[a-zA-Z]+(\\[[a-zA-Z]+(,\\s[a-zA-Z]+)?\\])?\\s?:?" # New one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempting to syntax highlight type annotations specifically is IMO ill-advised, because Python does not have a clear separation of types and runtime variables. The types may be arbitrarily complicated expressions like dict[tuple[str, int], Callable[["MyType", int], float] | None].
The : also conflicts with a bunch of unrelated syntax like if x: y.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempting to syntax highlight type annotations specifically is IMO ill-advised, because Python does not have a clear separation of types and runtime variables.

Can you elaborate how highlighting types affects Python's runtime variables?

The types may be arbitrarily complicated expressions like dict[tuple[str, int], Callable[["MyType", int], float] | None].

The attached screenshot shows how types are handled with both the extended syntax file and the darcula_py.micro. As far as I can see it works as intended but I suppose there could be even more complicated
typings.

Types

The : also conflicts with a bunch of unrelated syntax like if x: y.

Fair enough although that falls under the umbrella of good practices, project rules and the use of compliant PEP formatters, case in point the autofmt plugin uses yapf for Python and if you write down that statement it will immediately redacted to the "correct" format.

if x:
    y

Which won't cause the highlight conflict

@EllKyGr
Copy link
Author

EllKyGr commented Jan 15, 2025

I don't think having color schemes that are specific to one syntax file is a good idea.

I'm aware of that, this is at the end a proposal addressing a current limitation: simple and general grouping might be "good enough" however it doesn't mean everybody shares the same thought specially since the editor is presented as a customizable, smoother experience over older terminal text editors: for instance, default colorschemes 256 or 16 colors look effectively the same with plane colors repeating all over again; the later makes sense the former don't, so why not use it to its full potential at the very least as an option? Let the user decide instead. The screenshots attached here shows the base darcula and the extended version to prove that. Again it's not one over the other rather been able to choose.

darcula
darcula

darcula_py
darcula_py

You should probably stick to using the same widely used groups as the other syntax files...

Same issue as the previous. I don't know if it is too much to ask or there are valid reasons to not implement it, yet I see a misused opportunity: yes at the present it works fine however might be the case anyone would want to see a clear difference from specifics statements, words and so on; I believe most languages uses the term variable effectively the same yet some will call the function a method, attributes rather than properties and so on. A master syntax file makes sense first, with this widely used groups, then every specific syntax file addressing the quirks for its respective language.

...and leave color schemes alone (or extend specific existing colorschemes instead of adding new ones).

Both darcula_py and monokai_py do exactly that, main reason for the _py was for visual comparison (for the reviewers) between the original and extended and how it works with the extended syntax file. The third one follows the same logic and adding additional colorschemes to the editor does not look to me like an actual issue unless you want to keep the editor lightweight. I've seen plugins which only adds colorschemes so I guess that's the way to go for this matter all things considered.

Overall I'm not sure if this syntax is an improvement, I think it introduces more problems than it solves to be honest.

I've been testing it for the last two weeks or so with no issue in any way, performance wise, crashes or any kind of runtime issues so I'm unsure what kind of problems might introduce. I suppose then the user will have to tweak the editor to suit their respective needs or just leave it as it is.

@Andriamanitra
Copy link
Contributor

Andriamanitra commented Jan 15, 2025

...and leave color schemes alone (or extend specific existing colorschemes instead of adding new ones).

Both darcula_py and monokai_py do exactly that, main reason for the _py was for visual comparison (for the reviewers) between the original and extended and how it works with the extended syntax file.

I see – for reviewing it would be better to edit darcula and monokai directly instead of introducing a new file, that way git (and Github) shows the diff between the files.

The third one follows the same logic and adding additional colorschemes to the editor does not look to me like an actual issue unless you want to keep the editor lightweight. I've seen plugins which only adds colorschemes so I guess that's the way to go for this matter all things considered.

I don't mind additional colorschemes (within reason) but having ones that are slight variants of each other is just annoying when browsing them. If we add _py variants, should there then also be _rb, _rs, _cpp, _hs, and so on?

Overall I'm not sure if this syntax is an improvement, I think it introduces more problems than it solves to be honest.

I've been testing it for the last two weeks or so with no issue in any way, performance wise, crashes or any kind of runtime issues so I'm unsure what kind of problems might introduce. I suppose then the user will have to tweak the editor to suit their respective needs or just leave it as it is.

I mean weird glitches like adding/removing spaces changing the color of variable name even though it's still parsed the exact same by the actual interpreter. Some variables get colored only partially and others don't get colored at all. Sometimes the exact same expression gets colored different in a different context. Every source file I look at there's something wonky going on.

Here's a screenshot with monokai_py looking at one of my source files:
screenshot with bunch of inconsistencies

@EllKyGr
Copy link
Author

EllKyGr commented Jan 16, 2025

I see – for reviewing it would be better to edit darcula and monokai directly instead of introducing a new file, that way git (and Github) shows the diff between the files.

Fair enough. I know you can see the differences in both Git and Github, I just added the extended one in case somebody wanted to compare them visually, i.e. inside Micro and use set colorscheme, to swap between each other.

I don't mind additional colorschemes (within reason) but having ones that are slight variants of each other is just annoying when browsing them. If we add _py variants, should there then also be _rb, _rs, _cpp, _hs, and so on?

Same as previous point, again just for visual comparison: I would've change it with no problem if approved.

Here's a screenshot with monokai_py looking at one of my source files

I see what you mean now. Definitely my biggest mistake was assume every piece of code would be relatively simple and the expressions presented here will suffice.
Anyway I can make the appropriate modifications to rename the colorschemes files if those are ok. Although the syntax file is not, I would definitely insist in adding more subgroups or a master syntax file for universal terms. Other than that I suppose there's no way to meet all potential cases (like the ones in your file) without clogging more the syntax file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants