-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: better support for arm-none-eabi-objdump output #4
base: master
Are you sure you want to change the base?
Conversation
amaanq
commented
Nov 19, 2023
•
edited
Loading
edited
- ... is repeated filler zero bytes
- comments can start with ;
- arm uses 2 or 4 bytes for "machine code bytes"
Also added support for raw data (see the added test). That was a bit painful to get working but I effectively only parse raw data if there's a dot within the first 4 characters, it doesn't have "0x" in it (could be a .word or .byte directive, etc), and ends with a newline |
Hey @amaanq thank you for your edits. I wondered at the start of making this parser what'd be the best approach for arm (if we'd need a separate parser or just extend this parser to handle both conventions. I'm glad this one was able to suit both. Not to distract from the work you've already done but I wanted to point out that I made a separate repository for just disassembly, over at https://github.com/ColinKennedy/tree-sitter-disassembly. I wonder if it'd be less work to treat disassembly as an injected language in tree-sitter-objdump and then move the disassembly/parse logic over to https://github.com/ColinKennedy/tree-sitter-disassembly, in one place. Or if we would be better off to keep the two repositories separate. What do you think? Either way, whether we keep it separate or inject+consolidate, I can make sure both repositories handle the most conventions within reason. For the new cases you've mentioned in the PR description, would you mind adding tests for these? The |
Ah I see, that's nice. I was planning to have asm injected into instruction nodes, I don't think it's more or less work either way to split diassembly parsing into another repo via injections or just keep it all here, the parser size is still very very small. I added a test case for |
20000018: 00000000 20000200 20000268 200002d0 ....... h.. ... | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these "..."s meant to be part of the byte dump or do they represent something different? ....... h.. ...
is 15 characters so the "..."s seemed to be something different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something different, it's like filler zeroes
Hi @amaanq sorry for the radio silence on this one. I made a separate parser at https://github.com/ColinKennedy/tree-sitter-disassembly and PRed it to nvim-treesitter at nvim-treesitter/nvim-treesitter#5755. Depending on how that goes, it may be better to reduce tree-sitter-objdump to just inject to tree-sitter-disassembly. Either way, injection or not, I'm happy to merge this PR in either case. The only thing giving me pause right now is the failing https://github.com/ColinKennedy/tree-sitter-objdump/actions/runs/6957883193/job/18931655399?pr=4. But looking at the log, it seems to be an issue with the |