-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avclass package, class, restructure #35
base: master
Are you sure you want to change the base?
Conversation
Reduced arguments to --input and --type. Handles multiple files or directories. This makes it easier to add additonal inputs, such as metadefender I left the old arguments in there for backward compatibility, but we may just want to remove them.
Apologies for this falling in a vacuum. We did check your merge request when first issued, but as you mentioned there were many changes combined in the merge request and it was not easy to merge except blindly.
Other of the proposed changes will hopefully be integrated soon in some form. For example, we plan to modify the options to provide the input and the type, but so far focused on changes that did not require changes to existing (AVClass2) options beyond the tool name change. We also plan to refactor labeler.py to make it easier to be used as a library. In any case, many thanks for your contribution. We have added you to the list of significant contributors! |
Now there are two command line options -f to provide files and -d to provide directories. The format of each file is automatically determined by the tool. Input-related options removed: -vt, -vt3, -lb, -vtdir, -lbdir, -gz We have also changed the default behavior of the tool so that now by default it runs in compatibility mode and outputs the families, which we believe is what most users want. There is a new -t option to output the full list of tags. We have also removed the -p option as by default the new -t option includes the full path of the tags as that provides more info. |
Awesome - I'll have to look it over. In particular if you made any changes to common.py. I just implemented AVClass as a service into AssemblyLine. https://github.com/CybercentreCanada/assemblyline-service-avclass Let me know if you have any suggestions with the implementation. One thing I did expand on was the translations and taxonomy. I provided an option to extend your dataset with Malpedia families / alt_names. I also added a function (think it's called is_hex) as I had a bit of false positives on short hex labels. |
I have committed a refactoring of labeler.py to introduce a FileLabeler class. I have introduced a -o option to place the output into a file instead of stdout. I have slightly changed the output format so that -pup, -vtt are included in the now default compatibility mode, but not with the -t (tags) option. Could add them with the tags as well if that seems useful. Changes to common.py are limited to moving the get_sample_info* methods from AvLabels into FileLabeler. The idea is to leave AvLabels to focus on the labels and FileLabeler on files with many reports. I plan to look into providing a few more options into AvLabels so that there is more flexibility using the package as a library Regarding the integration into AssemblyLine, is your goal by adding Malpedia to provide additional aliases for a given tag? Or something else? |
Part of it was to add possible aliases and giving a common name for a family. But it also could expand what is detected as a family by adding additional "FAM:...". If a family is detected, it's possible it also links attribution, references, and a history of the malware. |
I just committed support for OPSWAT MetaDefender as provided in this fork Regarding Malpedia, we added support for such integration when producing the first MISP taxonomy, but the intern working on this left before we could commit the cleaned script to automatically update the MISP taxonomy to keep it in sync with the latest AVClass taxonomy and rules. We can try to resurrect that effort if that seems worth it for MISP users. |
This is a major change but figured I should create the PR for review or in case you want to copy something. This fork requires Python 3.6+ and removes the original avclass 1.x (tag it for posterity use), so there is only one AVCLASS and it's at version 2.x (Compatibility mode
-c
still works). Some of the input parameters have changed to make it easier to add and detect things. It also pulls in some PR from other authors that I thought were meaningful.