Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test files for version 9 #78

Open
sadikovi opened this issue Jan 31, 2018 · 8 comments
Open

Test files for version 9 #78

sadikovi opened this issue Jan 31, 2018 · 8 comments

Comments

@sadikovi
Copy link
Owner

netflow version 9 sample file.

nfcapd.201801311702.gz

@sadikovi sadikovi changed the title Test files Test files for version 9 Jan 31, 2018
@natedogs911
Copy link

Thank you!

2 questions:

a. if you have test data that is not labeled as to the flow version (i.e. v5 or v7) is there a way to determine the version using an easy available tool such as your library or nfdump?

b. do you have test data for v5 or v7 available?

Thanks again

@sadikovi
Copy link
Owner Author

Hi,

Yes, each file encodes version in the header, including version 9. The package checks magic bytes and version number to make sure that we are reading files consistently in Spark - package does not rely on a file name.

I do have test files that are used in unit tests (see https://github.com/sadikovi/spark-netflow/tree/master/src/test/resources/correct). They are generated files, For manual quality testing I have real-world dataset locally.

@sadikovi
Copy link
Owner Author

You can use my library to check version. Unfortunately, it will check only version 5 and version 7, any other version will throw exception (I think). See example (https://github.com/sadikovi/spark-netflow#using-netflowlib-library-separately) for more information.

@natedogs911
Copy link

thank you, the samples will be helpful.
I was just looking at getHeader() to see if I can identify why the test files I have are throwing "bad magic".

@sadikovi
Copy link
Owner Author

All I can say that the file is most likely not a version 5 or version 7. If you are convinced that your files are version 5 or version 7, which you can do by removing the magic check in the library and try again. Magic numbers are for Cisco Netflow. If you had your files generated using something else, then, I assume, magic will be different.

I think we might need to remove the magic check, or add list of magic that is supported by the package.

You can attach your file, I can have a look.

@natedogs911
Copy link

I just zipped one of the smallest files.
These are synthetically generated files and I suspect the header is the problem. I'll try looking at the header shortly as well.
nfcapd.201601280215.zip

@sadikovi
Copy link
Owner Author

I get similar byte layout as for the file I included in the issue, so I guess it is version 9. The package currently does not support version 9.

@sadikovi
Copy link
Owner Author

sadikovi commented Feb 2, 2018

It looks like those files are nfdump specific, not Cisco NetFlow. I tried successfully parsing them following structs in https://github.com/pmorch/nfdump/blob/621674bc751437741ca367b7c7b170fca6106764/bin/nffile.h.

Code needs to be written specifically to handle those types of files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants