-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(map): Replace loop with NumPy post-processing #369
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @mikkelkp . This is a big help.
I can understand why you set it up like this so that you can do an end-to-end test locally with the existing recipe but I think it would be really helpful if this numpy capability were exposed with a --flag option on the CLI commands. This way, you don't have to update all of the tests and we would be able to merge this PR knowing that the users of the LB Versioner component won't break their comfort mapping recipes as a result of this change.
You'll see an example of how I would recommend formatting the flag in my comments and I think we can have code to sense whether the input files are numpy arrays with the following:
with open(input_file, encoding='utf-8') as inf:
first_char = inf.read(1)
second_char = inf.read(1)
is_text = True if first_char.isdigit() or second_char.isdigit() else False
Then, you use that is_text variable to decide whether to load the file with the numpy methods or with the existing methods to read CSV. Because we only read the first two characters, it should not add to the calculation time. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
No description provided.