Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curious on results #7

Closed
Eibwen opened this issue Jul 5, 2018 · 6 comments
Closed

Curious on results #7

Eibwen opened this issue Jul 5, 2018 · 6 comments

Comments

@Eibwen
Copy link

Eibwen commented Jul 5, 2018

Hello,
Not sure if there is a better way to ask this, but I am curious if using this tool has produced good results for you, or if you've given up on the idea or found different methods to achieve the results?

I recently started at a new company and am trying to analyse their repo history to know whats going on and all. I also previously built a tool which would track down files that very few or only one author touched (had an employee or two that would commit dead files and just leave it there, cluttering the working directory)

For the latter, I transformed the commit data into a different data structure which was more file-based, rather than commit-based. That idea might be able to make your Issue#2 quicker to process--externally store a filename history chain somewhere

@rubberduck203
Copy link
Owner

I’ve gotten very good results with this on the handful of occasions I’ve had a reason to use it!!

At one client, I was able to identify a class that had a ton of churn. I was then able to cross reference that with static analysis and code coverage to show them that there was a correlation to high line count, high cyclomatic complexity, and poor test coverage to a high bug and “touch” rate. Using that info we chose to invest our time testing and refactoring that monster, where the money was best invested. I love this approach, just haven’t had a recent need for it.

As for #2, Git and Libgit2Sharp provide the file rename information in the TreeEntryChanges object. I’ve just not taken the time to properly process that info in this project yet.

Thanks for your interest. It’s nice to know other people are interested in my work on this.

@Eibwen
Copy link
Author

Eibwen commented Jul 5, 2018

That does sound like a great approach, I'll have to keep that idea in my back pocket and/or suggest to friends.

From pointing your program on the repo I'm becoming more familiar with, first definitely impressed at the speed v2 processed it at, my impression has been that libgit2sharp is pretty slow but good to see it can be improved. And some further thoughts:

Have the usage examples suggest piping out to a file, and/or a parameter to specify an output report file.

Also maybe have a parameter that can say "Exclude files with less than 20 changes", and/or "Only print the top 200 items". Reason being the repo I ran it on it produced 14000 lines of output, which well exceeded the buffer on my command prompt.

I would imagine with your data structures you could also report like: "Modified 3 times, last time was 3 years ago", I guess that somewhat is related to #4, but I would imagine "stale files" would often be a bad sign too (although probably more often than not just stable and well-built features), but could help with suggesting to refactor something out to a nuget library when one needs to break up a monolithic repository

@rubberduck203
Copy link
Owner

Have the usage examples suggest piping out to a file, and/or a parameter to specify an output report file.

Also maybe have a parameter that can say "Exclude files with less than 20 changes", and/or "Only print the top 200 items". Reason being the repo I ran it on it produced 14000 lines of output, which well exceeded the buffer on my command prompt.

I would actually recommend using standard tools like sort, uniq, head, and tail for that purpose. I was trying to stick to the *nix philosophy of “Do one thing, do it well, and conform to a standard IO (text). The date filter was only added because it was easier to handle inside the tool than via a pipe.

You’re absolutely right that there should be some usage examples of how to do some of these things though. Thanks for the feedback. I’ll try to find some time to do that.

@rubberduck203
Copy link
Owner

@Eibwen I just released a new version that makes it easier to pipe the output into other programs for more thorough analysis and updated the read me. Are you one of the folks who downloaded the windows version? I could use a hand with creating the equivalent DOS commands for the docs.

@Eibwen
Copy link
Author

Eibwen commented Jul 6, 2018

Yes I did. What were you thinking? cmd.exe is fairly limited (just looked up how to head and it told you to just invoke powershell commands hah), and powershell I always have to google a fair amount, but would be happy to help. (I normally use bash shell for windows that is included in git packages for anything semi-advanced on commandline)

Really in my experience I'd say most windows developers/users would be accustomed enough with SublimeText or other text editors that just manipulating the results in there would be quicker and easier than obscure powershell command strings

@rubberduck203
Copy link
Owner

Okay. Since you & I seem to basically the only users, I’m going to just kick this can until someone asks about it. Hopefully by then I’ll feel good about recommending the Win Linux Subsystem or have another Win environment to work out the equivalent power shell.

Thanks so much for the feedback.

I’m going to close this issue, but feel free to snag my email from the Git log or ping me here if there’s anything else I can help with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants