Curious on results #7

Eibwen · 2018-07-05T13:51:59Z

Hello,
Not sure if there is a better way to ask this, but I am curious if using this tool has produced good results for you, or if you've given up on the idea or found different methods to achieve the results?

I recently started at a new company and am trying to analyse their repo history to know whats going on and all. I also previously built a tool which would track down files that very few or only one author touched (had an employee or two that would commit dead files and just leave it there, cluttering the working directory)

For the latter, I transformed the commit data into a different data structure which was more file-based, rather than commit-based. That idea might be able to make your Issue#2 quicker to process--externally store a filename history chain somewhere

rubberduck203 · 2018-07-05T14:29:25Z

I’ve gotten very good results with this on the handful of occasions I’ve had a reason to use it!!

At one client, I was able to identify a class that had a ton of churn. I was then able to cross reference that with static analysis and code coverage to show them that there was a correlation to high line count, high cyclomatic complexity, and poor test coverage to a high bug and “touch” rate. Using that info we chose to invest our time testing and refactoring that monster, where the money was best invested. I love this approach, just haven’t had a recent need for it.

As for #2, Git and Libgit2Sharp provide the file rename information in the TreeEntryChanges object. I’ve just not taken the time to properly process that info in this project yet.

Thanks for your interest. It’s nice to know other people are interested in my work on this.

Eibwen · 2018-07-05T19:31:04Z

That does sound like a great approach, I'll have to keep that idea in my back pocket and/or suggest to friends.

From pointing your program on the repo I'm becoming more familiar with, first definitely impressed at the speed v2 processed it at, my impression has been that libgit2sharp is pretty slow but good to see it can be improved. And some further thoughts:

Have the usage examples suggest piping out to a file, and/or a parameter to specify an output report file.

Also maybe have a parameter that can say "Exclude files with less than 20 changes", and/or "Only print the top 200 items". Reason being the repo I ran it on it produced 14000 lines of output, which well exceeded the buffer on my command prompt.

I would imagine with your data structures you could also report like: "Modified 3 times, last time was 3 years ago", I guess that somewhat is related to #4, but I would imagine "stale files" would often be a bad sign too (although probably more often than not just stable and well-built features), but could help with suggesting to refactor something out to a nuget library when one needs to break up a monolithic repository

rubberduck203 · 2018-07-05T20:52:29Z

Have the usage examples suggest piping out to a file, and/or a parameter to specify an output report file.

Also maybe have a parameter that can say "Exclude files with less than 20 changes", and/or "Only print the top 200 items". Reason being the repo I ran it on it produced 14000 lines of output, which well exceeded the buffer on my command prompt.

I would actually recommend using standard tools like sort, uniq, head, and tail for that purpose. I was trying to stick to the *nix philosophy of “Do one thing, do it well, and conform to a standard IO (text). The date filter was only added because it was easier to handle inside the tool than via a pipe.

You’re absolutely right that there should be some usage examples of how to do some of these things though. Thanks for the feedback. I’ll try to find some time to do that.

rubberduck203 · 2018-07-05T23:02:33Z

@Eibwen I just released a new version that makes it easier to pipe the output into other programs for more thorough analysis and updated the read me. Are you one of the folks who downloaded the windows version? I could use a hand with creating the equivalent DOS commands for the docs.

Eibwen · 2018-07-06T13:20:47Z

Yes I did. What were you thinking? cmd.exe is fairly limited (just looked up how to head and it told you to just invoke powershell commands hah), and powershell I always have to google a fair amount, but would be happy to help. (I normally use bash shell for windows that is included in git packages for anything semi-advanced on commandline)

Really in my experience I'd say most windows developers/users would be accustomed enough with SublimeText or other text editors that just manipulating the results in there would be quicker and easier than obscure powershell command strings

rubberduck203 · 2018-07-06T13:39:47Z

Okay. Since you & I seem to basically the only users, I’m going to just kick this can until someone asks about it. Hopefully by then I’ll feel good about recommending the Win Linux Subsystem or have another Win environment to work out the equivalent power shell.

Thanks so much for the feedback.

I’m going to close this issue, but feel free to snag my email from the Git log or ping me here if there’s anything else I can help with.

rubberduck203 mentioned this issue Jul 5, 2018

Add more usage examples #8

Closed

rubberduck203 closed this as completed Jul 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Curious on results #7

Curious on results #7

Eibwen commented Jul 5, 2018

rubberduck203 commented Jul 5, 2018

Eibwen commented Jul 5, 2018

rubberduck203 commented Jul 5, 2018

rubberduck203 commented Jul 5, 2018

Eibwen commented Jul 6, 2018

rubberduck203 commented Jul 6, 2018

Curious on results #7

Curious on results #7

Comments

Eibwen commented Jul 5, 2018

rubberduck203 commented Jul 5, 2018

Eibwen commented Jul 5, 2018

rubberduck203 commented Jul 5, 2018

rubberduck203 commented Jul 5, 2018

Eibwen commented Jul 6, 2018

rubberduck203 commented Jul 6, 2018