Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimated development time. #1276

Open
thisismygitrepo opened this issue Feb 21, 2024 · 6 comments
Open

Estimated development time. #1276

thisismygitrepo opened this issue Feb 21, 2024 · 6 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@thisismygitrepo
Copy link

thisismygitrepo commented Feb 21, 2024

Summary 💡

Can we add an extra output line where we mention an estimate of how many hours went into development?

I'm afraid there is no well-maintained library that implements this, but I believe any simple formula with simple assumptions will do.

@thisismygitrepo thisismygitrepo added the enhancement New feature or request label Feb 21, 2024
@spenserblack
Copy link
Collaborator

Fun idea! First we'll probably want a utility in place to detect files that are generated or vendored, so that committing e.g. dist/* or bootstrap.min.js won't incorrectly increase the calculated work time.

After that, I'd assume we'd get work from an estimated lines of code per hour or bytes per hour (there are pros and cons to each). The simplest way would be to check the current files, but this won't take deletions into account. If a project is rewritten, we'd miss a lot of the work hours! The most accurate way would probably be to compare all the deltas of the commits discoverable in the current branch, but this will be more expensive and time-consuming.

@thisismygitrepo
Copy link
Author

thisismygitrepo commented Feb 21, 2024

Thanks for expressing interest. That's a lot of ideas for first comment. We can start simple by hardcoding the exclusion of cache files and generated code. There are good tools that identify such files for all types of projects, e.g., kondo.

This repo has implementation. The README refers to 30 lines of code algorithm with decent explanation.

git-hours

I tried installing it but it didn't work.

This one worked but it vastly overestimated the time.

git-estimate

The unreasonable numbers are probably due to the caveats that you discerned in your first line of reasoning.

At the end of README, the author refers to the first repo as source of inspiration and quotes the overuse of dependencies (js libraries) in it as a reason to hack together a standalone binary in golang.

@spenserblack
Copy link
Collaborator

🤔 As I understand it, both of those use the time between commits to estimate the time worked. TBH I think this is a mistake. For example, I will often use git add -p to select the changes I want and ignore the changes I don't want for a commit, commit them, then repeat until I've committed all my changes. This practice means that I might have spent a significant amount of time, but my commits would all be grouped into the span of a few minutes. This could lead to a significant underestimation of time spent. One could also make these tools significantly overestimate time by simply making a small commit that took 5 minutes to write, waiting an hour, and making another small commit. The actual time spent might just be 10 minutes, but, AFAIK, these tools would consider that an hour of work.

IMO any tool that estimates time worked in a git repository would need to take code size into account (lines of code, file size, or something else). A quick Google search told me that the average developer writes 20 lines of code per hour, so that might be a good starting point.

@thisismygitrepo
Copy link
Author

I see.
In general, you seem to have many reservations about any algorithm based on certain git-committing behaviour as it cracks with repos developed by coders with different behaviour.

I like the irreducibly simple algroithm of dividing numbe of lines by 20 to get number of hours. I just used the provided number of lines in onefetch on one of my repos and I got 380 hours. This is a reaonsable estimation IF I wrote it as is with zero change.

  • Step 1 from here: find total number of lines including all the changes (its like integration in maths, start from first commit, then add up changes lines time).
  • Step 2: I propose a different factor that incorporate the size of the repo. The factor of 20 lines per hour could be reasonable the project is small. But when the project grows in complexity, you spend way more time observing tthe consequences of any small change on the overall behaviour.

@spenserblack
Copy link
Collaborator

  • I propose a different factor that incorporate the size of the repo. The factor of 20 lines per hour could be reasonable the project is small. But when the project grows in complexity, you spend way more time observing tthe consequences of any small change on the overall behaviour.

I'm glad you're thinking about this! For the sake of a reasonable estimate, I wonder if "development time" should be reimagined as "coding time" -- i.e. the actual time spent writing code for the project. "Development" is a broad subject, and can include responding to issues, reviews, planning, etc.


🤔 I think that calculating time spent can be a serious undertaking with a lot of code complexity. Additionally, it would have a lot of utility outside of this project. For this reason, I believe it shouldn't necessarily be part of the onefetch project, but a completely separate crate/project that maybe onefetch can add as a dependency. @o2sh Thoughts?

@o2sh
Copy link
Owner

o2sh commented Feb 26, 2024

I love the idea, thanks for the suggestion @thisismygitrepo.
Finding the "right" algorithm will be a trade off between complexity-accuracy and the solution will inherently be opiniated.

Personally, I'm okay with it being a rough estimate as long as the logic is sound and well-documented.

I believe it shouldn't necessarily be part of the onefetch project, but a completely separate crate/project that maybe onefetch can add as a dependency.

For the sake of simplicity in review, integration, and testing, we could start by incubating it within onefetch for the first iteration, and then later on extract it into its own crate. Alternatively, we could begin with it as a separate crate from the outset. Both approaches are feasible.

@thisismygitrepo If you're up for it, feel free to give it a try!

@o2sh o2sh added the help wanted Extra attention is needed label Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants