Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define classes and algrithm for effective operation of the relase notes generation #103

Open
3 tasks
benedeki opened this issue Oct 14, 2024 · 3 comments
Open
3 tasks
Assignees
Labels
refactoring Improving code quality, paying off tech debt, aligning APIs spike Proof of concept, research and investigation tasks

Comments

@benedeki
Copy link

Background

The current data structures and the steps data are extracted from GitHub API might be not the most suitable for all the features requested.
Brainstorm and design the best data structures and steps of the program to fill them and utilize them.

Desired Outcome

  1. Best data structures
  2. Algorithmic steps for processing

Tasks

Preview Give feedback
@benedeki benedeki added refactoring Improving code quality, paying off tech debt, aligning APIs spike Proof of concept, research and investigation tasks labels Oct 14, 2024
@miroslavpojer
Copy link
Collaborator

miroslavpojer commented Oct 14, 2024

Suggestion:

There can exist several options for Issues <==> PRs <==> Commits

Issue, PR, Commit
0, 0, 1 - Direct commit
1, 0, 0 - Issue without PR
1, 1, 1 
0, 1, 1 - PR without Issue with 1 Commit
0, 1, Z - PR without Issue with Z Commit

1, Y, Z - today design - the row is primary for 1 Issue Y PRs and Z Commits
X, Y, Z - weak point of today's design

Not solve more Issues linked from one PR - not considered type of link: Fixed, Solved, Mention.

Legend - characters are different counts

David idea, define a class Line - represents one row of the chapter
where the line can contain:

  • X Issues (Class Issue)
  • Y PRs (Class PullRequest)
  • Z Commits (Class Commit)

Problems

  • style work nr.1
    • I have 1 Issue and 1 .. N PRs, which brings a change => series of small changes
      • is each PR holder of RLS notes and have to be?
  • style work nr.2
    • I have 1 PR, which brings change and has RLS included
      • PR can link to more issues

Ideas to debate:

  • allow change point of view for output generation - another work with Line instances
  • Issue-oriented: current state (data: one Issue per line instance)
  • PR oriented: this is maybe David vision (data: one PR per line instance)
  • line as a summary of all links (Issue to PR)
    • X Issues, Y PRs, Z Commits
      • where line can be a series of supported placeholders where the code will decide which can be used in current conditions

@benedeki
Copy link
Author

benedeki commented Oct 14, 2024

My suggestions:

The central class in ReleaseNotesEntry.

Github Item classes

class SourceObject(var inARNEnty: Boolean)
class GitHubItem(number: Integer, ..) extends SourceObject
class Issue extends GitHubItem
class PullRequest extends GitHubItem
class Commit(commitMessage: String, ....)extends SourceObject

Release notes

class ReleaseNotesEntry(
  entries: List[String],
  issues: List[Issue],
  pullRequests: List[PullRequest]
  commits: List[Commit]
)

can be created either from Issue or PR, maybe deserved each specific child class for ReleaseNotesEntry (needs to be considered)

Algorithm

  • create an MxN bilateral graphs of issues and PR, how they are connectd
  • link commits to PRs
  • release notes in Issue:
    • go through issues, create ReleaseNotesEntry based on them
  • release notes in PR:
    • go through PR, create ReleaseNotesEntry based on them
  • take remaining issues, PR and commits to generate "special ReleaseNotesEntry

@miroslavpojer
Copy link
Collaborator

miroslavpojer commented Oct 15, 2024

New User control: Define where Release notes are defined: Issue, PR

  • Note: for now release notes will be placed in body of Issue or PR

Class structure:

  • class ReleaseNotesEntry
    • entries can be empty list ==> mean no RLS notes defined
class ReleaseNotesEntry(
  entries: List[String],
  issues: List[Issue],
  pullRequests: List[PullRequest]
  commits: List[Commit]
)
  • class SourceObject(var inARNEnty: Boolean)
    • hold users decision about source of RLS notes
  • class GitHubItem(number: Integer, ..) extends SourceObject
    • ancestor for Issue and PR holding shared attributes and logic
  • class Issue extends GitHubItem
  • class PullRequest extends GitHubItem
  • class Commit(commitMessage: String, ....)extends SourceObject

Algorithm

  • data mining

    • get all Issue from latest RLS (in all states!) - 1 API call
    • get all PRs from latest RLS - 1 API call
    • get all PRs related commits - 1 API call per PR
    • get all commits from latest RLS - 1 API call
  • data linking

    • create a MxN bilateral graph of issues and PR
      • Note: connect Issues to PRs or PRs to Issues
    • identify direct commits
  • ReleaseNotesEntry instance creation

    • if release notes in Issue (M == 1):
      • go through issues, create ReleaseNotesEntry based on them
    • if release notes in PR (N == 1):
      • go through PR, create ReleaseNotesEntry based on them
    • take remaining issues, PR and commits to generating "special ReleaseNotesEntry" types
  • ReleaseNotesEntry Categorization

    • assign entry instances to chapters by their labels and service chapters by metadata analysis
  • RLS notes generation

    • term: Release notes (text consisting from custom and service chapters and link to commits)
    • chapters
      • their control is not part of this spike
    • chapter rows
      • format defined by user definition or by default value (supported by several examples in Doc)
      • one row_format for all chapters (exceptions exist)
        • example: {pr-number} {issue-title} by {contributors}\n{rls-notes}
        • 1 - N lines == one chapter entry row
      • chapters with default row_format
        • format example: Commit: {sha} _{author}_ {contributors}\n{message}
        • user cannot define own row format in initial version of this spike !
        • types:
          • direct commit (see format example above)
          • issue only and pr only - default format provide exact information to unique identification
            • Note: pr-only is newly service chapter too - expected duplications
      • placeholders
        • supported set of placeholders - see a doc
        • if no data available for placeholder, than alternative string will be added
  • Open Points

    • re-evaluation of Service chapter names and meaning:
      • current no labels ==> no chapter - Issue and PRs without user defined label

Summary

  • No special label is needed for the generator.
  • Special label(s) will be used only for GH check workflow (check the presence of Release notes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Improving code quality, paying off tech debt, aligning APIs spike Proof of concept, research and investigation tasks
Projects
None yet
Development

No branches or pull requests

3 participants