Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Split Github comments instead of trimming #301

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

andridzi
Copy link
Contributor

@andridzi andridzi commented Nov 7, 2024

  • Split Github comments instead of trimming
  • Delete kubechecks running comment instead of editing message

Can't do this for Gitlab because I don't have a way how to test it.
When enabled kubeconform checks it's possible to have markdown issues when comment is splitted in the middle of the report.

Partially fixes #299

@andridzi
Copy link
Contributor Author

@djeebus
can you please review it? I'm looking for this one to be released ASAP if possible, because it's already very much required feature from my side.
thanks in advance

@Greyeye
Copy link
Collaborator

Greyeye commented Nov 14, 2024

@andridzi, please add a test for splitComments.

@KyriosGN0
Copy link
Contributor

@andridzi are you planning on working on this PR? i would really like this feature.....

@andridzi
Copy link
Contributor Author

@KyriosGN0 , globally - yes, but not right now
I was trying to add a test for splitComments but failed
will give another try to it in few weeks I guess

@vovinacci
Copy link

vovinacci commented Feb 3, 2025

Reinforcements have arrived (c)

Hello @djeebus and @Greyeye,

I’ve re‐worked the implementation and added unit tests.

Many thanks to @andridzi for allowing me to incorporate enhancements into this PR, so we don’t end up with multiple forks and overlapping PRs.

The code and tests are extensively commented, as string manipulation can be rather intricate to revisit or debug. I trust the verbosity clarifies what’s happening under the hood.

Please let me know if you notice anything that might need adjusting or if you have any queries.

Cheers!

P.S. @andridzi, I’ve synced the fork to the latest changes. Could you kindly verify whether there are any unwanted modifications?

@andridzi
Copy link
Contributor Author

andridzi commented Feb 3, 2025

hello @vovinacci

thank you very much for jumping into this, really appreciated!

changes looks very good (especially test part, that very hard for me :) )

one small bug I found - looks like after <details><summary>Show Output</summary> there is a missing "```diff\n" at the beginning of each continued message.

everything else works just fine

@zapier-sre-bot
Copy link
Collaborator

Mergecat's Review

Click to read mergecats review!

😼 Mergecat review of pkg/vcs/github_client/message.go

@@ -16,7 +16,75 @@ import (
 	"github.com/zapier/kubechecks/telemetry"
 )
 
-const MaxCommentLength = 64 * 1024
+const (
+	MaxCommentLength = 64 * 1024
+
+	// Using string concatenation for better code readability.
+	// Cannot use raw string literal (backticks) here as the separators themselves contain backticks.
+	sepStart = "Continued from previous comment.\n" +
+		"<details><summary>Show Output</summary>\n\n" +
+		"```diff\n"
+
+	sepEnd = "\n```\n" +
+		"</details>\n" +
+		"<br>\n\n" +
+		"**Warning**: Output length greater than maximum allowed comment size. Continued in next comment."
+)
+
+// splitComment splits the given comment into chunks from the beginning,
+// ensuring that each decorated chunk does not exceed maxSize.
+// - The first chunk has no prefix but, if not the only chunk, is suffixed with sepEnd.
+// - Subsequent chunks are prefixed with sepStart. If they’re not final, they are also suffixed with sepEnd.
+// This forward‐splitting approach preserves the beginning of the comment.
+func splitComment(comment string, maxSize int, sepEnd string, sepStart string) []string {
+	// Guard: If the comment fits in one chunk, return it unsplit.
+	if len(comment) <= maxSize {
+		return []string{comment}
+	}
+	// Guard: if maxSize is too small to accommodate even one raw character with decorations,
+	// return the unsplit comment.
+	if maxSize < len(sepEnd)+1 || maxSize < len(sepStart)+1 {
+		return []string{comment}
+	}
+
+	// Check if we have capacity for subsequent chunks
+	if maxSize-len(sepStart)-len(sepEnd) <= 0 {
+		// No room for raw text if we try to use both prefix and suffix
+		// => fallback to unsplit
+		return []string{comment}
+	}
+
+	var parts []string
+
+	// Process the first chunk.
+	// For the first chunk (if a split is needed) we reserve space for sepEnd only.
+	firstRawCapacity := maxSize - len(sepEnd)
+	firstChunkRaw := comment[0:firstRawCapacity]
+	parts = append(parts, firstChunkRaw+sepEnd)
+	i := firstRawCapacity
+
+	// Process subsequent chunks.
+	for i < len(comment) {
+		remaining := len(comment) - i
+
+		// If the remaining text fits in one final chunk (with only the sepStart prefix),
+		// then create that final chunk without a trailing sepEnd.
+		if remaining <= maxSize-len(sepStart) {
+			parts = append(parts, sepStart+comment[i:])
+			break
+		} else {
+			// Otherwise, for a non-final chunk, reserve space for both prefix and suffix.
+			rawCapacity := maxSize - len(sepStart) - len(sepEnd)
+			// The following slice is guaranteed to be in range because we only land here
+			// if remaining > maxSize - len(sepStart). Consequently, rawCapacity <= remaining,
+			// ensuring i+rawCapacity is within the comment's length.
+			chunk := sepStart + comment[i:i+rawCapacity] + sepEnd
+			parts = append(parts, chunk)
+			i += rawCapacity
+		}
+	}
+	return parts
+}
 
 func (c *Client) PostMessage(ctx context.Context, pr vcs.PullRequest, message string) (*msg.Message, error) {
 	_, span := tracer.Start(ctx, "PostMessageToMergeRequest")
@@ -27,6 +95,11 @@ func (c *Client) PostMessage(ctx context.Context, pr vcs.PullRequest, message st
 		message = message[:MaxCommentLength]
 	}
 
+	if err := c.deleteLatestRunningComment(ctx, pr); err != nil {
+		log.Error().Err(err).Msg("failed to delete latest 'kubechecks running' comment")
+		return nil, err
+	}
+
 	log.Debug().Msgf("Posting message to PR %d in repo %s", pr.CheckID, pr.FullName)
 	comment, _, err := c.googleClient.Issues.CreateComment(
 		ctx,
@@ -48,30 +121,72 @@ func (c *Client) UpdateMessage(ctx context.Context, m *msg.Message, msg string)
 	_, span := tracer.Start(ctx, "UpdateMessage")
 	defer span.End()
 
-	if len(msg) > MaxCommentLength {
-		log.Warn().Int("original_length", len(msg)).Msg("trimming the comment size")
-		msg = msg[:MaxCommentLength]
+	comments := splitComment(msg, MaxCommentLength, sepEnd, sepStart)
+
+	owner, repo, ok := strings.Cut(m.Name, "/")
+	if !ok {
+		e := fmt.Errorf("invalid GitHub repository name: no '/' in %q", m.Name)
+		telemetry.SetError(span, e, "Invalid GitHub full repository name")
+		log.Error().Err(e).Msg("invalid GitHub repository name")
+		return e
 	}
 
-	log.Info().Msgf("Updating message for PR %d in repo %s", m.CheckID, m.Name)
+	pr := vcs.PullRequest{
+		Owner:    owner,
+		Name:     repo,
+		CheckID:  m.CheckID,
+		FullName: m.Name,
+	}
 
-	repoNameComponents := strings.Split(m.Name, "/")
-	comment, resp, err := c.googleClient.Issues.EditComment(
-		ctx,
-		repoNameComponents[0],
-		repoNameComponents[1],
-		int64(m.NoteID),
-		&github.IssueComment{Body: &msg},
-	)
+	log.Debug().Msgf("Updating message in PR %d in repo %s", pr.CheckID, pr.FullName)
 
-	if err != nil {
-		telemetry.SetError(span, err, "Update Pull Request comment")
-		log.Error().Err(err).Msgf("could not update message to PR, msg: %s, response: %+v", msg, resp)
+	if err := c.deleteLatestRunningComment(ctx, pr); err != nil {
 		return err
 	}
 
-	// update note id just in case it changed
-	m.NoteID = int(*comment.ID)
+	for _, comment := range comments {
+		cc, _, err := c.googleClient.Issues.CreateComment(
+			ctx, owner, repo, m.CheckID, &github.IssueComment{Body: &comment},
+		)
+		if err != nil {
+			telemetry.SetError(span, err, "Update Pull Request comment")
+			log.Error().Err(err).Msg("could not post updated message comment to PR")
+			return err
+		}
+		m.NoteID = int(*cc.ID)
+	}
+
+	return nil
+}
+
+func (c *Client) deleteLatestRunningComment(ctx context.Context, pr vcs.PullRequest) error {
+	_, span := tracer.Start(ctx, "deleteLatestRunningComment")
+	defer span.End()
+
+	existingComments, resp, err := c.googleClient.Issues.ListComments(
+		ctx, pr.Owner, pr.Name, pr.CheckID, &github.IssueListCommentsOptions{
+			Sort:      pkg.Pointer("created"),
+			Direction: pkg.Pointer("asc"),
+		},
+	)
+	if err != nil {
+		telemetry.SetError(span, err, "List Pull Request comments")
+		log.Error().Err(err).Msgf("could not retrieve existing PR comments, response: %+v", resp)
+		return fmt.Errorf("failed to list comments: %w", err)
+	}
+
+	// Find and delete the first running comment.
+	for _, existingComment := range existingComments {
+		if existingComment.Body != nil && strings.Contains(*existingComment.Body, ":hourglass: kubechecks running ... ") {
+			log.Debug().Msgf("Deleting 'kubechecks running' comment with ID %d", *existingComment.ID)
+			if r, e := c.googleClient.Issues.DeleteComment(ctx, pr.Owner, pr.Name, *existingComment.ID); e != nil {
+				telemetry.SetError(span, e, "Delete Pull Request comment")
+				log.Error().Err(e).Msgf("failed to delete 'kubechecks running' comment, response: %+v", r)
+				return fmt.Errorf("failed to delete 'kubechecks running' comment: %w", e)
+			}
+			break
+		}
+	}
 
 	return nil
 }
@@ -105,7 +220,7 @@ func (c *Client) hideOutdatedMessages(ctx context.Context, pr vcs.PullRequest, c
 
 	for _, comment := range comments {
 		if strings.EqualFold(comment.GetUser().GetLogin(), c.username) {
-			// Github API does not expose minimizeComment API. IT's only available from the GraphQL API
+			// GitHub API does not expose minimizeComment API. It's only available from the GraphQL API
 			// https://docs.github.com/en/graphql/reference/mutations#minimizecomment
 			var m struct {
 				MinimizeComment struct {

Feedback & Suggestions:

  1. Security: The deleteLatestRunningComment function retrieves and deletes comments based on a specific string pattern. Ensure that this operation is safe from potential race conditions or malicious comment manipulations. Consider adding additional checks or logging to verify the integrity of the comments being deleted. 🛡️

  2. Performance: The splitComment function is well-structured for splitting large comments. However, consider optimizing the loop by using a buffer or pre-allocating the parts slice to avoid multiple allocations, especially if dealing with a large number of comments. 🚀

  3. Error Handling: In deleteLatestRunningComment, when deleting a comment, if an error occurs, it logs the error and returns. Consider adding a retry mechanism or more detailed error handling to ensure robustness, especially in network-related operations. 🔄

  4. Code Readability: The use of strings.Cut in UpdateMessage is a nice touch for splitting the repository name. Ensure that this pattern is consistently used across the codebase for similar operations to maintain readability and consistency. 📚

  5. Logging: The logging messages are informative, but consider adding more context where necessary, such as including the PR URL or additional identifiers, to make debugging easier. 📝



Dependency Review

Click to read mergecats review!

No suggestions found

@vovinacci
Copy link

vovinacci commented Feb 4, 2025

@andridzi Fixed in 042f102.
I've overlooked that originally markdown backticks were part of the function and not the constant itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple comments instead of trimming comment size
6 participants