Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-12175. Audit logs in SCM shouldn't print delete txns #7805

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Tejaskriya
Copy link
Contributor

@Tejaskriya Tejaskriya commented Feb 4, 2025

What changes were proposed in this pull request?

As a part of SCM audits for datanode's heartbeats, the command.toString for the commands being sent to the datanodes are logged. For most commands this results in a short string, but for deleteBlocks command, all the delete transactions are also added. This leads to a very verbose audit log.
SCM should add only the metadata response to the logs appropriately for each of the commands.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12175

How was this patch tested?

Tested manually.
Before the change:

2025-02-05 17:12:56,748 | INFO  | SCMAudit | user=hadoop | ip=172.18.0.10 | op=SEND_HEARTBEAT {datanodeUUID=b8ca7816-1298-4f9b-b962-a8b43351fb30, term=3, command=[commandType: deleteBlocksCommand deleteBlocksCommandProto { deletedBlocksTransactions { txID: 4 containerID: 4 localID: 115816896921600004 localID: 115816896921600005 localID: 115816896921600006 count: 0 } cmdId: 1738775521193 } term: 3 encodedToken: "" deadlineMsSinceEpoch: 0 ]} | ret=SUCCESS |

After the change:

2025-02-07 08:26:25,050 | INFO  | SCMAudit | user=hadoop | ip=172.18.0.4 | op=SEND_HEARTBEAT {datanodeUUID=45691298-4592-4dab-b4fd-8488660f479c, term=4, command=[commandType: deleteBlocksCommand deleteTransactionsCount: 1 cmdID: 1738916727618 encodedToken: "" deadlineMsSinceEpoch: 0]} | ret=SUCCESS |

The debug log added:

2025-02-07 14:25:03 2025-02-07 08:55:03,855 [IPC Server handler 9 on default port 9861] INFO server.SCMDatanodeHeartbeatDispatcher: Heartbeat dispatched: datanode=a4c34281-ee25-4b99-a63b-2b438064ffe8, Commands= [cmdID: 1738918447397 encodedToken: "" term: 3 deadlineMsSinceEpoch: 0 deleteBlocksTransactions: { txnId:1 containerID:4 deleteBlockCount:1 count:0 }]

@Tejaskriya Tejaskriya marked this pull request as ready for review February 5, 2025 08:36
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tejaskriya for the patch.

Please provide examples of audit log before/after the change, can take it from acceptance test logs (before, after).

@Tejaskriya
Copy link
Contributor Author

@adoroszlai Thanks for the review! I have made the changes and added the old and new audit log in the PR description.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the patch.

@Tejaskriya
Copy link
Contributor Author

@adoroszlai I have addressed all the comments, Thanks for suggesting these improvements!

Copy link
Contributor

@nandakumar131 nandakumar131 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tejaskriya for working on this.
Overall the change looks good.
Let's retain cmdId, encodedToken and deadlineMsSinceEpoch in the audit log for deleteBlocksCommand as well.

Also, please add a debug level log which prints the transactions. We don't have to print the block IDs (Local IDs), instead we can print the total number of blocks in a container.

@nandakumar131
Copy link
Contributor

You can add the debug log in SCMDatanodeHeartbeatDispatcher#dispatch.

@Tejaskriya
Copy link
Contributor Author

Thanks for the review @nandakumar131 , I have added the extra details to the audit log and also added a debug log in SCMDatanodeHeartbeatDispatcher#dispatch.
The updated output is added to the PR description

Comment on lines +209 to +231
allCommands.append("cmdID: ").append(cmd.getId());
allCommands.append(" encodedToken: \"").append(cmd.getEncodedToken()).append("\"");
allCommands.append(" term: ").append(cmd.getTerm());
allCommands.append(" deadlineMsSinceEpoch: ").append(cmd.getDeadline());
if (cmd.getType().equals(deleteBlocksCommand)) {
DeleteBlocksCommand delCmd = (DeleteBlocksCommand) cmd;
allCommands.append(" deleteBlocksTransactions: {");
for (DeletedBlocksTransaction txn : delCmd.blocksTobeDeleted()) {
allCommands.append(" txnId:").append(txn.getTxID());
allCommands.append(" containerID:").append(txn.getContainerID());
allCommands.append(" deleteBlockCount:").append(txn.getLocalIDCount());
allCommands.append(" count:").append(txn.getCount());
allCommands.append(",");
}
allCommands.deleteCharAt(allCommands.length() - 1);
allCommands.append(" }");
} else {
allCommands.append(" ").append(cmd.getProto());
}
allCommands.append(", ");
}
int len = allCommands.length();
allCommands.delete(len - 2, len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCMCommand is not a protobuf object, you can override toString method in all the SCMCommand implementation and just logging the SCMCommand#toString here should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants