Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance blkdiscard for Improved Disk Initialization & Pre-commit Linting #41

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

MaozOvadia
Copy link

Enhance blkdiscard for Improved Disk Initialization & Pre-commit Linting

Description

This PR includes the following improvements:

1. Enhanced blkdiscard Functionality

  • Modified the blkdiscard function to accept multiple commands, enabling dynamic disk initialization.
  • Added a dual-step blkdiscard process:
    • Step 1: The first blkdiscard command ensures that existing data on the disk is erased. The behavior varies by environment:
      • On AWS, it uses a randomized pattern to overwrite the disk content, ensuring prior Aerospike data becomes unreadable.
      • On other environments, it zeroes out the disk content.
    • Step 2: The second blkdiscard command forces zeros to the first 8MB of the disk header, making the disk "Aerospike-ready".
  • This approach is particularly beneficial for dynamic environments, such as when using the OpenEBS LVM provisioner, where disks are created on-the-fly as part of PVC provisioning.

2. Improved Pre-commit Configuration for Linting

  • Updated the .pre-commit-config.yaml file to ensure golangci-lint runs on the entire codebase.
  • Added pass_filenames: false to the configuration, resolving type-check errors by ensuring that all files are linted, not just the modified ones.

Background & Context

In dynamic setups like AWS, where the OpenEBS LVM provisioner is used, disks are created on-the-fly during PVC creation. The previous approach, which used dd for disk initialization, led to significant delays (up to an hour) when adding new nodes to an Aerospike cluster.

To optimize this process, the blkdiscard command was enhanced to:

  • Ensure that any prior Aerospike data on the disk is irrecoverable by using randomized writes (specific to AWS).
  • Prepare the disk for Aerospike use by zeroing out the first 8MB of the header, thus eliminating the need for time-consuming dd operations.

The updated solution has been tested on an AWS setup and was found to be suitable for the team's needs. However, there was an internal discussion about potentially creating a more configurable option, such as a new initMethod, to handle more diverse use cases.


Key Changes

Disk Initialization Enhancements:

  • Modified blkdiscard to accept multiple commands.
  • Introduced a two-step blkdiscard process:
    1. Initial blkdiscard (random or zeros) to erase existing content.
    2. Secondary blkdiscard to zero out the first 8MB header.

Pre-commit Configuration:

  • Ensured that golangci-lint covers the entire codebase.
  • Updated configuration to address type-check issues by linting all files.

Testing & Validation

  • The modified blkdiscard approach has been successfully tested in an AWS environment using the OpenEBS LVM provisioner. The solution was found to significantly reduce node initialization times compared to the previous dd method.
  • The updated pre-commit configuration has been validated to ensure that golangci-lint runs comprehensively across the entire project.

Potential Future Enhancements

  • Consider adding a new initMethod option:
    • An option like blkdiscard-aws could be introduced for environments requiring specialized initialization (e.g., randomized writes on AWS).
    • This change would require updates to the Aerospike Kubernetes Operator (AKO) and the init container configuration.

Closing Notes

This PR aims to provide an immediate performance boost for disk initialization in dynamic environments, while also leaving room for future enhancements if more flexibility is desired. Thank you for your feedback and consideration!


- Added 'pass_filenames: false' to pre-commit configuration
- This change ensures golangci-lint runs on the entire codebase, not just modified files, resolving typecheck errors
- Modified blkdiscard function to accept and execute multiple commands
- Added dual blkdiscard commands:
  1. Standard blkdiscard to ensure existing data is erased (random on AWS, zeros elsewhere)
  2. Additional blkdiscard to zero out the 8MB header, making the disk Aerospike-ready
- This modification supports dynamic disk initialization, useful for environments like AWS where disks are created on-the-fly (e.g., OpenEBS LVM provisioner)
- Addresses need for efficient disk readiness without lengthy dd operations
@sud82
Copy link
Collaborator

sud82 commented Nov 22, 2024

Hi @MaozOvadia,

As you know that the blkdiscard command has different behavior across different platforms, it is safe to do the header cleaning only if it can make sure that no aerospike data will be returned. Therefore, our current blkdiscard initMethod was only for devices that support RZAT (Read zero after trim).

To further support the environments like AWS where blkdiscard can return random data (Non-aerospike data), we will give a different initMethod having the header cleaning procedure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants