Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider GMP implementation #1

Open
ocram opened this issue Nov 19, 2019 · 0 comments
Open

Consider GMP implementation #1

ocram opened this issue Nov 19, 2019 · 0 comments
Labels
enhancement New feature or request

Comments

@ocram
Copy link
Contributor

ocram commented Nov 19, 2019

Making use of the GMP (GNU Multiple Precision) extension (gmp) can potentially speed up the generation of long random strings:

public static function stringFromAlphabet(string $alphabet, int $length): string {
    // determine the base of the alphabet
    $base = \strlen($alphabet);
    // calculate the size of the range that random values are to be returned from, which can then be used as an upper bound (exclusive) with a lower bound (inclusive) of zero
    $desiredRangeGmp = \gmp_pow(\gmp_init($base, 10), $length);
    // calculate (log_2) the number of bits required (as a minimum) so that every integer in the range can be represented
    $bitsNeededGmp = \strlen(\gmp_strval($desiredRangeGmp, 2));
    // calculate the number of bytes required
    $bytesNeeded = \gmp_intval(\gmp_div_q($bitsNeededGmp, \gmp_init(8, 10), \GMP_ROUND_PLUSINF));
    // determine the maximum available range of random integers that the chosen number of bytes will provide
    $maximumRangeGmp = \gmp_pow(\gmp_pow(\gmp_init(2, 10), 8), $bytesNeeded);
    // extend the desired range to the largest whole multiple of its own that will still be within the maximum available range provided by the bytes, as an optimization so that many more numbers fall into the range that is used and significantly fewer numbers have to be discarded
    $extendedRangeGmp = \gmp_sub($maximumRangeGmp, \gmp_mod($maximumRangeGmp, $desiredRangeGmp));
    // determine the maximum decimal that falls into the extended range and can thus be accepted
    $maxDecimalGmp = \gmp_sub($extendedRangeGmp, \gmp_init(1, 10));

    // until bytes in the desired (extended) range have been found, perform rejection sampling and discard any bytes outside the range to achieve a uniform distribution
    do {
        // generate a sequence of random bytes with the required length
        $bytes = \Delight\Random\Random::bytes($bytesNeeded);
    } while (\gmp_cmp(\gmp_import($bytes), $maxDecimalGmp) === 1);

    // the bytes from the extended range can safely be converted to the target alphabet via a base conversion (without introducing bias)
    $str = \Delight\BaseConvert\Alphabet::convert($bytes, \Delight\BaseConvert\Alphabet::BYTE, $alphabet);
    // the modulo operation can be applied safely on numbers from the extended range (without introducing bias)
    $str = \substr($str, -$length);
    // pad the number with zeros at the left
    $str = \str_pad($str, $length, $alphabet[0], \STR_PAD_LEFT);

    return $str;
}

But this doesn’t seem to be faster for many short strings. So the question becomes when to use this. A draft implementation for deciding this is the following:

private static function shouldUseGmpInsteadOfMultipleRandomInts() {
    if (!\extension_loaded('gmp')) {
        return false;
    }

    if (\strncasecmp(\PHP_OS, 'WIN', 3) === 0) {
        if (\Delight\Alphabets\Alphabet::isStandardAlphabet($alphabet)) {
            return $length >= 8;
        }
        else {
            return $length >= 20;
        }
    }
    else {
        if (\Delight\Alphabets\Alphabet::isStandardAlphabet($alphabet)) {
            return $length >= 13;
        }
        else {
            return false;
        }
    }
}
@ocram ocram added the enhancement New feature or request label Nov 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant