Slower than std? #24

jonashaag · 2023-10-10T16:49:11Z

I was debugging this and wanted to write a reproducer. For some reason in my tests, std is always faster at parsing than this library. Anything wrong with my benchmark?

>>> import random
>>> open("/tmp/numbers","w").write("\n".join(str(random.randint(0, 1000000)) for _ in range(10000000)))

cargo run --example bench -r < /tmp/numbers

use std::io;
use std::str;
use std::str::FromStr;
use std::time::Instant;

use atoi::FromRadix10Signed;

fn main() {
    let mut buf_digits: Vec<u8> = Vec::new();
    let mut sum = 0;
    let texts: Vec<Vec<u8>> = io::stdin()
        .lines()
        .map(|l| l.unwrap().as_bytes().into())
        .collect();
    let now = Instant::now();
    for text in texts {
        let num = if true {
            // from_str
            let utf8 = str::from_utf8(&text).unwrap();
            i128::from_str(utf8).unwrap()
        } else if false {
            // from_str with . filter
            let utf8 = str::from_utf8(&text).unwrap();
            buf_digits.clear();
            buf_digits.extend(utf8.as_bytes().into_iter().filter(|&&c| c != b'.'));
            i128::from_str(str::from_utf8(&buf_digits).unwrap()).unwrap()
        } else if false {
            // from_radix
            let (num, _consumed) = i128::from_radix_10_signed(&text);
            num
        } else if false {
            // from_radix with . filter
            buf_digits.clear();
            buf_digits.extend(text.into_iter().filter(|&c| c != b'.'));
            let (num, _consumed) = i128::from_radix_10_signed(&buf_digits);
            num
        // } else if true {
        //     // atoi_radix10
        //     let utf8 = str::from_utf8(&text).unwrap();
        //     atoi_radix10::parse_from_str(utf8).unwrap()
        } else {
            panic!("select a benchmark");
        };
        if num < 0 {
            dbg!(num);
        }
        sum += num;
    }
    let elapsed_time = now.elapsed();
    println!("Took {} ms.", elapsed_time.as_millis());
    dbg!(sum);
}

The text was updated successfully, but these errors were encountered:

pacman82 · 2023-10-10T20:35:13Z

This crate comes with a criterion benchmark suite. I would suggest to check out the repostiory and run it using cargo bench. You even get nice plots. I added a commit with some benchmarks i128. This crate is less about crazy optimizations, and more about avoiding the detour over utf8. On my system this seems to pay off. Your milage may vary.

jonashaag · 2023-10-11T08:44:22Z

Here's a benchmark suite that demonstrates the difference

use atoi::{FromRadix10, FromRadix10Checked, FromRadix10Signed, FromRadix16, FromRadix16Checked};
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use std::str;
use std::str::FromStr;

use std::fs::read_to_string;

pub fn i128_signed_four_digit_number(c: &mut Criterion) {
    c.bench_function("signed i128 four digit number", |b| {
        let lines: Vec<Vec<u8>> = read_to_string("/tmp/numbers")
            .unwrap()
            .lines()
            .map(|l| l.as_bytes().into())
            .collect();
        b.iter(|| {
            black_box(&lines)
                .iter()
                .map(|l| i128::from_radix_10_signed(l).0)
                .collect::<Vec<_>>()
        })
    });
}

pub fn i128_through_utf8(c: &mut Criterion) {
    c.bench_function("i128 via UTF-8", |b| {
        let lines: Vec<Vec<u8>> = read_to_string("/tmp/numbers")
            .unwrap()
            .lines()
            .map(|l| l.as_bytes().into())
            .collect();
        b.iter(|| {
            black_box(&lines)
                .iter()
                .map(|l| {
                    let s = str::from_utf8(l).unwrap();
                    s.parse::<i128>().unwrap();
                    //i128::from_str(s).unwrap();
                    //atoi_radix10::parse_from_str(s).unwrap::<i128>();
                })
                .collect::<Vec<_>>()
        })
    });
}

criterion_group!(benches, i128_signed_four_digit_number, i128_through_utf8,);
criterion_main!(benches);

std is 4x faster. Anything wrong about my benchmark?

pacman82 · 2023-10-11T08:57:35Z

Not on the face of it. Wouldn't be able to reproduce without your numbers though.

jonashaag · 2023-10-11T09:01:51Z

See first post, it's just a bunch of ints

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slower than std? #24

Slower than std? #24

jonashaag commented Oct 10, 2023

pacman82 commented Oct 10, 2023

jonashaag commented Oct 11, 2023

pacman82 commented Oct 11, 2023

jonashaag commented Oct 11, 2023

Slower than std? #24

Slower than std? #24

Comments

jonashaag commented Oct 10, 2023

pacman82 commented Oct 10, 2023

jonashaag commented Oct 11, 2023

pacman82 commented Oct 11, 2023

jonashaag commented Oct 11, 2023