Error casting strings with large dates to Timestamp #7208

ryzhyk · 2025-02-27T07:36:32Z

Describe the bug

Attempting to convert a string with a valid timestamp with a large date in the ISO format (e.g., +10999-12-31T00:00:00) results in:

Error parsing timestamp from '+10999-12-31T00:00:00': error parsing date

To Reproduce

#[test]
fn test_cast_string_with_large_timestamp_to_timestamp() {
    let array = Arc::new(StringArray::from(vec![
        Some("+10999-12-31T00:00:00"),
    ])) as ArrayRef;
    let to_type = DataType::Timestamp(TimeUnit::Second, None);
    let options = CastOptions {
        safe: false,
        format_options: FormatOptions::default(),
    };
    let b = cast_with_options(&array, &to_type, &options).unwrap();
}

Expected behavior
The cast succeeds.

Additional context

I think this issue is similar to #7073, but this one is for timestamps, whereas the other one affected dates.

I ran into this issue when parsing a metadata file of a delta lake table containing large dates. The table was created using Spark SQL i.e., this issue occurs in the wild.

mbutrovich · 2025-02-28T19:58:11Z

I've actually been debugging a similar issue for DataFusion Comet, and will open a related issue shortly. The issue may stem from the fact that Spark still defaults to writing INT96 for timestamps. In my issue, we read back a Parquet file written with large timestamp values from a Parquet file, and arrow-rs coerces them into a Timestamp(TimeUnit::Nanoseconds, None) by default which cannot represent as large of a date range as an INT96.

ryzhyk added the bug label Feb 27, 2025

tustvold added enhancement Any new improvement worthy of a entry in the changelog and removed bug labels Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error casting strings with large dates to Timestamp #7208

Error casting strings with large dates to Timestamp #7208

ryzhyk commented Feb 27, 2025 •

edited

Loading

mbutrovich commented Feb 28, 2025

Error casting strings with large dates to Timestamp #7208

Error casting strings with large dates to Timestamp #7208

Comments

ryzhyk commented Feb 27, 2025 • edited Loading

mbutrovich commented Feb 28, 2025

ryzhyk commented Feb 27, 2025 •

edited

Loading