avoid decoding sample data as Ints #99

kolia · 2021-08-20T16:07:01Z

If the encoding offset and resolution are ~~Ints~~ 0.0 and 1.0 respectively, the decoded samples.data can have eltype Int, which is not friendly for downstream uses of the data including DSP.

This could be avoided by either converting decoded data to have Float64 eltype, or made slightly less awkward to handle by implementing convert(Samples{D}, samples).

The text was updated successfully, but these errors were encountered:

jrevels · 2021-08-31T13:19:48Z

In what situation does this arise? if you want the decoded representation to be Float64, you should use Float64 decoding parameters?

It's generally good style in Julia to avoid avoidable promotion/widening, and instead respect the callers' choice of input types as much as possible. To do otherwise is to leak an implementation detail and can be far more annoying to workaround in practice (e.g. the classic CuArrays <-> Float32 autoconversion choice that resulted in lots of developer pain).

kolia · 2021-09-02T16:08:07Z

This can arise because of https://github.com/beacon-biosignals/Onda.jl/blob/master/src/samples.jl#L352
if sample_resolution_in_unit is 1.0 and sample_offset_in_unit is 0.0, then the decoded sample eltype will be Ints instead of Float64 for all other cases, since resolution and offset are both Float64s.

Maybe a good fix is to make that line convert decoded data to Float64.

jrevels · 2021-09-02T16:31:08Z

Ah. Yeah, I think it'd be okay to have the result in that case be promoted

Maybe a good fix is to make that line convert decoded data to Float64.

ehhhh it should converted to whatever the promotion should be based on the arguments, not to Float64

(and thus the aliasing optimization is still preserved in the case where the types already match)

kolia · 2021-09-02T16:48:15Z

Ok, I was wondering which of convert.(Float64, sample_data) or sample_data .* 1.0 .+ 0.0 would be faster, and they turn out to be about the same.

using BenchmarkTools
x = rand(Int16, 200 * 3600 * 8)   # 6-channel 8 hour recording at 200Hz
@btime $x .* 1.0 .+ 0.0
@btime convert.(Float64, sample_data)

both give ~180ms and 2 allocations over 263.67Mb.

Same times and allocations for x of eltype Float32.

So maybe best is to just special-case (1.0, 0.0) and return the input sample_data when promote_type(typeof(sample_offset_in_unit), eltype(sample_data)) == eltype(sample_data), and otherwise do the mult/add.

jrevels · 2021-09-02T17:17:38Z

promote_type(typeof(sample_offset_in_unit), eltype(sample_data)) == eltype(sample_data)

yeah the patch I'd apply is this:

diff --git a/src/samples.jl b/src/samples.jl
index 4c806eb..a91cf0f 100644
--- a/src/samples.jl
+++ b/src/samples.jl
@@ -343,13 +343,17 @@ If:

     sample_data isa AbstractArray &&
     sample_resolution_in_unit == 1 &&
-    sample_offset_in_unit == 0
+    sample_offset_in_unit == 0 &&
+    promote_type(typeof(sample_resolution_in_unit), typeof(sample_offset_in_unit)) == eltype(sample_data)

 then this function is the identity and will return `sample_data` directly without copying.
 """
 function decode(sample_resolution_in_unit, sample_offset_in_unit, sample_data)
-    if sample_data isa AbstractArray
-        sample_resolution_in_unit == 1 && sample_offset_in_unit == 0 && return sample_data
+    if (sample_data isa AbstractArray &&
+        sample_resolution_in_unit == 1 &&
+        sample_offset_in_unit == 0 &&
+        promote_type(typeof(sample_resolution_in_unit), typeof(sample_offset_in_unit)) == eltype(sample_data))
+        return sample_data
     end
     return sample_resolution_in_unit .* sample_data .+ sample_offset_in_unit
 end

jrevels · 2022-11-02T18:49:40Z

closed by #133 ; decode will now decode to Float64 by default, but you can pass in an extra type parameter to steer it

jrevels mentioned this issue Oct 31, 2022

refactor to latest Legolas version + onda.signal v2 #133

Merged

jrevels closed this as completed Nov 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avoid decoding sample data as Ints #99

avoid decoding sample data as Ints #99

kolia commented Aug 20, 2021 •

edited

Loading

jrevels commented Aug 31, 2021

kolia commented Sep 2, 2021

jrevels commented Sep 2, 2021 •

edited

Loading

kolia commented Sep 2, 2021 •

edited

Loading

jrevels commented Sep 2, 2021

jrevels commented Nov 2, 2022

avoid decoding sample data as Ints #99

avoid decoding sample data as Ints #99

Comments

kolia commented Aug 20, 2021 • edited Loading

jrevels commented Aug 31, 2021

kolia commented Sep 2, 2021

jrevels commented Sep 2, 2021 • edited Loading

kolia commented Sep 2, 2021 • edited Loading

jrevels commented Sep 2, 2021

jrevels commented Nov 2, 2022

kolia commented Aug 20, 2021 •

edited

Loading

jrevels commented Sep 2, 2021 •

edited

Loading

kolia commented Sep 2, 2021 •

edited

Loading