Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid decoding sample data as Ints #99

Closed
kolia opened this issue Aug 20, 2021 · 6 comments
Closed

avoid decoding sample data as Ints #99

kolia opened this issue Aug 20, 2021 · 6 comments

Comments

@kolia
Copy link

kolia commented Aug 20, 2021

If the encoding offset and resolution are Ints 0.0 and 1.0 respectively, the decoded samples.data can have eltype Int, which is not friendly for downstream uses of the data including DSP.

This could be avoided by either converting decoded data to have Float64 eltype, or made slightly less awkward to handle by implementing convert(Samples{D}, samples).

@jrevels
Copy link
Member

jrevels commented Aug 31, 2021

In what situation does this arise? if you want the decoded representation to be Float64, you should use Float64 decoding parameters?

It's generally good style in Julia to avoid avoidable promotion/widening, and instead respect the callers' choice of input types as much as possible. To do otherwise is to leak an implementation detail and can be far more annoying to workaround in practice (e.g. the classic CuArrays <-> Float32 autoconversion choice that resulted in lots of developer pain).

@kolia
Copy link
Author

kolia commented Sep 2, 2021

This can arise because of https://github.com/beacon-biosignals/Onda.jl/blob/master/src/samples.jl#L352
if sample_resolution_in_unit is 1.0 and sample_offset_in_unit is 0.0, then the decoded sample eltype will be Ints instead of Float64 for all other cases, since resolution and offset are both Float64s.

Maybe a good fix is to make that line convert decoded data to Float64.

@jrevels
Copy link
Member

jrevels commented Sep 2, 2021

Ah. Yeah, I think it'd be okay to have the result in that case be promoted

Maybe a good fix is to make that line convert decoded data to Float64.

ehhhh it should converted to whatever the promotion should be based on the arguments, not to Float64

(and thus the aliasing optimization is still preserved in the case where the types already match)

@kolia
Copy link
Author

kolia commented Sep 2, 2021

Ok, I was wondering which of convert.(Float64, sample_data) or sample_data .* 1.0 .+ 0.0 would be faster, and they turn out to be about the same.

using BenchmarkTools
x = rand(Int16, 200 * 3600 * 8)   # 6-channel 8 hour recording at 200Hz
@btime $x .* 1.0 .+ 0.0
@btime convert.(Float64, sample_data)

both give ~180ms and 2 allocations over 263.67Mb.

Same times and allocations for x of eltype Float32.

So maybe best is to just special-case (1.0, 0.0) and return the input sample_data when promote_type(typeof(sample_offset_in_unit), eltype(sample_data)) == eltype(sample_data), and otherwise do the mult/add.

@jrevels
Copy link
Member

jrevels commented Sep 2, 2021

promote_type(typeof(sample_offset_in_unit), eltype(sample_data)) == eltype(sample_data)

yeah the patch I'd apply is this:

diff --git a/src/samples.jl b/src/samples.jl
index 4c806eb..a91cf0f 100644
--- a/src/samples.jl
+++ b/src/samples.jl
@@ -343,13 +343,17 @@ If:

     sample_data isa AbstractArray &&
     sample_resolution_in_unit == 1 &&
-    sample_offset_in_unit == 0
+    sample_offset_in_unit == 0 &&
+    promote_type(typeof(sample_resolution_in_unit), typeof(sample_offset_in_unit)) == eltype(sample_data)

 then this function is the identity and will return `sample_data` directly without copying.
 """
 function decode(sample_resolution_in_unit, sample_offset_in_unit, sample_data)
-    if sample_data isa AbstractArray
-        sample_resolution_in_unit == 1 && sample_offset_in_unit == 0 && return sample_data
+    if (sample_data isa AbstractArray &&
+        sample_resolution_in_unit == 1 &&
+        sample_offset_in_unit == 0 &&
+        promote_type(typeof(sample_resolution_in_unit), typeof(sample_offset_in_unit)) == eltype(sample_data))
+        return sample_data
     end
     return sample_resolution_in_unit .* sample_data .+ sample_offset_in_unit
 end

@jrevels
Copy link
Member

jrevels commented Nov 2, 2022

closed by #133 ; decode will now decode to Float64 by default, but you can pass in an extra type parameter to steer it

@jrevels jrevels closed this as completed Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants