-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for clipping and rounding issues #267
base: 8.x
Are you sure you want to change the base?
Conversation
@@ -121,6 +121,17 @@ public double toGeoPhysical(double sample) { | |||
return rasterDataNode.scale(sample); | |||
} | |||
|
|||
private int toRaw(int sample) { | |||
final double rawSample = rasterDataNode.scaleInverse(sample); | |||
if (rawSample < -2147483648.0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which miracle Number ist this?
Why not
if (rawSample < Integer.MIN_VALUE)
if (rawSample < -2147483648.0) { | ||
return Integer.MIN_VALUE; | ||
} | ||
if (rawSample > 2147483647.0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
} | ||
writableRaster.setSample(x, y, 0, sample); | ||
} | ||
|
||
private int clipOrRound(int sample, int lowerBound, int upperBound) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only a clip without round
Feel free to rename methods and extract constants as you like. I regard it as good programming style to compare variables of the same type only, but you may care about this or not. My only concern here is clipping and rounding. |
@@ -551,9 +605,33 @@ public void setSample(int x, int y, double sample) { | |||
if (scaled) { | |||
sample = toRaw(sample); | |||
} | |||
switch (rasterDataNode.getDataType()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If raw data run into a clipping there must be something wrong with geopysical data because they run outside the covered valid value range, defined by raw data type and scaling. In this case clipping the raw data value then turns a wrong geophysical value into a valid value.
This can not be right.
Clipping or other types of handling values outside the range should be done at the place where the TileImpl is used, because only the context knows how to handle values outside the range.
Clipping then should be done that way:
First calculate the geopysical clipping bound by scaling the raw min and raw max value (take care if signed or unsigned), then clip the geopysical.
Up to here al computations outside of TileImpl.
Then transform to raw and round to prevent the cast error.
An integer cast is neither a floor nor a round nor a cail operation.
A cast on a positive double value is equal to a floor operation.
But on negative values it is equal to a cail operation.
It changes the direction of the value change for negative numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree.
- A geophysical value that is outside the range of the raw data type is not invalid. It is just too large to be stored. It is better to clip it than to pass it uncontrolled.
- Casting of a
double
orfloat
to ashort
orbyte
yields rather weird results. Cast of a positivedouble
orfloat
to ashort
orbyte
is not equal to afloor
operation. Test it yourself. - A scientist developing an operator probably does not want to bother with clipping and data types. Often the data type is predetermined (from external data) and not always the same.
- The API documentation of
setSample
states that the values are clipped, which they are not. - A scientist's expectation probably is: he ensures that numbers (
double
orfloat
) are calculated correctly. How these values are encoded into an image or a file and decoded again is business of the software framework. Encoding may not be lossless. But if something cannot be encoded without loss, the software framework should handle it in a smart and easily predictible way that is as good an approximation to the original information as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A geophysical value that is outside the range of the raw data type is not invalid. It is just too large to be stored. It is better to clip it than to pass it uncontrolled.
Thats exactly what I meant.
Values are invalid from the point of view of "which data can be saved" not from the Point of view, does the value make sence from the scientific poit of view of valid data range.
If the value is in a scientific valid range, but can not be saved, then the configuration of the store must be wrong. Then either the raw data type or the scaling must be wrong.
If a value is a scientific valid value, a raw clipping changes this scientific valid value. Does this make sence?
Ideally a raw date store should not manipulate data. Also a raw data store shall not know anything of the valid scientific value range. A raw data store shall only be able to tore store all the scientivic valid data of a cerain use case. For this a raw data store has to be configured the right way. Then scientific valid data can be stored using the scaling but without manipulation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comments
Clipping or other types of handling geophysical values outside the defined geopysical value range should be done at the place where the TileImpl is used, because only the context knows how to handle values outside the range. TileImpl is not only used in CollocationOp but also in many other contexts. |
If new values are to be expected only slightly outside the valid range (defined by data type and scaling factors) because, for example, one has decided to use an interpolation type that is not a linear interpolation for a resampling, then the problem can also be circumvented by slightly reducing the scaling factor in the target raster. Then overshoots from the interpolation fit back into the raw data area. |
Well, the API documentation states that values are clipped. But they are not. The purpose of Your suggestions work technically but do not solve the encoding / decoding problem for raster data types, which are given and whose properties must not be changed. |
|
|
This pull request fixes (some) clipping and rounding issues due to (in my opinion) wrong implementations of
setSample(int x, int y, double / float / int sample)
inTileImpl
. The original implementation did notThe first point leads to unpredictible behaviour (may be predictable for a Java software developer, but not for a scientist who wants to code an operator). The second points leads to a systematic bias when written data are read again.
For example, as a consequence of these issues, the collocation operator produces extremely high (and wrong) signals from almost zero signal for
short
data types when higher than bilinear interpolation (like bicubic interpolation) is requested.Please review my changes. Note that they highlight the problems and are merely suggestions how to possibly solve them (in part).