forked from thorfdbg/difftest_ng
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
433 lines (385 loc) · 27.3 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
This is difftest_ng, "Difftest, the next generation".
Difftest is a program/framework helping to find errors in image
compression algorithms. It allows to measure many error measures
between a reference image and a compressed and re-expanded
image. Error measures are targetted not at human vision, but at
measures that allow to automatically detect common problems in image
codecs.
In addition, difftest_ng includes a couple of convenience functions,
including restricting the measurement only to a single component,
computing the FFT (or weighted FFT), filtering the image, measuring
the histogram and converting between various image formats. Currently,
difftest_ng supports pnm (ppm,pgm,pbm), pgx (JPEG 2000 reference
testing format), bmp, TIFF, multiple of raw formats with very flexible
specifications, pfm, rgbe, png, exr and dpx.
difftest_ng compiles under GNU/Linux and probably some other operating
systems, it requires libpng, libgsl and libopenexr for its full
function. Without additional libraries, some of its operations are not
available.
difftest_ng is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation, either version 3 of the License, or (at your
option) any later version.
difftest_ng is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-----------------------------------------------------------------------------------------------
Source and destination image formats are recognized by the file name extension. While this
is certainly not the most elegant method to identify image formats, it has been proven to
be robust enough for the purpose of difftest_ng.
The following file name extensions are currently recognized:
.pnm,.pgm,.ppm,.pbm,.pfm,.pfs: File formats from the pnm (Picture Any Map) format
The precise format does not depend on the extender, but
on the number of components and whether the data is float
or integer. pnm is the superset, pgm is grey scale, ppm
RGB, pbm binary data. While all the above represent
integer samples, pfs and pfm are floating point sample
formats.
.bmp: The Windows(tm) native bitmap format, covering 4 and 8
bits per sample, palette or true-color images of various
bitdepths.
.pgx: The file format defined by ISO/IEC 15444-4 (JPEG 2000)
used there for conformance testing. pgx can represent any
integer type of samples. A pgx file consists of a directory
file - the one with the .pgx ending that is specified, and
header and data files. The header files define the
dimensions and bitdepths of the data files. Data and header
may be concatenated together, difftest_ng understands this
case, but always writes separate headers as output.
.tif,.tiff The TIFF file format, specified and owned by Adobe.
TIFF supports multiple formats, including YUV sub-
sampling, palette files and floating-point formats.
.png PNG is a simple lossless image compression scheme for
internet images and replaced there gif images. This
requires libpng available.
.rgbe,.hdr RGBE high-dynamic range images. While rgbe and hdr are
not exactly identical, they are close enough to be
supported as sub-formats of the same format. When
creating files, RGBE is created.
.dpx The DPX format specified by SMTPE, used mainly to
represent frame-based movie content. dpx can represent
integer and floating point samples of various bit-depths,
including YUV data and subsampling. Unfortunately,
the format is quite underspecified and not all software
seems to follow the specs (or an interpretation thereof)
precisely.
.exr The openEXR format by Industrial Light & Magic, a
format for representing high-dynamic range images. This
requires libopenexr to be available.
.raw,.craw,.v210,.yuv Raw formats. All raw formats are covered by a single
format converter that requires the specification of the
layout of the format as part of the file name. Use the
--rawhelp command line option how to specify a raw
format.
-----------------------------------------------------------------------------------------------
Usage: difftest_ng [options] original distorted
where original and distorted are ppm,pbm,pgm,pfm,pfs,bmp,pgx,tif,png,exr,rgbe or raw (craw,v12,yuv) images
and options are one or more of
--psnr : measure the psnr with equal weights over all components
--maxsnr : measure the snr with equal weights over all components, normalize to maximum signal
--snr : measure the snr with equal weights over all components, normalize to source energy
--mse : measure the mean square error with equal weights over all components
--rmse : measure the root mean square error with equal weights over all components
--minpsnr : measure the minimum psnr over all components
--ycbcrpsnr : measure the psnr with weights derived from the YCbCr transformation
--yuvpsnr : measure the psnr with weights derived from the YUV transformation
--swpsnr : measure the psnr with weights coming from the subsampling factors
--mrse : measure the log of the mean relative square error with equal weights
--minmrse : measure the minimum mrse over all components
--ycbcrmrse : measure the mrse with weights derived from the YCbCr transformation
--yuvmrse : measure the mrse with weights derived from the YUV transformation
--swmrse : measure the mrse with weights coming from the subsampling factors
--peak : measure the peak relative error in dB over all components
--avgpeak : measure the peak relative error averaged over all components, in dB
--peakx : find the x position of the largest pixel error
--peaky : find the y position of the largest pixel error
--min : find the minimum value of original - distorted
--max : find the maximum value of original - distorted
--toe : find the minimum of the original
--head : find the maximum of the original
--drift : find the mean error (drift) between original and distorted
--mae : find the mean absolute error
--pae : find the peak absolute error
--stripe : measure a striping indicator that detects horizontal or vertical artifacts
--width : print the width of the images
--height : print the height of the images
--depth : print the number of components of the images
--precision comp : print the bit precision of the given component
--signed comp : print the signedness of the given component (-1 = signed, +1 = unsigned)
--float comp : print whether the indicated component is IEEE floating point (1 = yes, 0 = no)
--diff target : save the difference image (-i is an alternative form of this option)
--rawdiff target : similar to --diff, except that it doesn't scale the difference to maximum range
--sdiff scale trgt : generate a differential signal with an explicitly given scale
--suppress thres : suppress all pixels in the target that are less than a threshold away from the source
--mask roi : mask the source image by the mask image before applying the comparison
--notmask roi : mask the source image by the inverse of the mask
--convert target : save the original image unaltered, but possibly in a new format
--merge target : merge the two images together, add second as components of first
--fft target : save the fft of the difference image
--wfft target : save the windowed fft of the difference image
--filt x y r dst : run a radial filter around frequency x,y with radius r, saves the filtered image as dst
--nfilt x y r dst : similar to --filt, but the output is normalized to the full range
--comb x y r dst : apply a comb filter in direction x y and radius r
--ncomb x y r dst : similar to --comb, but the output is normalized to the full range
--hist target : generate a histogram plot. If "target" is -, write to stdout
--thres threshold : compute the ratio of pixels whose difference is > than threshold
--colorhist size : generate reduced histogram separately for each component using the given bucket size
--maxfreqr : locate the absolute value of the most exposed frequency in the error image
--maxfreqx : locate the horizontal component of the most exposed frequency in the error image
--maxfreqy : locate the vertical component of the most exposed frequency in the error image
--maxfreqv : compute the domination ratio of the most exposed frequency in the error image
--patternidx : scan the FFT for suspicious patterns and output the likeliness of errors
--toflt dst : save a floating point version of the source image
--tohfl dst : save a half-float version of the source image
--touns bpp dst : save an unsigned integer version with bpp bits per pixel of the source image
--tosgn bpp dst : save a signed integer version with bpp bits per pixel of the source image
--asuns bpp : convert the two input images to unsigned bpp before further procesing
--assgn bpp : convert the two input images to signed bpp before further processing
--gamma bpp gamma dst : perform a gamma correction on a floating point image to create
integer output
--invgamma gamma dst : perform an inverse gamma correction creating a floating point image
from integer
--toegamma slope gamma dst : perform a gamma transformation on the same data type with a given slope in
its toe region
--invtoegamma slope gamma dst : perform an inverse gamma transformation on the same data with parameters
as above
--halflog dst : represent a floating point image in IEEE half float format saved as 16 bit integers
--halfexp dst : read a 16-bit integer image using IEEE half float and save as floating point
--togamma bpp gamma: convert both images to gamma before applying the measurement (apply as a filter)
--fromgamma gamma : convert both images from a gamma before applying the measurement
--totoegamma slope gamma : perform a forwards gamma transformation (linear to gamma) with given
slope in the toe region
--fromtoegamma slope gamma : perform an inverse gamma transformation (gamma to linear) with given
slope in the toe region
--tohalflog : convert to 16-bit integer before comparing (apply as filter)
--fromhalflog : convert from 16-bit integer to float before comparing (apply as filter)
--tolog clamp : convert to logarithmic domain with clamp value before comparing
--topercept : convert from absolute luminance to a perceptually uniform space
--topq bits : convert floating point to SMPTE 2084 quantized to the given bits
--frompq : convert SMPTE 2084 quantized data to luminances
--tohlg bits : convert floating point to Hybrid Log Gamma with HEVC conventions
--fromhlg : convert Hybrid Log Gamma to linear luminance, 1000 nits peak
--invert : invert the source image before comparing
--flipx : flip the source horizontally before comparing
--flipy : flip the source vertically before comparing
--flipxextend : create a twice as wide image by flipping it over the right edge
--flipyextend : create a twice as high image by flippingit over the bottom edge
--shift dx dy : shift the image by the given amount right/bottom (or right/up if < 0)
--pad bpp dst : pad (right-aligned) a component into a larger bit-depths
--asprec bpp : set the bit-depth to bpp, padding input and output into the target bitdepth
--sub x y : subsample all components by the subsampling factors in x and y direction
--csub x y : subsample all but component 0 by the subsampling factors in x and y direction
--up x y : upsample all components by the subsampling factors in x and y direction
--up auto : upsample all components such that we get consistent 1x1 (444) sampling
--cup x y : upsample all but component 0 by the subsampling factors in x and y direction
--coup x y : co-sited upsampling of all components in x and y direction
--coup auto : co-sited upsampling, with automatic choice of upsampling factors
--cocup x y : co-sited upsampling of the chroma components in x and y direction
--boxup x y : upsample with a simple box filter
--boxcup x y : upsample the chrome components with a simple box filter
--clamp min max : clamp the image(s) to the specified range of sample values
--only component : acts as a filter and restricts all following operations to the given component
--upto component : restricts all following operations to components 0..component-1
--rgb : restricts the activity to at most the first three components
--crop x1 y1 x2 y2 : crop a rectangular image region (x1,y1)-(x2,y2). Edges are inclusive.
--cropd x1 y1 x2 y2: crop a rectangular image region (x1,y1)-(x2,y2) from the distorted image only.
--restore : un-do the restrictions of --crop and --only or --rgb
--toycbcr : convert images to 601 YCbCr before comparing
--toycbcrbl : convert images to 601 YCbCr before comparing, and include a black level
--tosignedycbcr : convert images to 601 YCbCr with signed chroma components
--fromycbcr : convert images from 601 YCbCr to RGB before comparing
--fromycbcrbl : convert images from 601 YCbCr to RGB before comparing, and remove the black level
--toycbcr709 : convert images to 709 YCbCr before comparing
--toycbcr709bl : convert images to 709 YCbCr before comaring, and include a black level
--fromycbcr709 : convert images from 709 YCbCr to RGB before comparing
--fromycbcr709bl : convert images from 709 YCbCr to RGB before comparing, and remove the black level
--toycbcr2020 : convert images to 2020 YCbCr before comparing
--toycbcr2020bl : convert images to 2020 YCbCr before comaring, and include a black level
--fromycbcr2020 : convert images from 2020 YCbCr to RGB before comparing
--fromycbcr2020bl : convert images from 2020 YCbCr to RGB before comparing, and remove the black level
--fromgrey : convert a grey-scale image to color by duplicating components
--torct : convert an image with the RCT from JPEG 2000
--tosignedrct : convert an image with the RCT from JPEG 2000, leaving chroma signed
--torctd : convert a 4 component RGGB image to YCbCr+DeltaG with the RCT
--torctd1 agmnt : convert a Bayer pattern with given arrangement with the above RCTD
--torctx agmnt : convert a Bayer pattern with given arrangement to RCT with
an improved longer averaging filter for green
--toydgcgcox agmnt : convert a Bayer pattern with given arrangement with an extended version
of the YDgCgCo transformation
--to422rct : convert a 422 image with green in component 0 to YCbCr
--to422signedrct : convert a 422 image with green in component 0, leaving chroma signed
--fromrct : convert an image back to RGB with the inverse RCT
--fromrctd : convert a YCbCr+DeltaG to a four-component RGGB with the inverse RCTD
--fromrctd1 agmnt : convert a YCbCr+DeltaG to a 1-component RGGB Bayer pattern
--fromrctx agmnt : convert an RCTX-converted image to a Bayer pattern image with the
given sample arrangment
--fromydgcgcox agmt: convert a YDgCgCoX-converted image to a Bayer pattern image with the
given sample arrangement
--from422rct : convert from YCbCr to RGB with green in channel 0
--todeltag : convert RGGB to RGB+DeltaG
--toycgco : convert an image with the YCgCo transformation
--tosignedycgco : convert an image to YCgCo leaving chroma signed
--tocycbcod : convert an RGGB image to YCgCo+DeltaG
--fromycgco : convert an image back to RGB with the inverse YCgCo transformation
--fromycgcod : convert a YCgCo+DeltaG to RGGB with the inverse RCT
--fromdeltag : convert RGB+DeltaG to RGGB
--toxyz : convert images from RGB to XYZ before comparing
--fromxyz : convert images from XYZ to RGB before comparing
--tolms : convert images from RGB to LMS before comparing
--fromlms : convert images from LMS to RGB before comparing
--xyztolms : convert images from XYZ to LMS before comparing
--lmstoxyz : convert images from LMS to XYZ before comparing
--scale a,b,c... : scale components by the indicated factors before comparing
--offset a,b,c... : offset component values by the indicated values before comparing
--tobayer : convert a four-component image to a Bayer-pattern image
--frombayer : convert a Bayer patterned grey-scale image to four components
--tobayersh agmnt: convert a four-component image in component order RGGB
to a Bayer patterned image of the given arragement
--frombayersh agmnt: convert a Bayer patterned image in the given sample order
into a four-component image in RGGB order
--422tobayer agmnt: convert a 422 three-component image to a Bayer pattern image
where the argument describes the sample organization. It can be either
grbg,rggb,gbrg or bggr, and green becomes the luma component
--bayerto422 agmnt: convert a Bayer pattern image to a 422 three component image with luma
as green and Cb as red and Cr as blue component. The argument describes
the bayer pattern arrangement as above.
--debayer agmnt: de-Bayer a bayer pattern image with bi-linear interpolation, org describes
the sample organization and can be grbg,rggb,gbrg or bggr
--debayerahd argmt: de-Bayer with the Adaptive Homogeneity-Directed Demosaic Algorithm
--fill r,g,b,... : fill the source image with the given color
--paste x y : paste the distorted image at the given position into the source
--raw : encode output in raw if applicable
--ascii : encode output in ascii if applicable
--interleaved : encode output in interleaved samples if applicable
--separate : encode output in separate planes if applicable
--isyuv : override automatic YUV detection, sources are really in YUV
--isrgb : override automatic YUV detection, sources are really in RGB
--isfullrange : override automatic range detection, source has no head/toe region
--isreducedrange : override automatic range detection, source has head/toe region
--littleendian : use little endian output if applicable
--bigendian : use big endian output if applicable
--toabsradiance : multiply floating point samples by recorded radiance scale to convert to absolute radiance
--brief : use a brief (only numeric) output format
>,>=,==,!=,<=,< t : last result must be larger, larger or equal, equal, not equal,
smaller or equal or smaller than given threshold t.
Attention: Quoting required when used from the shell.
If the source image is '-/<width>x<height>x<depth>', it is replaced by a blank image of the
given dimensions. This image can be filled with any other color by --fill, see above.
If the distorted image file name equals '-', then the image is replaced by a blank image
--help : print this page
--rawhelp : print help on raw image formatting. First time users: PLEASE READ THIS.
-----------------------------------------------------------------------------------------------
Raw formats are unframed and hence the formatting of the file must be specified
on the command line, here as part of the FILE NAME, not by separate options.
Raw files are specified by 'filename.raw@format', with an '@' (at) sign separating
the format specification from the file name. Note that the annex .raw is necessary.
The format specification itself consists of two parts, the image dimensions and
the layout of the data:
<width>x<height>x<depth>:<datalayout>
where <width> is the width, <height> the height and <depth> the number of
components in the image, without the angle brackets (pure numerical values).
Numbers are separated by 'x' (lower-case x).
This part MAY be omitted for saving since image dimensions are known.
The colon ':', however, must be present.
Image data can be either represented INTERLEAVED, that is, all components of a
single pixel are adjacent to each other, or SEPARATE, that is, each component is
described in a separate bitplane, and the bitplanes are stored adjacent to each
other. Additional padding bits might be present in the interleaved representation.
A single pixel is described by one or several fields, where each field encodes
the data of a single component or may simply be present for padding.
Fields can be either signed or unsigned, have a bit-width and an endianness.
In the interleaved presentation, fields can be bit-packed together, i.e. may share
bits in a byte, word or longword. If fields are bit-packed, the entire number of
bits must be either 8,16 or 32, and shares the endianness of all its components,
i.e. all components must indicate the same endianness.
In the separate presentation, bits of a component are packed near each other
without any padding. However, some components might be subsampled, i.e. may
contain less samples than others. This option does not exist for interleaved
data.
The format specification looks like this for interleaved data:
<packing>{<sign-flag><bits><endian>=<target>}:{<sign-flag><bits><endian>=<target>}:...
where <packing> is either + or -, indicating the packing order within the field
with +, which is the default, packing is from MSB to LSB, with '-' bits
are packed from LSB to MSB
where the curly brackets indicate the interleaved format
where <sign-flag> is an optional '+' or '-' sign indicating whether the component
is signed (then '-') or unsigned (then '+'). If omitted, the component is
unsigned.
where <bits> is a mandatory number of bits the component takes, e.g. 8
where <endian> is an optional endian indicator. It is '+' for big-endian and '-'
for little-endian data. If omitted, big-endian is assumed.
where <target> indicates to which component the data belongs, e.g. 0 for red
the target is separated by an equals sign '=' from the endianness.
For padding data, equal-sign and target are omitted.
where the colon ':' indicates that fields have separate endianness. This only
works for fields of withs 8,16,32 or 64 bits.
If the colon is replaced by a comma, the fields adjacent to the comma are bit-
packed into a single data unit. In total, up to 32 bits can be packed together,
and the total number of bits packed must be either 8, 16 or 32. Then, the endian-
ness of all fields packed together applies to the packed result, and must be
identical. By default, bit-packing reads from the MSB to the LSB. With the
optional minus sign in front, packing is from LSB to MSB.
Components may appear multiple times in the same format specification, which implies
subsampling. The subsampling factors are inclined from the sample count per component.
The sample pattern is filled at each line end, potentially creating dummy samples that
will be skipped over when reading and which are written as zero on writing.
The format for the separate representation uses square brackets instead:
<packing>[<sign-flag><bits><endian>=<target>]/<subx>x<suby>:...
and the syntax is as above, except that subsampling factors can be added to a field
description. They are separated by a slash '/' from the field description, followed
by the horizontal and vertical subsampling factors, separated by an 'x'.
The separate format also allows to pack several components together into one field,
similar to the above, packed fields are separated by a comma instead of a semicolon.
In such a case, the subsampling of the packed channels must be consistent and
identical within the same plane.
A padding channel is indicated by a missing target specification.
EXAMPLES:
A 640x480 RGB image with 8 bits per component encoded as RRRRRRRRGGGGGGGGBBBBBBBB
is denoted as this:
image.raw@640x480x3:{8=0}:{8=1}:{8=2}
Image dimensions can be omitted when saving, it then may be simplified to:
image.raw@:{8=0}:{8=1}:{8=2}
Note both '@' and ':' must be present.
Typical image formats:
{8=2}:{8=1}:{8=0}: 24 bits, pixel layout:
BBBBBBBBGGGGGGGGRRRRRRRR
{8=2}:{8=1}:{8=0}:{8} 32 bits, plus a pad byte:
BBBBBBBBGGGGGGGGRRRRRRRR00000000
{8=2}:{8=1}:{8=0}:{8=3} 32 bits plus alpha channel:
BBBBBBBBGGGGGGGGRRRRRRRRAAAAAAAA
[1=0] bit-packed 1bpp black & white image
[8=0]:[8=1]:[8=2] YUV or RGB in separate encoding, three planes, each 8bpp
[8=0]:[8=1]/2x2:[8=2]/2x2 YUV 420 in separate planes
[8=0]:[8=1]/2x1:[8=2]/2x1 YUV 422 in separate planes
[4],[12=0]:[4]/2x1,[12=1]/2x1:[4]/2x1,[12=2]/2x1
YUV 422 in separate planes, 12 bits per component
where each component is packed into 16 bits with
padding bits upfront, represented in big-endian
[4-],[12-=0]:[4-]/2x1,[12-=1]/2x1:[4-]/2x1,[12-=2]/2x1
YUV 422 12 bits/component as above, but little-endian.
{2-},{10-=2},{10-=1},{10-=0} 32 bits, pixel layout is ten bits per component with
two padding bits in front, packed into 32 bits which
is written in little-endian format. Represented as
little endian, this format reads
00BBBBBBBBBBGGGGGGGGGGRRRRRRRRRR
but in the file, bytes are shuffled around:
RRRRRRRR GGGGGGRR BBBBGGGG 00BBBBBB
{1-},{5-=2},{5-=1},{5-=0} 16 bits, pixel layout is five bits per component with
a single pad-bit upfront, packed into a 16-bit word
which is written in little-endian format. On a little
endian machine, this reads as
0BBBBBGGGGGRRRRR
but in the file, bytes are ordered reversely:
GGGRRRRR 0BBBBBGG
-{10-=1},{10-=0},{10-=2},{2-}:-{10-=0},{10-=1},{10-=0},{2-}:
-{10-=2},{10-=0},{10-=1},{2-}:-{10-=0},{10-=2},{10-=0},{2-}
This (single) line creates a raw format that complies to the
V210 pixel format. Each component is 10 bits wide,
three samples are left-aligned into one 32 bit word
causing two padding bits per 32 bit word. Each 32 bit
word is in little endian, and filled from LSB to MSB.
The samping order is UYV - YUY - VYU - YVY.
Subsampling factors are derived from the sample counts
and padding is applied at the end of each line to one
complete cycle.
-----------------------------------------------------------------------------------------------