Replies: 4 comments 6 replies
-
For instance, let's take the ADNI and AIBL datasets. As of today, we are using a bunch of plain Take a typical record of the collection data for any LONI dataset, serialized as:
A lot of information is lost by manipulating data as raw strings. For instance, most ages for PET scans in AIBL are 0, so these should be converted to All these rules will be difficult to capture in a common tabular file. |
Beta Was this translation helpful? Give feedback.
-
On the topic of memory efficiency (which is an issue for large datasets like ADNI), a well typed dataframe can significantly help reduce the load. As a quick benchmark, I downloaded an AIBL collection for 4 subjects (34 records in total) and computed the usage for the untyped (4.5 KB) and typed version (3.3 KB). That's already a reduction of more than a quarter for a very limited sample size. |
Beta Was this translation helpful? Give feedback.
-
For instance, here is an example of a typed description (in YAML form but could be JSON or TOML) that could be parsed: attributes:
- name: Image Data ID
index: true
- name: Subject
type: string
pattern: '\d{3}_S_\d{4}'
- name: Group
type: category
values:
- Control
- Patient
- name: Sex
type: category
values:
- F
- M
- name: Age
type: integer
exclude:
- 0
- name: Visit
type: integer
- name: Modality
type: category
values:
- MRI
- PET
- name: Description
type: string
- name: Type
type: category
values:
- Original
- Processed
- name: Acq Date
type: date
- name: Format
type: category
values:
- DCM
- NIFTI |
Beta Was this translation helpful? Give feedback.
-
It should be noted that imported typings don't work on a function called inside a node with nipype. |
Beta Was this translation helpful? Give feedback.
-
I have got a bunch of tasks revolving around the subject of quality and integrity checking of input and output data from the converters. I have started experimenting with an encoding for some form of a typed and declarative syntax for specifying important attributes such as input / output data types, invalid values, bound checks, valid categories.
I am starting this topic for scratch notes and gathering feedback as this experiment progresses.
Beta Was this translation helpful? Give feedback.
All reactions