Skip to content

v0.4.3

Compare
Choose a tag to compare
@lettergram lettergram released this 22 Apr 19:15
· 424 commits to main since this release
2238d32

Runtime Changes

Migrating from v0.4.2 to v0.4.3 should result in a 30-90% reduction in profiling time.
Largely dependent on system resources and data size.

Notes

  • Remove requirement for tensorflow-addons
  • Library now works with tensorflow nightly (Python 3.9)
  • Added example on generating a new data labeler

Profiler

  • Multiprocessing data preprocessing
  • Improved histogram accuracy
  • Reduced histogram generation runtime
  • Option to set the bin count for histogram
  • Expanded precision and switch to precision estimation (as opposed to exact calculations)
  • Limit pool size based on cpu and memory limitations

Data

  • Improved JSON detection method
    • Option (default) pulls metadata and data separately (data.meta and data.data)
    • data.meta would be part of the JSON which contains no records
    • data.data would be part of the JSON which contains records
    • Added option to select keys which represent records

Report

  • Precision report now contains additional details
"precision": {
   'min': int,
   'max': int,
   'mean': float,
   'var': float,
   'std': float,
   'sample_size': int,
   'margin_of_error': float,
   'confidence_level': float		
},

Bug fixes

  • Fixed error in merging options
  • Fixed issue related to merging DateTimeColumns
  • Fixed multiprocessing on OSX
  • Fixed row calculations if min_true_samples is greater than zero