TornadoVM Changelog

This file summarizes the new features and major changes for each TornadoVM version.

TornadoVM 0.6

21/02/2020

16/12/2019

Initial support for Xilinx FPGAs
TornadoVM API classes are now Serializable
Initial support for local memory for reductions
JVMCI built with local annotation patch removed. Now TornadoVM requires unmodified JDK8 with JVMCI support
Support of multiple reductions within the same task-schedules
Emulation mode on Intel FPGAs is fixed
Fix reductions on Intel Integrated Graphics
TornadoVM driver OpenCL initialization and OpenCL code cache improved
Refactoring of the FPGA execution modes (full JIT and emulation modes improved).

14/10/2019

Profiler supported (See PROFILER)
- Use -Dtornado.profiler=True to enable profiler
- Use -Dtornado.profiler=True -Dtornado.profiler.save=True to dump the profiler logs
Feature extraction added (See PROFILER)
- Use -Dtornado.feature.extraction=True to enable code extraction features
Mac OSx support (See INSTALL)
Automatic reductions composition (map-reduce) within the same task-schedule
Bug related to a memory leak when running on GPUs solved
Bug fixes and stability improvements

22/07/2019

New Matrix 2D and Matrix 3D classes with type specializations.
New API-call TaskSchedule#batch for batch processing. It allows programmers to run with more data than the maximum capacity of the accelerator by creating batches of executions.
FPGA full automatic compilation pipeline.
FPGA options simplified:
- -Dtornado.precompiled.binary=<binary> for loading the bitstream.
- -Dtornado.opencl.userelative=True for using relative addresses.
- -Dtornado.opencl.codecache.loadbin=True removed.
Reductions support enhanced and fully automated on GPUs and CPUs.
Initial support for reductions on FPGAs.
Initial API for profiling tasks integrated.

25/02/2019

Rename to TornadoVM
Device selection for better performance (CPU, multi-core, GPU, FPGA) via an API for Dynamic Reconfiguration
- Added methods executeWithProfiler and executeWithProfilerSequential with an input policy.
- Policies: Policy.PERFORMANCE, Policy.END_2_END, and Policy.LATENCY implemented.
Basic heuristic for predicting the highest performing target device with Dynamic Reconfiguration
Initial FPGA integration for Altera FPGAs:
- Full JIT compilation mode
- Ahead of time compilation mode
- Emulation/debug mode
FPGA JIT compiler specializations
Added support for Java reductions:
- Compiler specializations for CPU and GPU reductions
Performance and stability fixes

07/09/2018

Initial Implementation of the Tornado compiler
Initial GPU/CPU code generation for OpenCL
Initial support in the runtime to execute OpenCL programs generated by the Tornado JIT compiler
Initial Tornado-API release (@Parallel annotation and TaskSchedules)
Multi-GPU enabled through multiple tasks-schedules