This repository has been archived by the owner on Dec 15, 2022. It is now read-only.
Batching & Flink
New & Improved:
- Any Kettle input step loads data from anywhere (downside: data needs to fit in memory)
- Any Output step, now with batching (using "row set size" of transformation)
- Beam Job Config Dialog cleanup
- Beam Job Config: added Flink options
- Local Flink runner support
Releasing the hundreds of MB in libraries for this release is not optimal for GitHub so you need to download the plugin elsewhere. Please download this archive and unzip it in the <PDI>/plugins/
folder.
Then patch your Kettle CE version 8.2 by unzipping pdi-engine-configuration-8.2.0.0-342.zip
in and over your Kettle distribution root folder . Doing this adds "Beam" as a Run Configuration option besides "Pentaho" and "Spark". The source code for this new "Beam Run Configuration" can be found by getting branch 8.2.0.1 from pentaho-kettle in the engine-configuration plugin
For configuration and usage please see the README.md file in this project.