Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Batching & Flink

Compare
Choose a tag to compare
@mattcasters mattcasters released this 15 Feb 08:49

New & Improved:

  • Any Kettle input step loads data from anywhere (downside: data needs to fit in memory)
  • Any Output step, now with batching (using "row set size" of transformation)
  • Beam Job Config Dialog cleanup
  • Beam Job Config: added Flink options
  • Local Flink runner support

Releasing the hundreds of MB in libraries for this release is not optimal for GitHub so you need to download the plugin elsewhere. Please download this archive and unzip it in the <PDI>/plugins/ folder.

Then patch your Kettle CE version 8.2 by unzipping pdi-engine-configuration-8.2.0.0-342.zip in and over your Kettle distribution root folder . Doing this adds "Beam" as a Run Configuration option besides "Pentaho" and "Spark". The source code for this new "Beam Run Configuration" can be found by getting branch 8.2.0.1 from pentaho-kettle in the engine-configuration plugin

For configuration and usage please see the README.md file in this project.