From 0569fe2e677fff3eb610d96d9f4cee1f30ace1a4 Mon Sep 17 00:00:00 2001 From: "Ariel Shaqed (Scolnicov)" Date: Tue, 30 Apr 2024 15:39:13 +0300 Subject: [PATCH] Document the correct Spark client version (0.13.0, maybe assembled) (#7708) * Document the correct Spark client version (0.13.0, maybe assembled) * [CR] [bug] Fix tab heading Tab headings don't support `` `...` `` so don't use that there. Also add words how to use the assembled JAR with spark-shell and friends. * [bug] Avoid indentation in
-tabs notation --- docs/reference/spark-client.md | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/docs/reference/spark-client.md b/docs/reference/spark-client.md index 418260d1ca0..6420e460c1d 100644 --- a/docs/reference/spark-client.md +++ b/docs/reference/spark-client.md @@ -19,17 +19,31 @@ Utilize the power of Spark to interact with the metadata on lakeFS. Possible use Please note that Spark 2 is no longer supported with the lakeFS metadata client. {: .note } -Start Spark Shell / PySpark with the `--packages` flag: +The Spark metadata client is compiled for Spark 3.1.2 with Hadoop 3.2.1, but +can work for other Spark versions and higher Hadoop versions. -This client is compiled for Spark 3.1.2 with Hadoop 3.2.1, but can work for other Spark -versions and higher Hadoop versions. +
+ +
+Start Spark Shell / PySpark with the `--packages` flag, for instance: ```bash -spark-shell --packages io.lakefs:lakefs-spark-client_2.12:0.11.0 +spark-shell --packages io.lakefs:lakefs-spark-client_2.12:0.13.0 ``` -Alternatively an assembled jar is available on S3, at -`s3://treeverse-clients-us-east/lakefs-spark-client/0.11.0/lakefs-spark-client-assembly-0.11.0.jar` +Alternatively use the assembled jar (an "Überjar") on S3, from +`s3://treeverse-clients-us-east/lakefs-spark-client/0.13.0/lakefs-spark-client-assembly-0.13.0.jar` +by passing its path to `--jars`. +The assembled jar is larger but shades several common libraries. Use it if Spark +complains about bad classes or missing methods. +
+
+Include this assembled jar (an "Überjar") from S3, from +`s3://treeverse-clients-us-east/lakefs-spark-client/0.13.0/lakefs-spark-client-assembly-0.13.0.jar`. +
## Configuration