Skip to content

Commit

Permalink
doc update for tpch spark
Browse files Browse the repository at this point in the history
  • Loading branch information
Dhruv Garg committed Nov 22, 2024
1 parent 977f36a commit d793b25
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion rpc/spark_erdos_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ make

Running `./dbgen` above creates a dataset of scale factor `s` of `1` (default) i.e. 1GB.

> NOTE: Had updated the scala version to 2.13.0 in tpch.sbt
> NOTE: Had updated the scala version to 2.13.0 in tpch.sbt. The sbt version used was `1.9.7`.
Next, we build the target for `tpch-spark`:
```bash
Expand Down Expand Up @@ -211,6 +211,9 @@ The above job submission is parameterized by `(DEADLINE, QUERY_NUM, DATASET_SIZE
`(120, 4, 50, 50)`.
> Refer to `launch_expt_script.py` in `tpch-spark` for more details on eligible values for these parameters and how they are used.
> NOTE: By default, env variable `TPCH_INPUT_DATA_DIR` will look for `dbgen` inside the current working directory. While it works for `spark-submit`
> issued from inside the `tpch-spark` repository, it needs to be explicitly set otherwise.
Once submitted, review the application's runtime status on the Spark Web UI.

### Shutdown cluster
Expand Down

0 comments on commit d793b25

Please sign in to comment.