You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. I was wondering if there existed a tutorial, or current support for 1) running a pyspark processing job locally and 2) doing so with a custom base docker (EMR) image? I see a tutorial for Dask using a script processor, and also some code for an SKLearn based processor. My goal is to be able to basically set up a local testing/dev environment that uses sagemaker spark processor code. I'm guessing this is more complicated than the other use cases since this processor is usually backed by an EMR cluster.
The text was updated successfully, but these errors were encountered:
Hi @dcompgriffPySparkProcessor will not work in local mode. This is a SageMaker Docker image and has nothing to do with EMR.
You can build your own Spark Docker image, and use ScriptProcessor with it, the same as the Dask example and run it locally.
Hello. I was wondering if there existed a tutorial, or current support for 1) running a pyspark processing job locally and 2) doing so with a custom base docker (EMR) image? I see a tutorial for Dask using a script processor, and also some code for an SKLearn based processor. My goal is to be able to basically set up a local testing/dev environment that uses sagemaker spark processor code. I'm guessing this is more complicated than the other use cases since this processor is usually backed by an EMR cluster.
The text was updated successfully, but these errors were encountered: