You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to follow the instructions under https://spark.rstudio.com/graphframes/ for running graphframes with spark version 2.1.0.
However, I am facing a similar issue as has been described before in #7.
That means after:
sparklyr::spark_install(version = "2.1.0")
I can connect to spark in a fresh R session via:
library(sparklyr)
sc <- spark_connect(master = "local", version = "2.1.0")
However, when also loading graphframes, I would run into the following error:
> library(sparklyr)
> library(graphframes)
> sc <- spark_connect(master = "local", version = "2.1.0", config = conf)
Ivy Default Cache set to: /Users/ludwig/.ivy2/cache
The jars for the packages stored in: /Users/ludwig/.ivy2/jars
:: loading settings :: url = jar:file:/Users/ludwig/spark/spark-2.1.0-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
graphframes#graphframes added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
:: resolution report :: resolve 1226ms :: artifacts dl 0ms
:: modules in use:
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 0 | 0 | 0 || 0 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
module not found: graphframes#graphframes;0.5.0-spark2.1-s_2.11
==== local-m2-cache: tried
file:/Users/ludwig/.m2/repository/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.pom
-- artifact graphframes#graphframes;0.5.0-spark2.1-s_2.11!graphframes.jar:
file:/Users/ludwig/.m2/repository/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.jar
==== local-ivy-cache: tried
/Users/ludwig/.ivy2/local/graphframes/graphframes/0.5.0-spark2.1-s_2.11/ivys/ivy.xml
-- artifact graphframes#graphframes;0.5.0-spark2.1-s_2.11!graphframes.jar:
/Users/ludwig/.ivy2/local/graphframes/graphframes/0.5.0-spark2.1-s_2.11/jars/graphframes.jar
==== central: tried
https://repo1.maven.org/maven2/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.pom
-- artifact graphframes#graphframes;0.5.0-spark2.1-s_2.11!graphframes.jar:
https://repo1.maven.org/maven2/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.jar
==== spark-packages: tried
http://dl.bintray.com/spark-packages/maven/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.pom
-- artifact graphframes#graphframes;0.5.0-spark2.1-s_2.11!graphframes.jar:
http://dl.bintray.com/spark-packages/maven/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: graphframes#graphframes;0.5.0-spark2.1-s_2.11: not found
::::::::::::::::::::::::::::::::::::::::::::::
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: graphframes#graphframes;0.5.0-spark2.1-s_2.11: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1078)
at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:296)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:160)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
I have tried using a more recent version of spark (2.4.3) as well as putting the apparently missing graphframes jars directly into the jars directory without success.
Any advice on how to resolve this would be greatly appreciated. Thanks!
The problem seems to be that the default repos (https://repo1.maven.org and http://dl.bintray.com), that sparklyr tries to install graphframes from, do not host the graphframes jars anymore.
Also the code can be updated to pull the latest version of graphframes (v0.8.1, Sep 2020), which works with Spark version 2.4 and higher, as done here.
I can provide a pull request if it seems worth incorporating these updates.
Hi,
I am trying to follow the instructions under https://spark.rstudio.com/graphframes/ for running
graphframes
with spark version2.1.0
.However, I am facing a similar issue as has been described before in #7.
That means after:
I can connect to spark in a fresh R session via:
However, when also loading
graphframes
, I would run into the following error:I have tried using a more recent version of spark (2.4.3) as well as putting the apparently missing graphframes jars directly into the
jars
directory without success.Any advice on how to resolve this would be greatly appreciated. Thanks!
The text was updated successfully, but these errors were encountered: