We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I put on the config of spark
spark.jars=hdfs:///tmp/spark-2.4-spline-agent-bundle_2.11-2.2.1.jar spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener spark.spline.mode=ENABLED spark.spline.lineageDispatcher=hdfs spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/ spark.spline.lineageDispatcher.hdfs.fileNamePrefix=lineage_ spark.spline.lineageDispatcher.hdfs.fileBufferSize=4096 spark.spline.lineageDispatcher.hdfs.filePermissions=777 spark.driver.memory=4g
I run this code from pyspark.sql import SparkSession spark = SparkSession.builder .appName("Write DataFrame to HDFS as CSV") .getOrCreate()
data = [ (1, "Alice", 28), (2, "Bob", 24), (3, "Cathy", 29) ] columns = ["Id", "Name", "Age"] df = spark.createDataFrame(data, columns) df.show() output_path = "hdfs:///tmp/sample_data" df.coalesce(1).write .option("header", True) .mode("overwrite") .csv(output_path) print("DataFrame written to HDFS at {}".format(output_path)) spark.stop()
the log show that all ok
But when i go to [[email protected] Scripts]# hdfs dfs -ls /tmp/spline/lineage/ I see no data
attached the log
Please provide versions of: Spline, Spark and Scala that were in use when the bug happened. Spark- 2.4.8.7.2.18.0-641 /_/
Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_232)
Spark 2 4 Spline Agent Bundle » 2.2.1
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
I put on the config of spark
spark.jars=hdfs:///tmp/spark-2.4-spline-agent-bundle_2.11-2.2.1.jar
spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener
spark.spline.mode=ENABLED
spark.spline.lineageDispatcher=hdfs
spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
spark.spline.lineageDispatcher.hdfs.fileNamePrefix=lineage_
spark.spline.lineageDispatcher.hdfs.fileBufferSize=4096
spark.spline.lineageDispatcher.hdfs.filePermissions=777
spark.driver.memory=4g
I run this code
from pyspark.sql import SparkSession
spark = SparkSession.builder
.appName("Write DataFrame to HDFS as CSV")
.getOrCreate()
Create a sample DataFrame
data = [
(1, "Alice", 28),
(2, "Bob", 24),
(3, "Cathy", 29)
]
columns = ["Id", "Name", "Age"]
df = spark.createDataFrame(data, columns)
df.show()
output_path = "hdfs:///tmp/sample_data"
df.coalesce(1).write
.option("header", True)
.mode("overwrite")
.csv(output_path)
print("DataFrame written to HDFS at {}".format(output_path))
spark.stop()
the log show that all ok
But when i go to
[[email protected] Scripts]# hdfs dfs -ls /tmp/spline/lineage/
I see no data
attached the log
Versions
Please provide versions of: Spline, Spark and Scala that were in use when the bug happened.
Spark- 2.4.8.7.2.18.0-641
/_/
Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_232)
Spark 2 4 Spline Agent Bundle » 2.2.1
The text was updated successfully, but these errors were encountered: