You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to work with Clickhouse in Pyspark using the Clickhouse connector and am getting parsing errors when I use some sql constructs specific to the Clickhouse dialect, such as limit 1 by
Here is the code and examples of errors I get. I tried different combinations of compatible pyspark, connector, client, http client, jdbc versions but none of them worked.
Traceback (most recent call last):
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
df = ss.sql('select * from last_ts final limit 1 by app_name')
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/sql/session.py", line 1631, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/py4j/java_gateway.py", line 1322, in call
return_value = get_return_value(
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'by'.(line 1, pos 36)
== SQL ==
select * from last_ts final limit 1 by app_name
------------------------------------^^^
Traceback (most recent call last):
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
df = ss.sql('select * from last_ts settings final = 1')
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/sql/session.py", line 1631, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/py4j/java_gateway.py", line 1322, in call
return_value = get_return_value(
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'final'.(line 1, pos 31)
== SQL ==
select * from last_ts settings final = 1
-------------------------------^^^
The text was updated successfully, but these errors were encountered:
dinkovv
changed the title
Parsing errors
SQL parsing syntax errors
Dec 20, 2024
I'm trying to work with Clickhouse in Pyspark using the Clickhouse connector and am getting parsing errors when I use some sql constructs specific to the Clickhouse dialect, such as
limit 1 by
Here is the code and examples of errors I get. I tried different combinations of compatible pyspark, connector, client, http client, jdbc versions but none of them worked.
`from pyspark.sql import SparkSession
packages = [
"com.clickhouse.spark:clickhouse-spark-runtime-3.4_2.12:0.8.0",
"com.clickhouse:clickhouse-client:0.7.0",
"com.clickhouse:clickhouse-http-client:0.7.0",
"org.apache.httpcomponents.client5:httpclient5:5.2.1"
]
spark = (SparkSession.builder
.config("spark.jars.packages", ",".join(packages))
.getOrCreate())
spark.conf.set("spark.sql.catalog.clickhouse", "com.clickhouse.spark.ClickHouseCatalog")
spark.conf.set("spark.sql.catalog.clickhouse.host", "127.0.0.1")
spark.conf.set("spark.sql.catalog.clickhouse.protocol", "http")
spark.conf.set("spark.sql.catalog.clickhouse.http_port", "8123")
spark.conf.set("spark.sql.catalog.clickhouse.user", "default")
spark.conf.set("spark.sql.catalog.clickhouse.password", "123456")
spark.conf.set("spark.sql.catalog.clickhouse.database", "default")
spark.conf.set("spark.clickhouse.write.format", "json")
df = spark.sql(query)
df.show()`
Traceback (most recent call last):
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
df = ss.sql('select * from last_ts final limit 1 by app_name')
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/sql/session.py", line 1631, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/py4j/java_gateway.py", line 1322, in call
return_value = get_return_value(
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'by'.(line 1, pos 36)
== SQL ==
select * from last_ts final limit 1 by app_name
------------------------------------^^^
Traceback (most recent call last):
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
df = ss.sql('select * from last_ts settings final = 1')
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/sql/session.py", line 1631, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/py4j/java_gateway.py", line 1322, in call
return_value = get_return_value(
File "/Users/repos/venvs/fx_reports/lib/python3.9/site-packages/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'final'.(line 1, pos 31)
== SQL ==
select * from last_ts settings final = 1
-------------------------------^^^
The text was updated successfully, but these errors were encountered: