feat(spark)!: Transpile ANY to EXISTS #4305

VaggelisD · 2024-10-29T10:04:58Z

In Hive hierarchy ANY is an aggregate function for BOOLEAN expressions and not an array/subquery operator as is in other dialects, meaning that the following transpiled queries do not work:

>>> sqlglot.parse_one("WITH t AS (SELECT ARRAY[1, 2, 3] AS col) SELECT * FROM t WHERE 1 <= ANY(col)", dialect="postgres").sql("spark")
'WITH t AS (SELECT ARRAY(1, 2, 3) AS col) SELECT * FROM t WHERE 1 <= ANY(col)'

spark-sql (default)> WITH t AS (SELECT ARRAY(1, 2, 3) AS col) SELECT * FROM t WHERE 1 <= ANY(col);
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve "any(col)" due to data type mismatch: Parameter 1 requires the "BOOLEAN" type, however "col" has the type "ARRAY<INT>".; line 1 pos 68;
...

However, Spark2+ supports the EXISTS() function which can process both ARRAY expressions and subqueries in a similar fashion. This PR:

Adds a transformation for the former (ARRAY) case, enabling the following path:

>>> sqlglot.parse_one("WITH t AS (SELECT ARRAY[1, 2, 3] AS col) SELECT * FROM t WHERE 1 <= ANY(col)", dialect="postgres").sql("spark")
'WITH t AS (SELECT ARRAY(1, 2, 3) AS col) SELECT * FROM t WHERE EXISTS(col, x -> 1 <= x)'

spark-sql (default)> WITH t AS (SELECT ARRAY(1, 2, 3) AS col) SELECT * FROM t WHERE EXISTS(col, x -> 1 <= x);
[1,2,3]

Extends EXISTS class to also inherit from Func, making it possible to parse & construct it as a proper function.

As step (2) is not required per se (could also build an anonymous function, check first commit), it's added as a standalone commit on top of (1); If we don't want to keep it it's trivial to drop it at once.

Docs

Postgres ANY for ARRAY | Postgres ANY for subqueries | Spark/Databricks ANY | Spark/Databricks EXISTS

sqlglot/transforms.py

VaggelisD added 2 commits October 29, 2024 11:48

feat(spark): Transpile ANY operator to EXISTS

0ceabf8

Support EXISTS as a function

ecc0e94

VaggelisD changed the title ~~feat(spark): Transpile ANY to EXISTS~~ feat(spark)!: Transpile ANY to EXISTS Oct 29, 2024

georgesittas reviewed Oct 29, 2024

View reviewed changes

sqlglot/transforms.py Show resolved Hide resolved

georgesittas approved these changes Oct 29, 2024

View reviewed changes

tobymao approved these changes Oct 29, 2024

View reviewed changes

VaggelisD merged commit e92904e into main Oct 29, 2024
6 checks passed

VaggelisD deleted the vaggelisd/spark_any branch October 29, 2024 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spark)!: Transpile ANY to EXISTS #4305

feat(spark)!: Transpile ANY to EXISTS #4305

VaggelisD commented Oct 29, 2024 •

edited

Loading

feat(spark)!: Transpile ANY to EXISTS #4305

feat(spark)!: Transpile ANY to EXISTS #4305

Conversation

VaggelisD commented Oct 29, 2024 • edited Loading

Docs

VaggelisD commented Oct 29, 2024 •

edited

Loading