Note: This parser is not an official PostgreSQL project but an independent, unofficial tool.
postgresql-cst-parser
is a PostgreSQL-specific Concrete Syntax Tree (CST) parser developed in Pure Rust. This document describes the parser's features, development motivation, usage, and implementation details.
- PostgreSQL 17 Support: Supports the latest PostgreSQL 17 syntax.
- Structured CST Output: The generated CST strictly follows the structure defined in PostgreSQL's gram.y file.
- Utilizing
cstree
: Uses thecstree
crate for building syntax trees. - PL/pgSQL: Currently not supported.
This project was developed because we needed a library that can be used from Rust, supports all syntax, and (being written in Pure Rust) can be used with wasm-bindgen.
You can use it as follows:
use postgresql_cst_parser::{parse, syntax_kind::SyntaxKind};
fn main() {
// Parse SQL query and get the syntax tree
let sql = "SELECT tbl.a as a, tbl.b from TBL tbl WHERE tbl.a > 0;";
let root = parse(sql).unwrap();
// Example 1: Extract all column references from the query
let column_refs: Vec<String> = root
.descendants()
.filter(|node| node.kind() == SyntaxKind::columnref)
.map(|node| node.text().to_string())
.collect();
println!("Column references: {:?}", column_refs); // ["tbl.a", "tbl.b", "tbl.a"]
// Example 2: Find the WHERE condition
if let Some(where_clause) = root
.descendants()
.find(|node| node.kind() == SyntaxKind::where_clause)
{
println!("WHERE condition: {}", where_clause.text());
}
// Example 3: Get the selected table name
if let Some(relation_expr) = root
.descendants()
.find(|node| node.kind() == SyntaxKind::relation_expr)
{
if let Some(name_node) = relation_expr
.descendants()
.find(|node| node.kind() == SyntaxKind::ColId)
{
println!("Table name: {}", name_node.text());
}
}
// Example 4: Parse complex SQL and extract specific nodes
let complex_sql = "WITH data AS (SELECT id, value FROM source WHERE value > 10)
SELECT d.id, d.value, COUNT(*) OVER (PARTITION BY d.id)
FROM data d JOIN other o ON d.id = o.id
ORDER BY d.value DESC LIMIT 10;";
let complex_root = parse(complex_sql).unwrap();
// Extract CTEs (Common Table Expressions)
let ctes: Vec<_> = complex_root
.descendants()
.filter(|node| node.kind() == SyntaxKind::common_table_expr)
.collect();
// Extract window functions
let window_funcs: Vec<_> = complex_root
.descendants()
.filter(|node| node.kind() == SyntaxKind::over_clause)
.collect();
println!("Number of CTEs: {}", ctes.len());
println!("Number of window functions: {}", window_funcs.len());
}
Example of the generated syntax tree:
SELECT tbl.a as a from TBL tbl;
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected] "SELECT"
[email protected] " "
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected] "tbl"
[email protected]
[email protected]
[email protected] "."
[email protected]
[email protected]
[email protected] "a"
[email protected] " "
[email protected] "as"
[email protected] " "
[email protected]
[email protected] "a"
[email protected] " "
[email protected]
[email protected] "from"
[email protected] " "
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected] "TBL"
[email protected] " "
[email protected]
[email protected]
[email protected]
[email protected] "tbl"
[email protected] ";"
If you'd like to try this parser directly, you can experience it online here.
This implementation uses PostgreSQL's scan.l and gram.y with patches from libpg_query applied. scan.l
has been further rewritten for Rust, and based on scan.l
and gram.y
, a syntax parsing table has been created to build the parser.
kwlist.h
,parser.c
,scan.l
,gram.y
are under the PostgreSQL License.lexer_ported.rs
andgenerated.rs
contain code ported from PostgreSQL, so the ported parts are under the PostgreSQL License.- This project applies patches from libpg_query to
scan.l
andgram.y
, but the patches themselves are not included in this repository. - Other files are published under the MIT License.