Extensible SQL Lexer and Parser for Rust.
The goal of this project is to build a SQL lexer and parser capable of parsing SQL that conforms with the ANSI/ISO SQL standard while also making it easy to support custom dialects so that this crate can be used as a foundation for vendor-specific parsers.
This parser is currently being used by the DataFusion query engine, LocustDB, Ballista and GlueSQL.
Example
To parse a simple SELECT
statement:
use sqlparser::dialect::GenericDialect;
use sqlparser::parser::Parser;
let sql = "SELECT a, b, 123, myfunc(b)
FROM table_1
WHERE a > b AND b < 100
ORDER BY a DESC, b";
let dialect = GenericDialect {}; // or AnsiDialect, or your own dialect ...
let ast = Parser::parse_sql(&dialect, sql).unwrap();
println!("AST: {:?}", ast);
This outputs
AST: [Query(Query { ctes: [], body: Select(Select { distinct: false, projection: [UnnamedExpr(Identifier("a")), UnnamedExpr(Identifier("b")), UnnamedExpr(Value(Long(123))), UnnamedExpr(Function(Function { name: ObjectName(["myfunc"]), args: [Identifier("b")], over: None, distinct: false }))], from: [TableWithJoins { relation: Table { name: ObjectName(["table_1"]), alias: None, args: [], with_hints: [] }, joins: [] }], selection: Some(BinaryOp { left: BinaryOp { left: Identifier("a"), op: Gt, right: Identifier("b") }, op: And, right: BinaryOp { left: Identifier("b"), op: Lt, right: Value(Long(100)) } }), group_by: [], having: None }), order_by: [OrderByExpr { expr: Identifier("a"), asc: Some(false) }, OrderByExpr { expr: Identifier("b"), asc: None }], limit: None, offset: None, fetch: None })]
Command line
To parse a file and dump the results as JSON:
$ cargo run --features json_example --example cli FILENAME.sql [--dialectname]
Supporting custom SQL dialects
This is a work in progress, but we have some notes on writing a custom SQL parser.
Design
The core expression parser uses the Pratt Parser design, which is a top-down operator-precedence (TDOP) parser, while the surrounding SQL statement parser is a traditional, hand-written recursive descent parser. Eli Bendersky has a good tutorial on TDOP parsers, if you are interested in learning more about the technique.
Full Example
Let us look at a full SQL Parser Example.
Step 1. Write Code
Finally we need to write our code as follows:
(a). cli.rs
#![warn(clippy::all)]
/// A small command-line app to run the parser.
/// Run with cargo run --example cli
use std::fs;
use simple_logger::SimpleLogger;
use sqlparser::dialect::*;
use sqlparser::parser::Parser;
fn main() {
SimpleLogger::new().init().unwrap();
let filename = std::env::args().nth(1).expect(
r#"
No arguments provided!
Usage:
$ cargo run --example cli FILENAME.sql [--dialectname]
To print the parse results as JSON:
$ cargo run --feature json_example --example cli FILENAME.sql [--dialectname]
"#,
);
let dialect: Box<dyn Dialect> = match std::env::args().nth(2).unwrap_or_default().as_ref() {
"--ansi" => Box::new(AnsiDialect {}),
"--bigquery" => Box::new(BigQueryDialect {}),
"--postgres" => Box::new(PostgreSqlDialect {}),
"--ms" => Box::new(MsSqlDialect {}),
"--mysql" => Box::new(MySqlDialect {}),
"--snowflake" => Box::new(SnowflakeDialect {}),
"--hive" => Box::new(HiveDialect {}),
"--redshift" => Box::new(RedshiftSqlDialect {}),
"--generic" | "" => Box::new(GenericDialect {}),
s => panic!("Unexpected parameter: {}", s),
};
println!("Parsing from file '{}' using {:?}", &filename, dialect);
let contents = fs::read_to_string(&filename)
.unwrap_or_else(|_| panic!("Unable to read the file {}", &filename));
let without_bom = if contents.chars().next().unwrap() as u64 != 0xfeff {
contents.as_str()
} else {
let mut chars = contents.chars();
chars.next();
chars.as_str()
};
let parse_result = Parser::parse_sql(&*dialect, without_bom);
match parse_result {
Ok(statements) => {
println!(
"Round-trip:n'{}'",
statements
.iter()
.map(std::string::ToString::to_string)
.collect::<Vec<_>>()
.join("n")
);
if cfg!(feature = "json_example") {
#[cfg(feature = "json_example")]
{
let serialized = serde_json::to_string_pretty(&statements).unwrap();
println!("Serialized as JSON:n{}", serialized);
}
} else {
println!("Parse results:n{:#?}", statements);
}
std::process::exit(0);
}
Err(e) => {
println!("Error during parsing: {:?}", e);
std::process::exit(1);
}
}
}
(b). parse_select.rs
#![warn(clippy::all)]
use sqlparser::dialect::GenericDialect;
use sqlparser::parser::*;
fn main() {
let sql = "SELECT a, b, 123, myfunc(b)
FROM table_1
WHERE a > b AND b < 100
ORDER BY a DESC, b";
let dialect = GenericDialect {};
let ast = Parser::parse_sql(&dialect, sql).unwrap();
println!("AST: {:?}", ast);
}
Reference
Download the code below:
No. | Link |
---|---|
1. | Download Full Code |
2. | Read more here. |
3. | Follow code author here. |