Apache Calcite - calcite jdbc驱动使用场景

前言

在使用Calcite查询数据时通常会用到这些代码获取schema

Connection connection = DriverManager.getConnection("jdbc:calcite:", info);
CalciteConnection calciteConnection = connection.unwrap(CalciteConnection.class);
SchemaPlus rootSchema = calciteConnection.getRootSchema();

但是并不是使用Calcite的场景都需要这样用，例如作为作为查询优化器或解析器使用，就不需要通过Caclite JDBC驱动来访问数据源。

查询优化器或解析器场景不使用驱动

如果你只是使用 Calcite 来进行 SQL 解析、验证、优化等操作，而不需要通过 JDBC 接口来访问数据源，那么你并不需要使用 JDBC 连接代码。你可以直接使用 Calcite 提供的 API 来构建和操作查询计划。例如，使用 Frameworks 工具类来创建 Calcite 的环境和配置。解析器解析Sql的场景比较常见，下文不进行演示。

    @Testpublic void testSqlToRelNode() throws Exception{Properties info = new Properties();SchemaPlus rootSchema = Frameworks.createRootSchema(true);Schema schema = new AbstractSchema() {};rootSchema.add("MY_SCHEMA", schema);Table yourTable = new AbstractTable() {@Overridepublic RelDataType getRowType(RelDataTypeFactory typeFactory) {// 如果要动态分析表，那么就自己去创建return typeFactory.builder().add("id", typeFactory.createJavaType(int.class)).add("name", typeFactory.createJavaType(String.class)).add("age", typeFactory.createJavaType(int.class)).build();}};Table department_table = new AbstractTable() {@Overridepublic RelDataType getRowType(RelDataTypeFactory typeFactory) {// 如果要动态分析表，那么就自己去创建return typeFactory.builder().add("id", typeFactory.createJavaType(int.class)).add("department", typeFactory.createJavaType(String.class)).add("location", typeFactory.createJavaType(String.class)).build();}};rootSchema.getSubSchema("MY_SCHEMA").add("your_table", yourTable);rootSchema.getSubSchema("MY_SCHEMA").add("department_table", department_table);SqlParser.Config parserConfig = SqlParser.config().withLex(Lex.MYSQL).withConformance(SqlConformanceEnum.MYSQL_5);Frameworks.createRootSchema(true);FrameworkConfig config = Frameworks.newConfigBuilder().parserConfig(parserConfig).defaultSchema(rootSchema.getSubSchema("MY_SCHEMA")) // 使用自定义Schema.build();Planner planner = Frameworks.getPlanner(config);String sql = "SELECT A.id, A.name FROM (SELECT id,name FROM your_table WHERE age > 30 ) A JOIN (SELECT id, department FROM department_table WHERE location = 'NY' ) B ON A.id = B.id WHERE A.id > 100 ";
//        String sql = "SELECT * FROM your_table where id = 1 and name = 'you_name'";SqlNode sqlNode = planner.parse(sql);SqlNode validatedSqlNode = planner.validate(sqlNode);RelRoot relRoot = planner.rel(validatedSqlNode);RelNode rootRelNode = relRoot.rel;System.out.println(rootRelNode.explain());HepProgram hepProgram = new HepProgramBuilder().addRuleInstance(CoreRules.FILTER_PROJECT_TRANSPOSE).addRuleInstance(CoreRules.FILTER_INTO_JOIN).addRuleInstance(CoreRules.FILTER_AGGREGATE_TRANSPOSE).addRuleInstance(CoreRules.FILTER_SET_OP_TRANSPOSE).addRuleInstance(CoreRules.PROJECT_FILTER_TRANSPOSE).addRuleInstance(CoreRules.PROJECT_JOIN_TRANSPOSE).build();// 创建HepPlannerHepPlanner hepPlanner = new HepPlanner(hepProgram);// 设置根RelNodehepPlanner.setRoot(rootRelNode);// 进行优化RelNode optimizedRelNode = hepPlanner.findBestExp();// 输出优化后的RelNodeSystem.out.println("优化后的RelNode: \n" + optimizedRelNode.explain());// 使用RelToSqlConverter将优化后的RelNode转换回SQLRelToSqlConverter relToSqlConverter = new RelToSqlConverter(MysqlSqlDialect.DEFAULT);Result result = relToSqlConverter.visitRoot(optimizedRelNode);SqlNode sqlNodeConverted = result.asStatement();//使用SqlPrettyWriter格式化SQLSqlPrettyWriter writer = new SqlPrettyWriter();String convertedSql = writer.format(sqlNodeConverted);//输出转换后的SQLSystem.out.println("优化后的SQL: " + convertedSql);}

Frameworks.createRootSchema(true) 是 Apache Calcite 中用于创建一个根模式（schema）的方法。它的作用和参数含义如下：

Frameworks.createRootSchema(true) 的含义

方法作用：该方法用于创建一个新的根模式（SchemaPlus），这是一个可以包含多个子模式和表的顶层容器。在 Calcite 中，模式（schema）用于组织和访问数据库对象，如表、视图和函数。
参数说明：
- boolean addMetadataSchema: 这个布尔参数决定是否在创建的根模式中添加一个默认的元数据模式（metadata schema）。true：表示在创建的根模式中包含一个名为 “metadata” 的子模式。这个子模式可以包含一些系统表或视图，用于访问数据库的元数据。false：表示不添加这样的元数据模式。
使用场景
- 默认环境设置：当你在使用 Calcite 进行查询解析、验证和优化时，需要一个模式来组织和访问数据。通过 createRootSchema(true)，你可以快速创建一个包含元数据的根模式，便于进行一些系统级别的查询和管理。
- 自定义环境：如果你希望手动管理哪些模式和表可用，并且不需要系统元数据，可以将参数设置为 false，以获得一个更简洁的根模式。

作为JDBC驱动使用

如果你希望通过 JDBC 接口来使用 Calcite，那么确实需要使用类似于

DriverManager.getConnection(“jdbc:calcite:”, info);

的代码来建立数据库连接。通过这种方式，你可以像使用其他数据库一样，使用标准的 SQL 查询和 JDBC 操作来访问 Calcite。

JDBC 规范

JDBC（Java Database Connectivity）协议规范是Java平台的一部分，它定义了一组标准API，允许Java应用程序与各种数据库进行交互。

JDBC（Java Database Connectivity）和驱动程序（Driver）之间的关系是密切相关的，驱动程序是实现JDBC功能的核心组件。

驱动程序是一个实现JDBC接口的具体类库，通常由数据库厂商提供。负责将JDBC API调用转换为数据库特定的调用。它充当Java应用程序与数据库之间的桥梁。

Calcite驱动

为了基于JDBC接口来执行sql，calcite就需要实现自己的驱动。最终注册在驱动管理器中进行使用。JDBC 驱动实现逻辑在 org.apache.calcite.jdbc包下。按照规范需要实现下列基础能力：
实现步骤

实现Driver接口：这是JDBC驱动程序的入口。需要实现java.sql.Driver接口，并注册驱动程序。
实现Connection接口：负责管理与数据库的连接。实现java.sql.Connection接口的方法以支持事务管理、关闭连接等功能。
实现Statement接口：用于执行SQL查询。需要实现java.sql.Statement接口，支持执行SQL语句并返回结果。
实现ResultSet接口：用于处理查询结果。实现java.sql.ResultSet接口以提供对结果集的访问。

样板代码：

import java.sql.*;
import java.util.Properties;
import java.util.logging.Logger;// 1. Implement the Driver interface
public class MyMemoryDriver implements Driver {static {try {DriverManager.registerDriver(new MyMemoryDriver());} catch (SQLException e) {e.printStackTrace();}}@Overridepublic Connection connect(String url, Properties info) throws SQLException {if (acceptsURL(url)) {return new MyMemoryConnection();}return null;}@Overridepublic boolean acceptsURL(String url) throws SQLException {return url.startsWith("jdbc:mymemory:");}@Overridepublic DriverPropertyInfo[] getPropertyInfo(String url, Properties info) throws SQLException {return new DriverPropertyInfo[0];}@Overridepublic int getMajorVersion() {return 1;}@Overridepublic int getMinorVersion() {return 0;}@Overridepublic boolean jdbcCompliant() {return false;}@Overridepublic Logger getParentLogger() throws SQLFeatureNotSupportedException {return null;}
}// 2. Implement the Connection interface
class MyMemoryConnection implements Connection {@Overridepublic Statement createStatement() throws SQLException {return new MyMemoryStatement();}// Implement other methods of Connection...@Overridepublic void close() throws SQLException {// Close the connection}@Overridepublic boolean isClosed() throws SQLException {return false;}// Other methods...
}// 3. Implement the Statement interface
class MyMemoryStatement implements Statement {@Overridepublic ResultSet executeQuery(String sql) throws SQLException {return new MyMemoryResultSet();}// Implement other methods of Statement...@Overridepublic void close() throws SQLException {// Close the statement}// Other methods...
}// 4. Implement the ResultSet interface
class MyMemoryResultSet implements ResultSet {@Overridepublic boolean next() throws SQLException {return false; // Simulate no data}// Implement other methods of ResultSet...@Overridepublic void close() throws SQLException {// Close the result set}// Other methods...
}

Driver接口的实现类

org.apache.calcite.jdbc.Driver

在静态代码块中完成注册

public class Driver extends UnregisteredDriver {public static final String CONNECT_STRING_PREFIX = "jdbc:calcite:";protected final @Nullable Supplier<CalcitePrepare> prepareFactory;static {new Driver().register();}

 /** org.apache.calcite.avatica.UnregisteredDriver类* Registers this driver with the driver manager.*/protected void register() {try {DriverManager.registerDriver(this);} catch (SQLException e) {System.out.println("Error occurred while registering JDBC driver "+ this + ": " + e.toString());}}

其余实现

connection: org.apache.calcite.jdbc.CalciteConnectionImpl
Statement:org.apache.calcite.jdbc.CalcitePreparedStatement
ResultSet: org.apache.calcite.jdbc.CalciteResultSet

通过URL发现驱动

DriverManager.getConnection(“jdbc:calcite:”, info);

URL在JDBC的上下文中，主要作用是帮助DriverManager识别并选择合适的JDBC驱动程序。

识别驱动程序：DriverManager使用提供的JDBC URL来查找合适的驱动程序。JDBC URL的格式通常为jdbc:subprotocol:subname。在这个例子中，"jdbc:calcite:"中的"calcite"是用于标识Calcite的子协议。DriverManager通过这个标识来匹配已经注册的驱动程序。
加载驱动程序：一旦找到匹配的驱动程序，DriverManager会使用该驱动程序来尝试建立连接。
驱动程序需要实现java.sql.Driver接口的connect方法，该方法会检查URL是否是它所支持的格式，并根据URL和属性信息创建并返回一个Connection对象。
建立连接：如果驱动程序接受了该URL（通过acceptsURL方法），它将使用提供的配置信息（info对象）来建立连接。连接建立后，应用程序就可以通过返回的Connection对象与数据库进行交互。

小结

经过上述步，获取连接后，通过JDBC接口访问calcite注册的数据源。Apache Calcite选择基于JDBC实现交互接口，具备下面两个好处：

普适性：JDBC（Java Database Connectivity）是Java应用程序与数据库通信的标准API，广泛用于Java开发中。选择JDBC使Calcite能够立即被Java开发者接受和使用，而不需要学习新的接口或编程模型。
兼容性：通过使用JDBC，Calcite可以与现有的工具和框架（如Java应用服务器、数据库客户端工具等）无缝集成，这些工具通常已经支持JDBC。

总结

在使用calcite时，根据合适的场景选择是否需要使用calcite jdbc驱动来，前文举例中「查询优化器」可以使用也可以直接通过Calcite的API来创建优化所需要的schema。而查询场景我们基于JDBC接口来完成，就需要前置通过Calcite驱动来获取连接对象。