使用JDBC连接器连接到MapR Drill,在没有任何异常的情况下连接失败。

huangapple 未分类评论67阅读模式
英文:

Using JDBC connector to connect to mapR drill failing without any Exception

问题

我正在构建一个Java应用程序,该应用程序从MapR集群中的文件系统(parquet)中获取数据。最初我使用了Apache Spark,但处理速度相当慢。因此,我决定使用Drill JDBC连接方法。

根据MapR文档中的说明 https://mapr.com/docs/52/Drill/Using-JDBC-Driver-App.html

以下是我的代码:

步骤1:
将驱动程序JAR文件放置在项目路径中的lib文件夹中,如下所示:

[项目目录][1]
[1]: https://i.stack.imgur.com/APsZi.png

步骤2:
将JAR文件导入到我的Maven pom.xml中

<dependency>
    <groupId>DrillJDBC41</groupId>
    <artifactId>DrillJDBC41</artifactId>
    <scope>system</scope>
    <version>1.0</version>
    <systemPath>${project.basedir}\src\lib\DrillJDBC41.jar</systemPath>
</dependency>

步骤3:
我的代码实现

try {
    private static final String CONNECTION_URL = "jdbc:drill:zk=192.168.1.1:31010/drill/dev.maprcluster.com-drillbits;schema=dfs";

    private static final String JDBC_DRIVER = "com.mapr.drill.jdbc41.Driver";
    
    Connection con = null;
    Statement stmt = null;
    ResultSet rs = null;
    
    // 定义一个普通查询
    String query = "SELECT * FROM `dfs.default`.`/storage/products/data/d/report/2019/07/12`" + " where unique_key = '00209220' LIMIT 30";
           
    // 使用类名注册驱动程序
    Class.forName(JDBC_DRIVER);
    
    try {
        System.out.println("about establishing connection");
        con = DriverManager.getConnection(CONNECTION_URL);
        System.out.println("connection established");
    } catch (Exception e) {
        System.out.println("EXCEPTION OO");
        e.printStackTrace();
    }
    
    stmt = con.createStatement();
    
    System.out.println("trying to execute query");
    
    rs = stmt.executeQuery(query);
    
    System.out.println("gotten result set");
    
    while(rs.next()) {
        System.out.println("data is returned");
    }
} catch (SQLException se) {
    System.out.println("sql exception");
    se.printStackTrace();
} catch (Exception e) {
    e.printStackTrace();
} finally {
    try {
        System.out.println("entered finally block");
        if (rs != null) {
            rs.close();
        }
        if (stmt != null) {
            stmt.close();
        }
        if (con != null) {
            con.close();
        }
    } catch (SQLException se1) {
        se1.printStackTrace();
    }
}

问题:

应用程序可以正常构建,但是当我尝试获取数据时,它在打印了 "about establishing connection" 后停止运行,并且直接进入finally块而不抛出任何异常。

我不确定问题在哪里。

我还尝试了另一种使用Apache Drill的实现:

<dependency>
    <groupId>org.apache.drill.exec</groupId>
    <artifactId>drill-jdbc-all</artifactId>
    <version>1.1.0</version>
</dependency>

将驱动程序更改为:

private static final String JDBC_DRIVER_DRILL = "org.apache.drill.jdbc.Driver";

问题仍然存在。

没有抛出任何错误。

输出:

about establishing connection

entered finally block

更新

我按照建议捕获了Throwable,我得到了以下错误:

java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.createStarted()Lcom/google/common/base/Stopwatch;
	at org.apache.drill.common.config.DrillConfig.create(DrillConfig.java:189)
	at org.apache.drill.common.config.DrillConfig.create(DrillConfig.java:163)
	at org.apache.drill.common.config.DrillConfig.forClient(DrillConfig.java:114)
	at com.mapr.drill.drill.client.DRJDBCClient.openSession(Unknown Source)
	at com.mapr.drill.drill.client.DRJDBCClient.<init>(Unknown Source)
	at com.mapr.drill.drill.core.DRJDBCConnection.connect(Unknown Source)
	at com.mapr.drill.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
	at com.mapr.drill.jdbc.common.AbstractDriver.connect(Unknown Source)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:270)
英文:

I am building a java app that gets data from the file system (parquet) in mapR cluster. I was initially using apache spark but the processing was quite slow.
So i decided to use the drill jdbc connection approach.

Following the documentation in mapR https://mapr.com/docs/52/Drill/Using-JDBC-Driver-App.html

Here is my code

Step 1;
placed the driver jar in a lib folder in my project path as shown;

[project directory][1]
  [1]: https://i.stack.imgur.com/APsZi.png

step 2; 
Imported the jar to my maven pom.xml
        &lt;dependency&gt;
            &lt;groupId&gt;DrillJDBC41&lt;/groupId&gt;
            &lt;artifactId&gt;DrillJDBC41&lt;/artifactId&gt;
            &lt;scope&gt;system&lt;/scope&gt;
            &lt;version&gt;1.0&lt;/version&gt;
            &lt;systemPath&gt;${project.basedir}\src\lib\DrillJDBC41.jar&lt;/systemPath&gt;
        &lt;/dependency&gt;

step 3
my code implementation
       

        try {


 private static final String CONNECTION_URL = &quot;jdbc:drill:zk=192.168.1.1:31010/drill/dev.maprcluster.com-drillbits;schema=dfs&quot;;



 private static final String JDBC_DRIVER = &quot;com.mapr.drill.jdbc41.Driver&quot;;

 Connection con = null;
        Statement stmt = null;
        ResultSet rs = null;
        // Define a plain query
        String query = &quot;SELECT * FROM `dfs.default`.`/storage/products/data/d/report/2019/07/12`&quot; + &quot; where unique_key = &#39;00209220&#39; LIMIT 30&quot;;
       
            // Register the driver using the class name
            Class.forName(JDBC_DRIVER);
            // Establish a connection using the connection
            // URL

            try {
                System.out.println(&quot;about establishing connection&quot;);
                con = DriverManager.getConnection(CONNECTION_URL);
                System.out.println(&quot;connection established&quot;);
            }catch (Exception e){
                System.out.println(&quot;EXCEPTION OO&quot;);
                e.printStackTrace();
            }
            // Create a Statement object for sending SQL
            // statements to the database
            stmt = con.createStatement();

            System.out.println(&quot;trying to execute query&quot;);
            // Execute the SQL statement
            rs = stmt.executeQuery(query);
            // Display a header line for output appearing in
            // the Console View
            System.out.println(&quot;gotten result set&quot;);

                    // Step through each row in the result set
                    // returned from the database
            while(rs.next()) {
                // Retrieve values from the row where the
                System.out.println(&quot;data is returned&quot;);
                // cursor is currently positioned using
                // column names

                // Display values in columns 20 characters
                // wide in the Console View using the
                // Formatter

            }
        } catch (SQLException se) {
            System.out.println(&quot;sql exception&quot;);
            se.printStackTrace();
            // Handle errors encountered during interaction
            // with the data source
        } catch (Exception e) {
            e.printStackTrace();
            // Handle other errors
        } finally {
            // Perform clean up
            try {
                System.out.println(&quot;entered finally block&quot;);
                if (rs != null) {
                    rs.close();
                }
                if (stmt != null) {
                    stmt.close();
                }
                if (con != null) {
                    con.close();
                }
            } catch (SQLException se1) {
                se1.printStackTrace();
            }
        } // End try
    }

PROBLEM

The application builds fine but when i try to get data , it stops after printing this;

about establishing connection

and goes straight into the finally block without throwing any exception.

I am not sure what the problem is.

I also tried another another implementation using apache drill

  &lt;dependency&gt;
            &lt;groupId&gt;org.apache.drill.exec&lt;/groupId&gt;
            &lt;artifactId&gt;drill-jdbc-all&lt;/artifactId&gt;
            &lt;version&gt;1.1.0&lt;/version&gt;
&lt;/dependency&gt;

changed the driver to this;

 private static final String JDBC_DRIVER_DRILL = &quot;org.apache.drill.jdbc.Driver&quot;;

still the same problem.

No error thrown.

Output;

about establishing connection

entered finally block

UPDATE

I caught Throwable as advised and i was getting the following error;

java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.createStarted()Lcom/google/common/base/Stopwatch;
	at org.apache.drill.common.config.DrillConfig.create(DrillConfig.java:189)
	at org.apache.drill.common.config.DrillConfig.create(DrillConfig.java:163)
	at org.apache.drill.common.config.DrillConfig.forClient(DrillConfig.java:114)
	at com.mapr.drill.drill.client.DRJDBCClient.openSession(Unknown Source)
	at com.mapr.drill.drill.client.DRJDBCClient.&lt;init&gt;(Unknown Source)
	at com.mapr.drill.drill.core.DRJDBCConnection.connect(Unknown Source)
	at com.mapr.drill.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
	at com.mapr.drill.jdbc.common.AbstractDriver.connect(Unknown Source)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:270)

答案1

得分: 0

看着这一行,

`private static final String CONNECTION_URL = &quot;jdbc:drill:zk=192.168.1.1:31010/drill/dev.maprcluster.com-drillbits;schema=dfs&quot;;`

您提供的链接中所陈述的文档将其作为连接到Zookeeper集群的示例。您复制了示例中提供的相同服务器地址。有许多集群管理器,如Yarn和Mesos。请确保您正在使用正确的连接字符串连接到集群中的正确管理器,并且再次确保服务器地址应与集群中相应的地址对应。

同样,将您想要用于建立连接的代码方面单独放入一个独立的类中,并在单个try-catch块中处理逻辑,以实现关注点分离,而不是将其放在服务类的嵌套try-catch块中。
这将使您能够更轻松地进行测试和调试,以了解发生了什么。

最后,由于没有静态跟踪来通知错误,正确的方法是调试连接类(假设您已经将其提取出来),然后逐步执行,以了解在那行中发生了什么。这将使您能够弄清下一步该怎么做。

英文:

Looking at this line,

`private static final String CONNECTION_URL = &quot;jdbc:drill:zk=192.168.1.1:31010/drill/dev.maprcluster.com-drillbits;schema=dfs&quot;;`

The documentation you provided a link to, stated it to be an example of connecting to a zookeeper cluster. And you copied the same server address provided in the example. There are many cluster managers like Yarn and Mesos. Ensure you are using the correct connection string to the right manager in your cluster and again, the server addresses should correspond to the respective addresses in your cluster.

Again, separate the aspect of the code that you want to use to get connection to a separate class and handle the logic in a single try-catch block, for the sake of separation of concern, instead of having it in a nested try-catch block in your service class.
This would enable you to test and debug more easily to understand what is happening.

Finally, since there's no static trace to inform of the error, the right approach is to debug the connection class (assuming you have extracted it out), then step through it to learn what is happening in that line. That would enable you to figure out what to do next.

答案2

得分: 0

我喜欢Apache Drill,但如果您关心速度,这真的是读取Parquet数据的最佳方式吗?

AvroParquetReader可能是一种更好的方法,可直接读取Parquet格式的文件。

您还可以直接使用ParquetFileReader类。示例在这里:

https://www.jofre.de/?p=1459

以及在这里:

https://www.arm64.ca/post/reading-parquet-files-java/

英文:

I love Apache Drill, but is this really the best way to read Parquet data if you are worried about speed?

The AvroParquetReader might be a better approach for directly reading a Parquet-formatted file.

You can also use the ParquetFileReader class directly. Example here:

https://www.jofre.de/?p=1459

and here:

https://www.arm64.ca/post/reading-parquet-files-java/

huangapple
  • 本文由 发表于 2020年7月23日 16:32:27
  • 转载请务必保留本文链接:https://java.coder-hub.com/63050111.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定