英文:
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type. when running go example using apache beam spark runner
问题
我想在一个具有一个主节点和两个从节点的 Spark 集群上,使用 Spark Runner 运行 Apache Beam Go SDK 提供的 grades 示例(使用 Spark 2.4.5 版本)。我使用以下命令启动作业运行器:./gradlew :runners:spark:2:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
,并使用以下命令运行它:grades -runner=spark -endpoint=localhost:8099 -job_name=gradetest
。
然而,我遇到了以下错误:
Job state: RUNNING
2021/10/08 11:42:34 (): org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type.TypeBindings.emptyBindings()Lcom/fasterxml/jackson/databind/type/TypeBindings;
at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:73)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:104)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92)
at org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:199)
我认为问题可能来自于 Spark 作业运行器,因为我使用 Python 测试了另一段代码,也遇到了相同的错误。
非常感谢任何帮助!
英文:
I want to run grades example proposed by apache beam go sdk using spark runner on a spark cluster with one master and two slaves (spark2.4.5 version ). I start the job runner using ./gradlew :runners:spark:2:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
and run it using the following command : grades -runner=spark -endpoint=localhost:8099 -job_name=gradetest
However I get the following error.
2021/10/08 11:42:34 (): org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type.TypeBindings.emptyBindings()Lcom/fasterxml/jackson/databind/type/TypeBindings;
at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:73)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:104)
at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92)
at org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:199)
I assume that the problem can be from the spark job runner because i test another code using python and I get the same error.
Any help would be gratefully appreciated
答案1
得分: 0
抱歉回答晚了,我刚刚在研究相关问题。
不幸的是,在升级Beam中使用的Jackson版本时,Spark 2的作业服务器出现了问题。默认情况下,Spark首先从其自己的系统类路径加载类。而该类路径包含一个非常过时的Jackson版本,不再与Beam兼容。
虽然有一些解决方法可以处理将Beam Java应用程序提交给Spark时的问题,但遗憾的是,这些方法对于Spark作业服务器不起作用。请参阅https://github.com/apache/beam/issues/23568了解更多详细信息。
此外,Beam已经弃用了对Spark 2的支持。
我建议改为在Spark 3上运行作业服务器。
./gradlew :runners:spark:3:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
英文:
Sorry for the late answer, I was just looking into related issues.
Unfortunately the Spark 2 job-server was broken when bumping the Jackson version used in Beam. By default Spark loads classes from its own system classpath first. And that classpath contains a very outdated version of Jackson which is not compatible with Beam anymore.
While there is work-arounds to deal with that when submitting Beam Java apps to Spark, these - unfortunately - don't work for the Spark job-server. See
https://github.com/apache/beam/issues/23568 for more details.
Also, Spark 2 support has been deprecated in Beam.
I recommend running the job-server for Spark 3 instead.
./gradlew :runners:spark:3:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
专注分享java语言的经验与见解,让所有开发者获益!
评论