java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type. when running go example using apache beam spark runner

huangapple 未分类评论67阅读模式
英文:

java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type. when running go example using apache beam spark runner

问题

我想在一个具有一个主节点和两个从节点的 Spark 集群上,使用 Spark Runner 运行 Apache Beam Go SDK 提供的 grades 示例(使用 Spark 2.4.5 版本)。我使用以下命令启动作业运行器:./gradlew :runners:spark:2:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077,并使用以下命令运行它:grades -runner=spark -endpoint=localhost:8099 -job_name=gradetest

然而,我遇到了以下错误:

Job state: RUNNING
2021/10/08 11:42:34 (): org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type.TypeBindings.emptyBindings()Lcom/fasterxml/jackson/databind/type/TypeBindings;
	at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:73)
	at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:104)
	at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92)
	at org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:199)

我认为问题可能来自于 Spark 作业运行器,因为我使用 Python 测试了另一段代码,也遇到了相同的错误。
非常感谢任何帮助!

英文:

I want to run grades example proposed by apache beam go sdk using spark runner on a spark cluster with one master and two slaves (spark2.4.5 version ). I start the job runner using ./gradlew :runners:spark:2:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077 and run it using the following command : grades -runner=spark -endpoint=localhost:8099 -job_name=gradetest

However I get the following error.

2021/10/08 11:42:34  (): org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type.TypeBindings.emptyBindings()Lcom/fasterxml/jackson/databind/type/TypeBindings;
   at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:73)
   at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:104)
   at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92)
   at org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:199)

I assume that the problem can be from the spark job runner because i test another code using python and I get the same error.
Any help would be gratefully appreciated

答案1

得分: 0

抱歉回答晚了,我刚刚在研究相关问题。

不幸的是,在升级Beam中使用的Jackson版本时,Spark 2的作业服务器出现了问题。默认情况下,Spark首先从其自己的系统类路径加载类。而该类路径包含一个非常过时的Jackson版本,不再与Beam兼容。

虽然有一些解决方法可以处理将Beam Java应用程序提交给Spark时的问题,但遗憾的是,这些方法对于Spark作业服务器不起作用。请参阅https://github.com/apache/beam/issues/23568了解更多详细信息。

此外,Beam已经弃用了对Spark 2的支持。

我建议改为在Spark 3上运行作业服务器。

./gradlew :runners:spark:3:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
英文:

Sorry for the late answer, I was just looking into related issues.

Unfortunately the Spark 2 job-server was broken when bumping the Jackson version used in Beam. By default Spark loads classes from its own system classpath first. And that classpath contains a very outdated version of Jackson which is not compatible with Beam anymore.

While there is work-arounds to deal with that when submitting Beam Java apps to Spark, these - unfortunately - don't work for the Spark job-server. See
https://github.com/apache/beam/issues/23568 for more details.

Also, Spark 2 support has been deprecated in Beam.

I recommend running the job-server for Spark 3 instead.

./gradlew :runners:spark:3:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077

huangapple
  • 本文由 发表于 2021年10月8日 19:07:25
  • 转载请务必保留本文链接:https://java.coder-hub.com/69494877.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定