英文:
EMR cluster hangs in Step state 'Running/Pending'
问题
以下是使用Java SDK启动EMR集群并包含自定义JAR步骤的代码部分:
String dataTrasnferJar = "s3://test/testApplication.jar";
if (dataTrasnferJar == null || dataTrasnferJar.isEmpty())
throw new InvalidS3ObjectException(
"EMR自定义JAR文件路径为空。请提供有效的JAR文件路径。");
HadoopJarStepConfig customJarConfig = new HadoopJarStepConfig().withJar(dataTrasnferJar);
StepConfig customJarStep = new StepConfig("Mongo_to_S3_Data_Transfer", customJarConfig)
.withActionOnFailure(ActionOnFailure.CONTINUE);
AmazonElasticMapReduce emr = AmazonElasticMapReduceClientBuilder.standard()
.withCredentials(awsCredentialsProvider)
.withRegion(region)
.build();
Application spark = new Application().withName("Spark");
String clusterName = "my-cluster-" + System.currentTimeMillis();
RunJobFlowRequest request = new RunJobFlowRequest()
.withName(clusterName)
.withReleaseLabel("emr-6.0.0")
.withApplications(spark)
.withVisibleToAllUsers(true)
.withSteps(customJarStep)
.withLogUri(loggingS3Bucket)
.withServiceRole("EMR_DefaultRole")
.withJobFlowRole("EMR_EC2_DefaultRole")
.withInstances(new JobFlowInstancesConfig()
.withEc2KeyName(key_pair)
.withInstanceCount(instanceCount)
.withEc2SubnetIds(subnetId)
.withAdditionalMasterSecurityGroups(securityGroup)
.withKeepJobFlowAliveWhenNoSteps(true)
.withMasterInstanceType(instanceType));
RunJobFlowResult result = emr.runJobFlow(request);
英文:
I am launching an EMR cluster through java SDK with a custom jar step. The cluster launch is successful but when after bootstrapping while the step is pending/running state the cluster stucks.
I am not even able to ssh on the machine.
Following is my code to launch the cluster with custom jar step -
String dataTrasnferJar = s3://test/testApplication.jar;
if (dataTrasnferJar == null || dataTrasnferJar.isEmpty())
throw new InvalidS3ObjectException(
"EMR custom jar file path is null/empty. Please provide a valid jar file path");
HadoopJarStepConfig customJarConfig = new HadoopJarStepConfig().withJar(dataTrasnferJar);
StepConfig customJarStep = new StepConfig("Mongo_to_S3_Data_Transfer", customJarConfig)
.withActionOnFailure(ActionOnFailure.CONTINUE);
AmazonElasticMapReduce emr = AmazonElasticMapReduceClientBuilder.standard()
.withCredentials(awsCredentialsProvider)
.withRegion(region)
.build();
Application spark = new Application().withName("Spark");
String clusterName = "my-cluster-" + System.currentTimeMillis();
RunJobFlowRequest request = new RunJobFlowRequest()
.withName(clusterName)
.withReleaseLabel("emr-6.0.0")
.withApplications(spark)
.withVisibleToAllUsers(true)
.withSteps(customJarStep)
.withLogUri(loggingS3Bucket)
.withServiceRole("EMR_DefaultRole")
.withJobFlowRole("EMR_EC2_DefaultRole")
.withInstances(new JobFlowInstancesConfig()
.withEc2KeyName(key_pair)
.withInstanceCount(instanceCount)
.withEc2SubnetIds(subnetId)
.withAdditionalMasterSecurityGroups(securityGroup)
.withKeepJobFlowAliveWhenNoSteps(true)
.withMasterInstanceType(instanceType));
RunJobFlowResult result = emr.runJobFlow(request);
</details>
# 答案1
**得分**: 0
EMR emr-6.0.0版本仍在开发中。您可以尝试使用emr-5.29.0版本吗?
<details>
<summary>英文:</summary>
EMR emr-6.0.0 version is still in development. Can you try the same with emr-5.29.0?
</details>
专注分享java语言的经验与见解,让所有开发者获益!
评论