如何在使用Spring Batch时跳过空行

huangapple 未分类评论42阅读模式
英文:

How to skip empty rows using Spring Batch

问题

我正在使用Spring Batch读取固定长度的平面文件,并且我希望在批处理过程中跳过空行和不正确的行。在下面的示例中,我还希望跳过以字符“------”开头的行。

你可以通过使用Skip策略或其他方式来实现。以下是一个使用Skip策略的示例:

import org.springframework.batch.core.SkipListener;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepScope;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.LineMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.FixedLengthTokenizer;
import org.springframework.batch.item.file.transform.Range;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;

@Configuration
@EnableBatchProcessing
public class SpringBatchConfig {

    @Bean
    public Job job(JobBuilderFactory jobBuilderFactory,
                   StepBuilderFactory stepBuilderFactory,
                   ItemReader<Aluno> itemReader,
                   ItemWriter<Aluno> itemWriter) {

        Step step = stepBuilderFactory.get("ETL-file-load")
                .<Aluno, Aluno>chunk(100)
                .reader(itemReader)
                .writer(itemWriter)
                .faultTolerant()
                .skipPolicy(skipPolicy())
                .build();

        return jobBuilderFactory.get("ETL-Load")
                .incrementer(new RunIdIncrementer())
                .start(step)
                .build();
    }

    @Bean
    @StepScope
    public FlatFileItemReader<Aluno> itemReader(@Value("${input}") Resource resource) {
        FlatFileItemReader<Aluno> flatFileItemReader = new FlatFileItemReader<>();
        flatFileItemReader.setResource(resource);
        flatFileItemReader.setName("CSV-Reader");
        flatFileItemReader.setLinesToSkip(2);
        flatFileItemReader.setLineMapper(lineMapper());
        return flatFileItemReader;
    }

    @Bean
    public LineMapper<Aluno> lineMapper() {
        DefaultLineMapper<Aluno> lineMapper = new DefaultLineMapper<>();
        lineMapper.setLineTokenizer(tokenizer());
        lineMapper.setFieldSetMapper(new AlunoFieldSetMapper());
        return lineMapper;
    }

    @Bean
    public FixedLengthTokenizer tokenizer() {
        FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
        tokenizer.setNames("name", "studentId", "courseCode");
        tokenizer.setColumns(new Range(1, 40), new Range(41, 48), new Range(49, 55));
        return tokenizer;
    }

    @Bean
    public SkipPolicy skipPolicy() {
        return new CustomSkipPolicy();
    }

    @Bean
    public SkipListener<Aluno, Aluno> skipListener() {
        return new CustomSkipListener();
    }
}

请注意,上述示例代码中使用了自定义的跳过策略 (CustomSkipPolicy) 和跳过监听器 (CustomSkipListener)。你需要根据实际情况来实现这些类。

英文:

I'm reading a fixed lenght flatfile with Spring Batch and I would like to skip empty rows and incorrect rows for my batch processing. In the exemple bellow I'm also want to skip rows that starts with the characters "------".

Could you please help me giving an exemple using Skip Policy or other ways?

My file:

---------------------------A---------------------------

AARON THIAGO LOPES                       3099234 100-11
 
AARON PAPA DA SILVA                      8610822 160-26

ABNER MENEZEZ SOUZA                      1494778 500-35

EDSON EDUARD MOZART                      1286664 500-34 


//Method that reads the file.

@Configuration
@EnableBatchProcessing
public class SpringBatchConfig {

    @Bean
    public Job job(JobBuilderFactory jobBuilderFactory,
                   StepBuilderFactory stepBuilderFactory,
                   ItemReader&lt;Aluno&gt; itemReader,
                   ItemWriter&lt;Aluno&gt; itemWriter){

        Step step = stepBuilderFactory.get(&quot;ETL-file-load&quot;)
                .&lt;Aluno, Aluno&gt;chunk(100)
                .reader(itemReader)
                .writer(itemWriter)
                .build();

        return jobBuilderFactory.get(&quot;ETL-Load&quot;)
                .incrementer(new RunIdIncrementer())
                .start(step)
                .build();
    }

    @Bean
    public FlatFileItemReader&lt;Aluno&gt; itemReader(@Value(&quot;${input}&quot;) Resource resource) {
        FlatFileItemReader&lt;Aluno&gt; flatFileItemReader = new FlatFileItemReader&lt;&gt;();
        flatFileItemReader.setResource(resource);
        flatFileItemReader.setName(&quot;CSV-Reader&quot;);
        flatFileItemReader.setLinesToSkip(2);
        flatFileItemReader.setLineMapper(lineMapper());
        return flatFileItemReader;
    }

huangapple
  • 本文由 发表于 2020年4月4日 13:18:05
  • 转载请务必保留本文链接:https://java.coder-hub.com/61024094.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定