Spring Batch 寫Excel數據

Spring Batch 寫Excel數據

本文通過示例介紹如何通過Spring Batch寫數據至Excel。

1. Apache POI介紹

Apache poi 是解析微軟office文檔的Java工具庫,支持解析Excel,Word,PowerPoint甚至Visio.

本文聚焦Excel及當前最新版本xlsx,POI提供低內存佔用API寫xlsx文件。通過設置滑動窗口值指定在內存中緩存多少行,如果超過窗口大小,內容被寫到磁盤臨時文件,最後數據被移動至特定目標文件。

Spring Batch 支持CSV文件,但默認不支持Xlsx文件,github提供excel讀寫組件版本比較老,不支持POI sxssf api。

2. Spring Batch SXSSF 示例

爲了簡化過程,主要說明寫Excel文件。我們使用Spring Boot搭建項目,主要依賴如下:

dependencies {

    compile group: 'org.apache.poi', name: 'poi', version: '4.0.1'
    compile group: 'org.apache.poi', name: 'poi-ooxml', version: '4.0.1'
    compile group: 'org.apache.poi', name: 'poi-ooxml-schemas', version: '4.0.1'
    compile "com.google.guava:guava:23.0"

    implementation 'org.springframework.boot:spring-boot-starter-batch'
    implementation 'org.springframework.boot:spring-boot-starter-data-jpa'
    compileOnly 'org.projectlombok:lombok'
    runtimeOnly 'com.h2database:h2'

    annotationProcessor 'org.projectlombok:lombok'
    testImplementation('org.springframework.boot:spring-boot-starter-test') {
        exclude group: 'org.junit.vintage', module: 'junit-vintage-engine'
    }
    testImplementation 'org.springframework.batch:spring-batch-test'
}

使用h2數據庫和jpa說明從數據庫讀取信息。下面看實現過程。

假如書店有售書籍系統,需要獲得書籍訂單信息,需要把書記信息列表導出excel。下面JPA標識書店應用邏輯,定義Book實體和Repository從數據庫中讀取信息。

@Data
@Builder
@Entity
@AllArgsConstructor
@NoArgsConstructor
public class Book {
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Integer id;
    private String title;
    private String author;
    private String isbn;
}

lombok的@Data註解針對所有屬性生成toString, equals, hashCode, getters 和 setters方法。

Repository定義:

public interface BookRepository extends PagingAndSortingRepository<Book, Long> {

}

下面創建 RepositoryItemReader 從 JPA 讀取數據,並啓用JPA repository。在配置類中定義讀步驟:

    @Bean
    public ItemReader<Book> bookReader(BookRepository repository) {
        RepositoryItemReader<Book> reader = new RepositoryItemReader<>();
        reader.setRepository(repository);
        reader.setMethodName("findAll");
        reader.setPageSize(CHUNK);
        reader.setSort(singletonMap("id", ASC));
        return reader;
    }

讀取書籍信息是爲了寫Excel,所有信息都保存在Workbook,我們定義一個Bean:

@Bean
public SXSSFWorkbook workbook() {
    return new SXSSFWorkbook(CHUNK);
}

SXSSFWorkbook 在內存中的數量有CHUNK靜態常量定義。使用workbook定義sheet,我們先聲明BookWriter,後面詳細解釋:

    @Bean
    public ItemWriter<Book> bookWriter(SXSSFWorkbook workbook) {
        SXSSFSheet sheet = workbook.createSheet("Books");
        return new BookWriter(sheet);
    }

讀寫組件都有,接着可以定義step和job,完整的配置類代碼:

@Configuration
@EnableBatchProcessing
public class BatchConfiguration {
    private static final Integer CHUNK = 100;

    private final JobBuilderFactory jobBuilderFactory;
    private final StepBuilderFactory stepBuilderFactory;

    @Autowired
    public BatchConfiguration(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
        this.jobBuilderFactory = jobBuilderFactory;
        this.stepBuilderFactory = stepBuilderFactory;
    }

    @Bean
    public SXSSFWorkbook workbook() {
        return new SXSSFWorkbook(CHUNK);
    }

    @Bean
    public Job job(Step step, JobExecutionListener listener) {
        return jobBuilderFactory.get("exportBooksToXlsx")
                .start(step)
                .listener(listener)
                .build();
    }

    @Bean
    public Step step(ItemReader<Book> reader, ItemWriter<Book> writer) {
        return stepBuilderFactory.get("export")
                .<Book, Book>chunk(CHUNK)
                .reader(reader)
                .writer(writer)
                .build();
    }

    @Bean
    public ItemReader<Book> bookReader(BookRepository repository) {
        RepositoryItemReader<Book> reader = new RepositoryItemReader<>();
        reader.setRepository(repository);
        reader.setMethodName("findAll");
        reader.setPageSize(CHUNK);
        reader.setSort(singletonMap("id", ASC));
        return reader;
    }

    @Bean
    public ItemWriter<Book> bookWriter(SXSSFWorkbook workbook) {
        SXSSFSheet sheet = workbook.createSheet("Books");
        return new BookWriter(sheet);
    }

    @Bean
    JobListener jobListener(SXSSFWorkbook workbook, BookRepository bookRepository) {
        return new JobListener(workbook, bookRepository);
    }
}

下面詳細解釋BookWriter類,其針對每條書籍信息創建一行,屬性作爲列進行存儲。

public class BookWriter implements ItemWriter<Book> {
    private final Sheet sheet;

    public BookWriter(Sheet sheet) {
        this.sheet = sheet;
    }

    @Override
    public void write(List<? extends Book> list) {
        for (int i = 0; i < list.size(); i++) {
            writeRow(i+1, list.get(i));
        }
    }

    private void writeRow(int currentRowNumber, Book book) {
        List<String> columns = prepareColumns(book);
        Row row = this.sheet.createRow(currentRowNumber);
        for (int i = 0; i < columns.size(); i++) {
            writeCell(row, i, columns.get(i));
        }
    }

    private List<String> prepareColumns(Book book) {
        return asList(
                book.getId().toString(),
                book.getAuthor(),
                book.getTitle(),
                book.getIsbn()
        );
    }

    private void writeCell(Row row, int currentColumnNumber, String value) {
        Cell cell = row.createCell(currentColumnNumber);
        cell.setCellValue(value);
    }
}

當Job狀態改變時,會自動調用JobExecutionListener。在Job完成時,使用FileOutputStream保存數據至xlsx文件:


@Slf4j
public class JobListener implements JobExecutionListener {
    private final SXSSFWorkbook workbook;
    private final BookRepository bookRepository;


    public JobListener(SXSSFWorkbook workbook, BookRepository bookRepository) {
        this.workbook = workbook;
        this.bookRepository = bookRepository;
    }

    @Override
    public void afterJob(JobExecution jobExecution) {
        BatchStatus batchStatus = jobExecution.getStatus().getBatchStatus();
        String resName = (String)jobExecution.getExecutionContext().get("resourceName");
        assert resName != null;

        if (batchStatus == COMPLETED) {
            try {
                FileOutputStream fileOutputStream = new FileOutputStream(resName+".xlsx");
                workbook.write(fileOutputStream);
                fileOutputStream.close();
            } catch (IOException e) {
                log.error(e.getMessage(), e);
            }
        }
    }

    @Override
    public void beforeJob(JobExecution jobExecution) {
        initializeBooks(bookRepository);

        ExecutionContext context = jobExecution.getExecutionContext();
        context.put("titles", Arrays.asList("id","title","author","isbn"));
        context.put("resourceName", "書籍信息信息");

        headRow((List<String>) Objects.requireNonNull(context.get("titles")));
    }

    private void headRow(List<String> titles){
        Row row = workbook.getSheetAt(0).createRow(0);
        for (int i = 0; i < titles.size(); i++) {
            writeCell(row, i, titles.get(i));
        }
    }

    private void writeCell(Row row, int currentColumnNumber, String value) {
        Cell cell = row.createCell(currentColumnNumber);
        cell.setCellValue(value);
    }

    private void initializeBooks(BookRepository bookRepository) {
        Set<Book> books = new HashSet<>();
        Book.BookBuilder builder = Book.builder();
        books.add(builder.author("John Doe").title("Forbid tails").isbn("1111-111-111-111").build());
        books.add(builder.author("Mary Doe").title("Not found title").isbn("2222-222-222-222").build());
        bookRepository.saveAll(books);
    }
}

initializeBooks方法增加模擬數據。beforeJob方法在Job啓動之前調用,動態設置excel標題信息和導出文件名稱。

3. 總結

本文使用Spring Batch從數據庫讀取信息寫入Excel。使用POI的sxssf api佔用較小內存,避免大數據量造成系統罷工。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章