Spring Batch 寫Excel數據
本文通過示例介紹如何通過Spring Batch寫數據至Excel。
1. Apache POI介紹
Apache poi 是解析微軟office文檔的Java工具庫,支持解析Excel,Word,PowerPoint甚至Visio.
本文聚焦Excel及當前最新版本xlsx,POI提供低內存佔用API寫xlsx文件。通過設置滑動窗口值指定在內存中緩存多少行,如果超過窗口大小,內容被寫到磁盤臨時文件,最後數據被移動至特定目標文件。
Spring Batch 支持CSV文件,但默認不支持Xlsx文件,github提供excel讀寫組件版本比較老,不支持POI sxssf api。
2. Spring Batch SXSSF 示例
爲了簡化過程,主要說明寫Excel文件。我們使用Spring Boot搭建項目,主要依賴如下:
dependencies {
compile group: 'org.apache.poi', name: 'poi', version: '4.0.1'
compile group: 'org.apache.poi', name: 'poi-ooxml', version: '4.0.1'
compile group: 'org.apache.poi', name: 'poi-ooxml-schemas', version: '4.0.1'
compile "com.google.guava:guava:23.0"
implementation 'org.springframework.boot:spring-boot-starter-batch'
implementation 'org.springframework.boot:spring-boot-starter-data-jpa'
compileOnly 'org.projectlombok:lombok'
runtimeOnly 'com.h2database:h2'
annotationProcessor 'org.projectlombok:lombok'
testImplementation('org.springframework.boot:spring-boot-starter-test') {
exclude group: 'org.junit.vintage', module: 'junit-vintage-engine'
}
testImplementation 'org.springframework.batch:spring-batch-test'
}
使用h2數據庫和jpa說明從數據庫讀取信息。下面看實現過程。
假如書店有售書籍系統,需要獲得書籍訂單信息,需要把書記信息列表導出excel。下面JPA標識書店應用邏輯,定義Book實體和Repository從數據庫中讀取信息。
@Data
@Builder
@Entity
@AllArgsConstructor
@NoArgsConstructor
public class Book {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Integer id;
private String title;
private String author;
private String isbn;
}
lombok的@Data註解針對所有屬性生成toString, equals, hashCode, getters 和 setters方法。
Repository定義:
public interface BookRepository extends PagingAndSortingRepository<Book, Long> {
}
下面創建 RepositoryItemReader 從 JPA 讀取數據,並啓用JPA repository。在配置類中定義讀步驟:
@Bean
public ItemReader<Book> bookReader(BookRepository repository) {
RepositoryItemReader<Book> reader = new RepositoryItemReader<>();
reader.setRepository(repository);
reader.setMethodName("findAll");
reader.setPageSize(CHUNK);
reader.setSort(singletonMap("id", ASC));
return reader;
}
讀取書籍信息是爲了寫Excel,所有信息都保存在Workbook,我們定義一個Bean:
@Bean
public SXSSFWorkbook workbook() {
return new SXSSFWorkbook(CHUNK);
}
SXSSFWorkbook 在內存中的數量有CHUNK靜態常量定義。使用workbook定義sheet,我們先聲明BookWriter,後面詳細解釋:
@Bean
public ItemWriter<Book> bookWriter(SXSSFWorkbook workbook) {
SXSSFSheet sheet = workbook.createSheet("Books");
return new BookWriter(sheet);
}
讀寫組件都有,接着可以定義step和job,完整的配置類代碼:
@Configuration
@EnableBatchProcessing
public class BatchConfiguration {
private static final Integer CHUNK = 100;
private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;
@Autowired
public BatchConfiguration(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
}
@Bean
public SXSSFWorkbook workbook() {
return new SXSSFWorkbook(CHUNK);
}
@Bean
public Job job(Step step, JobExecutionListener listener) {
return jobBuilderFactory.get("exportBooksToXlsx")
.start(step)
.listener(listener)
.build();
}
@Bean
public Step step(ItemReader<Book> reader, ItemWriter<Book> writer) {
return stepBuilderFactory.get("export")
.<Book, Book>chunk(CHUNK)
.reader(reader)
.writer(writer)
.build();
}
@Bean
public ItemReader<Book> bookReader(BookRepository repository) {
RepositoryItemReader<Book> reader = new RepositoryItemReader<>();
reader.setRepository(repository);
reader.setMethodName("findAll");
reader.setPageSize(CHUNK);
reader.setSort(singletonMap("id", ASC));
return reader;
}
@Bean
public ItemWriter<Book> bookWriter(SXSSFWorkbook workbook) {
SXSSFSheet sheet = workbook.createSheet("Books");
return new BookWriter(sheet);
}
@Bean
JobListener jobListener(SXSSFWorkbook workbook, BookRepository bookRepository) {
return new JobListener(workbook, bookRepository);
}
}
下面詳細解釋BookWriter類,其針對每條書籍信息創建一行,屬性作爲列進行存儲。
public class BookWriter implements ItemWriter<Book> {
private final Sheet sheet;
public BookWriter(Sheet sheet) {
this.sheet = sheet;
}
@Override
public void write(List<? extends Book> list) {
for (int i = 0; i < list.size(); i++) {
writeRow(i+1, list.get(i));
}
}
private void writeRow(int currentRowNumber, Book book) {
List<String> columns = prepareColumns(book);
Row row = this.sheet.createRow(currentRowNumber);
for (int i = 0; i < columns.size(); i++) {
writeCell(row, i, columns.get(i));
}
}
private List<String> prepareColumns(Book book) {
return asList(
book.getId().toString(),
book.getAuthor(),
book.getTitle(),
book.getIsbn()
);
}
private void writeCell(Row row, int currentColumnNumber, String value) {
Cell cell = row.createCell(currentColumnNumber);
cell.setCellValue(value);
}
}
當Job狀態改變時,會自動調用JobExecutionListener。在Job完成時,使用FileOutputStream保存數據至xlsx文件:
@Slf4j
public class JobListener implements JobExecutionListener {
private final SXSSFWorkbook workbook;
private final BookRepository bookRepository;
public JobListener(SXSSFWorkbook workbook, BookRepository bookRepository) {
this.workbook = workbook;
this.bookRepository = bookRepository;
}
@Override
public void afterJob(JobExecution jobExecution) {
BatchStatus batchStatus = jobExecution.getStatus().getBatchStatus();
String resName = (String)jobExecution.getExecutionContext().get("resourceName");
assert resName != null;
if (batchStatus == COMPLETED) {
try {
FileOutputStream fileOutputStream = new FileOutputStream(resName+".xlsx");
workbook.write(fileOutputStream);
fileOutputStream.close();
} catch (IOException e) {
log.error(e.getMessage(), e);
}
}
}
@Override
public void beforeJob(JobExecution jobExecution) {
initializeBooks(bookRepository);
ExecutionContext context = jobExecution.getExecutionContext();
context.put("titles", Arrays.asList("id","title","author","isbn"));
context.put("resourceName", "書籍信息信息");
headRow((List<String>) Objects.requireNonNull(context.get("titles")));
}
private void headRow(List<String> titles){
Row row = workbook.getSheetAt(0).createRow(0);
for (int i = 0; i < titles.size(); i++) {
writeCell(row, i, titles.get(i));
}
}
private void writeCell(Row row, int currentColumnNumber, String value) {
Cell cell = row.createCell(currentColumnNumber);
cell.setCellValue(value);
}
private void initializeBooks(BookRepository bookRepository) {
Set<Book> books = new HashSet<>();
Book.BookBuilder builder = Book.builder();
books.add(builder.author("John Doe").title("Forbid tails").isbn("1111-111-111-111").build());
books.add(builder.author("Mary Doe").title("Not found title").isbn("2222-222-222-222").build());
bookRepository.saveAll(books);
}
}
initializeBooks方法增加模擬數據。beforeJob方法在Job啓動之前調用,動態設置excel標題信息和導出文件名稱。
3. 總結
本文使用Spring Batch從數據庫讀取信息寫入Excel。使用POI的sxssf api佔用較小內存,避免大數據量造成系統罷工。