基於JPA的批量增加數據引發的幾種方式

原創

东方索

2020-02-21 03:18

這裏以mysql數據庫爲例，提供批量插入數據的高效方式，並做一定的對比，模擬10萬條數據

一、環境配置

Application.yml文件配置

server:
  port: 8086
spring:
  application:
    name: batch
  jpa:
    database: mysql
    show-sql: true
    properties:
      hibernate:
        dialect: org.hibernate.dialect.MySQL5InnoDBDialect
        generate_statistics: true
        jdbc:
          batch_size: 500
          batch_versioned_data: true
        order_inserts: true
        order_updates: true
  datasource:
    url: jdbc:mysql://localhost:3306/hr?rewriteBatchedStatements=true&serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8&useSSL=true&allowMultiQueries=true
    username: root
    password: ****
    driver-class-name: com.mysql.cj.jdbc.Driver

數據：

public void inits(){
        //這裏模擬10萬條數據
        for(int i=0;i<100000;i++){
            User user = new User();
            user.setAge(i);
            user.setId(i+"");
            user.setName("name"+i);
            userList.add(user);
        }
    }

二、幾種批量插入方式的比較

（1） JPA的SaveAll方法

傳統jpa的saveAll方法還是太慢了，加了配置依舊很慢，10萬數據需要23s的時間

（2）使用EntityManager的persist方法

    @PersistenceContext
    private EntityManager em;

    private static final int BATCH_SIZE = 10000;
 /**
     * 批量增加，需要配置，10萬條數據的消耗是 2549 ms
     * @param list
     */
    @Transactional(rollbackFor = Exception.class)
    public void batchInsertWithEntityManager(List<T> list){
        Iterator iterator = list.listIterator();
        int index = 0;
        while (iterator.hasNext()){
            em.persist(iterator.next());
            index++;
            if (index % BATCH_SIZE == 0){
                em.flush();
                em.clear();
            }
        }
        if (index % BATCH_SIZE != 0){
            em.flush();
            em.clear();
        }
    }

效果：

 @Test
    public void testBatchInsert(){
        long saveStart = System.currentTimeMillis();
        batchDao.batchInsertWithEntityManager(userList);
        long saveEnd = System.currentTimeMillis();
        System.out.println("the save total time is "+(saveEnd-saveStart)+" ms"); //the save total time is 2549 ms
        userDao.deleteAllInBatch(); //一條語句，批量刪除
    }

（3）使用jdbcTemplate的batchUpdate方法

    @Autowired
    private JdbcTemplate jdbcTemplate;
   /**
     * jdbcTemplate，batchUpdate增加，需自己定義sql，需要配置
     * @param list
     */
    public void batchWithJDBCTemplate(List<User> list){
        String sql = "Insert into t_user(id,name,age) values(?,?,?)";
        jdbcTemplate.batchUpdate(sql,new BatchPreparedStatementSetter() {
            @Override
            public void setValues(PreparedStatement ps, int i) throws SQLException {
                ps.setString(1,list.get(i).getId());
                ps.setString(2,list.get(i).getName());
                ps.setInt(3,list.get(i).getAge());
            }
            @Override
            public int getBatchSize() {
                return list.size();
            }
        });
    }

效果：

 @Test
    public void testBatchWithJDBC(){
        long saveStart = System.currentTimeMillis();
        batchDao.batchWithJDBCTemplate(userList);
        long saveEnd = System.currentTimeMillis();
        System.out.println("the save total time is "+(saveEnd-saveStart)+" ms"); // the save total time is 1078 ms
        userDao.deleteAllInBatch(); //一條語句，批量刪除
    }

（4）原生sql的方式

/**
     * 使用數據庫原生的方式執行，不需要配置
     * @param list
     */
    public void batchWithNativeSql(List<User> list) throws SQLException {
        String sql = "Insert into t_user(id,name,age) values(?,?,?)";
        DataSource dataSource = jdbcTemplate.getDataSource();
        try{
            Connection connection = dataSource.getConnection();
            connection.setAutoCommit(false);
            PreparedStatement ps = connection.prepareStatement(sql);
            final int batchSize = 10000;
            int count = 0;
            for(User user :list){
                ps.setString(1,user.getId());
                ps.setString(2,user.getName());
                ps.setInt(3,user.getAge());
                ps.addBatch();
                count++;
                if(count % batchSize == 0 || count == list.size()) {
                    ps.executeBatch();
                    ps.clearBatch();
                }
            }
            connection.commit();
        }catch (SQLException e){
            e.printStackTrace();
        }
    }

效果：

 @Test
    public void testBatchWithNativeSql() throws SQLException {
        long saveStart = System.currentTimeMillis();
        batchDao.batchWithNativeSql(userList);
        long saveEnd = System.currentTimeMillis();
        System.out.println("the save total time is "+(saveEnd-saveStart)+" ms"); // the save total time is 899 ms
        userDao.deleteAllInBatch();
    }

三、結論

JdbcTemplate的batchUpdate和原生SQL操作兩種方式基本上能滿足數據大的操作需求，前者需要在進行配置，而後者不需要配置。對於小批量的數據操作，根據自己的需要選擇。

溫馨提醒：配置失效或者沒有效果，請重新檢查一下配置文件，同時也檢查表是否存在觸發器，有觸發器則需要與相關人員進行溝通，建議把觸發器涉及的業務進行抽取成幾張表的批量插入。

東方索

發佈了33 篇原創文章 · 獲贊 6 · 訪問量 10萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

基於JPA的批量增加數據引發的幾種方式

一、環境配置

二、幾種批量插入方式的比較

（1） JPA的SaveAll方法

（2）使用EntityManager的persist方法

（3）使用jdbcTemplate的batchUpdate方法

（4）原生sql的方式

三、結論

工作中用到的腳本合集

微服務實踐Aspire項目發佈到遠程k8s集羣

通過f-string編寫簡潔高效的Python格式化輸出代碼

[轉帖]20個常用的Linux工具命令

[轉帖]PostgreSQL從小白到高手教程 - 第46講：poc-tpch測試

24-5-18 X

Android_通知(Notification)

Intent的用法

Android__Service

The summary of Java

Android__數據存儲

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結