多線程+JPA批量insert 實現三分鐘造100w測試數據

多線程+JPA批量insert實現三分鐘造100w測試數據
  一、實現環境、場景
 
二、實現步驟
 
三、注意事項與實踐心得
 
四、遇到的問題
 
五、參考文獻
一、實現環境、場景
 1.工程環境
  SpringBoot -- 1.5.9.RELEASE
  JDK -- 1.8
  數據源 -- Druid
  數據庫 -- mysql
 2.實現場景
  短時間內批量造數據,比如100w條
  
二、實現步驟
  
1.pom文件

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
 <modelVersion>4.0.0</modelVersion>
 <parent>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-parent</artifactId>
  <version>1.5.9.RELEASE</version>
  <relativePath/>
 </parent>
 <groupId>com.example</groupId>
 <artifactId>demo</artifactId>
 <version>0.0.1-SNAPSHOT</version>
 <name>demo</name>
 <description>Demo project for Spring Boot</description>

 <properties>
  <java.version>1.8</java.version>
 </properties>

 <dependencies>
  <dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-starter-data-jpa</artifactId>
  </dependency>
  <dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-starter-jdbc</artifactId>
  </dependency>
  <dependency>
   <groupId>com.alibaba</groupId>
   <artifactId>druid</artifactId>
   <version>1.0.31</version>
  </dependency>
  <dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-devtools</artifactId>
   <optional>true</optional>
  </dependency>
  <dependency>
   <groupId>mysql</groupId>
   <artifactId>mysql-connector-java</artifactId>
  </dependency>
  <dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-configuration-processor</artifactId>
   <optional>true</optional>
  </dependency>
  <dependency>
   <groupId>org.projectlombok</groupId>
   <artifactId>lombok</artifactId>
   <optional>true</optional>
  </dependency>
 </dependencies>
 <build>
  <plugins>
   <plugin>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-maven-plugin</artifactId>
   </plugin>
  </plugins>
 </build>
</project>

2.數據源的配置
2.1 建表語句 -- 此外還要設置自增主鍵

CREATE TABLE `t_emp` (
  `Id` int(11) NOT NULL,
  `Name` varchar(255) DEFAULT NULL,
  `JoinTime` datetime DEFAULT NULL,
  PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

2.2 application.yml配置

spring:
  datasource:
  type: com.alibaba.druid.pool.DruidDataSource      # 當前數據源操作類型
  jpa:
  hibernate:
    naming:
    physical-strategy: org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
#   show-sql: true
  properties:
    hibernate:
    jdbc:
      batch_size: 800
      batch_versioned_data: true
    order_inserts: true

2.3 db.properties配置

druid.driver-class-name=org.gjt.mm.mysql.Driver
druid.url=jdbc:mysql://localhost:33306/clouddb01?characterEncoding=utf-8&useSSL=false
druid.username=root
druid.password=xxxx
druid.minIdle=10
druid.testWhileIdle=false
druid.initialSize=300
druid.maxWait=200
druid.maxActive=300

2.4 數據源配置類

package com.example.databatch.config;

import com.alibaba.druid.pool.DruidDataSource;
import lombok.Data;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.PropertySource;
import org.springframework.stereotype.Component;
import javax.sql.DataSource;

@Configuration
public class DatasourceConfig {
 @Autowired
 private DataSourceProperties dbProperties;

 @Bean
 public DataSource druid() {
  DruidDataSource dataSource = new DruidDataSource();
  dataSource.setUsername(dbProperties.getUsername());
  dataSource.setPassword(dbProperties.getPassword());
  dataSource.setDriverClassName(dbProperties.getDriverClassName());
  dataSource.setUrl(dbProperties.getUrl());
  dataSource.setMinIdle(dbProperties.getMinIdle());
  dataSource.setInitialSize(dbProperties.getInitialSize());
  dataSource.setMaxWait(dbProperties.getMaxWait());
  dataSource.setTestWhileIdle(dbProperties.isTestWhileIdle());
  dataSource.setMaxActive(dbProperties.getMaxActive());
  return dataSource;
 }
}

@Data
@Component
@PropertySource(value = "classpath:db.properties")
@ConfigurationProperties(prefix = "druid")
class DataSourceProperties {
 private String driverClassName;
 private String url;
 private String username;
 private String password;
 private Integer minIdle;
 private Integer initialSize;
 private Integer maxWait;
 private boolean testWhileIdle;
 private Integer maxActive;
}

3.實體類和業務代碼
3.1實體類

/**
 * @ClassName Employee
 * @Description 批量插入的員工數據
 * @Date 2020/2/20 12:30
 **/
@Data
@Entity
@ToString
@AllArgsConstructor
@NoArgsConstructor
@Table(name = "t_emp")
public class Employee {

 @Id
 @Column(name = "Id")
 @GeneratedValue(strategy = GenerationType.IDENTITY)
 private Integer id;

 @Column(name = "Name")
 private String name;

 // 配合JPA hibernate註解, 見application.yml
 // org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
 @Column(name = "JoinTime")
 private Date joinTime;
}

3.2 repository接口

package com.example.databatch.repository;

import com.example.databatch.entities.Employee;
import org.springframework.data.jpa.repository.JpaRepository;

public interface EmpRepository extends JpaRepository<Employee, Integer> {}

3.3 隨機工具類

import java.util.Random;

/**
 * @ClassName RandomUtil
 * @Date 2020/2/20 22:38
 **/
public class RandomUtil {
  private static Random random = new Random();
  public static synchronized String generateName(){
    int i = random.nextInt(5) + 3; // 名字長度
    char[] chars = new char[i];
    for (int j = 0; j<i; j++){
      int charR = random.nextInt(25)+97;
      if (j == 0){
        int i1 = charR - 32;
        chars[j] = (char) i1;
      }else{
        chars[j] = (char) charR;
      }
    }
    String name = new String(chars);
    return name;
  }
  public static Character getChar(){
    return null;
  }
  public static void main(String[] args) {
//    int first = 'a';
//    int firstA = 'A';
//    int last = 'z';
//    System.out.println(first);
//    System.out.println(firstA);
//    System.out.println(last);
    for (;;)
      System.out.println(generateName());
  }
}

3.4 主要的測試service類

package com.example.databatch.service;

import com.example.databatch.entities.Employee;
import com.example.databatch.repository.EmpRepository;
import com.example.databatch.utils.RandomUtil;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;
import org.springframework.stereotype.Service;

import javax.annotation.PostConstruct;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.concurrent.Callable;

/**
 * @ClassName DataBatchService
 * @Date 2020/2/20 22:26
 **/
@Service
@Slf4j
public class DataBatchService {
  private int threadId = 1;
  private int batchSize = 1000;
  @Autowired
  private EmpRepository empRepository;

  @PostConstruct
  public void init() throws InterruptedException {
    ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
    threadPoolTaskExecutor.setCorePoolSize(100);
    threadPoolTaskExecutor.setMaxPoolSize(100);
    threadPoolTaskExecutor.setKeepAliveSeconds(3000);
    threadPoolTaskExecutor.setQueueCapacity(500);
    // 線程池需要初始化!
    threadPoolTaskExecutor.initialize();
    // 100個線程
    for (int i = 0; i < 100; i++) {
      Callable callable = new Callable() {
        @Override
        public Object call() throws Exception {
          produceData();
          return "";
        }
      };
      threadPoolTaskExecutor.submit(callable);
    }
  }
  public void produceData() throws InterruptedException {
    log.info("-- thread-- " + (threadId++) + " is working");
    List<Employee> list = new ArrayList<Employee>();
    // 每個線程產生1w條數據
    for (int i = 0; i < 10000; i++) {
      try{
      Employee emp = new Employee(RandomUtil.generateName(), new Date());
      list.add(emp);
      if (list.size() % batchSize == 0) {
        log.info("list當前大小{}", list.size());
        empRepository.save(list);
        empRepository.flush();
        //Thread.sleep(3);
        list.clear();
        log.info("---list當前大小{}", list.size());
      }
    } catch(Exception e){
      log.error("發生錯誤:{}", e);
    }
  }
}
}

3.5 主啓動類

@SpringBootApplication
public class DataBatchApplication {
  public static void main(String[] args) throws IOException {
    SpringApplication.run(DataBatchApplication.class, args);
    while(true) {
      System.in.read();
    }
  }
}

三、注意事項與實踐心得
1.當表名與實體類的姓名不一致時,在實體類上加@Table(name = "t_emp")註解
2.當表的字段名與實體類不一致時,在實體類的字段名上加@Column(name = "表字段名")
注:如果數據庫的字段名不是“word1_word2”格式的,需要在配置文件中添加【參考文獻11】:
  
3.MySql 中 json 格式映射到 JPA / Hibernate,實體類直接拿json格式的String映射,或者參考:
如何把 MySql 中 json 格式映射到 JPA / Hibernate
4.還需要看數據庫安裝時的最大連接數配置

四、遇到的問題
1.BeanCreationNotAllowedException: Singleton bean creation not allowed while singletons of this facto  看了 https://blog.csdn.net/VICTOR_fusheng/article/details/79095991?utm_source=distribute.pc_relevant.none-task 和 https://blog.csdn.net/kyq_1024yahoocn/article/details/70155422 受到啓發
解決方法,在主啓動類的main方法中加入一下代碼:
while(true) {
System.in.read();
     }
2.聯合主鍵設置參考【文獻9】
3.JPA進行insert操作時會首先select嗎【參考文獻3】
 我在測試的時候,就是發現了JPA先select,爲了保證主鍵唯一(因爲之前沒有在主鍵上設置主鍵生成策略),先按實體類標記的主鍵查找記錄(中間難免序列化,會極大的消耗CPU性能,加上如果標記的主鍵與數據庫設置的不一致,隨着數量增多,會越來越慢),再插入新記錄!這是大坑!

五、參考文獻
1.java求排列組合數 https://blog.csdn.net/qq_21808961/article/details/77990940
2.jpa批量插入mysql https://www.jianshu.com/p/c50b1e233ef9
3.JPA進行insert操作時會首先select嗎 https://blog.csdn.net/weixin_30867015/article/details/95404588
4.MySQL數據庫最大連接數和單次最大寫入量 https://blog.csdn.net/superit401/article/details/97757273
5.spring data jpa開啓批量插入、批量更新 https://www.cnblogs.com/blog5277/p/10661096.html
6.Spring Boot線程池配置使用 https://blog.csdn.net/mdw5521/article/details/79446075
7.springboot中的線程池的詳細配置 https://blog.csdn.net/YoungLee16/article/details/84184056
8.JPA之聯合主鍵[複合主鍵] https://blog.csdn.net/u014268482/article/details/81027274
9.Spring Boot JPA 使用以及設置多個主鍵 https://blog.csdn.net/xx326664162/article/details/80053719
10.java實現排列組合(通俗易懂) https://www.cnblogs.com/zzlback/p/10947064.html
11.spring boot 中Spring data jpa命名策略 https://www.cnblogs.com/lovechengyu/p/8032039.html
12.Mysql 查看連接數,狀態 最大併發數(贊) https://www.cnblogs.com/haciont/p/6277675.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章