多線程+JPA批量insert實現三分鐘造100w測試數據
一、實現環境、場景
二、實現步驟
三、注意事項與實踐心得
四、遇到的問題
五、參考文獻
一、實現環境、場景
1.工程環境
SpringBoot -- 1.5.9.RELEASE
JDK -- 1.8
數據源 -- Druid
數據庫 -- mysql
2.實現場景
短時間內批量造數據,比如100w條
二、實現步驟
1.pom文件
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>1.5.9.RELEASE</version>
<relativePath/>
</parent>
<groupId>com.example</groupId>
<artifactId>demo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>demo</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid</artifactId>
<version>1.0.31</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
2.數據源的配置
2.1 建表語句 -- 此外還要設置自增主鍵
CREATE TABLE `t_emp` (
`Id` int(11) NOT NULL,
`Name` varchar(255) DEFAULT NULL,
`JoinTime` datetime DEFAULT NULL,
PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
2.2 application.yml配置
spring:
datasource:
type: com.alibaba.druid.pool.DruidDataSource # 當前數據源操作類型
jpa:
hibernate:
naming:
physical-strategy: org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
# show-sql: true
properties:
hibernate:
jdbc:
batch_size: 800
batch_versioned_data: true
order_inserts: true
2.3 db.properties配置
druid.driver-class-name=org.gjt.mm.mysql.Driver
druid.url=jdbc:mysql://localhost:33306/clouddb01?characterEncoding=utf-8&useSSL=false
druid.username=root
druid.password=xxxx
druid.minIdle=10
druid.testWhileIdle=false
druid.initialSize=300
druid.maxWait=200
druid.maxActive=300
2.4 數據源配置類
package com.example.databatch.config;
import com.alibaba.druid.pool.DruidDataSource;
import lombok.Data;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.PropertySource;
import org.springframework.stereotype.Component;
import javax.sql.DataSource;
@Configuration
public class DatasourceConfig {
@Autowired
private DataSourceProperties dbProperties;
@Bean
public DataSource druid() {
DruidDataSource dataSource = new DruidDataSource();
dataSource.setUsername(dbProperties.getUsername());
dataSource.setPassword(dbProperties.getPassword());
dataSource.setDriverClassName(dbProperties.getDriverClassName());
dataSource.setUrl(dbProperties.getUrl());
dataSource.setMinIdle(dbProperties.getMinIdle());
dataSource.setInitialSize(dbProperties.getInitialSize());
dataSource.setMaxWait(dbProperties.getMaxWait());
dataSource.setTestWhileIdle(dbProperties.isTestWhileIdle());
dataSource.setMaxActive(dbProperties.getMaxActive());
return dataSource;
}
}
@Data
@Component
@PropertySource(value = "classpath:db.properties")
@ConfigurationProperties(prefix = "druid")
class DataSourceProperties {
private String driverClassName;
private String url;
private String username;
private String password;
private Integer minIdle;
private Integer initialSize;
private Integer maxWait;
private boolean testWhileIdle;
private Integer maxActive;
}
3.實體類和業務代碼
3.1實體類
/**
* @ClassName Employee
* @Description 批量插入的員工數據
* @Date 2020/2/20 12:30
**/
@Data
@Entity
@ToString
@AllArgsConstructor
@NoArgsConstructor
@Table(name = "t_emp")
public class Employee {
@Id
@Column(name = "Id")
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
@Column(name = "Name")
private String name;
// 配合JPA hibernate註解, 見application.yml
// org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
@Column(name = "JoinTime")
private Date joinTime;
}
3.2 repository接口
package com.example.databatch.repository;
import com.example.databatch.entities.Employee;
import org.springframework.data.jpa.repository.JpaRepository;
public interface EmpRepository extends JpaRepository<Employee, Integer> {}
3.3 隨機工具類
import java.util.Random;
/**
* @ClassName RandomUtil
* @Date 2020/2/20 22:38
**/
public class RandomUtil {
private static Random random = new Random();
public static synchronized String generateName(){
int i = random.nextInt(5) + 3; // 名字長度
char[] chars = new char[i];
for (int j = 0; j<i; j++){
int charR = random.nextInt(25)+97;
if (j == 0){
int i1 = charR - 32;
chars[j] = (char) i1;
}else{
chars[j] = (char) charR;
}
}
String name = new String(chars);
return name;
}
public static Character getChar(){
return null;
}
public static void main(String[] args) {
// int first = 'a';
// int firstA = 'A';
// int last = 'z';
// System.out.println(first);
// System.out.println(firstA);
// System.out.println(last);
for (;;)
System.out.println(generateName());
}
}
3.4 主要的測試service類
package com.example.databatch.service;
import com.example.databatch.entities.Employee;
import com.example.databatch.repository.EmpRepository;
import com.example.databatch.utils.RandomUtil;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;
import org.springframework.stereotype.Service;
import javax.annotation.PostConstruct;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.concurrent.Callable;
/**
* @ClassName DataBatchService
* @Date 2020/2/20 22:26
**/
@Service
@Slf4j
public class DataBatchService {
private int threadId = 1;
private int batchSize = 1000;
@Autowired
private EmpRepository empRepository;
@PostConstruct
public void init() throws InterruptedException {
ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setCorePoolSize(100);
threadPoolTaskExecutor.setMaxPoolSize(100);
threadPoolTaskExecutor.setKeepAliveSeconds(3000);
threadPoolTaskExecutor.setQueueCapacity(500);
// 線程池需要初始化!
threadPoolTaskExecutor.initialize();
// 100個線程
for (int i = 0; i < 100; i++) {
Callable callable = new Callable() {
@Override
public Object call() throws Exception {
produceData();
return "";
}
};
threadPoolTaskExecutor.submit(callable);
}
}
public void produceData() throws InterruptedException {
log.info("-- thread-- " + (threadId++) + " is working");
List<Employee> list = new ArrayList<Employee>();
// 每個線程產生1w條數據
for (int i = 0; i < 10000; i++) {
try{
Employee emp = new Employee(RandomUtil.generateName(), new Date());
list.add(emp);
if (list.size() % batchSize == 0) {
log.info("list當前大小{}", list.size());
empRepository.save(list);
empRepository.flush();
//Thread.sleep(3);
list.clear();
log.info("---list當前大小{}", list.size());
}
} catch(Exception e){
log.error("發生錯誤:{}", e);
}
}
}
}
3.5 主啓動類
@SpringBootApplication
public class DataBatchApplication {
public static void main(String[] args) throws IOException {
SpringApplication.run(DataBatchApplication.class, args);
while(true) {
System.in.read();
}
}
}
三、注意事項與實踐心得
1.當表名與實體類的姓名不一致時,在實體類上加@Table(name = "t_emp")註解
2.當表的字段名與實體類不一致時,在實體類的字段名上加@Column(name = "表字段名")
注:如果數據庫的字段名不是“word1_word2”格式的,需要在配置文件中添加【參考文獻11】:
3.MySql 中 json 格式映射到 JPA / Hibernate,實體類直接拿json格式的String映射,或者參考:
如何把 MySql 中 json 格式映射到 JPA / Hibernate
4.還需要看數據庫安裝時的最大連接數配置
四、遇到的問題
1.BeanCreationNotAllowedException: Singleton bean creation not allowed while singletons of this facto 看了 https://blog.csdn.net/VICTOR_fusheng/article/details/79095991?utm_source=distribute.pc_relevant.none-task 和 https://blog.csdn.net/kyq_1024yahoocn/article/details/70155422 受到啓發
解決方法,在主啓動類的main方法中加入一下代碼:
while(true) {
System.in.read();
}
2.聯合主鍵設置參考【文獻9】
3.JPA進行insert操作時會首先select嗎【參考文獻3】
我在測試的時候,就是發現了JPA先select,爲了保證主鍵唯一(因爲之前沒有在主鍵上設置主鍵生成策略),先按實體類標記的主鍵查找記錄(中間難免序列化,會極大的消耗CPU性能,加上如果標記的主鍵與數據庫設置的不一致,隨着數量增多,會越來越慢),再插入新記錄!這是大坑!
五、參考文獻
1.java求排列組合數 https://blog.csdn.net/qq_21808961/article/details/77990940
2.jpa批量插入mysql https://www.jianshu.com/p/c50b1e233ef9
3.JPA進行insert操作時會首先select嗎 https://blog.csdn.net/weixin_30867015/article/details/95404588
4.MySQL數據庫最大連接數和單次最大寫入量 https://blog.csdn.net/superit401/article/details/97757273
5.spring data jpa開啓批量插入、批量更新 https://www.cnblogs.com/blog5277/p/10661096.html
6.Spring Boot線程池配置使用 https://blog.csdn.net/mdw5521/article/details/79446075
7.springboot中的線程池的詳細配置 https://blog.csdn.net/YoungLee16/article/details/84184056
8.JPA之聯合主鍵[複合主鍵] https://blog.csdn.net/u014268482/article/details/81027274
9.Spring Boot JPA 使用以及設置多個主鍵 https://blog.csdn.net/xx326664162/article/details/80053719
10.java實現排列組合(通俗易懂) https://www.cnblogs.com/zzlback/p/10947064.html
11.spring boot 中Spring data jpa命名策略 https://www.cnblogs.com/lovechengyu/p/8032039.html
12.Mysql 查看連接數,狀態 最大併發數(贊) https://www.cnblogs.com/haciont/p/6277675.html