Datax 支持 postgresql update
datax介紹
DataX 是阿里巴巴集團內被廣泛使用的離線數據同步工具/平臺,實現包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、DRDS 等各種異構數據源之間高效的數據同步功能。
支持增量 postgresql update
我們使用datax 希望支持postgresql 增量導入數據:地址:https://gitee.com/cecotw/DataX
鏈接:https://pan.baidu.com/s/1mbEvLsDZZNWMYrTTTeYkAw 密碼:v97c
修改 PostgresqlWriter.java
刪除限制:
修改WriterUtil.java
添加postgresql 數據插入類型轉換:
public static String getWriteTemplate(List<String> columnHolders, List<String> valueHolders, String writeMode, DataBaseType dataBaseType, boolean forceUseUpdate) {
boolean isWriteModeLegal = writeMode.trim().toLowerCase().startsWith("insert")
|| writeMode.trim().toLowerCase().startsWith("replace")
|| writeMode.trim().toLowerCase().startsWith("update");
if (!isWriteModeLegal) {
throw DataXException.asDataXException(DBUtilErrorCode.ILLEGAL_VALUE,
String.format("您所配置的 writeMode:%s 錯誤. 因爲DataX 目前僅支持replace,update 或 insert 方式. 請檢查您的配置並作出修改.", writeMode));
}
// && writeMode.trim().toLowerCase().startsWith("replace")
String writeDataSqlTemplate;
if (forceUseUpdate ||
((dataBaseType == DataBaseType.MySql || dataBaseType == DataBaseType.Tddl) && writeMode.trim().toLowerCase().startsWith("update"))
) {
//update只在mysql下使用
writeDataSqlTemplate = new StringBuilder()
.append("INSERT INTO %s (").append(StringUtils.join(columnHolders, ","))
.append(") VALUES(").append(StringUtils.join(valueHolders, ","))
.append(")")
.append(onDuplicateKeyUpdateString(columnHolders))
.toString();
} else {
if (dataBaseType == DataBaseType.PostgreSQL) {
writeDataSqlTemplate = new StringBuilder().append("INSERT INTO %s (")
.append(StringUtils.join(columnHolders, ","))
.append(") VALUES(").append(StringUtils.join(valueHolders, ","))
.append(")").append(onConFlictDoString(writeMode, columnHolders)).toString();
} else {
//這裏是保護,如果其他錯誤的使用了update,需要更換爲replace
if (writeMode.trim().toLowerCase().startsWith("update")) {
writeMode = "replace";
}
writeDataSqlTemplate = new StringBuilder().append(writeMode)
.append(" INTO %s (").append(StringUtils.join(columnHolders, ","))
.append(") VALUES(").append(StringUtils.join(valueHolders, ","))
.append(")").toString();
}
}
return writeDataSqlTemplate;
}
增加onConFlictDoString方法:
public static String onConFlictDoString(String conflict, List<String> columnHolders) {
conflict = conflict.replace("update", "");
StringBuilder sb = new StringBuilder();
sb.append(" ON CONFLICT ");
sb.append(conflict);
sb.append(" DO ");
if (columnHolders == null || columnHolders.size() < 1) {
sb.append("NOTHING");
return sb.toString();
}
sb.append(" UPDATE SET ");
boolean first = true;
for (String column : columnHolders) {
if (!first) {
sb.append(",");
} else {
first = false;
}
sb.append(column);
sb.append("=excluded.");
sb.append(column);
}
return sb.toString();
}
效果
{
"job": {
"setting": {
"speed": {
"byte": 1048576
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "postgresqlreader",
"parameter": {
"username": "postgres",
"password": "postgres",
"connection": [
{
"querySql":["SELECT seq,userid,name FROM user"],
"jdbcUrl": [
"jdbc:postgresql://127.0.0.1:5432/postgres"
]
}
]
}
},
"writer": {
"name": "postgresqlwriter",
"parameter": {
"username": "thsdb",
"password": "thsdb_outsev",
"column": [
"seq",
"userid",
"name"
],
"connection": [
{
"jdbcUrl": "jdbc:postgresql://127.0.0.1:5432/postgres",
"table": [
"user1"
]
}
],
"writeMode": "update (seq,userid)"
}
}
}
]
}
}
源碼
- 關於 DATAX改造後的代碼 ,參考 這兒.(https://gitee.com/cecotw/DataX)