Avro序列化Java中 的BigDecimal

背景

由於業務需要,需要將SparkSQL中Array、Map、Struct使用avro序列化成一個大的字節數組進行存儲。但是在序列化過程中,涉及到Java的BigDecimal類型,根據avro官網提示,定義schema如下:

{
   
   
    "namespace":"com.bugboy.avro.bean",
    "type":"record",
    "name":"DecimalDemo",
    "fields":[
        {
   
   "name":"id", "type":"string"},
        {
   
   "name":"value","type":{
   
   "type":"bytes","logicalType": "decimal","precision": 10,"scale": 2}}
    ]
}

但是使用該schema序列化時,還是遇到了很多問題,比如java.math.BigDecimal不能強轉成byebuff等等,這裏就不細說了。反正就是官網上沒有找到具體實現方案。

方案

也許源碼中會有對應的demo,但是由於時間緊迫,沒辦法自己看源碼,就使用面對debug編程的笨辦法一步一步的設置,摸索。在經過多次嘗試之後,最終得到如下方案進行對包含有BigDecimal的Record(對應SparkSQL中的Struct)進行序列化與反序列化。

package com.bugboy.avro.bean;

import org.apache.avro.Conversions;
import org.apache.avro.LogicalTypes;
import org.apache.avro.Schema;
import org.apache.avro.SchemaBuilder;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;
import org.apache.avro.io.BinaryDecoder;
import org.apache.avro.io.BinaryEncoder;
import org.apache.avro.io.DecoderFactory;
import org.apache.avro.io.EncoderFactory;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.math.BigDecimal;

public class SerRecord {
   
   
  public static void main(String[] args) throws IOException {
   
   
    // 構建record的schema,當然,也可以使用解析器解析定義好的json直接生成schema
    SchemaBuilder.BaseTypeBuilder<Schema> builder = SchemaBuilder.builder();
    SchemaBuilder.FieldAssembler<Schema> fieldAssembler = builder.record("Hello").namespace("").fields();
    fieldAssembler.name("id")
            .type(builder.stringType())
            .noDefault();
    Schema decimalSchema = builder.bytesType();
    LogicalTypes.decimal(10, 2).addToSchema(decimalSchema);
    fieldAssembler.name("value")
            .type(decimalSchema)
            .noDefault();
    Schema schema = fieldAssembler.endRecord(); // 構建結束

    // 準備好Record
    GenericData.Record record = new GenericData.Record(schema);
    record.put("id", "001");
    record.put("value", BigDecimal.valueOf(67.78));

    // 序列化
    GenericDatumWriter<GenericData.Record> writer = new GenericDatumWriter<>();
    // 需要設置DecimalConversion序列化器,否則會報BigDecimal不能強轉ByteBuffer的異常。
    writer.getData().addLogicalTypeConversion(new Conversions.DecimalConversion());
    // 需要進行設置,否則會空指針異常
    writer.setSchema(schema);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    BinaryEncoder encoder = EncoderFactory.get().directBinaryEncoder(baos, null);
    // 進行序列化,得到字節數組
    writer.write(record, encoder);
    byte[] bytes = baos.toByteArray();

    // 進行反序列化,將字節數組轉化成Record
    GenericDatumReader<GenericData.Record> reader = new GenericDatumReader<>();
    ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
    BinaryDecoder decoder = DecoderFactory.get().directBinaryDecoder(bais, null);
    // 設置schema,否則會報空指針異常
    reader.setExpected(schema);
    reader.setSchema(schema);
    // 反序列化
    GenericData.Record newRecord = reader.read(null, decoder);
    // 取值進行驗證
    System.out.println(newRecord.get("id"));
    System.out.println(newRecord.get("value"));
  }
}

pom依賴如下:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.bugboy</groupId>
    <artifactId>avro-bean-serder</artifactId>
    <version>1.0.0</version>
    <properties>
        <avro-version>1.8.2</avro-version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro</artifactId>
            <version>${avro-version}</version>
        </dependency>
        <dependency>
            <groupId>joda-time</groupId>
            <artifactId>joda-time</artifactId>
            <version>2.9.4</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-maven-plugin</artifactId>
                <version>${avro-version}</version>
                <executions>
                    <execution>
                        <phase>generate-sources</phase>
                        <goals>
                            <goal>schema</goal>
                        </goals>
                        <configuration>
                            <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                            <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>
</project>

寫在最後

過年不能回家,好煩啊!!!!!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章