使用Node.js解析PNG文件

寫上篇博客前對Node的Stream的官方文檔掃了一遍，之後還想繼續使用Stream寫些demo，就選擇了寫個小程序使用Node讀取解析PNG圖片（想的是如果可以方便地解析、生成PNG圖片，那就可以很方便地生成驗證碼圖片發給前端），結果就把自己坑了。。。PNG還是比較複雜的(以前數字圖像處理的課中接觸的主要就是bmp、tiff，要麼就直接用OpenCV、GDAL直接讀取各種格式的圖片，還沒有仔細看過PNG的具體格式)，由於時間關係我只解析了“非隔行掃描、非索引顏色、FilterMethod爲0”的PNG圖片-_-||
使用Node的fs.createReadStream()可以創建一個文件讀取流，在這裏我使用的是Paused模式（Paused模式和Flowing模式可以看上一篇的介紹），通過stream.read()方法可以比較精細地讀取readable流中的數據：

this.path = path;
this.stream = fs.createReadStream(this.path);
//使用paused模式
this.stream.pause();
this.stream.once('readable', ()=>{
   //使用stream.read()消耗readable數據流
   // ......
});

關於PNG的格式，有很多博客都寫得比較詳細的，但是幾乎所有的文章都略過了IDAT數據塊中的data解壓方法、濾波方法，當時還是在PNG官方文檔中弄明白的。這裏先給出文檔鏈接：W3C - Portable Network Graphics (PNG) Specification (Second Edition)

PNG 全稱是 Portable Network Graphics，即“便攜式網絡圖形”，是一種無損壓縮的位圖圖形格式。其設計目的是試圖替代GIF和TIFF文件格式，同時增加一些GIF文件格式所不具備的特性。

PNG文件結構

一個完整的PNG數據都是以一個PNG signature開頭和一系列數據塊（chunk）組成，其中第一個chunk爲IHDR，最後一個chunk爲IEDN。

PNG結構:
signature
chunk (IHDR)
…
chunk
…
chunk (IEDN)

官方文檔的描述是：This signature indicates that the remainder of the datastream contains a single PNG image, consisting of a series of chunks beginning with an IHDR chunk and ending with an IEND chunk.

PNG Signature

PNG signature 位於PNG文件的最開頭，佔8個字節，每個字節用十進制可以表示爲 [137, 80, 78, 71, 13, 10, 26, 10] ，通過下面的函數可以驗證signature的正確性：

checkSignature(){
     //PNG的Signature長度爲8字節, 1Byte = 8bit
     let buffer = this.stream.read(8);
     let signature = [137, 80, 78, 71, 13, 10, 26, 10];
     for(let i=0; i<signature.length; i++){
         let v = buffer.readUInt8(i);
         if(v !== signature[i]) 
             throw new Error('It is not PNG file !');
     }
     return true;
 }

PNG Chunk

PNG定義了兩種類型的數據塊，一種是稱爲關鍵數據塊(critical chunk)，這是標準的數據塊，另一種叫做輔助數據塊(ancillary chunks)，這是可選的數據塊。關鍵數據塊定義了4個標準數據塊(IHDR, PLTE, IDAT, IEND)，每個PNG文件都必須包含它們（沒有PLTE的話就默認爲RGB色），PNG讀寫軟件也都必須要支持這些數據塊。雖然PNG文件規範沒有要求PNG編譯碼器對可選數據塊進行編碼和譯碼，但規範提倡支持可選數據塊。
下表就是PNG中數據塊的類別，其中，關鍵數據塊是前4個。

Chunk name	Multiple allowed	Ordering constraints
IHDR	No	Shall be first	文件頭數據塊
PLTE	No	Before first IDAT	調色板數據塊
IDAT	Yes	Multiple IDAT chunks shall be consecutive	圖像數據塊
IEND	No	Shall be last	圖像結束數據

cHRM	No	Before PLTE and IDAT	基色和白色點數據塊
gAMA	No	Before PLTE and IDAT	圖像γ數據塊
iCCP	No	Before PLTE and IDAT. If the iCCP chunk is present, the sRGB chunk should not be present.	ICCP
sBIT	No	Before PLTE and IDAT	樣本有效位數據塊
sRGB	No	Before PLTE and IDAT. If the sRGB chunk is present, the iCCP chunk should not be present.	標準RPG顏色
bKGD	No	After PLTE; before IDAT	背景顏色數據塊
hIST	No	After PLTE; before IDAT	圖像直方圖數據塊
tRNS	No	After PLTE; before IDAT	圖像透明數據塊
pHYs	No	Before IDAT	物理像素尺寸數據塊
sPLT	Yes	Before IDAT	建議調色板
tIME	No	None	圖像最後修改時間數據塊
iTXt	Yes	None	國際文本數據
tEXt	Yes	None	文本信息數據塊
zTXt	Yes	None	壓縮文本數據塊

每個chunk由4個部分組成（當Length=0時，就沒有chunk data），如下：

name	meaning
Length	A four-byte unsigned integer giving the number of bytes in the chunk’s data field. The length counts only the data field, not itself, the chunk type, or the CRC. Zero is a valid length. Although encoders and decoders should treat the length as unsigned, its value shall not exceed 2^31-1 bytes.
Chunk Type	A sequence of four bytes defining the chunk type. Each byte of a chunk type is restricted to the decimal values 65 to 90 and 97 to 122. These correspond to the uppercase and lowercase ISO 646 letters (A-Z and a-z) respectively for convenience in description and examination of PNG datastreams. Encoders and decoders shall treat the chunk types as fixed binary values, not character strings. For example, it would not be correct to represent the chunk type IDAT by the equivalents of those letters in the UCS 2 character set.
Chunk Data	The data bytes appropriate to the chunk type, if any. This field can be of zero length.
CRC	A four-byte CRC (Cyclic Redundancy Code) calculated on the preceding bytes in the chunk, including the chunk type field and chunk data fields, but not including the length field. The CRC can be used to check for corruption of the data. The CRC is always present, even for chunks containing no data.

由於Length，Chunk Type，CRC的長度都是固定的（都是4字節），而Chunk Data的長度由Length的值確定。因此解析每個Chunk時都需要確定Chunk的type和其data的長度。

  /**
   * 讀取數據塊的名稱和長度
   * Length 和 Name(Chunk type) 位於每個數據塊開頭
   * Length, Chunk type 各佔4bytes
   * @returns {{name: string, length: *}}
   */
  readHeadAndLength(){
      let buffer = this.stream.read(8);
      // 將Length的4bytes讀成一個32bits的整數
      let length = buffer.readInt32BE(0);
      let name = buffer.toString(undefined, 4, 8);
      return {name, length};
  }

我的demo中解析的主要chunk是IHDR和IDAT，後者相對複雜一點。通過遞歸逐個解析chunk：

 readChunk({name, length}){
     if(!length || !name){
         console.log(name, length);
         return;
     }

     switch(name){
         case 'IHDR':
             this.readChunk(this.readIHDR(name, length));
             break;
         case 'IDAT':
             this.readChunk(this.readIDAT(name, length));
             break;
         case 'PLTE':
             // 還不支持調色板PLTE數據塊
             throw new Error('PLTE');
             break;
         default:
             // 跳過其他數據塊
             console.log('Skip',name,length);
             // length+4爲data+CRC的數據長度
             this.stream.read(length+4);
             this.readChunk(this.readHeadAndLength());
     }
 }

IHDR 數據塊

IHDR數據塊是PNG數據的第一個數據塊，它是PNG文件的頭文件數據，其Chunk Data由以下信息組成：

Name	Length
Width	4 bytes	圖像寬度，以像素爲單位
Height	4 bytes	圖像高度，以像素爲單位
Bit depth	1 bytes	圖像深度。索引彩色圖像: 1，2，4或8; 灰度圖像: 1，2，4，8或16；真彩色圖像：8或16
Colour type	1 bytes	顏色類型。0：灰度圖像；2：真彩色圖像；3：索引彩色圖像；4：帶α通道數據的灰度圖像；6：帶α通道數據的真彩色圖像
Compression method	1 bytes	壓縮方法（壓縮IDAT的Chunk Data）
Filter method	1 bytes	濾波器方法
Interlace method	1 bytes	隔行掃描方法。0：非隔行掃描；1： Adam7

知道IHDR的data部分的組成後，可以使用以下代碼可以解析IHDR數據塊的信息，這些信息對於解析IDAT數據十分重要：

  readIHDR(name, length){
      if(name !== 'IHDR') throw new Error('IHDR ERROR !');

      this.info = {};
      this.info.width = this.stream.read(4).readInt32BE(0);
      this.info.height = this.stream.read(4).readInt32BE(0);
      this.info.bitDepth = this.stream.read(1).readUInt8(0);
      this.info.coloType = this.stream.read(1).readUInt8(0);
      this.info.compression = this.stream.read(1).readUInt8(0);
      this.info.filter = this.stream.read(1).readUInt8(0);
      this.info.interlace = this.stream.read(1).readUInt8(0);
      console.log(this.info);
      //bands表示每個像素包含的波段數（如RGBA爲4波段）
      switch(this.info.coloType){
          case 0:
              this.info.bands = 1;
              break;
          case 2:
              this.info.bands = 3;
              break;
          case 3:
              // 不支持索引色
              throw new Error('Do not support this color type !');
              break;
          case 4:
              this.info.bands = 2;
              break;
          case 6:
              this.info.bands = 4;
              break;
          default:
              throw new Error('Unknown color type !');
      }
      // CRC
      this.stream.read(4);
  }

以截圖中的圖片爲例，這是一張包含透明通道的5*5大小的PNG圖片，通過上面的代碼得到其IHDR裏面的信息：

{ width: 5,
  height: 5,
  bitDepth: 8,
  coloType: 6,
  compression: 0,
  filter: 0,
  interlace: 0 }

由IHDR的信息可以知道，這張圖片是採用非隔行掃描、filter Method 爲 0，帶α通道數據的真彩色圖像，每個通道佔8比特，所以一個像素佔4*8比特。

IDAT 數據塊

IDAT是圖像數據塊，它存儲PNG實際的數據，在數據流中可包含多個連續順序的圖像數據塊。IDAT存放着圖像真正的數據信息，因此，如果能夠了解IDAT中Chunk Data的結構，我們就可以很方便地解析、生成PNG圖像。具體的步驟包括解壓、濾波等。

IDAT 數據塊解壓

圖像數據塊中的圖像數據可能是經過變種的LZ77壓縮編碼DEFLATE壓縮的，關於DEFLATE詳細介紹可以參考《DEFLATE Compressed Data Format Specification version 1.3》，網址：http://www.ietf.org/rfc/rfc1951.txt 。可以使用Node的zlib模塊直接解壓。zlib模塊提供通過 Gzip 和 Deflate/Inflate 實現的壓縮、解壓功能，可以通過這樣使用它：

const zlib = require('zlib');

通過下面的代碼可以將Chunk  Data解壓成濾波後的數據：

readIDAT(name, length){
      if(name !== 'IDAT') throw new Error('IDAT ERROR !');

      let buffer = this.stream.read(length);
      //解壓數據塊中data部分,得到真正的圖像數據
      this.data = zlib.unzipSync(buffer);
      console.log("Unzip length", this.data.length);

      // CRC
      this.stream.read(4);
      return this.readHeadAndLength();
  }

對於前文提到的圖片，解壓前IDAT的Chunk Data大小爲49字節，解壓後的大小爲105字節。解壓後的數據是以左上角爲起點。對於我這張圖片而言（非隔行掃描、filter Method 爲 0，帶α通道數據的真彩色圖像），按照RGBA RGBA RGBA排列數據，每行的開頭有一個Filter Type標識（佔1字節）。下面的代碼可以獲得每行的Filter Type：

 /**
  * 獲取每行的filter type
  * 每行有個1字節長度的filterType
  * @param row
  * @returns {*}
  */
 getFilterType(row){
     let offset = this.info.bitDepth/8;
     let pointer = row * this.info.width * offset * this.info.bands + row;
     //讀每行最開頭的1字節
     return this.readNum(this.data, pointer, 8);
 }

下面是解壓後的IDAT Chunk Data（濾波後的每個波段以及每行的Filter Type）：

------Row0------
Filter type:1
[ 255, 0, 0, 255 ]
[ 0, 255, 255, 0 ]
[ 0, 1, 1, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
------Row1------
Filter type:2
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
------Row2------
Filter type:4
[ 0, 255, 255, 0 ]
[ 0, 0, 0, 0 ]
[ 1, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
------Row3------
Filter type:1
[ 0, 0, 0, 255 ]
[ 252, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 3, 255, 255, 1 ]
------Row4------
Filter type:4
[ 255, 255, 255, 0 ]
[ 0, 0, 0, 1 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 1, 1, 1, 255 ]

從中可以發現，原本第二行應該與第一行一模一樣，這裏卻全是0，其Filter Type爲2，指Up濾波，也就是其值與上面一行對應。這樣的好處就是便於壓縮，減少空間。

IDAT 數據塊濾波處理

PNG的具體濾波方法可以參考官方文檔：PNG Filtering
知道了PNG的濾波方法後就可以恢復真正的圖像數據。對於FilterMethod=0的濾波而言，定義了5種FilterType：

Type	Name
0	None
1	Sub
2	Up
3	Average
4	Paeth

根據官方文檔的介紹，我寫了下面的恢復濾波前的數據的方法：


/**
 * 處理filterMethod=0時整個圖像中的一行
 * 這時每行都對應一種具體的FilterType
 * @param index
 * @param start
 * @param filterType
 * @param colByteLength
 * @returns {*}
 */
reconForNoneFilter(index, start, filterType, colByteLength){
    let pixelByteLength = this.info.bands*this.info.bitDepth/8;
    switch(filterType){
        case 0:
            //None
            return this.data[index];
            break;
        case 1:
            //Sub
            if(index-start-1<pixelByteLength)return this.data[index];
            else return this.data[index] + this.data[index-pixelByteLength];
        case 2:
            //Up
            return this.data[index] + this.data[index-colByteLength];
        case 3:
            //Average
            {
                let a=0,b=0;
                a = index-start-1<pixelByteLength?a:this.data[index-pixelByteLength];
                b = this.data[index-colByteLength];
                return this.data[index] + Math.floor((a+b)/2);
            }
        case 4:
            //Paeth
            {
                let a=0,b=0,c=0;
                b = this.data[index-colByteLength];
                if(index-start-1<pixelByteLength){
                    a = c =0;
                }else{
                    a = this.data[index-pixelByteLength];
                    if(start>=colByteLength){
                        c = this.data[index-pixelByteLength-colByteLength];
                    }
                }
                //PaethPredictor function
                let p = a + b - c;
                let pa = Math.abs(p - a), pb = Math.abs(p - b), pc = Math.abs(p - c);
                let Pr = 0;
                if(pa <= pb && pa <= pc)Pr = a;
                else if(pb <= pc)Pr = b;
                else Pr = c;

                return Pr;
            }
        default:
            throw new Error('recon failed');
    }
}

恢復後的數據如下：

------Row0------
Filter type:1
[ 255, 0, 0, 255 ]
[ 255, 255, 255, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
------Row1------
Filter type:2
[ 255, 0, 0, 255 ]
[ 255, 255, 255, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
------Row2------
Filter type:4
[ 255, 0, 0, 255 ]
[ 255, 255, 255, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
------Row3------
Filter type:1
[ 0, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 255, 255, 255, 0 ]
------Row4------
Filter type:4
[ 0, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 255, 255, 255, 0 ]

這時剛好能和前面提到的圖片對應上。^_^

參考資料
分析PNG圖像結構
 W3C - Portable Network Graphics (PNG) Specification (Second Edition)

代碼地址：https://git.oschina.net/liuyaqi/JSPNG/

使用Node.js解析PNG文件

PNG文件結構

PNG Signature

PNG Chunk

IHDR 數據塊

IDAT 數據塊

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

全面系統的AI學習路徑，幫助普通人也能玩轉AI

HTML 00 Tutorial

uni-app實現上拉加載

vue3編譯優化之“靜態提升”

又是一個月-20240513

flask 如何保證返回json有序

linux服務器設置ssh免密

我的THREE.js之旅01

css3 a標籤效果

node.js POST流程

Number.MAX_SAFE_INTEGER與Number.MAX_VALUE

給定入棧順序求所有出棧可能性

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結