java壓縮編碼之GZIP

逆向分析中將分析結果的 byte[ ] 以十六進制的形式打印出來,常常會遇到這樣的格式:

1F8B08000000000000002597C712ABBC0E809FE69C。。。。。。省略N多。。。。。。。

看其格式就知道是GZIP壓縮的格式。

根據目前我逆向分析的好幾個APP應用,它們使用網絡請求時,基本上都使用了GZIP壓縮技術對其請求返回數據進行GZIP壓縮或解壓處理。


壓縮和解壓分別使用了兩個IO流:

GZIP壓縮:GZIPOutputStream

GZIP解壓:GZIPInputStream


示例代碼:

public class Gzip {

	public static void main(String[] args) throws IOException {
		// TODO Auto-generated method stub

		String str="hello world";
		
		byte[ ] bytes=str.getBytes();
		System.out.println("壓縮前長度:"+bytes.length);
		byte[ ] gzipBytes=gzip(bytes);
		System.out.println("壓縮後長度:"+gzipBytes.length);
		System.out.println("壓縮後:"+byteToHexString(gzipBytes));
		byte[ ] unGzipBytes=unGzip(gzipBytes);
		System.out.println("解壓後:"+byteToHexString(unGzipBytes));
	}
	
	public static byte[] gzip(byte[] content) throws IOException{
		ByteArrayOutputStream baos=new ByteArrayOutputStream();
		GZIPOutputStream gos=new GZIPOutputStream(baos);
		
		ByteArrayInputStream bais=new ByteArrayInputStream(content);
		byte[ ] buffer=new byte[1024];
		int n;
		while((n=bais.read(buffer))!=-1){
			gos.write(buffer, 0, n);
		}
		gos.flush();
		gos.close();
		return baos.toByteArray();
	}
	
	public static byte[] unGzip(byte[] content) throws IOException{
		ByteArrayOutputStream baos=new ByteArrayOutputStream();
		GZIPInputStream gis=new GZIPInputStream(new ByteArrayInputStream(content));
		byte[] buffer=new byte[1024];
		int n;
		while((n=gis.read(buffer))!=-1){
			baos.write(buffer, 0, n);
		}
		
		return baos.toByteArray();
	}
	
	public static String byteToHexString(byte[] bytes) {
        StringBuffer sb = new StringBuffer(bytes.length);
        String sTemp;
        for (int i = 0; i < bytes.length; i++) {
            sTemp = Integer.toHexString(0xFF & bytes[i]);
            if (sTemp.length() < 2)
                sb.append(0);
            sb.append(sTemp.toUpperCase());
        }
        return sb.toString();
    }

}

運行結果:


上面的1F8B0800000000000000是10個字節的固定GZIP格式,最後8個字節表示壓縮前的數據長度。


再來看“a”,“abc”,“abcde”,“asdfghjk” 這四個字符串的壓縮結果:


基本都是一樣的格式。


最後一個例子,輸入一個比較長的數據,再看看結果:

----------------------數據-------------------

"lng=113.902302&cx=d41bgtHhx7iFJPuvzpwy8o7tNKB4l%2BDauZTUvdnDoEjlplqvzoXiGkVpWGf9MpgWiLOwsj5gMdDH%0AZ%2FpwELvq67gex7Y6gtgOs4mu3JJctJ0agflWYuan9qX7aZufh%2FA4E2lsouyvze344TMxzjfpnFMV%0AwNJ%2B8TIUe6qg2vC1osSpQXirubdyOr3j1pJyShr4sogM7zkiJAdrynp0arvp%2Fx64DvnjxoEsThjw%0Aq6Ma7Eb%2FvhXBGq7fEAecLtPZvQCWyFRWk5NqrSCr6KaD90ACcaDuOYZ50%2BUQJDBJ4dovO%2FuVCFnO%0AXqEecWLsJkzLyQeq2CL2u5YUcWXN3WcJwLNRxQEK%2BVr98SNeDAkVA6SCGKYz6re5dCeBJiNf2aZT%0ARhI32h91JVZOvifG2nCCduaAUjxxMa1WVBT7EdGwgalmzo5Jnhop4zIxsiurD9ZR0nheGjnqDD05%0AsgRJzxCGIW%2BWqsrWPb3omJ9dSpSeBWShKN2cn4YVMNUQZChei1ggVtjcmfq0QCLFJVT4JaUmx%2BEL%0ACeN%2Bnd4bakTOGwehwz5QYxWJKey4Bx7wScyFbdCBM4H6tLSIHW3bFumlP4Jrj4cB8FL3g%2BNI2mmq%0A8el5wmFiJ8opoyEHzVYh8uMELV6PkqrEUrblRnG%2BYIRjkNo00ZUW6e%2FqAR%2Fku%2FIjggBWLrvwYbaF%0A6JQELcshV5eBkzC%2FUWZUvnWHx4fQ4rygmAiH0rgJkCRXugTb9b1LvM7Qh0VzqLeMlOhHUeY95Y0n%0ADmVU%2B3SwXDcqXGV7xAYN%2BYtIpaXUUE0Ym4S2t0RFqoi8c6QG78CBwmLbC8iIHV%2Bqed%2BaH7wj%2B%2FgC%0AB6Np%2BBa3nYsxHYO%2Be44k0vb4FMZfjTKcbGbX2oCOR2dxCyevR48%2BnN4TaTy7af17LK00qRKoabBS%0APMBLkLo74Ay2gVuBkRJ2m8WvGDjbjql5ECuXXX3xJqKOgmb8w6TVj%2FULXqzPbwWamInoCYg8Icke%0AAxklN8GBY9Goa%2Fe2oARB7us8wfNDnA6uHBFBqKfUwk08f5TqWEJQnx7DGB1H9NYJnRryAOWnLDK1%0APGRU%2Blxa98Tc1I9WxHjuNptVfbirLfRUkYJ7JHQefrJH0NEGVmafodDfz2minx1veSCn6dQD5X3z%0AkXEixFLLQqg1HTU4QhS53RMiVaJqRfOjFXlkn%2BP36XvqUZuY%2F4QSUHKm4CxYm8Mu2L8Mwh3xouXm%0AhOP9nfxGq3N5n8eRshksxHJap1fS4s7z843hzepKgo7rXspRj%2FqWJNFKW1%2BK8UCjlXm0A7Maoome%0A5QNVa5WgLVuLuCcpZQ7u1QV%2FD%2FL2nPy%2BMhhFya35A%2F0hHULiIZlIiFomiPl6wAststTPu7LU1MyY%0ArjvLV2ImEVCyd4RmfYEmNp5EtI6LnQL66zHQnQrWEz0hfccJLn7drpemlV%2F2bj6qa2MBp9tkODnM%0Ax7jHEVgWVoDdN0Rj0eobgRHZjeuGkO04%2F%2FS1qvAlMHy4ewcnSIgOyYzz6BpHXKXXx9hY2xqj5yp%2F%0A7Tje81R6hSK1BeBdS7Wz1gw7XmYWKiezW9F6XBw%2BQ0L5vR1F%2Bf%2B83v32hS8HDKmz5e8%2FhaVZC1So%0AX5wL2HAZe8wcYfzDYbEHTSjHaz2d5AdxuwdtOu99UZkjNm9rbIkscelofvBcMuDW6MK2ojCd%2FSnS%0AJlAmMQ0nn3sVZYwCBLiXKlee8IqoDgGuUhiA19RUkNYtMg6pk7%2BieLFkKy5z7yPerpy6Jt7PTN59%0ACqYWFP9fyCP44DNU5nL0Z04UTNQqyZl%2Ff7LDVzlXygXxmVdbmgfwo%2FahUpLpQyU5KO%2BQwipRK4mr%0AGqssLcIvYaJxtATfXqGW1kUzysJFqXKeVY4aXk8DU%2FAsX91XXtBqzcLT%2Fjke1xZNxgKicF3Kte71%0AbkeUJl39n85xVNpJ5Xn%2FRqO%2F1uG69Svj3F3ShHcj4danJUX7b2NDOk5rKbVZSfefZsPJXNCB0lR2%0AQQGuMMYR4F1gkUZElgyidnPr7R4cOa9%2BuGf7txOYGSfXPyH1N8lXEd1A1Iyt1eDwaIkzbH4rGcVQ%0AK%2ByOWmHKeqJOBmWXfBm5ZHxJ0zBtAsbc2%2BFQbttZ17eE7%2BpvOvWiS9Enqy90vUQSaLdzmvQdlGYI%0AYlHijyZuDYCKgPibV1HtU6Iz%2Fes%2FKkGNCR64Y9r0E00zqw8PLSqCjdClWOZYv9K6xClLOApMMUdw%0AdPZ9UUQZ2ihOtPtd%2BP4U9DPIIhlHG4LRu%2B6sL2hMaV7PqbBNYA9EbgP41R2aSRdia4Nc0z95EUZW%0AdBrUcPdUl5FTe8CYobLYolevokSwVstKsLXO0vMWWbgHA2sXzruKVm4pGudizM4%2FL9TIBY8DTIxy%0AAOh2sVZNxM3DjDenWd0xviDByluT7SlU3BmWG27%2BGAPCsLpPdNkjp0dGSaLtf6c1ivQpzUSP0CtX%0ADD53lxukU5B%2FR9XNQpgkJ1OaH5DCu%2BuTr2zH3oUG3O2NVhEuE660c09ABtWG%2FM7u4KxOMzDSjL4%2F%0AgToQ8GDopo8reZWzwgfUvsXFYxPZIP0ZyZ3%2BzI4oBDo3cqbNaotdyxTNq4zDGgRS%2BSZ26xR%2FAzv4%0AmkYUrouEd6toPRSS439Acjv3vc5zWaW5yXmKXwfWSuJ5ETpgGssBaqvErrFJCZwFEJhj91eAE4uT%0AMtwPmugZCINSom8ogkSIey84Y5%2FeiOga7ZSqRDr6e8Vop8wjEEkQz8fpxb3movIQZaDcHVQdi2GJ%0ANLojDQ%2FyyYo3lQExBMLVZJwJiOCS%2BcKSyYGTAL3P4bVdrn2v%2Be%2FddHk3ylI%3D%0A&countrycode=86&type=0&lat=22.552802"

----------------------數據-------------------

然後得到結果:

截圖一


截圖二


截圖二種的最後8個字節500C0000代表長度,計算的規則是從左到右,每256就向着右邊相鄰的低位字節進1。

也就是 500C0000= (0x0C)*16^2 +(0x50)= 12*256+80=3152。

剛好等於原始數據的長度。

2371/3152=75.2%,壓縮後,長度爲原始數據長度的75.2%。


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章