MapReduce運算下FSDataOutputStream類對象使用writeUTF寫出字符串到文件時每行首出現非預期字符

演示代碼簡要如下:

fs=FileSystem.get(job.getConfiguration());
FSDataOutputStream finf=fs.create(Path);
finf.writeUTF(String);

輸出結果截圖:
在這裏插入圖片描述
複製內容到記事本顯示如下:
在這裏插入圖片描述
有點像控制字符,反正看不懂,並且原始數據也沒有這些東西,那可能就是IO寫入的問題了。
查看writeUTF源碼發現如下注釋:

void java.io.DataOutputStream.writeUTF(String str) throws IOException
Writes a string to the underlying output stream using modified UTF-8 encoding in a machine-independent manner. 
First, two bytes are written to the output stream as if by the writeShort method giving the number of bytes to follow. This value is the number of bytes actually written out, not the length of the string. Following the length, each character of the string is output, in sequence, using the modified UTF-8 encoding for the character. If no exception is thrown, the counter written is incremented by the total number of bytes written to the output stream. This will be at least two plus the length of str, and at most two plus thrice the length of str.

關鍵一行:

First, two bytes are written to the output stream as if by the writeShort method giving the number of bytes to follow.

會先寫入兩個字節的標識符?
換成如下代碼問題解決:

finf.writeBytes(String);
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章