問題:
What is the fastest, easiest tool or method to convert text files between character sets?在字符集之間轉換文本文件的最快、最簡單的工具或方法是什麼?
Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.具體來說,我需要從 UTF-8 轉換爲 ISO-8859-15,反之亦然。
Everything goes: one-liners in your favorite scripting language, command-line tools or other utilities for OS, web sites, etc.一切順利:您最喜歡的腳本語言、命令行工具或其他用於操作系統、網站等的實用程序。
Best solutions so far:迄今爲止的最佳解決方案:
On Linux/UNIX/OS X/cygwin:在 Linux/UNIX/OS X/cygwin 上:
Gnu iconv suggested by Troels Arvin is best used as a filter . Troels Arvin建議的 Gnu iconv最好用作過濾器。 It seems to be universally available.它似乎是普遍可用的。 Example:例子:
$ iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt
As pointed out by Ben , there is an online converter using iconv .正如Ben所指出的,有一個使用 iconv的在線轉換器。
Gnu recode ( manual ) suggested by Cheekysoft will convert one or several files in-place . Cheekysoft建議的 Gnu recode ( 手動)將就地轉換一個或多個文件。 Example:例子:
$ recode UTF8..ISO-8859-15 in.txt
This one uses shorter aliases:這個使用較短的別名:
$ recode utf8..l9 in.txt
Recode also supports surfaces which can be used to convert between different line ending types and encodings: Recode 還支持可用於在不同行尾類型和編碼之間進行轉換的表面:
Convert newlines from LF (Unix) to CR-LF (DOS):將換行符從 LF (Unix) 轉換爲 CR-LF (DOS):
$ recode ../CR-LF in.txt
Base64 encode file: Base64 編碼文件:
$ recode ../Base64 in.txt
You can also combine them.您也可以將它們組合起來。
Convert a Base64 encoded UTF8 file with Unix line endings to Base64 encoded Latin 1 file with Dos line endings:將帶有 Unix 行結尾的 Base64 編碼的 UTF8 文件轉換爲帶有 Dos 行結尾的 Base64 編碼的拉丁文 1 文件:
$ recode utf8/Base64..l1/CR-LF/Base64 file.txt
On Windows with Powershell ( Jay Bazuzi ):在帶有Powershell ( Jay Bazuzi ) 的 Windows 上:
PS C:\\> gc -en utf8 in.txt | Out-File -en ascii out.txt
(No ISO-8859-15 support though; it says that supported charsets are unicode, utf7, utf8, utf32, ascii, bigendianunicode, default, and oem.) (雖然不支持 ISO-8859-15;它說支持的字符集是 unicode、utf7、utf8、utf32、ascii、bigendianunicode、default 和 oem。)
Edit編輯
Do you mean iso-8859-1 support?你的意思是iso-8859-1支持嗎? Using "String" does this eg for vice versa使用“字符串”執行此操作,例如反之亦然
gc -en string in.txt | Out-File -en utf8 out.txt
Note: The possible enumeration values are "Unknown, String, Unicode, Byte, BigEndianUnicode, UTF8, UTF7, Ascii".注意:可能的枚舉值爲“Unknown、String、Unicode、Byte、BigEndianUnicode、UTF8、UTF7、Ascii”。
- CsCvt - Kalytta's Character Set Converter is another great command line based conversion tool for Windows. CsCvt - Kalytta 的字符集轉換器是另一個很棒的基於命令行的 Windows 轉換工具。
解決方案:
參考一: https://stackoom.com/question/Gs8參考二: Best way to convert text files between character sets?