WebSVN(2.3.3)查看文件中文亂碼解決方法

環境:AIX 5.3, Apache 2.2.10, Php 5.2.17, WebSVN 2.3.2

問題:svn的comment中文顯示沒問題,utf-8編碼的文件顯示也沒問題,但是默認沒有設置編碼的文件(AIX下的C文件等),中文全部顯示爲亂碼。

解決:查看websvn源碼,估計問題在於detectCharacterEncoding函數,發現裏邊只列了ISO-8859-1和UTF-8兩種編碼。由於系統中主要都是GB18030編碼的,乾脆將ISO-8859-1改成GB的。測試,發現不行。apache服務器日誌顯示

”Wrong charset, conversion from `GB18030' to `GB18030//TRANSLIT//IGNORE' is not allowed “ 看來是iconv函數還有問題,將//TRANSLIT//IGNORE參數刪除後,再修改toOutputEncoding函數,同樣刪除iconv的參數。搞定,完美顯示中文

修改內容如下:

------------------

function detectCharacterEncoding($str) {
        $list = array(/*'ASCII',*/ 'UTF-8', 'GB18030');
      //  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^將ISO-8859-1修改爲GB18030
        if (function_exists('mb_detect_encoding')) {
                // @see http://de3.php.net/manual/en/function.mb-detect-encoding.php#81936
                // why appending an 'a' and specifying an encoding list is necessary
                return mb_detect_encoding($str.'a', $list);

        } else if (function_exists('iconv')) {
                foreach ($list as $item) {
                        //$encstr = iconv($item, $item.'//TRANSLIT//IGNORE', $str);
                                                  //    ^^^^^^^^^^^^^^^^^^^^AIX 不支持這種方式,刪除
                        $encstr = iconv($item, $item, $str);
                        if (md5($encstr) == md5($str)) return $item;
                }
        }

        return null;
}

function toOutputEncoding($str) {
        $enc = detectCharacterEncoding($str);

        if ($enc !== null && function_exists('mb_convert_encoding')) {
                $str = mb_convert_encoding($str, 'UTF-8', $enc);

        } else if ($enc !== null && function_exists('iconv')) {
                //$str = iconv($enc, 'UTF-8//TRANSLIT//IGNORE', $str);
                                         //  ^^^^^^^^^^^^^^^^^^^^同上,刪除
                $str = iconv($enc, 'UTF-8', $str);

        } else {
                // @see http://w3.org/International/questions/qa-forms-utf-8.html
                $isUtf8 = preg_match('%^(?:
                        [\x09\x0A\x0D\x20-\x7E]              # ASCII
                        | [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte
                        |  \xE0[\xA0-\xBF][\x80-\xBF]        # excluding overlongs
                        | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte
                        |  \xED[\x80-\x9F][\x80-\xBF]        # excluding surrogates
                        |  \xF0[\x90-\xBF][\x80-\xBF]{2}     # planes 1-3
                        | [\xF1-\xF3][\x80-\xBF]{3}          # planes 4-15
                        |  \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16
                        )*$%xs', $str
                );
                if (!$isUtf8) $str = utf8_encode($str);
        }

        return $str;
}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章