WebSVN(2.3.3)查看文件中文乱码解决方法

环境:AIX 5.3, Apache 2.2.10, Php 5.2.17, WebSVN 2.3.2

问题:svn的comment中文显示没问题,utf-8编码的文件显示也没问题,但是默认没有设置编码的文件(AIX下的C文件等),中文全部显示为乱码。

解决:查看websvn源码,估计问题在于detectCharacterEncoding函数,发现里边只列了ISO-8859-1和UTF-8两种编码。由于系统中主要都是GB18030编码的,干脆将ISO-8859-1改成GB的。测试,发现不行。apache服务器日志显示

”Wrong charset, conversion from `GB18030' to `GB18030//TRANSLIT//IGNORE' is not allowed “ 看来是iconv函数还有问题,将//TRANSLIT//IGNORE参数删除后,再修改toOutputEncoding函数,同样删除iconv的参数。搞定,完美显示中文

修改内容如下:

------------------

function detectCharacterEncoding($str) {
        $list = array(/*'ASCII',*/ 'UTF-8', 'GB18030');
      //  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^将ISO-8859-1修改为GB18030
        if (function_exists('mb_detect_encoding')) {
                // @see http://de3.php.net/manual/en/function.mb-detect-encoding.php#81936
                // why appending an 'a' and specifying an encoding list is necessary
                return mb_detect_encoding($str.'a', $list);

        } else if (function_exists('iconv')) {
                foreach ($list as $item) {
                        //$encstr = iconv($item, $item.'//TRANSLIT//IGNORE', $str);
                                                  //    ^^^^^^^^^^^^^^^^^^^^AIX 不支持这种方式,删除
                        $encstr = iconv($item, $item, $str);
                        if (md5($encstr) == md5($str)) return $item;
                }
        }

        return null;
}

function toOutputEncoding($str) {
        $enc = detectCharacterEncoding($str);

        if ($enc !== null && function_exists('mb_convert_encoding')) {
                $str = mb_convert_encoding($str, 'UTF-8', $enc);

        } else if ($enc !== null && function_exists('iconv')) {
                //$str = iconv($enc, 'UTF-8//TRANSLIT//IGNORE', $str);
                                         //  ^^^^^^^^^^^^^^^^^^^^同上,删除
                $str = iconv($enc, 'UTF-8', $str);

        } else {
                // @see http://w3.org/International/questions/qa-forms-utf-8.html
                $isUtf8 = preg_match('%^(?:
                        [\x09\x0A\x0D\x20-\x7E]              # ASCII
                        | [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte
                        |  \xE0[\xA0-\xBF][\x80-\xBF]        # excluding overlongs
                        | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte
                        |  \xED[\x80-\x9F][\x80-\xBF]        # excluding surrogates
                        |  \xF0[\x90-\xBF][\x80-\xBF]{2}     # planes 1-3
                        | [\xF1-\xF3][\x80-\xBF]{3}          # planes 4-15
                        |  \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16
                        )*$%xs', $str
                );
                if (!$isUtf8) $str = utf8_encode($str);
        }

        return $str;
}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章