http://bugs.php.net/bug.php?id=37738
Problem Description:
------------
Simply put, basename() does ot work with Japanese filepaths. If the filename is Japanese only the extension part of the filename is returned. So a filename "/folder/�t�@�C����.txt" resolves to just ".txt". I discovered the problem when performing a basename() on the $_FILES array's 'name' element for uploaded Japanese files, however after testing the bug occurs no matter how you supply the filename.
My PHP environment is running with UTF-8 internal encoding.
The code snippet below illustrates this perfectly.
Reproduce code:
---------------
<?php
// show normal behavior with roman filename
$filename='/myfolder/roman_filename.txt';
echo "The full filename of the romanized file is $filename./n"; // /myfolder/roman_filename.txt
$basename=basename($filename);
echo "The basename of the romanized file is $basename./n"; // /roman_filename.txt
// show behavior with Japanese filename
$filename='/myfolder/��{��̃t�@�C����.txt';
echo "The full filename of the Japanese file is $filename./n"; // /myfolder/��{��̃t�@�C����.txt
$basename=basename($filename);
echo "The basename of the Japanese file is $basename."; // .txt
?>
Expected result:
----------------
The full filename of the romanized file is /myfolder/roman_filename.txt.
The basename of the romanized file is roman_filename.txt.
The full filename of the Japanese file is /myfolder/��{��̃t�@�C����.txt.
The basename of the Japanese file is ��{��̃t�@�C����.txt.
Actual result:
--------------
The full filename of the romanized file is /myfolder/roman_filename.txt.
The basename of the romanized file is roman_filename.txt.
The full filename of the Japanese file is /myfolder/��{��̃t�@�C����.txt.
The basename of the Japanese file is .txt.
Solution1:
$filename = substr ( $down , strrpos ( $down , '/' ) + 1 ) ;
Solution2:
http://hi.baidu.com/janson6788/blog/item/91cdfa6d5c5d15f0431694b2.html
php 有一個 basename() 函數,用來在路徑字符串中得到文件名部分,也就最後一個 ”/” 或 “/” 之後的部分,但是在有些平臺下,比如我的 fedora 11 + apache ,處理含有中文的路徑是會使中文的部分丟失。
一開始想自己寫一個函數替代,後來上網找了一個比較好的方法,用的是正則表達式
function get_basename($filename)
{
return preg_replace('/^.+[///////]/', '', $filename);
}
這個函數明顯強於我自己寫的(就不貼出來了:) ),無論是速度還是準確性。
所以以後 php 編程時用到 basename() 函數,最好用這個替代。
Solution3,
http://www.zivee.cn/2010/02/php-basename-dependent-on-locale-setting/
按照網站上找到說法是此函數依賴於區域設置,如果是多字節名稱返回爲空可以通過 setlocale函數 如下設置
<?php
setlocale(LC_ALL, 'zh_CN.UTF8');
// or any other locale that can handle multibyte characters.
?>
最好是修改服務器的區域設置來整體解決 。