運營總有各種各樣的需求,今天運營需要我做一個文件上傳的功能,文件格式是.txt文件,內容是每一行是一個uid,具體需求是,後臺上傳一份uid的白名單,如果用戶的uid在這份白名單上,則彈窗。總數是500萬左右(不定),目前文件是60多M。
接到這個需求後,我首先想到的是數據存在redis上的集合裏(因爲uid大部分是10位數,不適合用bitmap)。但是數據量太大,佔用資源還是很大的(每個彈窗的每個平臺和每種語言都有一份不同的白名單,後面運營說其實一次性不會上那麼多。所以決定這樣做,但彈窗下線後,把redis數據刪了)。
廢話不多說,直接上代碼。
後臺form 表單提交,因爲文件是60多M,所以採用分片上傳,用的是百度的webUpload。http://fex.baidu.com/webuploader
主要前端代碼如下
<!--文件分片上傳-->
<!--引入CSS-->
<link rel="stylesheet" type="text/css" href="/plugs/webUploader/webuploader.css">
<!--引入JS-->
<script type="text/javascript" src="/plugs/webUploader/webuploader.js"></script>
<div class="form-group wu-example" id="target_common" style="display: none;" class="wu-example">
<label class="col-md-3 control-label"></label>
<div class=" col-md-3 btns">
<div id="picker">選擇文件</div>
<!-- <button id="ctlBtn" class="btn btn-default">開始上傳</button>-->
</div>
<!--用來存放文件信息-->
<div id="thelist">
<div>
<a href=javascript:;"><?= isset($info['target_user_name']) && $info['target_user_name'] ? $info['target_user_name'] : '' ?></a>
</div>
</div>
<input type="hidden" id="target_common_temp_url" name="target_common_temp_url" />
</div>
<input type="hidden" id="target_user_name" name="target_user_name" value="<?= isset($info['target_user_name']) && !empty($info['target_user_name'])?$info['target_user_name']:''?>"/>
var uploader = WebUploader.create({
swf: '/plugs/webUploader/Uploader.swf',// swf文件路徑
server: '/toolcontent/operation_position/popwindow_target_upload',// 文件接收服務端。
pick: '#picker', // 選擇文件的按鈕。可選。內部根據當前運行是創建,可能是input元素,也可能是flash.
resize: false, // 不壓縮image, 默認如果是jpeg,文件上傳前會壓縮一把再上傳!
chunked: true, //是否要分片處理大文件上傳
chunkSize:2 * 1024 * 1024, //分片上傳,每片2M,默認是5M
auto: true,
chunkRetry : 2, //如果某個分片由於網絡問題出錯,允許自動重傳次數
//runtimeOrder: 'html5,flash',
accept: {
title: '文件',
extensions: 'txt',
mimeTypes: 'text/plain'
}
});
// 當有文件被添加進隊列的時候
uploader.on( 'fileQueued', function( file ) {
var $list = $("#thelist");
/*$list.append( '<div id="' + file.id + '" class="item">' +
'<h4 class="info">' + file.name + '</h4>' +
'<p class="state">等待上傳...</p>' +
'</div>' );*/
//只顯示一個
$list.html( '<div id="' + file.id + '" class="item">' +
'<h4 class="info">' + file.name + '</h4>' +
'<p class="state">等待上傳...</p>' +
'</div>' );
});
// 文件上傳過程中創建進度條實時顯示。
uploader.on( 'uploadProgress', function( file, percentage ) {
var $li = $( '#'+file.id ),
$percent = $li.find('.progress .progress-bar');
// 避免重複創建
if ( !$percent.length ) {
$percent = $('<div class="progress progress-striped active">' +
'<div class="progress-bar" role="progressbar" style="width: 0%">' +
'</div>' +
'</div>').appendTo( $li ).find('.progress-bar');
}
$li.find('p.state').text('上傳中');
$percent.css( 'width', percentage * 100 + '%' );
});
//文件上傳成功或者失敗管理
uploader.on( 'uploadSuccess', function(file,response ) {
console.log(response);
$("#target_common_temp_url").val(response.filePath);
$("#target_user_name").val(response.oldName);
$( '#'+file.id ).find('p.state').text('已上傳');
});
uploader.on( 'uploadError', function( file ) {
$( '#'+file.id ).find('p.state').text('上傳出錯');
});
//文件上傳完成
uploader.on( 'uploadComplete', function( file,response ) {
$( '#'+file.id ).find('.progress').fadeOut();
});
主要後端代碼:
public function main()
{
$targetDir = '/www/privdata/xxxx/target_user_tmp';//存放分片臨時目錄
$uploadDir = '/www/privdata/xxxx/target_user';//分片合併存放目錄
$cleanupTargetDir = true; // Remove old files
$maxFileAge = 5 * 3600; // Temp file age in seconds
// 創建文件夾
if (!file_exists($targetDir)) {
mkdir($targetDir,0777,true);
}
if (!file_exists($uploadDir)) {
mkdir($uploadDir,0777,true);
}
// 獲得文件名稱
if (isset($_REQUEST["name"])) {
$fileName = $_REQUEST["name"];
} elseif (!empty($_FILES)) {
$fileName = $_FILES["file"]["name"];
} else {
$fileName = uniqid("file_");
}
$oldName = $fileName;
$fileName = iconv('UTF-8','gb2312',$fileName);
$filePath = $targetDir . DIRECTORY_SEPARATOR . $fileName;
$chunk = isset($_REQUEST["chunk"]) ? intval($_REQUEST["chunk"]) : 0;
$chunks = isset($_REQUEST["chunks"]) ? intval($_REQUEST["chunks"]) : 1;
$response = [
'code' => 0,
'msg' => ''
];
// 移除舊文件
if ($cleanupTargetDir) {
if (!is_dir($targetDir) || !$dir = opendir($targetDir)) {
$response['msg'] = 'Failed to open temp directory111';
echo json_encode($response);exit;
}
while (($file = readdir($dir)) !== false) {
$tmpfilePath = $targetDir . DIRECTORY_SEPARATOR . $file;
// If temp file is current file proceed to the next
if ($tmpfilePath == "{$filePath}_{$chunk}.part" || $tmpfilePath == "{$filePath}_{$chunk}.parttmp") {
continue;
}
// Remove temp file if it is older than the max age and is not the current file
if (preg_match('/\.(part|parttmp)$/', $file) && (filemtime($tmpfilePath) < time() - $maxFileAge)) {
unlink($tmpfilePath);
}
}
closedir($dir);
}
// 打開臨時文件
if (!$out = fopen("{$filePath}_{$chunk}.parttmp", "wb")) {
$response['msg'] = 'Failed to open output stream222';
echo json_encode($response);exit;
}
if (!empty($_FILES)) {
if ($_FILES["file"]["error"] || !is_uploaded_file($_FILES["file"]["tmp_name"])) {
$response['msg'] = 'Failed to move uploaded file333';
echo json_encode($response);exit;
}
// Read binary input stream and append it to temp file
if (!$in = fopen($_FILES["file"]["tmp_name"], "rb")) {
$response['msg'] = 'Failed to open input stream444';
echo json_encode($response);exit;
}
} else {
if (!$in = fopen("php://input", "rb")) {
$response['msg'] = 'Failed to open input stream555';
echo json_encode($response);exit;
}
}
while ($buff = fread($in, 4096)) {
fwrite($out, $buff);
}
fclose($out);
fclose($in);
rename("{$filePath}_{$chunk}.parttmp", "{$filePath}_{$chunk}.part");
$done = true;
for( $index = 0; $index < $chunks; $index++ ) {
if ( !file_exists("{$filePath}_{$index}.part") ) {
$done = false;
break;
}
}
if ($done) {
$pathInfo = pathinfo($fileName);
$hashStr = substr(md5($pathInfo['basename']),8,16);
$hashName = time() . $hashStr . '.' .$pathInfo['extension'];
$uploadPath = $uploadDir . DIRECTORY_SEPARATOR .$hashName;
if (!$out = fopen($uploadPath, "wb")) {
$response['msg'] = 'Failed to open output stream';
echo json_encode($response);exit;
}
//flock($hander,LOCK_EX)文件鎖
if ( flock($out, LOCK_EX) ) {
for( $index = 0; $index < $chunks; $index++ ) {
if (!$in = fopen("{$filePath}_{$index}.part", "rb")) {
break;
}
while ($buff = fread($in, 4096)) {
fwrite($out, $buff);
}
fclose($in);
unlink("{$filePath}_{$index}.part");
}
flock($out, LOCK_UN);
}
fclose($out);
$response = [
'code' => 1,
'success'=>true,
'oldName'=>$oldName,
'filePath'=>$uploadPath,
// 'fileSize'=>$data['size'],
'fileSuffixes'=>$pathInfo['extension'], //文件後綴名
// 'file_id'=>$data['id'],
];
echo json_encode($response);exit;
}
$response = [
'code' => 1,
'success'=>true,
];
echo json_encode($response);exit;
}
現在大文件上傳解決完了,現在提交所有表單到服務端,然後服務端解析txt文件內容
如下代碼
function input_file($filename) {
$fp = fopen($filename,'r');//打開文件,如果打開失敗,本函數返回 FALSE。
if(!$fp){
return false;
}
/*$data = [];
$i = 0;
while (!feof($fp)) {
if ($i == 0) continue;
$i++;
$line = fgets($fp);
$line = str_replace("\n","",$line);
$data[] = $line;
}
fclose($fp);
return $data;*/
$str = '';
$buffer = 1024 * 1024;//每次讀取1024 * 1024字節
while (!feof($fp)) {
$str .= fread($fp, $buffer);
}
$arr = explode("\n", $str);
unset($arr[0]);
fclose($fp);
return $arr;
}
後臺解析txt文本壓力其實不大,主要是寫入redis集合中壓力很大。但redis集合可以一次性寫入多個value,代碼如下
public function setData($data)
{
if ($data) {
//先清空集合
$this->delData();
if (is_array($data)) {
//一次性寫1000個
$uidArr = [];
$success = $error = 0;
foreach ($data as $uid) {
if (trim($uid)) {
$uid = trim($uid);
$uidArr[] = $uid;
}
if (count($uidArr) > 1000) {
$res = $this->redis->sAdd($this->key, ...$uidArr);
if ($res) {
$success = $success + count($uidArr);
} else {
$error = $error + count($uidArr);
}
$uidArr = [];
}
}
//剩餘的寫入
if ($uidArr) {
$res = $this->redis->sAdd($this->key, ...$uidArr);
if ($res) {
$success = $success + count($uidArr);
} else {
$error = $error + count($uidArr);
}
}
return ['success'=>$success, 'error' => $error];
} else {
$this->redis->sAdd($this->key, $data);
}
$this->redis->expire($this->key, self::CACHE_TTL);
}
}
public function delData()
{
$this->redis->del($this->key);
}
可能出現的問題:
1、php 寫入大小的限制,修改php.ini文件,設置memory_limit = 2048M,默認是128M(Allowed memory size of 134217728 bytes exhausted (tried to allocate 4096 bytes) in xxxxxxxx on line 209)
2、如果是直接post 提交(本文不存在),則配置php.ini文件:設置post_max_size = 80M,默認才16M
3、nginx 配置:可以自己去查一下
client_max_body_size 100m;//最大上傳500M
client_body_buffer_size 100m;//最大上傳500M
proxy_read_timeout 300;//該指令設置與代理服務器的讀超時時間。它決定了nginx會等待多長時間來獲得請求的響應。這個時間不是獲得整個response的時間,而是兩次reading操作的時間。默認60秒
4、fastcgi配置,可以百度一下這個參數是幹嘛的
fastcgi_buffer_size 1024k;
fastcgi_buffers 64 1024k;
改進方法,由於數據太大,redis存儲要太久,讀取文件數據到數組佔用內存太高,所有這邊做一個優化:
前臺form提交表單,把分片上傳的文件path提交到後端。後端只需要用
if (filesize($target_common_temp_url) == 0) {
$this->ajax_error('txt文件沒有任何數據!');
}
判斷文件內容是否爲空,如果不爲空,插入redis,用yield和一次性插入多個value。這樣降低了內存的使用,也減少了時間,redis代碼如下:
public function setDataNew($file)
{
if (is_file($file)) {
$this->delData();//先清空集合
$uidArr = [];
$success = $error = 0;
foreach ($this->getLines($file) as $n => $line) {
if ($n == 0) continue; // 去掉第一行
$uidArr[] = (int)trim($line);
if (count($uidArr) > 20000) {
$res = $this->redis->sAdd($this->key, ...$uidArr);
if ($res) {
$success = $success + count($uidArr);
} else {
$error = $error + count($uidArr);
}
$uidArr = [];
}
}
if ($uidArr) { //剩餘的寫入
$res = $this->redis->sAdd($this->key, ...$uidArr);
if ($res) {
$success = $success + count($uidArr);
} else {
$error = $error + count($uidArr);
}
}
$this->redis->expire($this->key, self::CACHE_TTL);
return ['success'=>$success, 'error' => $error];
}
}
public function delData()
{
$this->redis->del($this->key);
}
//讀取文件
private function getLines($file)
{
$f = fopen($file, 'r');
try {
while ($line = fgets($f)) {
yield $line;
}
} finally {
fclose($f);
}
}
其他可能用得到需要調試的函數:
//用於判斷內存的使用
//echo $this->formatBytes(memory_get_peak_usage());
function formatBytes($bytes)
{
if ($bytes < 1024) {
return $bytes . "b";
} else if ($bytes < 1048576) {
return round($bytes / 1024, 2) . "kb";
}
return round($bytes / 1048576, 2) . 'mb';
}
/*
* 十三位時間戳,包含毫秒1535423356248
* https://blog.csdn.net/tcf_jingfeng/article/details/82143440
*/
function msectime()
{
list($msec, $sec) = explode(' ', microtime());
$msectime = (float)sprintf('%.0f', (floatval($msec) + floatval($sec)) * 1000);
return $msectime;
}