故障現象
每隔一段時間發現mongos連不上,mongos日誌出現大量的socket連接錯誤
2017-08-08T17:09:31.095+0800 I SHARDING [conn52677] couldn't find database [d4ee211b4024a2d9e6bd57d4a28679339d6b0692] in config db
2017-08-08T17:09:33.768+0800 I SHARDING [LockPinger] cluster 10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607 pinged successfully at Tue Aug 8 17:09:33 2017 by distributed lock pinger '10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607/6a0a848974c9:27017:1502169957:1804289383',
sleeping for 30000ms
2017-08-08T17:10:03.779+0800 I SHARDING [LockPinger] cluster 10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607 pinged successfully at Tue Aug 8 17:10:03 2017 by distributed lock pinger '10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607/6a0a848974c9:27017:1502169957:1804289383',
sleeping for 30000ms
2017-08-08T17:10:33.787+0800 I SHARDING [LockPinger] cluster 10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607 pinged successfully at Tue Aug 8 17:10:33 2017 by distributed lock pinger '10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607/6a0a848974c9:27017:1502169957:1804289383',
sleeping for 30000ms
2017-08-08T17:11:03.798+0800 I SHARDING [LockPinger] cluster 10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607 pinged successfully at Tue Aug 8 17:11:03 2017 by distributed lock pinger '10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607/6a0a848974c9:27017:1502169957:1804289383',
sleeping for 30000ms
2017-08-08T17:11:33.806+0800 I SHARDING [LockPinger] cluster 10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607 pinged successfully at Tue Aug 8 17:11:33 2017 by distributed lock pinger '10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607/6a0a848974c9:27017:1502169957:1804289383',
sleeping for 30000ms
2017-08-08T17:12:03.815+0800 I SHARDING [LockPinger] cluster 10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607 pinged successfully at Tue Aug 8 17:12:03 2017 by distributed lock pinger '10.10.219.96:27607,10.10.223.127:27607,10.10.236.43:27607/6a0a848974c9:27017:1502169957:1804289383',
sleeping for 30000ms
2017-08-08T17:12:29.180+0800 I SHARDING [conn52677] put [d4ee211b4024a2d9e6bd57d4a28679339d6b0692] on: 044dd16e-7706-4a49-b4ff-73d86a99d6fd:044dd16e-7706-4a49-b4ff-73d86a99d6fd/10.10.131.20:27607,10.10.139.113:27607,10.10.140.40:27607
2017-08-08T17:12:29.181+0800 I SHARDING [conn52677] Exception thrown while processing query op for d4ee211b4024a2d9e6bd57d4a28679339d6b0692.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.59.129:30983]
2017-08-08T17:12:29.181+0800 I NETWORK [conn52677] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.10.59.129:30983]
2017-08-08T17:12:29.181+0800 I SHARDING [conn52678] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.237.57:16624]
2017-08-08T17:12:29.181+0800 I NETWORK [conn52678] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.10.237.57:16624]
2017-08-08T17:12:29.182+0800 I SHARDING [conn51881] Exception thrown while processing query op for 4773506c0c9ab0871c4593fd13250fa07d0d4e90.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.59.129:19704]
2017-08-08T17:12:29.182+0800 I NETWORK [conn51881] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.10.59.129:19704]
2017-08-08T17:12:29.182+0800 I SHARDING [conn47787] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.77.67:56779]
2017-08-08T17:12:29.182+0800 I SHARDING [conn46993] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.237.57:57994]
2017-08-08T17:12:29.182+0800 I SHARDING [conn46922] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.237.57:57898]
2017-08-08T17:12:29.182+0800 I SHARDING [conn44943] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.44.190:31490]
2017-08-08T17:12:29.182+0800 I SHARDING [conn46911] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.63.59:62378]
2017-08-08T17:12:29.183+0800 I SHARDING [conn46764] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.44.190:63754]
2017-08-08T17:12:29.183+0800 I SHARDING [conn46936] Exception thrown while processing query op for admin.$cmd :: caused by :: 9001 socket exception [SEND_ERROR] server [10.10.36.82:3261]
2017-08-08T17:12:29.183+0800 I NETWORK [conn47787] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.10.77.67:56779]
排查步驟
1 懷疑是php驅動的問題,使用shell每隔1s連接mongos,發現故障復現時,確實有一段時間mongo shell卡頓,截圖如下,12:00分到12:04分mongo shell客戶端一直卡着,排除驅動問題
2 各項監控正常,故障點mongos的cpu,io很低
3 佈置了1s間隔的連接數監控發現發生故障時連接數急劇下滑
4 佈置了打開文件數等監控,沒什麼發現
5 最後發現每次發生故障時,日誌裏有這麼一段類似的記錄,猜測是它引起的
2017-08-08T17:09:31.095+0800 I SHARDING [conn52677] couldn't find database [d4ee211b4024a2d9e6bd57d4a28679339d6b0692] in config db
2017-08-08T17:12:29.180+0800 I SHARDING [conn52677] put [d4ee211b4024a2d9e6bd57d4a28679339d6b0692] on: 044dd16e-7706-4a49-b4ff-73d86a99d6fd:044dd16e-7706-4a49-b4ff-73d86a99d6fd/10.10.131.20:27607,10.10.139.113:27607,10.10.140.40:27607
詳細分析
上面的5中的日誌記錄,應該是創建一個新的db,然後將其放入一個分片的過程;
該mongos集羣下大約有2w多個db,由5個分片組成,創建db較爲頻繁;
查了下官方文檔,有這麼一條信息
The mongos
selects
the primary shard when creating a new database by picking the shard in the cluster that has the least amount of data. mongos
uses
the totalSize
field
returned by the listDatabase
command
as a part of the selection criteria.
即mongos創建一個新的db時,會選擇在分片中磁盤容量最小的那個分片中創建。
根據日誌提示翻看了下mongo的代碼,截圖如下,發現創建db時實際上主要就是調用shard類的pick方法
繼續查看pick方法的定義,其實就是調用getStatus(),通過getStatus的結果採用輪詢的方式選擇出一個容量最小的分片進行創建。
Shard Shard::pick(const Shard& current) {
vector<Shard> all;
staticShardInfo.getAllShards(all);
if (all.size() == 0) {
staticShardInfo.reload();
staticShardInfo.getAllShards(all);
if (all.size() == 0)
return EMPTY;
}
// if current shard was provided, pick a different shard only if it is a better choice
ShardStatus best = all[0].getStatus();
if (current != EMPTY) {
best = current.getStatus();
}
for (size_t i = 0; i < all.size(); i++) {
ShardStatus t = all[i].getStatus();
if (t < best)
best = t;
}
LOG(1) << "best shard for new allocation is " << best << endl;
return best.shard();
}
我們繼續看getstatus的內容,實際調用的getShardDataSizeBytes獲取每個分片的磁盤容量大小,期間還通過調用getShardMongoVersion保證版本的一致性,查看版本通過調用db.serverstatus()命令實現,這個應該不是我們關心的內容。
ShardStatus Shard::getStatus() const {
return ShardStatus(
*this, getShardDataSizeBytes(getConnString()), getShardMongoVersion(getConnString()));
}
繼續查看getShardDataSizeBytes的內容,發現實際上就是在分片上執行listdatabases,其實也就是show dbs,這個命令會顯示db和該db的容量大小
long long Shard::getShardDataSizeBytes(const string& shardHost) {
ScopedDbConnection conn(shardHost);
BSONObj listDatabases;
bool ok = conn->runCommand("admin", BSON("listDatabases" << 1), listDatabases);
conn.done();
uassert(28599,
str::stream() << "call to listDatabases on " << shardHost
<< " failed: " << listDatabases,
ok);
BSONElement totalSizeElem = listDatabases["totalSize"];
uassert(28590, "totalSize field not found in listDatabases", totalSizeElem.isNumber());
return listDatabases["totalSize"].numberLong();
}
結論
該mongos下db數量太多,當創建一個新的db時,會加一個全局鎖,然後挨個去每個分片執行show dbs找出最小容量這個過程太過耗時,導致了mongos服務不可用。測試了下在該mongos下執行show dbs大概需要1分鐘以上,和服務不可用的時間相當