本文地址:http://blog.csdn.net/mounty_fsc/article/details/51088361
1 簡介
QueuePair與Body
是DataReader
的內部類。一個DataReader
對應一個任務,一個Body生成一個線程來讀取數據庫(如examples/mnist/mnist_train_lmdb
)。QueuePair
爲前面兩者之間的銜接、通信。
2 源代碼
/**
* @brief Reads data from a source to queues available to data layers.
* A single reading thread is created per source, even if multiple solvers
* are running in parallel, e.g. for multi-GPU training. This makes sure
* databases are read sequentially, and that each solver accesses a different
* subset of the database. Data is distributed to solvers in a round-robin
* way to keep parallel training deterministic.
*/
class DataReader {
public:
...
protected:
// Queue pairs are shared between a body and its readers
class QueuePair {
public:
explicit QueuePair(int size);
~QueuePair();
BlockingQueue<Datum*> free_;
BlockingQueue<Datum*> full_;
};
// A single body is created per source
class Body : public InternalThread {
public:
...
protected:
void InternalThreadEntry();
void read_one(db::Cursor* cursor, QueuePair* qp);
const LayerParameter param_;
BlockingQueue<shared_ptr<QueuePair> > new_queue_pairs_;
...
};
...
const shared_ptr<QueuePair> queue_pair_;
shared_ptr<Body> body_;
static map<const string, boost::weak_ptr<DataReader::Body> > bodies_;
};
2 類QueuePair
DataReader::QueuePair::QueuePair(int size) {
// Initialize the free queue with requested number of datums
for (int i = 0; i < size; ++i) {
free_.push(new Datum());
}
}
說明:
- 一個
QueuePair
對應一個任務隊列,從數據庫(如examples/mnist/mnist_train_lmdb
)中讀取size
個樣本 BlockingQueue
爲一個線程安全的隊列容器,其模板類型可能是Datum
,Batch
等。此處裝的是Datum
。BlockingQueue<Datum*> free_爲Datum
隊列,均爲新new
出來的,沒有包含原始數據(圖像)信息BlockingQueue<Datum*> full_
爲從數據庫讀取信息後的隊列,包含了原始數據(圖像)信息Datum
爲一個樣本單元,關於Datum
的定義,參見caffe.proto
文件,一般來說,Datum
對應於一張圖像(及其label
)
3 類Body
DataReader::Body::Body(const LayerParameter& param)
: param_(param),
new_queue_pairs_() {
StartInternalThread();
}
說明:
1. Body
類繼承了InternalThread
(詳見博文)。在構造函數了開啓這個線程
2. Body
類重載了 DataReader::Body::InternalThreadEntry()
函數,從數據庫讀取數據的操作在該函數中實現,見本文第5節
4 類DataReader
DataReader
類的構造函數如下:
map<const string, weak_ptr<DataReader::Body> > DataReader::bodies_;
static boost::mutex bodies_mutex_;
DataReader::DataReader(const LayerParameter& param)
: queue_pair_(new QueuePair( //
param.data_param().prefetch() * param.data_param().batch_size())) {
// Get or create a body
boost::mutex::scoped_lock lock(bodies_mutex_);
string key = source_key(param);
weak_ptr<Body>& weak = bodies_[key];
body_ = weak.lock();
if (!body_) {
body_.reset(new Body(param));
bodies_[key] = weak_ptr<Body>(body_);
}
body_->new_queue_pairs_.push(queue_pair_);
}
說明:
- 一個數據庫只可能有
Body
對象,如examples/mnist/mnist_train_lmdb
不管在任何線程的任何DataReader
對象中,都只會有一個Body
對象,因爲bodies_
是靜態的: - 所以有,一個
Body
的對象也可以有多個DataReader
對象 - 此外有,一個
DataReader
對象可以有多個Body
對象,即map<string,weak_ptr<Body>> bodies_
- 由代碼5,6行及16行可知,每一個DataReader對應一個讀的任務,即從數據庫(如examples/mnist/mnist_train_lmdb)中讀取param.data_param().prefetch() * param.data_param().batch_size()(LeNet5中默認爲4×64)個樣本
- 由此可見,一個DataReader爲一個任務,通過QueuePair(也對應於該任務)“通知”Body某個數據庫中讀去N個樣本
- 由代碼13行可知,某個數據庫(如examples/mnist/mnist_train_lmdb)對應的Body若不存在,將新建一個Body來處理該數據庫,也可以理解成新建一個唯一對應於該數據庫的線程來處理該數據可。
5 函數DataReader::Body::InternalThreadEntry
void DataReader::Body::InternalThreadEntry() {
...
vector<shared_ptr<QueuePair> > qps;
try {
...
// To ensure deterministic runs, only start running once all solvers
// are ready. But solvers need to peek on one item during initialization,
// so read one item, then wait for the next solver.
for (int i = 0; i < solver_count; ++i) {
shared_ptr<QueuePair> qp(new_queue_pairs_.pop());
read_one(cursor.get(), qp.get());
qps.push_back(qp);
}
// Main loop
while (!must_stop()) {
for (int i = 0; i < solver_count; ++i) {
read_one(cursor.get(), qps[i].get());
}
...
}
} catch (boost::thread_interrupted&) {
// Interrupted exception is expected on shutdown
}
}
說明:
read_one()
從QueuePair
的free_
中取出一個Datum
,從數據庫讀入數據至Datum
,然後放入full_
中- 由第4節16行可知,一個新的任務(
DataReader
)到來時,將把一個命令隊列(QueuePair
)放入到某個數據庫(Body
)的緩衝命令隊列中(new_queue_pairs_
) - 9到13行從每個
solver
的任務中讀取一個Datum
,在15到18行從數據庫中循環讀出數據 - 17行循環讀數據中,由於
QueuePair
中的BlockingQueue<Datum*> free_
是線程安全的,當free_
不夠用時候,該線程掛起,等待其他地方使用了full_
釋放出free_
再繼續從數據庫讀數據至free_
中的Datum
,然後把該Datum
重新放入full_
中。 - LeNet5中默認爲
free_
加full_
的Datum
個數爲4×64 - 該線程何時停止呢?