深度學習caffe數據結構（五）—— blob數據結構blob.cpp文件詳細解讀

在caffe中，Blob類實現的源碼位於caffe根目錄下的src/caffe/路徑中的blob.cpp文件中，本文對這個文件進行詳細解讀。

#include <climits>
#include <vector>

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/syncedmem.hpp"
#include "caffe/util/math_functions.hpp"

這幾行代碼是blob.cpp文件包含的頭文件。

namespace caffe {

這一行定義caffe命名空間

template <typename Dtype>
void Blob<Dtype>::Reshape(const int num, const int channels, const int height,
    const int width) {
  vector<int> shape(4);
  shape[0] = num;
  shape[1] = channels;
  shape[2] = height;
  shape[3] = width;
  Reshape(shape);
}

這個函數的是Blob的變形函數，使用num、channels、height、width這四個維度信息對Blob進行變形。在函數中，將這四個維度信息保存到shape向量中，根據shape向量來改變Blob形狀。

template <typename Dtype>
void Blob<Dtype>::Reshape(const vector<int>& shape) {
  CHECK_LE(shape.size(), kMaxBlobAxes);
  count_ = 1;
  shape_.resize(shape.size());
  if (!shape_data_ || shape_data_->size() < shape.size() * sizeof(int)) {
    shape_data_.reset(new SyncedMemory(shape.size() * sizeof(int)));
  }
  int* shape_data = static_cast<int*>(shape_data_->mutable_cpu_data());
  for (int i = 0; i < shape.size(); ++i) {
    CHECK_GE(shape[i], 0);
    if (count_ != 0) {
      CHECK_LE(shape[i], INT_MAX / count_) << "blob size exceeds INT_MAX";
    }
    count_ *= shape[i];
    shape_[i] = shape[i];
    shape_data[i] = shape[i];
  }
  if (count_ > capacity_) {
    capacity_ = count_;
    data_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
    diff_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));
  }
}

這個函數是Reshape函數的落實函數，通過shape向量來改變Blob的形狀。在這個函數中，還對類的成員函數進行賦值，爲數據分配內存空間。

函數中先是判斷shape的尺寸是否小於或等於最大的Blob尺寸，然後將成員變量shape_的尺寸賦值爲shape的尺寸。

接下來shape_data_.reset(new SyncedMemory(shape.size() * sizeof(int)));爲shape_data_分配內存空間。int* shape_data = static_cast<int*>(shape_data_->mutable_cpu_data());的作用是定義一個指向shape_data_的指針shape_data。

接下來在一個for循環中，對shape的每個維度進行遍歷，先判斷每個shape[i]是否大於等於0，然後判斷shape[i]是否小於允許的最大值（即整數的最大值/count_），然後將每個shape[i]累乘到count_上，並將shape[i]賦值給shape_[i]，再將shape[i]賦值給shape_data指向的內存空間數據。

在函數的最後，判斷Blob的元素數量count_是否大於Blob的容量capacity_。如果大於，將容量修改爲與元素數相等，最後爲data_和diff_分配內存空間。

template <typename Dtype>
void Blob<Dtype>::Reshape(const BlobShape& shape) {
  CHECK_LE(shape.dim_size(), kMaxBlobAxes);
  vector<int> shape_vec(shape.dim_size());
  for (int i = 0; i < shape.dim_size(); ++i) {
    shape_vec[i] = shape.dim(i);
  }
  Reshape(shape_vec);
}

上面這個函數也是一個Reshape函數，它的輸入爲BlobShape類型變量shape，在函數中，先判斷shape的維度數是否小於等於最大的維度數。然後將shape每個維度的尺寸賦值給shape_vec向量，並依據shape_vec來調用相應的Reshape函數來改變blob的形狀。

template <typename Dtype>
void Blob<Dtype>::ReshapeLike(const Blob<Dtype>& other) {
  Reshape(other.shape());
}

上面這個函數的作用是將本類的Blob尺寸改變爲與other相同的尺寸，函數中調用Reshape函數實現。

template <typename Dtype>
Blob<Dtype>::Blob(const int num, const int channels, const int height,
    const int width)
  // capacity_ must be initialized before calling Reshape
  : capacity_(0) {
  Reshape(num, channels, height, width);
}

這個函數是Blob的構造函數，輸入爲num、channels、height、width這四個維度信息，調用Reshape函數來實現。

template <typename Dtype>
Blob<Dtype>::Blob(const vector<int>& shape)
  // capacity_ must be initialized before calling Reshape
  : capacity_(0) {
  Reshape(shape);
}

這個函數是Blob的另一個構造函數，它的輸入爲shape向量。

template <typename Dtype>
const int* Blob<Dtype>::gpu_shape() const {
  CHECK(shape_data_);
  return (const int*)shape_data_->gpu_data();
}

這個函數是讀取gpu_shape地址的函數，函數返回指向shape_data_的指針。

template <typename Dtype>
const Dtype* Blob<Dtype>::cpu_data() const {
  CHECK(data_);
  return (const Dtype*)data_->cpu_data();
}

這個函數返回指向cpu_data的指針。

template <typename Dtype>
void Blob<Dtype>::set_cpu_data(Dtype* data) {
  CHECK(data);
  // Make sure CPU and GPU sizes remain equal
  size_t size = count_ * sizeof(Dtype);
  if (data_->size() != size) {
    data_.reset(new SyncedMemory(size));
    diff_.reset(new SyncedMemory(size));
  }
  data_->set_cpu_data(data);
}

這個函數用來設置cpu_data，用data指向的數據替代data_所指向的cpu數據。

template <typename Dtype>
const Dtype* Blob<Dtype>::gpu_data() const {
  CHECK(data_);
  return (const Dtype*)data_->gpu_data();
}

這個函數返回指向gpu_data的指針。

template <typename Dtype>
void Blob<Dtype>::set_gpu_data(Dtype* data) {
  CHECK(data);
  // Make sure CPU and GPU sizes remain equal
  size_t size = count_ * sizeof(Dtype);
  if (data_->size() != size) {
    data_.reset(new SyncedMemory(size));
    diff_.reset(new SyncedMemory(size));
  }
  data_->set_gpu_data(data);
}

這個函數用來設置gpu_data，用data指向的數據替代data_所指向的gpu數據。

template <typename Dtype>
const Dtype* Blob<Dtype>::cpu_diff() const {
  CHECK(diff_);
  return (const Dtype*)diff_->cpu_data();
}

這個函數返回指向cpu_diff的指針。

template <typename Dtype>
const Dtype* Blob<Dtype>::gpu_diff() const {
  CHECK(diff_);
  return (const Dtype*)diff_->gpu_data();
}

這個函數返回指向gpu_diff的指針。

template <typename Dtype>
Dtype* Blob<Dtype>::mutable_cpu_data() {
  CHECK(data_);
  return static_cast<Dtype*>(data_->mutable_cpu_data());
}

這個函數返回指向可以讀寫的cpu_data的指針。

template <typename Dtype>
Dtype* Blob<Dtype>::mutable_gpu_data() {
  CHECK(data_);
  return static_cast<Dtype*>(data_->mutable_gpu_data());
}

這個函數返回指向可以讀寫的gpu_data的指針。

template <typename Dtype>
Dtype* Blob<Dtype>::mutable_cpu_diff() {
  CHECK(diff_);
  return static_cast<Dtype*>(diff_->mutable_cpu_data());
}

這個函數返回指向可以讀寫的cpu_diff的指針。

template <typename Dtype>
Dtype* Blob<Dtype>::mutable_gpu_diff() {
  CHECK(diff_);
  return static_cast<Dtype*>(diff_->mutable_gpu_data());
}

這個函數返回指向可以讀寫的gpu_diff的指針。

template <typename Dtype>
void Blob<Dtype>::ShareData(const Blob& other) {
  CHECK_EQ(count_, other.count());
  data_ = other.data();
}

這個函數的作用是分享另一個Blob的數據data，在函數中如果other與本類Blob的元素數相等，則將other中的data複製到data_中。

template <typename Dtype>
void Blob<Dtype>::ShareDiff(const Blob& other) {
  CHECK_EQ(count_, other.count());
  diff_ = other.diff();
}

這個函數的作用是分享另一個Blob的數據diff，在函數中，如果other與本類Blob的元素數相等，則將other中的diff複製到diff_中。

template <> void Blob<unsigned int>::Update() { NOT_IMPLEMENTED; }
template <> void Blob<int>::Update() { NOT_IMPLEMENTED; }

這兩行聲明瞭兩個Blob的更新函數。

template <typename Dtype>
void Blob<Dtype>::Update() {
  // We will perform update based on where the data is located.
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    // perform computation on CPU
    caffe_axpy<Dtype>(count_, Dtype(-1),
        static_cast<const Dtype*>(diff_->cpu_data()),
        static_cast<Dtype*>(data_->mutable_cpu_data()));
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    // perform computation on GPU
    caffe_gpu_axpy<Dtype>(count_, Dtype(-1),
        static_cast<const Dtype*>(diff_->gpu_data()),
        static_cast<Dtype*>(data_->mutable_gpu_data()));
#else
    NO_GPU;
#endif
    break;
  default:
    LOG(FATAL) << "Syncedmem not initialized.";
  }
}

這個函數是更新函數Update的詳細定義。首先通過一個switch語句判斷data_位於哪裏，如果在CPU中，則調用caffe_axpy函數計算法diff_和data_各元素的和。如果數據在GPU或者在CPU和GPU中同步，則調用caffe_gpu_axpy來計算diff_和data_各元素的和。需要注意的是，如果caffe編譯時使能了CPU_ONLY選項，則不進行GPU相關的計算。其中caffe_axpy和caffe_gpu_axpy函數在math_functions.hpp文件中定義，這個文件位於caffe根目錄下的include/caffe/util/路徑中，有興趣可以自己研究一下這個函數的實現過程。

template <> unsigned int Blob<unsigned int>::asum_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::asum_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

這幾行代碼，聲明瞭兩個計算data元素絕對值和（L1範數）的函數。

template <typename Dtype>
Dtype Blob<Dtype>::asum_data() const {
  if (!data_) { return 0; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    return caffe_cpu_asum(count_, cpu_data());
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
  {
    Dtype asum;
    caffe_gpu_asum(count_, gpu_data(), &asum);
    return asum;
  }
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return 0;
}

上面的函數是計算data元素絕對值和的詳細定義。這個函數使用的方法與更新函數類似，也是使用一個switch語句判斷data_位於哪裏，如果在CPU中，則調用caffe_cpu_asum函數計算法data_各元素的絕對值和。如果數據在GPU或者在CPU和GPU中同步，則調用caffe_gpu_asum函數計算法data_各元素的絕對值和。如果caffe編譯時使能了CPU_ONLY選項，則不進行GPU相關的計算。其中，caffe_cpu_asum和caffe_gpu_asum函數也是在math_functions.hpp文件中定義。

template <> unsigned int Blob<unsigned int>::asum_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::asum_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

這幾行代碼，聲明瞭兩個計算diff元素絕對值和（L1範數）的函數。

template <typename Dtype>
Dtype Blob<Dtype>::asum_diff() const {
  if (!diff_) { return 0; }
  switch (diff_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    return caffe_cpu_asum(count_, cpu_diff());
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
  {
    Dtype asum;
    caffe_gpu_asum(count_, gpu_diff(), &asum);
    return asum;
  }
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << diff_->head();
  }
  return 0;
}

上面的函數是計算diff元素絕對值和的詳細定義。與計算data元素絕對值和的函數實現是完全一樣的，大家可以自行分析。

template <> unsigned int Blob<unsigned int>::sumsq_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::sumsq_data() const {
  NOT_IMPLEMENTED;
  return 0;
}

這幾行代碼，聲明瞭兩個計算data元素平方和（L2範數）的函數。

template <typename Dtype>
Dtype Blob<Dtype>::sumsq_data() const {
  Dtype sumsq;
  const Dtype* data;
  if (!data_) { return 0; }
  switch (data_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    data = cpu_data();
    sumsq = caffe_cpu_dot(count_, data, data);
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    data = gpu_data();
    caffe_gpu_dot(count_, data, data, &sumsq);
#else
    NO_GPU;
#endif
    break;
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return sumsq;
}

上面的函數是計算data元素平方和的詳細定義。這個函數也是使用一個switch語句判斷data_位於哪裏，如果在CPU中，則調用caffe_cpu_dot函數計算法data_各元素的平方和。如果數據在GPU或者在CPU和GPU中同步，則調用caffe_gpu_dot函數計算法data_各元素的平方和。如果caffe編譯時使能了CPU_ONLY選項，則不進行GPU相關的計算。其中，caffe_cpu_dot和caffe_gpu_dot函數也是在math_functions.hpp文件中定義。

template <> unsigned int Blob<unsigned int>::sumsq_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

template <> int Blob<int>::sumsq_diff() const {
  NOT_IMPLEMENTED;
  return 0;
}

這幾行代碼，聲明瞭兩個計算diff元素平方和（L2範數）的函數。

template <typename Dtype>
Dtype Blob<Dtype>::sumsq_diff() const {
  Dtype sumsq;
  const Dtype* diff;
  if (!diff_) { return 0; }
  switch (diff_->head()) {
  case SyncedMemory::HEAD_AT_CPU:
    diff = cpu_diff();
    sumsq = caffe_cpu_dot(count_, diff, diff);
    break;
  case SyncedMemory::HEAD_AT_GPU:
  case SyncedMemory::SYNCED:
#ifndef CPU_ONLY
    diff = gpu_diff();
    caffe_gpu_dot(count_, diff, diff, &sumsq);
    break;
#else
    NO_GPU;
#endif
  case SyncedMemory::UNINITIALIZED:
    return 0;
  default:
    LOG(FATAL) << "Unknown SyncedMemory head state: " << data_->head();
  }
  return sumsq;
}

上面的函數是計算diff元素絕平方和的詳細定義。與計算data元素平方和的函數實現是完全一樣的，大家可以自行分析。

深度學習caffe數據結構（五）—— blob數據結構blob.cpp文件詳細解讀

探究職業發展的關鍵：能力模型解讀

高效率使用windows

智能決策新時代：可視化大屏是否能夠超越傳統白板？

解密Prompt系列28. LLM Agent之金融領域摸索：FinMem & FinAgent

分享幾個.NET開源的AI和LLM相關項目框架

如何實現ubuntu虛擬機與windows系統之間複製粘貼

排序算法（三）—— 插入法排序算法

控制算法（一）—— PID控制算法

毫米波雷達介紹

如何用少線束的激光雷達獲得多線束的激光雷達的感知效果——點雲配準

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結