資源 | 用PyTorch搞定GluonCV預訓練模型，這個計算機視覺庫真的很好用

機器之心編輯

參與：思源

今年上半年，DMLC 團隊發佈了簡單易用的計算機視覺工具箱 GluonCV，它繼承了 MXNet 動態圖接口 Gluon 的優良傳統，並能使用簡單易用的 API 快速構建複雜的深度神經網絡。這一工具非常好用，因此很多研究者希望在 PyTorch 等其它框架上調用它。Amazon AI 的應用科學家張航博士將 GluonCV 轉換爲了 PyTorch 版工具，從而能直接在 PyTorch 上調用圖像分類和語義分割等各種預訓練模型。

項目地址：https://github.com/zhanghang1989/gluoncv-torch

張航博士是 DMLC 團隊成員，在 GluonCV 的 Contributors 中可以看到他的貢獻量很高，因此他完成的 GluonCV-Torch 庫是非常值得嘗試的一項工具。在本文中，我們不僅會簡要介紹 GluonCV-Torch 及它的使用，同時還會簡單地試用這些預訓練模型，包括 DeepLabV3 語義分割模型。

GluonCV-Torch 簡介

目前 GluonCV 已經包含非常多的預訓練模型與 CV 工具，包括 50 多種圖像分類模型、SSD 和 Yolo-v3 等目標檢測模型、FCN 和 DeepLab-v3 等語義分割模型，除此之外還有實例分割、生成對抗網絡和行人再識別等模型。而目前 GluonCV-Torch 主要提供了圖像分類與語義分割兩部分的預訓練模型，其中分類模型都是在 ImageNet 實現的預訓練，而語義分割模型分別在 Pascal VOC 和 ADE20K 實現預訓練。

安裝 GluonCV-Torch 的安裝方法非常簡單，只要我們預先安裝了 PyTorch 就行了：

pip install gluoncv-torch

如果在 PyTorch 中加載 GluonCV，我們可以簡單地導入 gluoncvth 模塊，並從該模塊調用比 torchvision 中更好的預訓練模型：

import gluoncvth as gcv

model = gcv.models.resnet50(pretrained=True)

對於圖像分類方面的預訓練模型，GluonCV-Torch 以便捷的接口提供了準確率非常高的預訓練模型。如下所示爲不同預訓練模型的效果：

對於語義分割模型，GluonCV-Torch 主要支持預訓練的 FCN、PSPNet 和 DeepLab-V3，其中 DeepLab-V3 是非常常用的開源模型，它在語義分割任務上有非常好的效果。如下展示了這三種模型在 Pascal VOC 數據集中的預訓練效果，其中 Pascal VOC 包含 20 種類別的圖像：

以下展示了三種語義分割模型在 ADE20K 數據集的效果，其中 ADE20K 爲 MIT 發佈的場景解析數據集，該數據集包含多種情景，包括人物、背景和物體等。

GluonCV-Torch 使用

在該項目中，張航展示了一個簡單的使用示例，其調用了在 ADE20K 數據集上預訓練的 DeepLabV3 語義分割模型。

import torch
import gluoncvth

# Get the model
model = gluoncvth.models.get_deeplab_resnet101_ade(pretrained=True)
model.eval()

# Prepare the image
url = 'https://github.com/zhanghang1989/image-data/blob/master/encoding/' + \
    'segmentation/ade20k/ADE_val_00001142.jpg?raw=true'
filename = 'example.jpg'
img = gluoncvth.utils.load_image(
    gluoncvth.utils.download(url, filename)).unsqueeze(0)

# Make prediction
output = model.evaluate(img)
predict = torch.max(output, 1)[1].cpu().numpy() + 1

# Get color pallete for visualization
mask = gluoncvth.utils.get_mask_pallete(predict, 'ade20k')
mask.save('output.png')

運行上面的代碼會自動從 AWS 雲儲存上下載 200 多兆的預訓練模型，國內的下載可能會比較慢，我們可以使用其它工具下載並解壓到對應文件夾。機器之心在 PyTorch 0.4.1 的環境下可以安裝 GluonCV-Torch，併成功運行，我們在 CPU 上推斷一張圖片的時間約爲 70 多秒，在 GPU（K80）上的推斷時間爲 10 多秒。

如果我們要推斷其它的圖像或使用其它預訓練模型，可以直接修改 load_image 的圖像路徑和 gluoncvth.models 就可以了。

API Reference

以上只是兩個簡單的示例，更多的模型和用法還需要根據實際情況修改。以下展示了目前 GluonCV-Torch 的主要模型 API：

ResNet

gluoncvth.models.resnet18(pretrained=True)
gluoncvth.models.resnet34(pretrained=True)
gluoncvth.models.resnet50(pretrained=True)
gluoncvth.models.resnet101(pretrained=True)
gluoncvth.models.resnet152(pretrained=True)

FCN

gluoncvth.models.get_fcn_resnet101_voc(pretrained=True)
gluoncvth.models.get_fcn_resnet101_ade(pretrained=True)

PSPNet

gluoncvth.models.get_psp_resnet101_voc(pretrained=True)
gluoncvth.models.get_psp_resnet101_ade(pretrained=True)

DeepLabV3

gluoncvth.models.get_deeplab_resnet101_voc(pretrained=True)
gluoncvth.models.get_deeplab_resnet101_ade(pretrained=True)

資源 | 用PyTorch搞定GluonCV預訓練模型，這個計算機視覺庫真的很好用

教程 | 神經網絡的奧祕之優化器的妙用

要玩轉這個星際爭霸II開源AI，你只需要i5+GTX1050

我的八年博士生涯——CMU王贇寫在入職Facebook之前

NeurIPS 2018亮點選讀：深度推理學習中的圖網絡與關係表徵

CVPR論文復現爭議後續：華人一作苦戰兩月給出有態度的分析結果

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結