CAM实现的流程(pytorch)

之前写了一个简化版本（简化版传送门）的可视化过程，简化版的可视化没有考虑到通道之间的关系。这篇将介绍cam的流程。
下一篇为Grad-Cam实现流程

流程图

算法思路

将要可视化的图片输进网络模型，判断出所属类别
获取最后一个卷积层的输出特征图
通过图片所属类别，得到权重，对获取的特征图的各个通道赋值，并且相加为单通道的特征图

举个例子

如果输入一张图片，通过网络模型之后，判断这张图片为第500类（总共1000类）。获取的特征图shape为(1，512，13，13)，假设分类层为1 x 1卷积（这里就不算是最后一个卷积层，而是属于分类层）和全局平均池化组成。那么，1000个类别有1000种权重，也就是说能够给特征图赋1000种值。每个权重关注点不一样，所以才需要知道图片属于哪个类别。知道它是500类后，那么只需要拿出第500个类别的权重赋给特征图就ok了。
CAM算法有一个制约条件，需要用到全局平均池化的操作，如果最后有多层全连接层，那么CAM算法就不适用了。比如vgg16，最后一个卷积层之后，接了三个全连接层，由于卷积层的输出特征图需要flatten才能接入全连接层，在经过三个全连接层后，已经难以算出通道之间的联系，则很难去计算各个特征图通道的权重重要性。这种情况下就需要用到Grad-Cam算法了。

代码分析

先准备图片、标签以及模型
类别标签下载方法：
先安装axel：
sudo apt-get install axel
执行下载命令
axel -n 5 https://s3.amazonaws.com/outcome-blog/imagenet/labels.json
图片下载：
axel -n 5 http://media.mlive.com/news_impact/photo/9933031-large.jpg
模型下载：
senet1_1:axel -n 5 https://download.pytorch.org/models/squeezenet1_1-f364aa15.pth
resnet18:axel -n 5 https://download.pytorch.org/models/resnet18-5c106cde.pth
densenet161: axel -n 5 https://download.pytorch.org/models/densenet161-8d451a50.pth

1.导入各种包，并且读取类别标签

from PIL import Image
import torch
from torchvision import models, transforms
from torch.autograd import Variable
from torch.nn import functional as F
import numpy as np
import cv2
import json

# 读取 imagenet数据集的类别标签
json_path = './cam/labels.json'
with open(json_path, 'r') as load_f:
    load_json = json.load(load_f)
classes = {int(key): value for (key, value)
           in load_json.items()}

2.读取图片，并预处理

# 读取 imagenet数据集的某类图片
img_path = './cam/9933031-large.jpg'
normalize = transforms.Normalize(
    mean=[0.485, 0.456, 0.406],
    std=[0.229, 0.224, 0.225]
)

# 图片预处理
preprocess = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    normalize
])

img_pil = Image.open(img_path)
img_tensor = preprocess(img_pil)
img_variable = Variable(img_tensor.unsqueeze(0))

3.加载预训练模型

# 加载预训练模型
model_id = 1
if model_id == 1:
    net = models.squeezenet1_1(pretrained=False)
    pthfile = r'./pretrained/squeezenet1_1-f364aa15.pth'
    net.load_state_dict(torch.load(pthfile))
    finalconv_name = 'features'  # 获取卷积层的特征
elif model_id == 2:
    net = models.resnet18(pretrained=False)
    finalconv_name = 'layer4'
elif model_id == 3:
    net = models.densenet161(pretrained=False)
    finalconv_name = 'features'
net.eval()	# 使用eval()属性
print(net)

我只下了senet1_1，如果想使用其余两个模型，依葫芦画瓢自行修改。
打印模型的结果：

SqueezeNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2))
    (1): ReLU(inplace)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
    (3): Fire(
      (squeeze): Conv2d(64, 16, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(16, 64, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(16, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (4): Fire(
      (squeeze): Conv2d(128, 16, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(16, 64, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(16, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
    (6): Fire(
      (squeeze): Conv2d(128, 32, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(32, 128, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(32, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (7): Fire(
      (squeeze): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(32, 128, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(32, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (8): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
    (9): Fire(
      (squeeze): Conv2d(256, 48, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(48, 192, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(48, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (10): Fire(
      (squeeze): Conv2d(384, 48, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(48, 192, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(48, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (11): Fire(
      (squeeze): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
    (12): Fire(
      (squeeze): Conv2d(512, 64, kernel_size=(1, 1), stride=(1, 1))
      (squeeze_activation): ReLU(inplace)
      (expand1x1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1))
      (expand1x1_activation): ReLU(inplace)
      (expand3x3): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (expand3x3_activation): ReLU(inplace)
    )
  )
  (classifier): Sequential(
    (0): Dropout(p=0.5)
    (1): Conv2d(512, 1000, kernel_size=(1, 1), stride=(1, 1))
    (2): ReLU(inplace)
    (3): AdaptiveAvgPool2d(output_size=(1, 1))
  )
)

可以看到特征提取部分在（features）中，分类层在（classifier）中。

4.获取特征图

features_blobs = []     # 后面用于存放特征图

def hook_feature(module, input, output):
    features_blobs.append(output.data.cpu().numpy())

# 获取 features 模块的输出
net._modules.get(finalconv_name).register_forward_hook(hook_feature)

register_forward_hook可以获取中间层输出，具体可自行百度。

5.获取权重

# 获取权重
params = list(net.parameters())
print(len(params))		# 52
weight_softmax = np.squeeze(params[-2].data.numpy())	# shape:(1000, 512)

params 中保存了模型的所有权重，怎么索引到我们需要的呢？再回到模型打印结果那里，由于pooling层和dropout层是不保存参数的，如果将所有的卷积、激活操作数下来，发现一共有52层有参数。如果要获取features模块到classifier模块的权重，那么就是获取classifier中(1): Conv2d(512, 1000, kernel_size=(1, 1), stride=(1, 1))的参数。这时，忽略最后一个全局平均池化，那么就是索引为-2的参数了。

logit = net(img_variable)				# 计算输入图片通过网络后的输出值
print(logit.shape)						# torch.Size([1, 1000])
print(params[-2].data.numpy().shape)	# 权重有1000种 (1000, 512, 1, 1)
print(features_blobs[0].shape)			# 特征图大小为　(1, 512, 13, 13)

# 结果有1000类，进行排序，并获得排序索引
h_x = F.softmax(logit, dim=1).data.squeeze()	
print(h_x.shape)						# torch.Size([1000])
probs, idx = h_x.sort(0, True)
probs = probs.numpy()					# 概率值排序
idx = idx.numpy()						# 类别索引排序，概率值越高，索引越靠前

# 取概率值为前5的类别看看类别名和概率值
for i in range(0, 5):
    print('{:.3f} -> {}'.format(probs[i], classes[idx[i]]))
'''
0.678 -> mountain bike, all-terrain bike, off-roader
0.088 -> bicycle-built-for-two, tandem bicycle, tandem
0.042 -> unicycle, monocycle
0.038 -> horse cart, horse-cart
0.019 -> lakeside, lakeshore

'''

6.定义计算CAM的函数

# 定义计算CAM的函数
def returnCAM(feature_conv, weight_softmax, class_idx):
    # 类激活图上采样到 256 x 256
    size_upsample = (256, 256)
    bz, nc, h, w = feature_conv.shape
    output_cam = []
    # 将权重赋给卷积层：这里的weigh_softmax.shape为(1000, 512)
    # 				feature_conv.shape为(1, 512, 13, 13)
    # weight_softmax[class_idx]由于只选择了一个类别的权重，所以为(1, 512)
    # feature_conv.reshape((nc, h * w))后feature_conv.shape为(512, 169)
    cam = weight_softmax[class_idx].dot(feature_conv.reshape((nc, h * w)))
    print(cam.shape)		# 矩阵乘法之后，为各个特征通道赋值。输出shape为（1，169）
    cam = cam.reshape(h, w) # 得到单张特征图
    # 特征图上所有元素归一化到 0-1
    cam_img = (cam - cam.min()) / (cam.max() - cam.min())  
    # 再将元素更改到　0-255
    cam_img = np.uint8(255 * cam_img)
    output_cam.append(cv2.resize(cam_img, size_upsample))
    return output_cam

7.生成图片

# 对概率最高的类别产生类激活图
CAMs = returnCAM(features_blobs[0], weight_softmax, [idx[0]])
# 融合类激活图和原始图片
img = cv2.imread(img_path)
height, width, _ = img.shape
heatmap = cv2.applyColorMap(cv2.resize(CAMs[0], (width, height)), cv2.COLORMAP_JET)
result = heatmap * 0.3 + img * 0.７
cv2.imwrite('CAM０.jpg', result)

cv2.applyColorMap函数的作用这里不再赘述，上一篇博客中已经涉及。


# 对概率排在第五的类别产生类激活图
CAMs = returnCAM(features_blobs[0], weight_softmax, [idx[4]])
# 融合类激活图和原始图片
img = cv2.imread(img_path)
height, width, _ = img.shape
heatmap = cv2.applyColorMap(cv2.resize(CAMs[0], (width, height)), cv2.COLORMAP_JET)
result = heatmap * 0.3 + img * 0.7
cv2.imwrite('CAM1.jpg', result)

差别一目了然

参考链接：
https://blog.csdn.net/qq_36825778/article/details/104193642
https://blog.csdn.net/u014264373/article/details/85415921

CAM实现的流程(pytorch)

目录

流程图

算法思路

举个例子

代码分析

1.导入各种包，并且读取类别标签

2.读取图片，并预处理

3.加载预训练模型

4.获取特征图

5.获取权重

6.定义计算CAM的函数

7.生成图片

Ubuntu16.04中Anconda2/Anconda3的安裝使用和tensorfow的配置

Tensorflow識別手寫數字

CAM實現的流程(pytorch)

GIoU詳解

解決RemoveError: setuptools is a dependency of conda and cannot be removed from conda

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結