Microsoft NNI入門

【GiantPandaCV導語】Neural Network Intelligence 是一個工具包，可以有效幫助用戶設計並調優汲取學習模型的神經網絡架構，以及超參數。具有易於使用、可擴展、靈活、高效的特點。本文主要講NNI基礎的概念以及一個訓練MNIST的入門教程。本文首發於GiantPandaCV，未經允許，不得轉載。

1. 概述

NNI有以下幾個特性：

易於使用：可以通過pip進行安裝，通過命令行工具查看效果。
可擴展：支持不同計算資源，多種計算平臺，可以在不同平臺並行運行。
靈活：NNI內部有超參數調優算法、NAS算法、early stop算法等
高效：NNI在系統和算法級別上進行不斷優化。

基礎概念：

Experiment：表示一次任務，比如尋找最好的神經網絡架構。由automl算法+多個Trial構成。
Search Space: 搜索空間，需要預定義的空間，比如超參數範圍，block個數限制等。
Configuration: 配置文件是搜索空間的實例化，比如從搜索空間中固定下來一定的超參數。
Trial：獨立嘗試，基於某個Configuration來進行運行的一次實驗。
Tuner：調優器內含有automl算法，可以爲下一個trial生成新的Configuration。
Assessor: 評估器，分析trial的中間結果，來確定trial是否應該提前終止掉。
訓練平臺：Trial的具體執行環境，比如本機、遠程服務器、集羣等等。

體系結構如下圖所示：

nnictl: 這是命令行工具，用於控制web 服務器，和其他管理功能，用戶可以使用這個命令來進行管理。
NNI Core: 內部核心，實現了web UI, nnimanager控制器，訓練服務等核心內容。
Advisor: 包括Tuner和Assessor，分別負責生成下一個trial和評估該trial。
右側代表訓練平臺，將許多trial進行分配到各個平臺中，完成一次嘗試。

2. 使用邏輯

一個Experiment的運行邏輯是：

Tuner 接收搜索空間，生成configuration
將這些生成的configuration提交到很多訓練平臺上。
將各個平臺上執行的訓練結果返回給Tuner
繼續生成新的configuration。

用戶的使用邏輯是：

定義搜索空間，按照格式要求編寫json文件
改動原有模型代碼，添加上nni的api
定義實驗配置，在config.yml文件中，根據要求，設置好對應的參數要求。

3. 功能

超參數調優：最核心的功能，提供了許多流行的自動調優算法和提前種豬算法。
通用NAS框架：指定候選的架構，並且可以爲NAS的研究人員提供了簡單的接口，便於開發新的NAS算法。NNI支持多種one-shot NAS算法，使用這些算法不需要啓動NNI experiment，只需直接運行。但是如果需要調整超參數，就需要啓動NNI experiement。
模型壓縮：壓縮後的網絡通常具有更小的模型尺寸和更快的推理速度，模型性能也不會有明顯的下降。 NNI 上的模型壓縮包括剪枝和量化算法
自動特徵工程：爲下游任務找到最有效的特徵。

4. 安裝

Linux下安裝：

python3 -m pip install --upgrade nni

Docker中使用NNI:

docker pull msranni/nni:latest

Window下安裝：

pip install cython wheel
python -m pip install --upgrade nni

5. 入門實驗

用MNIST進行演示如何找到MNIST模型最佳超參數，官方教程以tensorflow1.x爲例的，並且暫時還沒有支持tensorflow2.x，筆者本地只有tf2和pytorch環境，所以選擇pytorch進行演示。演示代碼來自官方庫：https://github.com/microsoft/nni/blob/master/examples/trials/mnist-pytorch

僞代碼：

輸出: 一組最優的參數配置

1: For t = 0, 1, 2, ..., maxTrialNum,
2:      hyperparameter = 從搜索空間選擇一組參數
3:      final result = run_trial_and_evaluate(hyperparameter)
4:      返回最終結果給 NNI
5:      If 時間達到上限,
6:          停止實驗
7: 返回最好的實驗結果

網絡結構定義：

class Net(nn.Module):
    def __init__(self, hidden_size):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4*4*50, hidden_size)
        self.fc2 = nn.Linear(hidden_size, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4*4*50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

基本上和pytorch網絡是一樣的，不過構建類的時候有一個超參數，hidden size是nni負責搜索的。

第一步：搜索空間文件構建

{
    "batch_size": {"_type":"choice", "_value": [16, 32, 64, 128]},
    "hidden_size":{"_type":"choice","_value":[128, 256, 512, 1024]},
    "lr":{"_type":"choice","_value":[0.0001, 0.001, 0.01, 0.1]},
    "momentum":{"_type":"uniform","_value":[0, 1]}
}

可以看出，搜索對象有batch size、hidden size、lr、momentum等參數，裏邊涉及到幾種類型 type。

choice代表從後邊value中選擇其中一個值，uniform代表生成一個均勻分佈的超參數。

第二步：添加nni api從nni獲取超參數，並返回運行結果

try:
    # get parameters form tuner
    tuner_params = nni.get_next_parameter()
    logger.debug(tuner_params)
    params = vars(merge_parameter(get_params(), tuner_params))
    print(params)
    main(params)
except Exception as exception:
    logger.exception(exception)
    raise

第三行，nni.get_next_parameter()就是tuner，獲取下一個configuration，將參數傳遞給main（第七行）中，開始根據configuration執行一次trial。

在main函數中，通過args得到對應hidden_size、lr、momentum等的參數

def main(args):
    use_cuda = not args['no_cuda'] and torch.cuda.is_available()

    torch.manual_seed(args['seed'])

    device = torch.device("cuda" if use_cuda else "cpu")

    kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}

    data_dir = args['data_dir']

    train_loader = torch.utils.data.DataLoader(
        datasets.MNIST(data_dir, train=True, download=True,
                       transform=transforms.Compose([
                           transforms.ToTensor(),
                           transforms.Normalize((0.1307,), (0.3081,))
                       ])),
        batch_size=args['batch_size'], shuffle=True, **kwargs)

    test_loader = torch.utils.data.DataLoader(
        datasets.MNIST(data_dir, train=False, transform=transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.1307,), (0.3081,))
        ])),
        batch_size=1000, shuffle=True, **kwargs)

    hidden_size = args['hidden_size']

    model = Net(hidden_size=hidden_size).to(device)
    
    optimizer = optim.SGD(model.parameters(), lr=args['lr'],
                          momentum=args['momentum'])

    for epoch in range(1, args['epochs'] + 1):
        train(args, model, device, train_loader, optimizer, epoch)
        test_acc = test(args, model, device, test_loader)

        # report intermediate result
        nni.report_intermediate_result(test_acc)
        logger.debug('test accuracy %g', test_acc)
        logger.debug('Pipe send intermediate result done.')

    # report final result
    nni.report_final_result(test_acc)
    logger.debug('Final result is %g', test_acc)
    logger.debug('Send final result done.')

返回運行結果：

for epoch in range(1, args['epochs'] + 1):
    train(args, model, device, train_loader, optimizer, epoch)
    test_acc = test(args, model, device, test_loader)

    # report intermediate result
    nni.report_intermediate_result(test_acc)
    logger.debug('test accuracy %g', test_acc)
    logger.debug('Pipe send intermediate result done.')

# report final result
nni.report_final_result(test_acc)
logger.debug('Final result is %g', test_acc)
logger.debug('Send final result done.')

主要是nni.report_intermediate_result 返回中間結果和 nni.report_final_result 返回最終結果。

第三步定義配置文件，聲明搜索空間和Trial

authorName: pprp
experimentName: example_mnist_pytorch
trialConcurrency: 1 # 設置併發數量
maxExecDuration: 1h # 每個trial 最長執行時間
maxTrialNum: 10 # 實驗重複運行次數
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json # 搜索空間對應json文件
#choice: true, false
useAnnotation: false
tuner:
  #choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner
  #SMAC (SMAC should be installed through nnictl)
  builtinTunerName: TPE # 指定tuner算法
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
trial:
  command: python3 mnist.py # 命令行
  codeDir: .
  gpuNum: 1 # 使用gpu數目

一切準備就緒，在命令行啓動MNIST Experiment:

nnictl create --config config.yml

訪問上圖展示的連接，可以看到NNI Web UI界面。

官方提供的教程基於tensorflow 1.x，詳細瞭解請看 https://nni.readthedocs.io/zh/stable/Tutorial/QuickStart.html

後續會陸陸續續出關於NAS使用教程，敬請期待。

Microsoft NNI入門

1. 概述

2. 使用邏輯

3. 功能

4. 安裝

5. 入門實驗

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

java由於越界導致的報錯

Spack：軟件包管理的終極解決方案以 unzip 無sudo權限安裝爲例

2021 BDCI 華爲零售商品識別競賽一等獎方案分享

當可變形注意力機制引入Vision Transformer

CeiT：訓練更快的多層特徵抽取ViT

BoTNet:Bottleneck Transformers for Visual Recognition

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結