mxnet學習(8):Trainer

class mxnet.gluon.Trainer(params, optimizer, optimizer_params = None, kvstore = ‘device’, compression_params = None, updata_on_kvstore = None)

參考：http://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html?highlight=trainer#mxnet.gluon.Trainer

該類將一個優化器應用於一套參數，trainer應該與autograd一起使用.

參數

params 參數字典，可以通過net.collect_params()指定需要優化的參數
optimizer string或者mxnet.optimizer.Optimizer類的對象。使用字符串如’sgd’
optimizer_params 字典，key會傳遞給optimizer，比如learning_rate, wd(weight decay), clip_gradient, lr_scheduler等，更多可以參考optimizer的constructor。
kvstore (str or KVStore) – kvstore type for multi-gpu and distributed training. See help on mxnet.kvstore.create for more information.
compression_params (dict) – Specifies type of gradient compression and additional arguments depending on the type of compression being used. For example, 2bit compression requires a threshold. Arguments would then be {‘type’:‘2bit’, ‘threshold’:0.5} See mxnet.KVStore.set_gradient_compression method for more details on gradient compression.
update_on_kvstore (bool, default None) – Whether to perform parameter updates on kvstore. If None, then trainer will choose the more suitable option depending on the type of kvstore. If the update_on_kvstore argument is provided, environment variable MXNET_UPDATE_ON_KVSTORE will be ignored.
Properties –
---------- –
learning_rate (float) – The current learning rate of the optimizer. Given an Optimizer object optimizer, its learning rate can be accessed as optimizer.learning_rate.

可以用trainer.learning_rate查看當前的learning rate

函數

allreduce_grads():對於不同的參數，在不同的環境下減小梯度
load_states(fname):Loads trainer states (e.g. optimizer, momentum) from a file.
save_states(fname):Saves trainer states (e.g. optimizer, momentum) to a file.
set_learning_rate(lr):設置一個新的 lr
step(batch_size, ignore_stale_grad = false):應該在autograd.backward()之後,autograd.record()之外。對於普通的參數更新，可以使用step(),內部自動調用allreduce_grads()和update。但是如果需要某些特定的轉換，比如梯度剪切，需要手動調用上述二者。
update(batch_size, ignore_stale_grad = False):更新一次參數。

Batch size of data processed. Gradient will be normalized by 1/batch_size. Set this to 1 if you normalized loss manually with loss = mean(loss).

ignore_stale_grad (bool, optional, default=False) – If true, ignores Parameters with stale gradient (gradient that has not been updated by backward after last step) and skip update.

class mxnet.optimizer.Optimizer(rescale_grad=1.0, param_idx2name=None, wd=0.0, clip_gradient=None, learning_rate=0.01, lr_scheduler=None, sym=None, begin_num_update=0, multi_precision=False, param_dict=None)

http://mxnet.incubator.apache.org/versions/master/api/python/optimization/optimization.html#mxnet.optimizer.Optimizer

class mxnet.lr_scheduler.LRScheduler(base_lr=0.01, warmup_steps=0, warmup_begin_lr=0, warmup_mode=‘linear’)

Base class of a learning rate scheduler.

A scheduler returns a new learning rate based on the number of updates that have been performed.

Parameters:

base_lr (float, optional) – The initial learning rate.
warmup_steps (int) – number of warmup steps used before this scheduler starts decay
warmup_begin_lr (float) – if using warmup, the learning rate from which it starts warming up
warmup_mode (string) – warmup can be done in two modes. ‘linear’ mode gradually increases lr with each step in equal increments ‘constant’ mode keeps lr at warmup_begin_lr for warmup_steps

class mxnet.lr_scheduler.FactorScheduler(step, factor=1, stop_factor_lr=1e-08, base_lr=0.01, warmup_steps=0, warmup_begin_lr=0, warmup_mode=‘linear’)

Reduce the learning rate by a factor for every n steps.

It returns a new learning rate by:

base_lr * pow(factor, floor(num_update/step))

Parameters:

step (int) – Changes the learning rate for every n updates.
factor (float, optional) – The factor to change the learning rate.
stop_factor_lr (float, optional) – Stop updating the learning rate if it is less than this value.

除此以外還有

class mxnet.lr_scheduler.MultiFactorScheduler(
                                                step, 
                                                factor=1, 
                                                base_lr=0.01,
                                                warmup_steps=0,
                                                warmup_begin_lr=0,
                                                warmup_mode='linear'
                                            )
                                            
class mxnet.lr_scheduler.PolyScheduler(
                                        max_update,
                                        base_lr=0.01,
                                        pwr=2,
                                        final_lr=0,
                                        warmup_steps=0,
                                        warmup_begin_lr=0,
                                        warmup_mode='linear'
                                      )
class mxnet.lr_scheduler.CosineScheduler(
                                        max_update,
                                        base_lr=0.01,
                                        final_lr=0,
                                        warmup_steps=0,
                                        warmup_begin_lr=0,
                                        warmup_mode='linear'
                                        )

mxnet學習(8):Trainer

class mxnet.gluon.Trainer(params, optimizer, optimizer_params = None, kvstore = ‘device’, compression_params = None, updata_on_kvstore = None)

class mxnet.optimizer.Optimizer(rescale_grad=1.0, param_idx2name=None, wd=0.0, clip_gradient=None, learning_rate=0.01, lr_scheduler=None, sym=None, begin_num_update=0, multi_precision=False, param_dict=None)

class mxnet.lr_scheduler.LRScheduler(base_lr=0.01, warmup_steps=0, warmup_begin_lr=0, warmup_mode=‘linear’)

class mxnet.lr_scheduler.FactorScheduler(step, factor=1, stop_factor_lr=1e-08, base_lr=0.01, warmup_steps=0, warmup_begin_lr=0, warmup_mode=‘linear’)

電子科技大學計算機科學與技術就讀體驗

Golang爬蟲代理接入的技術與實踐

mxnet學習(8):Trainer

mxnet學習(5):模型參數

論文：Aurora Guard_ Real-Time Face Anti-Spoofing via Light Reflection

mxnet學習(9):使用gluon接口讀取symbol預訓練模型finetune

python筆記(4): os.path模塊

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結