2019.11.6 note

DeepGCNs: Making GCNs Go as Deep as CNNs

Graph Convolutional Networks (GCNs) offer an alternative that allows for non-Eucledian data as input to a neural network similar to CNNs. While GCNs already achieve encouraging results, they are currently limited to shallow architectures with 2 to 4 layers due to vanishing gradients during training. They transfer concepts such as residual/dense connections and dilated convolutions from CNNs to GCNs in order to successfully train very deep GCNs. They show the benefit of deep GCNs with as many as 112 layers experimentally across various datasets and tasks.

For example, EdgeConv GCN:

The residual version is:

RETHINKING DATA AUGMENTATION: SELF-SUPERVISION AND SELF-DISTILLATION

In supervised settings, a common practice for data augmentation is to assign the same label to all augmented samples of the same source. However, if the augmentation results in large distributional discrepancy among them (e.g., rotations), forcing their label invariance may be too difficult to solve and often hurts the performance. To tackle this challenge, they suggest a simple yet effective idea of learning the joint distribution of the original and self-supervised labels of augmented samples. The joint learning framework is easier to train, and enables an aggregated inference combining the predictions from different augmented samples for improving the performance. Further, to speed up the aggregation process, they also propose a knowledge transfer technique of self-distillation type which transfers the knowledge of augmentation into the model itself.

SOFTMAX IS NOT AN ARTIFICIAL TRICK: AN INFORMATION-THEORETIC VIEW OF SOFTMAX IN NEURAL NETWORKS

Despite great popularity of applying softmax to map the non-normalised outputs of a neural network to a probability distribution over predicting classes, this normalised exponential transformation still seems to be artificial. A theoretic framework that incorporates softmax as an intrinsic component is still lacking. In this paper, we view neural networks embedding softmax from an information theoretic perspective. Under this view, we can naturally and mathematically derive log-softmax as an inherent component in a neural network for evaluating the conditional mutual information between network output vectors and labels given an input datum. We show that training deterministic neural networks through maximising log-softmax is equivalent to enlarging the conditional mutual information, i.e., feeding label information into network outputs. We also generalise our informative theoretic perspective to neural networks with stochasticity and derive information upper and lower bounds of log-softmax. In theory, such an information-theoretic view offers rationality support for embedding softmax in neural networks; in practice, we eventually demonstrate a computer vision application example of how to employ our information-theoretic view to filter out targeted objects on images.

STABILIZING TRANSFORMERS FOR REINFORCEMENT LEARNING

They stablize transformers for reinforcement learning and propose GTrXL for RL tasks. They show that the GTrXL, trained using the same losses, has stability and performance that consistently matches or exceeds a competitive LSTM baseline, including on more reactive tasks where memory is less critical. GTrXL offers an easy-to-train, simple-to-implement but substantially more expressive architectural alternative to the standard multi-layer LSTM ubiquitously used for RL agents in partially observable environments.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

2019.11.6 note

2019.11.6 note

DeepGCNs: Making GCNs Go as Deep as CNNs

RETHINKING DATA AUGMENTATION: SELF-SUPERVISION AND SELF-DISTILLATION

SOFTMAX IS NOT AN ARTIFICIAL TRICK: AN INFORMATION-THEORETIC VIEW OF SOFTMAX IN NEURAL NETWORKS

STABILIZING TRANSFORMERS FOR REINFORCEMENT LEARNING

redis的key亂碼問題和值自增問題

CORS error 但是 status code 是200 OK

一個開源且全面的C#算法實戰教程

一款.NET開源、功能強大、跨平臺的繪圖庫 - OxyPlot

壓縮上傳的GPU數據的方案

OpenTelemetry 實踐指南：歷史、架構與基本概念

需求管理祕籍：從混亂到有序，讓你的項目高效運轉

使用skopeo同步鏡像

用光線投射法渲染規則模型

2019.11.5 note

2019.11.15 note (2)

2019.11.6 note

2019.11.15 note (1)

IMO 2017 T1解答

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結