The article builds a bridge between the Gestalt principles in human perception (describing how visual elements are interpreted) and convolutional neural networks modeling. It suggests that a single principle – adoption to the statistical structure of the environment might suffice to explain perceptual mechanisms.
Continue ReadingPlaying hard exploration games by watching YouTube
The method to map unaligned videos from multiple sources to a common representation using self-supervised objectives constructed over both time and modality (i.e. vision and sound). The embedding of YouTube videos to construct a reward function that encourages an agent to imitate human gameplay
Continue ReadingAttention Is All You Need
Articles proposes a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. The model achieves 28.4 BLEU on the WMT 2014 English- to-German translation task, improving over the existing best results, including ensembles, by over 2 BLEU. The Transformer also generalizes well to other tasks.
Continue Reading