# Frontiers in Deep Learning and Reinforcement Learning

First of all, this post would not claim to exhaustively narrate everything discussed during the summer school. There are slides and video lectures for that and I highly recommend that you go thru them. Rather, in this post I will just mention ideas that are either: 1) new to me, 2) old but on which I have gained new perspective and deeper understanding 3) plain interesting. Some of these are mentioned in the slides or during the talk. Lastly, I do not claim that I have understood all the topics presented and what I am writing here are based solely on my understanding. Please refer to the actual slides or video lectures for a detailed treatment of each subject.

The first part was about deep learning and the second part was on reinforcement learning.

The first part started with an overview on Machine Learning by Dolna Precup (McGill University). She mentioned about parametric vs non-parametric models, bias-variance trade-off, regularisation, etc.

The next part was an introduction to Neural Networks by Hugo Rochelle (Google Brain). He talked about the basics of neural network learning by backpropagation, etc. He also mentioned all the hyperparameters that need to be tuned especially during optimisation. At the end of his talk, he mentioned about one-shot or zero-shot learning and the need to design new architectures based on intuition.

Yoshua Bengio (MILA, Uni of Montreal) gave the next talk on Recurrent Neural Networks. He mentioned about sequence-to-sequence network as a version of autoencoder-decoder. He also emphasised that the multiplicative interactions in the network increases expressive power. Regarding the gradients: on the one hand, if the eigen values are < 1, the gradient approaches zero, thus the vanishing gradient; on the other hand, if the eigen values are > 1, the gradient approaches infinity and thus the exploding gradient problem. He said that in GRUs gradients are copied instead of updated. Furthermore, he said that the main gate in the LSTM network is the forget gate and how the gates work are not yet fully understood. He also mentioned about the use of attention to make the network focus on important information in the network. (It is to be noted that he and his group pioneered both RNNs and attention mechanism.) Lastly, he said that backpropagation through time (BPTT), which is what happens in the RNNs, does not seem biologically plausible because our brain does not backpropagate the errors to the past in order to learn something. (During the summer school, Yoshua Bengio was awarded the Order of Canada for his contributions in this field.)

The next talk was about Probabilistic Numerics by Mike Osborne (Oxford). This area sounds new to me but one thing that intrigued me was what he said: “integration beats optimisation”. He gave us the link about this subject for those interested to go into this field.

The next talk was about Generative Models by Ian Goodfellow (Google Brain). He began by describing the taxonomy of generative models and then explained how Generative Adversarial Networks (GANs) work. What interested me is the idea that GANs can be used to generate models for simulated training data. In his second talk, he discussed all the state-of-the-art in GANs such as PixelCNN.

The next talk was given my Mike Osborne (Oxford) about the Future of Work. He presented statistics on which profession will likely be affected by the coming of intelligent machine age but he also emphasised that though this may happen, history shows that new jobs also come out and that it is up to the policy-makers to manage the transition well. He also said that while we have advanced dramatically in AI, having a cleaning assistant robot in the house would not be feasible even in 20-30 years for that seemingly simple job.

The next talk was about Convolutional Neural Networks (CNN) by Richard Zemel (Uni of Toronto). He talked about the advances in CNN but one particular thing that was interesting to me was about highway networks which could be used to dynamically determine when data is passed thru or transferred.

Raquel Urtasun (Head, Uber Toronto; Prof, Uni of Toronto) gave the next talk about Deep Structured Models. She emphasised that the outputs for these models are statistically dependent. Problems having a structure in the output are usually approached using Markov Random Fields (MRFs) but as a post-processing step. The complex dependencies wherein multiple variables are predicted fits nicely with the idea of using graphical models and thus its natural connection to deep learning. With deep structured models, complex dependencies can be modeled using one loss function much like in a multi-task learning fashion. Her advice in exploring this convergence of the two fields of graphical models and deep learning is to think how to encode what we know and we don’t know about the problem. At the end, she mentioned about the energy models as a way of looking at the loss functions.

The next talk was on Natural Language Processing by Phil Blunsom (Oxford). He first differentiated computational linguistics from natural language processing. He then talked about language modelling and he mentioned that there are better datasets for this task such as the WikiText and that we should not use the old-fashioned PTB or similar ones anymore because of the inherent biases present in the data. In talking about language modelling, he mentioned the evolution of methods from n-gram models to neural models and now we have RNNs and more sophisticated versions such as recurrent highway networks which is the state-of-the-art on this task.

In the second part of his talk, he discussed the advances of NLP and the frontiers. For example, he talked about what actually is limiting us - is it the thought or the syntax? He suggested that GANs may be useful in modelling structures. He mentioned a recent work on learning word compositionality in sentences in which reinforcement learning was used to select structures that maximises task-based rewards. Generally, how to capture the structure is still an open problem. He mentioned their research on grounding words to learn their meaning, i.e., they experimented with linguistic symbols in order to interpret meaning of utterance. However, to emphasise the complexity of the problem, language acquisition is an open problem in itself. He mentioned lots of researches done in psychology on language acquisition.

The next talk was about Computational Neuroscience by Blake Richards and Surya Ganguli. The emphasis of this talk was on how neuroscience can learn 222from deep learning and vice versa.

The other talks include Graphical Models + Deep Learning by Max Welling and Matt Johnson, Learning to Learn by Nando Freitas where he emphasised that a truly end-to-end learning only happens if it includes parameter learning as well.

The second part which is on Reinforcement Learning was started by Joelle Pineau (McGill University). She gave a very good introduction to the subject especially differentiating the jargons involved. Pieter Abbeel (UC Berkeley) then gave a talk on Policy Search, followed by Rich Sutton’s talk on TD learning. Much of the content were new to me so it would be redundant to repeat them all here:)

The most interesting ideas in this summer school which are relevant to my research at this point comes from the talks on GANs, grounding, deep structured models and RL.

To all the organisers of the summer school, I could not thank you more. I felt I was so lucky to be among the few admitted to attend it. The summer school has opened to me the vast horizon of research frontiers in this field.

*Thank you for scrolling*