Francois Chollet @ Tokyo University
@fchollet
title: Deep Learning: current capabilities, limitatinos, and future perspectives
software engineer at google brain
some small implementation details turned out to be critical
- like "how do you encode konwledge and reasoning in a computer"
- scale makes a difference in deep learning (vs linear regression)
- very large parametric models trianed on many many samples
- a layer is a geometric transformation that turns one vector into another
- a sequence of simple transformations to turn high level vector space to lower vector space
- enough data
- == a dense sampling of 'input X output' space
- chess has no priors
- therefore humans and models alike have to start learning from scratch
- therefore models can achieve arbitrary levels of skill
limitations
- extreme sensitivity to adversarial perturbations
- etc from his book
- it can match templates of arbitrary complexity
deep learning is pattern recognition
- what can pattern recognition solve?
- all these problems, if they can be mapped to pattern recognition, they can be solved.
- humans can cover a lot of ground with very little data
- extreme abstraction of meaning from data
ai of the future
- needs
- better metrics
- the right kpi
- many benchmarks today focues on skill, not intelligence
- an ambitious new benchmark is needed to measure progress
- can a strong player in starcraft 1 play starcraft 2 well?
- yes, in within a few rounds
- looks different, different features, different units, but the player is able to generalize over them.
- richer models
- many interesting problems cannot be expressed as a stack of layers
- learning in "machine learning" will be more program synthesis than tuning the parameters of a hard-coded geometric transform
- future AI systems will blend pattern reognition (geometric intelligence) with abstraction & reasoning (symbolic intelligence)
- but how do we scale symbolic intelligence?
- we frame it as a differentiable model and search it?
- we can't learn these modules on every new task (too complex, too little data)
- we'll need a library of reusable symbolic & geometric modules
- in order to create stornger priors
- lifelong learning
- abstract literally means 'reusable'
long term vision
- meta learning
- look at github, grab the library of stuff that solve some problem, put it together, solve the problem, then put it back into the library. (e.g. npm / github)
- a model that learns to map
- inputs (problems) to solution functions (solution space == npm library / github)