Week 1 of Neural Networks and Deep Learning

tags:

2017-11-21

Introduction to Deep Learning

These are my notes on week #1 of the Introduction to Deep Learning Coursera MOOC.

Overview

Video #1 runs through an overview of all 5 courses in the specialisation.

What is a Neural Network?

Video #2 asks What Is A Neural Network? E.g. Taking samples of house size (x) and their house price (y) can be plotted as a straight line using linear regression. Since the house price can never be negative, the curve ends up looking like _/ This simple graph is a neural network: given an input size, an output price can be predicted. This graph appears frequently in NN literature, and is known as a ReLU (Rectified Linear Unit).

Input house size can be expanded to many more features like ZIP code, number of bedrooms, etc - “home features”.

Supervised learning

Video #3 talks about Supervised Learning. It discusses different applications, and which type of NN is best suited, e.g.

Input (x)	Output (y)	Application	Type	Data
Home features	Price	Real estate	Standard NN	Structured (e.g. CSV)
Ad, user info	Click? (0/1)	Online ads	Standard NN	Structured
Image	Classification	Photo tagging	CNN	Unstructured
Audio	Text transcript	Speech recog	RNN (because it’s 1-dimensional time-series)	Unstructured
English	Chinese	Machine translation	RNN (also sequence data)	Unstructured
Image, Radar	Position of other cars	Autonomous driving	Custom/Hybrid	Unstructured

Why is Deep Learning taking off?

Video #4 asks why deep learning is taking off. (well, I’ve personally been fascinated with neural nets since the late nineties, but here we go…). Answer: data is getting larger, and NNs are more performant at data scale. Ng briefly touches on why the signoid is dropping out of favour and being replaced by ReLU: The sigmoid has a 0 gradient in the two outermost regions (the tails of the curve) which slows down training, because if you implement gradient descent, and the gradient is 0, then the parameters just change very slowly. Whereas with the ReLU, the gradient is 1 for all positive values. (the fact that the left region has a 0 slope and how it impacts training will probably become more apparent in later videos, but I suspect some manner of pruning/drop-out will come into play here.)

About the course

Video #5 talks about the curriculum again.

Resources

Video #6 mentions course resources:

forum
yup, that’s it

Heroes of Deep Learning

Video #7 the last, optional video is an interview with Geoffrey Hinton, which I can wholeheartedly recommend. Interestingly, at the 12h30 mark Geoffrey notes that he was working on variational methods, and it just so happens that people in statistics were also working on the same problem, but they didn’t know about it at the time. (Which goes to show that we can all do with better communication, and/or opening our eyes more often.) Around the 23m mark Hinton touches on Capsule Networks, which is something I’m also excited about.

Advice:

read the literature, but not too much of it!
notice what everyone does wrong, and do it right

Around the 36m mark, Hinton says the thing which is on everyone’s lips right now: we’re not programming computers anymore, we’re showing computers.

« Neural Networks refresher Week 2 of Neural Networks and Deep Learning »