Singularity Summit – Peter Norvig Director of Research at Google #SS12

Peter Norvig (Director of Research Google)

He went over the last part of his 2007 Singularity Summit talk

Probabilistic First Order Logic (thought he needed it in 2007 but upon review not needed)
Hierarchical Representation and Problem Solving (yes, progress made and needed it)
Learning over the above (yes)
Lots of data (yes, used 10 million videos and photos from Youtube)
Online (yes, this is partial – they loaded it offline and used it online)
Efficiently (yes)

Important research for the automatic image categorization work

Google trained a picture identifying system with 10 million youtube videos
Tens of thousands of nodes. Each node identified a particular thing.
Cats, people, yellow flowers etc..

Sparse Coding (Olhauser and Field 1996)

Deep Belief Networks (Hinton, Osindero and Teh, 2006, 16 pages)

Hinton tutorial on Deep belief networks (100 pages)

Peter Norvig answered questions about AI on Reddit back in 2010

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations (2009, 8 pages)

They were able to get the eyes and noses identified on faces, and key features of cars and chairs

Google put together the deep learning team

Andrew Ng
Geoffrey Hinton
Jeff Dean (the goto guy for scaling solutions to warehouse data center)
16000 CPUs, 1000 servers,
1 billion parameters, 200 by 200 pixels (100 times bigger than best prior solution, but 100 to 1000 times less than people)

Trained the picture identifying system with 10 million youtube videos
Tens of thousands of nodes. Each node identified a particular thing.
Cats, people, yellow flowers etc..

They are not duplicating the brain

They applied deep learning to speech recognition and got a 37% error reduction. This is largest error reduction in a number of years and is a lot of progress.

This is hierarchical, learning, unsupervised and using lots of data.

Machine Language translation

They also have machine translation of languages (ie German to English)
Find millions of translated language pairs.
Phrases match to phrases and then use jigsaw puzzle solving techniques
Three submodels are combined
Translation model, target model (expected word counts in the target language), distortion model
Optimize a formula that combines the functions

A Successful Online AI Course

Sebastian Thrun and Peter Norvig

160,000 students in 200 countries
Try to get closer to 1 on 1 tutoring as a goal
Education is the Afghanistan of Technology. Technologies try to conquer education but then fail over 10 years.
Thomas Edison predicted that films used in education would replace books in 10 years.
There needs to be feedback between the teacher and the student and feedback to the student.

Where to go from here

Need to work with more than still images of cats but use full video and other inputs
Computer vision as a field has gone away from 3D to 2D
Many 2D images from many angles are an approximation of 3D
Cannot handle deformable objects like jellyfish
They have plateaued by just looking at words
They now add syntax, clustering words, endings of words, word classes, more hierarchical clustering

Software and algorithmic advances will be more important than hardware for the next 5-10 years at least.

If you liked this article, please give it a quick review on ycombinator or StumbleUpon. Thanks