Demystifying AI and Machine Learning: What You Need to Know
So. Good evening everybody. Welcome. To stern so. My name is Natalia, Lavina I'm the director, of the Fubon, Center for, technology. Business and innovation, if. You haven't, heard about the Fubon Center yet. Or registered. Without noticing. The, center's profile. Our. Center, has launched, in April. Of. 2018. So, we are strictly. Speaking right - about one year old. The. Mission of our Center, is to, support. Interdisciplinary. Work that, in, related. To the topics of Technology and innovation, we. Have, I, am lucky to have three. Amazing co directors of the center two of whom I hear gasps Linda Rose if he can evade she's there leading the FinTech part. Of our Center and in, a minute I'll introduce my, colleague professor Forrester, Provo and we, also like it to have Professor Melissa Schilling who I'm sure some of you had as your instructor as I see many of familiar faces of, my former students and. With. That said we. Are very fortunate. To, launch our new, series on official. Intelligence, in business which, is led by Professor. Foster Provo, and this. Is the first of our series of talks today. Talk as you, know focuses, on machine learning and official intelligence on April, 15th in case you missed it it's in your, bookmarks. We have the next talk which, is focusing. On the topic of algorithmic fairness, and in, September, in case you forget of us about, us over the summer we, will have another talk focused. On the issues of artificial intelligence and the changing, nature of work. You. Can find. Wonderful. Videos from our prior events, on our website so if you missed our FinTech conference, which was held in october all the, talks and notes. For the talks you can find there and finally. In. Case you really want to capture and share the great insights from today with your colleagues we will have a full detail. Of today's event posted, in a few days and for. Anybody who is registered either on site or, ahead of time will receive a link to the video so don't worry if you miss a few phrases you'll, have the video. Without. Further ado I'd like to introduce. My. Colleague and, co-director, of the Fubon Center professor, Foster Provo, professor. Provo is professor, of data science, in an. Information, Systems he's, Andre. Mayer faculty, fellow adds. NYU, Stern School of Business he. Is one of the core directors, of the Fubon Center, and. He's. Also the former director, of the Center for data science, that some of you might have heard about which is one of the early. Day interdisciplinary. Initiatives, around data science, dan why you he's, previously served, as an editor-in-chief of the Journal of machine learning and was, elected, as a founding, member of the International machine. Learning society. Foster's. Research has won numerous awards. He. Has won most, recently. 2017. European european research, share of paper, of the Year award, 2016. He won the best paper and one of our top journals information, systems research perhaps. Worth mentioning, as, a particularly, important. Best paper award is he, received the best paper award, from ACM, SIGCHI DD. Conference. Across three decades, so, in academia we think that's real quality, when. He wins the best paper over 30 years award, a. Professor. Provo has, extended. Experience, not only in academia but also in business he's. Worked for, five. Years in the industry after receiving his PhD in, computer science in particular, as, part of that experience, he was won the President's, Award at nine X science and technology, that's now Verizon, in case people are young and don't know what nine X's. His. Book data science, for business, is a perennial, bestseller. And is used in many many data science, courses across the world and in. Business schools and beyond, Foster. Has. Designed, and, sorry. Foster, her also. Oh yes. I am almost done you have a lot of good thing they could say about you. So. I just, want to mention that professor. Provo has, designed, a data. Science, architecture for, a number of startups is. That you could find them. On his website but most notable, and the recent award is that CFO, tech outlook. 2019. Named one of the companies he's involved with now the tactic as. One. Of the biggest, innovators, of the year in AI so without, further ado and sorry, phalanx introduction, I'd like to welcome professor bubble. Thanks. A lot Natalia on thanks and thanks everybody so um Natalia. Mentioned that we have um this. Is our inaugural. Event. In our series, on demystifying. AI, and, and machine. Learning, um. You. Know artificial. Intelligence, and machine learning, I don't, you but I seem to be encountering, them more and more when I read the newspaper. Or when I read in Business Week or when I read just about anything, and. I've. Been studying. And building, AI systems. For 30 years, and I'm, mystified, by, what the heck is this this stuff is I read articles I don't know what the heck's going on you, know it seems like either you, know there's some dark, magic, that.
You, Know that's that's that's sort of inside there that is going to, automate. Away all of our jobs, as, apparently, some of our my colleagues, and business schools are telling they're telling, their students, and. We, should either be terrified, by this or. Or. Anxiously. Awaiting, all the leisure that this is going to that this is going to give us. So. Others. You know you. Know some people think there's going to be cars. Driving, themselves around the streets of Manhattan. And. You. Know this actual, discussion, not just science fiction about. You know when, when is it that machines, are gonna become smarter, than we are by. The way aren't they already. So. We, thought we would try to do our part here the Fubon Center to. Demystify. This at least a little bit piece by piece and. To. That end I have got the great pleasure of welcoming. My. Friend and colleague Pedro, Dominguez Pedro's. A. Professor. At the in. Computer. Science at the University of Washington, and he's, a head, of machine learning research here at de Shaw and, he's. The author of the book the master algorithm how the quest how the quest for the ultimate learning, machine will remake our world. Which. I don't read. It yet read it it's great um it, was I. Guess. It came out in 2015, and in 2016, Bill Gates. Recommended. It is one of the two books that you have to read about, about, artificial intelligence and. I think last year. Chinese. President Xi Jinping. Got. Was caught with it on his on his bookshelf, I don't like to actually know how you get caught as a president, of a China get, caught with something on your bookshelf but I really, like that part of the story, so. I'm better. And I got to know each other as young. A I guys. Back, in the mid 90s when we we, continued, to sort of work on similar. Topics over and over again. Pedro, has won the. The. Top Innovation, Award in, data science. The. He's, a fellow of the Association, for the Advancement of, artificial intelligence he's, received a Fulbright scholarship, a Sloan fellowship the, National Foundation, National, Science Foundation's Career Award and lots, of best paper Awards. His. Papers I tell. My students you know if it's written by Pedro Dominguez just, read it. So. One. More thing and Natalia mentioned, this but I just wanted to give another plug before we actually. See. The part of the show you came for which is Pedro's talk. Yes. In two weeks we have the second, the. Second, installment. Of our series because, tonight's was was. Postponed, reader the snow day they're, close to each other. But we're gonna have Solon Baracus. Who's faculty, at Cornell and. Researcher. At New York's New, York City's. Lab, of Microsoft, Research and, is one of the world's, expert, at.
The Confluence of ethics law and machine learning and. He's going to come and talk to us about the fact as you'll learn a little from Pedro a machine, learning. Algorithm. Is a computer, program that writes other computer, programs and, once. In as increasingly. Our the, computer programs that are doing things in business and in government are actually written, by other computer programs we, have to ask ourselves what, are the ethical. Implications and, is there even a strong ethical foundation. And this is what's alone is an expert in and so he's going to come and talk to us about algorithmic. Fairness, and, and business and so that'll be two weeks two, weeks from this evening okay so, without. Further ado let me turn. The mic over to Pedro and thanks. So much. All. Right thank you everybody for being here, thanks Foster for, being for, bringing me here it's, a particular, pleasure given, how long we've known each other and how different things were when we started out you. Know doing doing AI so. As Foster, mentioned, there's a lot of excitement about AI today but also a lot of confusion, some. People like sundar Pichai the CEO of Google think it's the greatest invention in history, since fire. Others. Think it's a bunch of hype there's. A lot of discussion, some of it quite heated there's. Also a lot of people who have to make decisions, business. Decisions, policy, decisions, personal, federal decisions that, have to do with AI and at. The same time, people's. Understanding, over here I think is really. Incomplete. And often at odds with reality so, what I would like to do in the next 45 minutes is maybe you, know try to give a little bit of perspective on what, AI really, is and what. It can and can't do because I think that's one of the biggest points of conclusion is what can we do with the act today and what, can we not yet, and then I think with that understanding then, all our debates and decisions can be much. One so. What is AI AI. Is getting computers, to do things that previously would, only be done humans, well. That's a very ambitious, and also very broad goal it, includes things like for example. Reasoning. Problem-solving planning. Acquiring. Common-sense knowledge and using it understanding. Speech and language understanding. What we see in the world being, able to navigate through the world and manipulate things and very importantly learning and. Associated. With each of these you, know capabilities, there's a subfield area in particular machine, learning which Foster and I have, what Wharton is really the subfield of AI that deals with getting computers, to learn. Now. There there have been two major areas of AI there was the pig in, this AI summer, sick, you know the the 80s and that. Was the first coming over yeah but it didn't really succeed and.
The Reason we today. Is that the, field changed in how it approaches, problems. The. Approach. From that kind of prevailed in the 70s and 80s was, what were, and are called knowledge based systems and the. Idea here is that we, manually, encode. Into the computer, all the knowledge, that it needs to solve problems if we wanted to do medical diagnosis, we interview a bunch of doctors and write, down the rules by which to diagnose people. This. Unfortunately, failed because. Of what is called the knowledge acquisition bottleneck. There's. Just too much knowledge and you're never done acquiring, more and more knowledge more and more rules and it's never enough it's, always very brittle. This. Caused us to change our emphasis to, doing something else and this is you know started, really taking off in the 90s and and today it's really what underlies all, the big successes of AI is, to do machine learning and the, I think machine learning is that no we're not going to program the computer how, to do things we're, going to have the computer learn, from. Data by, observing people by by you know by trying to figure out what's going on and this. You know tends to work much better and in particular, it scales, much better because, as we get more data. $0.09, since we get more large for free and as. The amount of their available has been exploding, in recent years you know computers, have started getting away and we're smarter, by you know by taking advantage of using machine learning, so. This may seem like you know black magic you know like like Foster was saying how how do computers, do that right how exactly does a computer learn, from data well there's five main ways, one. Of them is. To just you know work a little bit like scientists, using the scientific method fill, in the gaps in your existing knowledge you. Know make hypotheses. Test them refine them this. Is one probably. The oldest one another. One which is actually the dominant, one today is to emulate the brain your. Brain is the greatest learning machine the world has ever seen so you know as usual you, know when, engineers are behind the competition what it is the reverse engineer, it so let's, reverse engineer the competition, and this is what is this is called deep learning it's, also known as neural networks and and connectionism. Another. One is to simulate evolution. Right. Evolutionists, produces, amazing things like like you and me so why don't we do another computer. The. Fourth one is. Predicated. On the observation, that all knowledge that is learnt from data is necessarily, uncertain. We're. Never sure and, this is one of the big barriers so what we should do is quantify, the uncertainty and, then systematically reduce, it the, more we reduce it the more we know and finally, there's this approach of just that's very common in human, beings and how they you know solve problems is to notice similarities between.
New, Situations, and old ones and then by analogy, try, to you know infer, what to do in the new situation from the opah and, associated. With each of these approaches, there's a whole school. Of thought as a whole paradigm, in machine, learning and what, I'm gonna do here is give you just the highlights, of what each one does and what some of the examples, of the things that have been done with it so the first one is associated with the symbol lists it has its origins, in in logic and philosophy one. Of the interesting things about machine, learning or at least from my test is that every, paradigm, has its roots in a different field of science so, under the guise of studying, machine learning you can study all sorts of things and never be bored it. Each of these schools, also has its own master, algorithm and. Algorithm. That you can use in principle, to learn any knowledge from data. The. Master algorithm of the Symbolists is, is what is called inverse deduction, and mostly in just a second what it is for, each of these algorithms there's a theorem that says if you give it enough data it will learn anything of. Course whether you can learn with a realistic amount of that in computing is a different matter but at least those theorems are there then. There are the connectionist, these are the people I mentioned who are obviously influenced, by neuroscience, their, math solving this back propagation, you, all have it in your pocket right now, you. Know powering, the speech recognition, and and other things in your in, your cell phone the. Revolutionaries are the people who are influenced by obviously, evolutionary, biology, their, master all of them is something called genetic programming, or more generally genetic algorithms, algorithms inspired, by genetics, the. Patients, come from statistics, they're the people who worry about uncertainty, and their approach is. Is probably stick inference based on Bayes theorem which is where the name comes from and, then finally the analogizes, the. People who see learning and reasoning as you know doing analogies, and and you know playing them out they. Actually have a diverse, set of inferences but probably the most important one is from psychology, because again there's, a lot of experimental. Evidence that humans do this and and.
The Most important, algorithm in this class is what I called kernel machines or support, vector machines, so. Let's start with a symbol list here are three of the most prominent sim lists in the world Tom, Mitchell at Carnegie Mellon Steve Moulton in the UK and Andres, Quinlan in in, Australia and. The. Basic idea behind symbolic. Learning is is a really, actually a really brilliant insight, and this. Is the following. We. What. Is induction right induction, machine learning is induction right is inducing, general rules from specific, facts and this is a very hard thing to do what. We do however know very well how to do is the opposite, its deduction, it's, going from general rules to specific, facts, and. In the history of mathematics people. Have over and over again succeeded. By taking something they didn't understand and viewing, it as the inverse, operation. Of something they didn't understand like for example we can understand, subtraction. As the opposite of addition or integration. As the inverse of differentiation. So. The idea here is let us do the same thing but, with induction, and deduction as, the. You know the inverse operations, so for example. Addition. Lets you answer questions like if I asked 2 & 2 what do I get the. Answer is 4 that's not the deepest thing you'll hear today. Subtraction. Gives. Us the answer to the inverse, question which is what, do I need to add to 2 in order, to get 4 and the answer of course is 2. Similarly. Deduction. Lets you answer questions like if, I, know that Socrates, this is human, and that humans are mortal what. Can i infer, from that well. I can infer that Socrates, is mortal right. Now, that's the duction that's easy, induction. Is the opposite, it's saying if, I know that Socrates, is human, what. Else do I need to know in, order to infer that he's mortal and. The. Answer of course is if I know that human, that humans are mortal then I can infer that Socrates, is mortal and now. I've just added the new rule to my knowledge base and in, the future I can combine, that rule chain it to answer questions, do inferences, in situations, that are potentially, very different, from, the situation in which I learned this and this, ability to learn composable. Knowledge is something that only the symbol, lists have it's, one of the most important things they have, now. Of course this is all a natural language and computers don't understand natural language they, do this in a formal language like for example first-order, logic but, the basic idea is the same. And. As I mentioned this is a little bit like scientists, at work formulating. Hypotheses, to explain what, they see testing, them refining, them and in, fact one. Of the most exciting and. Eye-opening. I would say applications, of symbolic learning today is precisely, to. Automate science so. For example the, biologist, in this picture, is not the guy in. Your, second here the. Biologist, in this picture, is not this guy, that's a machine learning researcher, by the name of Rusk way of rough King the. Biologist in the picture is the machine in the background. Ross. Is at the University of Manchester and, he's built a series of machines the first one was called Adam and the current one is called Eve there. Are a complete, scientist, in about you know very magical there are complete scientists in a box. Starts out with some basic knowledge of molecular, biology DNA. Proteins, and so on and then, it's given for example a model organism to study like say yeast and then, it starts formulating, hypotheses, using the inverse deduction, process that I just described, and then naturally designs, and carries. Out all by itself the experiments, to test those hypotheses, so what you have there are things like micro arrays and and you, know gene sequences, and whatnot and then based on that it refines the hypothesis, and keeps going and.
Recently. Eve this covered the new malaria drug that is now being tested and. Once. You have one robot, scientist, like this there is nothing keeping you from making millions, and. I imagine you're a biologist, now instead of having you know 10 postdocs, you can have 10 million and you. Know make progress and million times faster and they don't get grumpy they don't need to sleep you know they have no rights right, it's. Amazing. Now. The connection, is look. At all of this and say well yeah, but most learning, is not scientists. In a lab coat or mathematicians. You know doing the duck shion's with you know pencil and paper most. Learning is done by people in, real life and. And. Of course you know the learning engine that powers all of this is your brain so if only we could understand how the brain works then, everything else would follow and, you. Know these are the connectionist, so-called because all the knowledge is in the connections, between your neurons and, you. Know here are three the, three leading connectionists, in the world in fact just last week they won the Turing award the Nobel Prize of computer science, highly, desert the. Leader of that whole you know school is is Geoff Hinton, he's. Actually been doing this since 1970. Since. He was a grad student against. The wishes of his advisor. And. You. Know he really believes that there is a single algorithm, by which the brain learns and he's going to discover it and, he's been added for 40 years, in. Fact he tells the story of coming home from work one day saying yes. I did, it I figured out how the brain works and his. Daughter looked up at him and said like oh dad not again. So. He's persistent, and you know and his quest has had its ups and downs but it you know it's really starting to pay off in particular, he is one of the inventors of backprop, which is you know this algorithm that powers just about everything these days and to, other you know very. Important ones are Yin Luca and yoshua, bengio Yin, who of course is you know right here in my you. So. How, this connectionism, work well. Let's. Let's follow this process of reverse engineering the competition, do we understand, the competition, roughly we know roughly how a neuron works so. We can try to build a computer model of how it works and then assemble, the network because the brain is a network of neurons and and, and you know try to make it learn in the same way roughly, that the brain learns so. So how does the neuron work right, what do we need to extract from it well. A neuron is a really, really weird kind of cell, it's. A cell that really looks like a microscopic. Tree. There's. A trunk called the axon, this. Is the axon and, then. It branches out the branches are called the dendrites and. And. Then the the nuan discharges. Sends these electrical, discharges, down the axon and and out through the dendrites, and then. The outgoing dendrites, of one neuron make contact, with the roots over, the neurons that confusing, they are also called dendrites, and the place, where they meet is called the synapse and. And. These synapses, can be more or less efficient, at passing the current and in, particular, when two neurons fire at the same time the synaptic comes more efficient, at firing.
Meaning. It becomes easier, for the upstream neuron to make the downstream neuron fire and to. The best of our knowledge, everything. You know, everything. You've ever learned is contained, in the strengths of your synapses. In. The connections, between your neurons and sense you know the name connectionism. So. Here's a very simple model of this, you. Know a neuron what it does is it takes a bunch of inputs they. Could be other neurons they could be you know the pixels, in the camera in the in the retina. It. Multiplies, each one by a weight that's you know how efficient the synapse is and then, if that's makes, the threshold, then it fires otherwise, it doesn't fire, so. For example if I put a picture of a cat here, it should you know it should fire because it is a cat the output should be one if. It's not firing then the urine is doing something wrong and I need to fix it and. You know Frank Rosenblatt in, the 50s figured out how to do this for one euro, what. People couldn't figure out for many years was how to do this when, you have a whole network of neurons, your. Brain is a network of ten to a hundred billion, neurons, and each one of them has you know a thousand to ten thousand connections so, this is just on a different scale. Now. The solution to their problem was precisely the back propagation algorithm that the Charenton and others came, up with and. Back. Proper heart is a very simple idea backprop. All it does it says like well what I'm gonna do is, you. Know I thought, the problem I'm trying to solve is like if my output, is wrong, who. Do I blame, alright. This is often called the credit assignment, problem machine learning and back propagation is the solution to it maybe, should be called the blame assignment, problem right so like who do we blame when, something goes wrong and yeah it is like I'm gonna take each each neuron and each wait I'm gonna say like if I tweak this wait a little bit if, I increase it does, the error at my output go up or down and. If increasing, the weight makes the output you know error go down then you know then I'll decrease, the weight and what. I'm gonna do in order to make this efficient, is start at the output, and figure this thing out layer by layer based. On what I've already found for the for the let you know for the downstream ways which, is why this is called Arab rat propagation. Or a back prop fruit for sure right. Because. It's propagating the errors back and then learning based on okay, now. As I mentioned back prop is used for all sorts of things these days but probably, the most famous application. Is still the the Google cat Network. This. Was actually on the front page of the New York Times a few years ago and. I never thought that one day I'd see a machine learning you know algorithm. On the front page of the New York Times but you know but that's the first time it happened, the. Google cat Network you. Know was trained. By. Looking, at YouTube videos and. In fact it was the journalist John Markoff that called it the Google cat Network you can actually recognize a lot of you know can organize cats and dogs and people. But it can recognize cats better than anything else which. Is why I called it the will cat melon and. The reason for that is actually very simple is that there's more video of cats than of anything else on YouTube because people just love to upload, videos of their cats okay, so. Then I will just set their watched hours and hours of cat videos and other videos you could have called it maybe that may be a better name would be like their you know the google the couch potato, network but. And then and, but then at the end of the day did this amazing thing which is gradually, recognizing. Are things like cats and dogs which is might.
Sound Like a very easy problem to us but is extremely hard, for a computer to solve and now the results of this are everywhere, now. The evolutionary's. They. Look at this and go well sure, that's. Nice that you can tweak the weights on a brain and make it learn that way but, where did the brain itself come from, in. Deep learning we actually have to design, by hand the architecture, of the noggin in fact we spend a lot of time doing that but. In the case of of our brain right the brain was designed by evolution and, we, roughly know how evolution, works so why don't we simulate, that on a computer except. That instead of evolving, plants and animals were going to evolve programs, and electronic, circuits and so on okay, so. John Holland was was the guy who really you know pushed this idea in the beginning john, 'cousin and and friends for many years john, 'cousin how Lipson are two more recent people. In this area so. How does this work well. You know John called. It genetic algorithms, because they are algorithms inspired, by genetics, and it, is like this at. Any point in time you. Have a population, of. Individuals. Which. In the case of nature each one of them is defined by you know by genome, which is you know a sequence of DNA base pairs for computers they can just be bit strings because it doesn't matter you know or how many values you have and then. Each of those defines a program and then what we do is going to execute that program, on the tasks and the. Ones that do better will, receive a higher Fitness score again. By analogy with biology, and then, the ones with the highest fitness scores get to reproduce you. Literally take two of these programs and you mate them you create a new genome, there is a crossover, of the father genome and the mother genome and that's a child and then a few more random mutations, you, do a bunch of children, like this and you have a new generation and, the. Thing that's amazing is that if you do this starting with completely. Random, strings. After. A few thousand generations this, is actually is doing things that in many cases human, engineers can't. The. Evolutionary, have actually gotten a bunch of patents, from, the US Patent Office for you. Know things like radios, and amplifiers, invented. By genetic algorithms, that. Nobody nobody, understands exactly how they work because you know they don't work according to our principles, but they actually do their job better and, these. Days when. We skip over this these days, the. Evolutionary is actually having a lot of fun creating, no longer, just circuits or programs but, actual, Hardware. Physical. Robots. So. This little spider is. From Hot Lips ins lab. And. It. Runs around and you know runs, away from you and so on it's. Exciting, and maybe also a little scary right so if Terminator, ever comes to pass maybe this is this is this how it began. Of. Course you know this these spiders are nowhere near ready to take over the world but they started, out they've, come a long way from the random soup of parts that they started out as in fact, the way to do this is they start by doing in a simulation.
Once. This once the he wants the the you, know the the robot safe is off to a point we can actually make them then. They get 3d printed and, then. In each generation the robots that do best at whatever task, or measure where we're trying to optimize get. To program, the 3d printer to produce the next generation. So. This is the state of the art in you. Know in evolutionary, learning. Now. The connectionists. And the evolutionary, is you, know for all the corals, that they have between them have something very important in common which. Is they do learning, that is inspired, by biology, whether. It's the brain revolution. Most. Machine, learning researchers, actually do not think this is such a great idea because biology, is very random right who knows that you know it produced the optimal result what. We should do is figure out from first principles. What is the optimal thing to do and then implement those on the computer and the. Poster children of this approach to machine learning are divisions, and of course the fundamental principle, from which all learning the rise for. Them is base, theorem for. If, your algorithm is not comparable with base theorem it must be wrong so. The visions are there you know that the most die-hard of all the machine learning schools, and, you know they say that you, know this is of themselves and part. This is because for 200 years in statistics, they were a persecuted, minority, so. They had to get very religious, about it to survive and and they did which is a good thing because these days on the back of computer power and so on you know it's really becoming it's, really on the ascendant, not just in machine learning but in statistics, as well so. Three famous patients, David hack and man Michael Jordan you, the purl actually, won the. The, Turing Award a few years ago for, inventing something called Bayesian networks which is a very you know powerful, representation for. Bayesian learning so. So what is vision learning, well. As I said the cornerstone, of Bayesian learning is based theorem and, Asians. Love it so much that a few years ago there was a Bayesian, learning startup, in London I think that actually created a big neon sign of, Bayes theorem and hung it outside, their offices, for the whole city to see. So. What is what is this theorem and you know for those of you who don't know it but and what does it - you know how does machine learning based, on Bayes theorem work well, the idea is the following I have. A set, of hypotheses, that I could use to explain the world and I, don't know which one is true. What. I have is some. Probability that, each hypothesis, is true and I, start out with what is called the prior probability, of the hypothesis, which is how much I believe in it before, I even see any data so. This is what prior, because it's prior to seeing the evidence okay and then.
I Start seeing data and the. Hypotheses, that are compatible with the data become more probable, and the, ones that aren't become less problem, so. The probability of the hypothesis, given that it is called its likelihood, and, you. Know in fact this is also what frequently statisticians, use and. So hopefully what happens after I've seen a bunch of there it starts to become clear, which are the good hypotheses, are not so good ones and the, product of the likelihood and the prior probability, is always called the posterior probability, which is how much I believe in a hypothesis, after seeing the evidence okay, and, then, you know I have to divide by this thing called the marginal to make sure the probabilities add up to one but that's not that important for our purposes okay, so, this is a very sound way to figure. How much I believe in chappathis, top, date according to evidence with powerful hypothesis, classes like billionaires this is extremely, computationally difficult, and, a lot of the smarts of the visions have been in you know coming up with ways to to, make that computation, viable of course you, know Moore's law has also helped a lot and Vision. Learning has been used for a lot of things in particular your first self-driving car will probably have, a Bayesian network you, know helping the car figure out you know what, the world is that it's in and where it is on that map and so on but. One application, of vision, learning that we are all familiar with and grateful, for is spam filtering, this. Was actually david hackman's idea and and. The idea here is very simple is i'm just have two hypotheses, I'm, looking at an email one, hypothesis, is that this email is spam and the, other hypothesis, is that it's it's so-called him or you know it's a good email and now. The evidence, is the contents, of the email so. For example if the email contains the word free in all capitals, that makes it more likely to be spam if. That word is followed by the word viagra, that makes it way more likely to be spam you know this is a real example on the. Other hand if it contains, your you know best friend's name on the signature line that makes it more likely to be here okay, and this, works surprisingly well, these. Days you know their spam filters based on all sorts of machine learning ideas but this was the first one and is still one of the most widely used and and most effective. So. Finally, the analogize, errs have, a very different approach, to all of this and this. Is the idea that an in fact in my experience this is the approach that most people, most non machine experts, find the most intuitive perhaps. Because it's what we do and the, idea is is when we have a problem to solve what, we do is we find in our memory a similar, problem in our past experience, and then, we transfer the solution from. The you know from the case that we remember to the case that we need to solve now and. The. Simplest algorithm of this type is something called the nearest neighbor algorithm we'll see in in in. You know just a second what that is and one of the main people responsible. For. Establishing nearest-neighbor as any point of them was peter hart vladimir. Vapnik created. You know kernel machines which are the most sophisticated algorithm. Or at least of the widely used albums of this class another. Famous analogy, is this douglas hofstadter, whom. You may know as the author of girl Alicia Bach, in. Fact he, coined the word the term analogize. Ur and and. His most recent book is 500, pages arguing. That everything, in cognition, learning. Reasoning you name it is all just analogy, in action so, he really does believe that analogy, is the master algorithm. So. How does nearest-neighbor work. Let. Me introduce that to you by way of a simple puzzle, so. Here's. The puzzle well. You know a simple exercise I give. You the, map of two countries one, is called positon, and one is called Nagel and the. The plus signs are the, main cities in positon and the, minus signs are the main cities in Nagaland and that's.
All I give you and now what I ask you is where is the border, between Pakistan and Nagaland okay. Now. Of course we don't know exactly because the cities don't determine the border but if you just look at those plus and minus signs it and probably roughly guess where the border lies. And. Nearest neighbor is exactly, a heuristic for solving this problem and the heuristic that needn't miss neighbor Falls is that a, point. Will be in pause this fan if, it's, closer, to, a. Plus sign to a positive, seeded into any negative. One okay. So, for example you know that line right there is the set of points that are at the same distance between that plus and then minus okay. Very. Simple but, incredibly effective, in fact what peter hart did back in in, the 60s was proof that, despite. The simplicity of this algorithm, you could learn anything, with it given enough data so you, could actually say that this was in some sense the first real machine learning argument, the, first time that could learn without limit as you gave it more there prior, to that they were only statistical, arguments like in a various kinds of linear classifiers, that, could only learn so much and nearest neighbor is really the first you know full-blown. Machine learning algorithm and you. Know it's very successful, for even for things like medical diagnosis for example you need to diagnose a new patient you know nothing about medicine if you have you know a database of patient records you can just find the patient through the most similar symptoms, and assume, that the diagnosis, is the same there's. None businesses you give it a database of you know a few hundred cases and it will probably be human doctors you. Know you. Know a diagnosis, of that problem you know even after all the years in in med school. Makes. You wanna cry for a doctor but but hey I hope, none of you are doctors or a patient yeah I mean for patients it's good because now medicine, could be much cheaper and continuously, available in your on your smartphone, and so on you. Know nearest neighbor has some shortcomings that are solved by kernel machines but let me skip over that. It's. A very widely used type. Of learning and, has been you know for decades probably. The most famous and economically, important, application, of this type of learning is recommender. Systems. And. Their. Big insight in recommender, systems was the following suppose I'm Netflix, and I want to recommend the movie to you in. The beginning people were trying to do it use you know using the characteristics, of the movie what genre do, you like action or do you like drama you. Know the actors you know like the directors, but, taste is a subtle thing and that doesn't work very well, the. Important insight was when people said like know what I'm gonna do is I'm gonna find people who, have similar taste to yours I'm gonna, find your nearest neighbors, in taste space and then. If they like the movie that you haven't seen then, you know then I'll recommend that to you because you know since your tastes are similar you'll, probably like it as well how. Do I decide how people's how similar people are well that's why you give them all those you know star ratings for if. I think they give five stars to a movie that you give five starts to and vice versa then we have similar tastes and you know your nearest neighbors in taste space could be in China or New Zealand it doesn't matter if you know if they're if, they tend to you know like or dislike the same things then I can make predictions based on it.
Recommender. Systems are reportedly. About, a third of Amazon's. Entire business meaning, a third of. The things they sell come, from the recommendations, and for. Netflix it's three quarters. So. Three quarters of Netflix, this business comes from the recommender system and of, course every e-commerce website, worth, so not, to mention likes of Spotify and Pandora and, you know it said et cetera they all, use this. Okay. To, summarize. We. Met the five main types of machine learning we've. Seen their each one of them has a problem that it solves better than all the others for, the symbolist this knowledge composition. For. For the connection, is its credit assignment, for. The revolutionaries. It's discovering, structure, for, visions its handling uncertainty, and for the analogous its reason using similarity, and we've also seen, the rich one of them has a master algorithm. And. Now wouldn't that in principle is capable of learning any knowledge from data and. The. More optimistic members. Of each of these tribes these days probably, most prominently, the connection it's they think that they have it all right you're going to learn everything using, background. But. Most people in this field do not believe that and the, reason is simple is that these, five problems are all real and, each of these algorithms only solves one of them but, we're not going to have a true master argument we'll have an argument that solves, all of those five prongs at the same time so. The question becomes can, we unify all these albums into a single, true master Atum that really does solve, all those five problems. And. That firstly seems like a really hard problem in fact people you know some, people used to say that it was impossible because, superficial. These algorithms look so different like how could you possibly unify, them well. In reality once, you notice that they are all made of the same three components. Then. If you then it becomes much easier because it can uni if unified children of those components then you're done, so. Are those components, well the first one is representation. Representation. Is the choice of language in which you represent what you've learned. So. In humans it could be you know English or Chinese or or another natural language for. Programming it's usually things like Java and Python and, whatnot, in AI we tend to use more abstract languages, like you know first our logic which are a dimension, or graphical, models which include Bayesian, networks as a special case and if. We can somehow combine, those two into. One right. First-order logic and graphical models then we actually have a representation, that covers you know all of the things that we've talked about. And, in fact a number of ways of doing that have have been developed you, know the most successful, ones are what are called probablistic logics, which, you, know as the name implies have, an element of probability, and then I want of logic in particular, Markov, logic networks, which are probably the most widely used one are just. Formulas. In first order logic with, weights attached, if. You really believe in a formula then you give it a highway to you know if not so much then you give it a lower weight and then. A state of the world is more probable, if more, formulas, are true in it and for most with higher weight are two in it so. This is the basic idea now the next question is well if I have a representation, now. How do I decide what is the good versus the bad hypothesis, in that representation, right I need some kind of you know objective, function some kind of scoring function to decide what's good and what's not and, people here have used all sorts of things but, most of them are special cases of or the posterior probabilities, that patient's use so, in some sense we already have this, is the easiest part of the problem and we largely already have a solution to it more. Generally however the. Evaluation, function should actually not be a part of the algorithm it should actually be something that is provided, by the user. If. You're a company for example this could be some measure of you know return on investment, or the click-through, rate if you're in the ad business if. You're if you're you, know a user right, maybe this should be some measure of your happiness like how much you liked you, know the, the movie or the order or the song or whatever and then, what the algorithm does is it takes that measure and finds, the, hypothesis, that optimizes, it, which.
Brings Us to the last problem which is optimization. It's, finding, in, that space. The. The, hypothesis, that maximizes. The evaluation, function okay. And here there's a natural combination of ideas from the from. The evolutionary, and from the connectionists. Remember. A formula first our logic is just a tree of sub formulas, combined. By ends and or and not and implies and so on so. We can discover it using genetic programming right. You know we have a genome that you know it has been a different possible formulism, and we love it that way but. Then of course you know our formulas, also have weights. But. To learn those weights we can use back propagation. Backpropagation. Through the chains of reasoning by which we solve problems and answer questions in the past and. So. At this point we have something that looks pretty close to being a unification, of all the five master, algorithms. Now. Does that mean that we're done well. Some people again the more optimistic ones, believe that we are close to being done I actually don't think we're anywhere close, to being done and I think the problem or one, main problem is that. There. Are very. Important, ideas, in machine learning that haven't been discovered by any of these schools yet and in, a way it's it's harder for us the specialists. To. Figure that out because we're really thinking along the tracks of a particular, paradigm so. I actually have more hope that the answer will come from people outside the field so. If you have any ideas please let me know so. I can publish them. So. To conclude, you. Know looking for it once, when, we have such a universal learner what. Will it make possible, that is not possible today. For. Example we would all like to have a home, robot that, cooks. Dinner does the dishes makes, the beds maybe even looks after the children why, don't we have that yet. Well. First of all you can't do with that machine learning I think there's universal agreement about that but, second of all a home, robot, in the process of a normal day encounters, every one of those five problems and therefore. It needs an algorithm, a learning, system, that can, handle all five and with and until we have that we won't have from robots here's. Another one. Wouldn't. It be nice if when, you go on the web instead of typing in some search keywords. And you know seeing some pages that maybe are relevant you could actually just ask questions and get answers. It. Would and in fact you know Google Microsoft Facebook, etc, they're all trying to solve this problem turn. The web into a big knowledge base that you can then just answer you know ask questions to but. Of course in order to do that first of all you need a very powerful representation. At the level for start of logic otherwise it's not going to be very good but, then you know the web is full of contradictory, knowledge, and noise and you know ambiguities, and whatnot so you need probably for that as well so again, we've unified those, five algorithms, they're not gonna be able to solve this problem okay. Here's. Another one perhaps the most important, one curing, cancer. I-i've. Really, mentioned that machine learning albums are very good at medical diagnosis, in, fact typically better than, the human doctors at any one medical, diagnosis, problem if they have in other right data to train on but. But we certainly, have not cured cancer using machine learning why is that. The. Reason is that cancer, is not a single, disease, and. Therefore. There will never be a single, drug or a very unlikely that it will ever be a single, drug that cures cancer every. Patients, tumor is different, and so. Everyone, needs a different cure, and the.
Solution, Researchers, increasingly, believe is a machine learning system that takes, in you, know the patient's genome the, cancers you know mutations, the patient's medical history and then predicts, for that patient, what, is the drug that is going to cure their cancer okay. But. Again in order to do that you need to understand how cells work at a very fine level and there's, an enormous amount, of research going on in this you know using you know a lot of data from things like microarrays and sequences, or not and a lot of very advanced machine learning to do this but, we're not there yet and again you, know we won't be able to do this until we solve all of those five problems because they're all very much present here as well and. Finally. Going, back to recommender, systems. Recommender. Systems are part of everybody's, life these days right you're continuing, recommender systems even, when you might not realize that you are for example Facebook right, this is choosing, you know what you know what posts, to show you you, know Twitter is basically a big machine learning algorithm figuring, out what tweets to show to whom and so on Amazon. You know Netflix, you know it said etc but, each of these recommender, systems is not that great yet and, and. One big reason for that is that it only knows a sliver, of you right, so Netflix knows your taste in movies but, that's it all. Right and you know Spotify, knows your taste in music but that's it. What. I would really like to have as a consumer, is a recommender, system that learns from all the data that I ever generate, and then. It has a 360-degree. Picture, of me it knows me really really, well it knows me better than my best friend and then. Based on that can recommend things for me you know at every step of life, you, know small, things and big things from. You know like what to eat today you know work you know what restaurant to pick to, where to go for college you know what you know what you. Know where to move right these. Days even, things like you know the the. Third. Of all marriages in America today start on the internet, and. The. Matchmakers. Are machine learning algorithms, you. Know you used to have village matchmakers, but now that village matchmake is a machine learning of it so. There are two and you know males and, so there are children, on life today who would not have been born if not from machine learning but, if you ask the parents like how. Did you get paired it was the algorithm. Right. This is the world that we live in today and you know the truth is for all that the CIO's of those companies say they're, matching organs are not very good largely. Because they don't know the people you know like you fill in a form and somehow you match people based on that that's not gonna work right if, you're gonna entrust such an important decision to another method better know for example you're.
Tasting Music maybe people in with taste in music are more compatible you know way where, you've traveled to like your entire life is potentially, relevant but. Then in order to do they even supposing, that you've pulled all that data and you're dealt with all the privacy and data security problems. Which is a whole you know you, know nest, of worms in its own you know right then, you still need an algorithm that can actually combine all, of that data into. Into the great picture view and all, the big tech companies, are trying to do this right you see them in the Virtual Assistants right like series trying to do this Cortinas, try to do this Alexa Google now but, we don't quite have the learning there is able to do that yet but if we do you, know manage, to create the master of them then, we will and I think we will all have you know happier and more productive lives for that thank. You. Thanks. So much, Pedro. We. Thought we'd. Spend. A little bit of time. We. Were gonna put actually a picture, of fire up there for our fireside, chat but we did oh we, can put up the picture of the you, know the, series yeah. So. Um. So. In this spirit. Of, demystifying. By. The way thanks I thought. To talk was great I don't know about you guys but who. Cares I'm. Just kidding oh. So. When, when, you're dealing with people out in the world who. Really need to know more. Than they do about machine. Learning and AI what, do you think is the like. The key well, what would you be if you had to if you could fix one thing that. People didn't, understand, what. Would that thing be about, AI and machine learning or whatever you want to have. Great. Question and, I and I in fact I think there is a very clear candidate, for what is the single biggest, and, most harmful, misconception. That people have about a and it's very simple people. Confuse artificial. Intelligence with human, intelligence. We. Are, always projecting. Onto a eyes human, qualities, like. Freewill and consciousness. And emotions, and the will to power and, you know a whole range of things that a eyes just don't have this. Is natural right because you know human beings as I was saying reason by analogy and, we're faced with a new phenomenon we always try to reduce it to the ones that we already know and human, or animal intelligence, is the only one that we know on the planet but, as a result of which we, tend to treat you, know for better and worse a eyes. As if they kind, of like humans, which. Which leads us into a lot of errors about first of all what they can and can't do about, what the dangers, and opportunities, are and in, a way you know if there's only one thing that I hope people would come away from you know a talk like this or from reading the master Aden is like a much better picture of what AI really is it's, not a human being in these guys right there's no homunculus, inside, that computer system, it's, it's its intelligence, is just the ability to solve problems and they solve problems you know and learn to solve, using, the kinds of techniques that we discovered, and, you know in some sense there is no black magic there right those of us in the field no that doesn't have like magic even though it looks like black magic from the outside so.
Um I kind, of joked at, the very outset, like aren't they already smarter. Than we are right you know and so, there there, certainly are ways at which the AI systems, are far smarter. Dimensions. Along which they're. Far smarter, than we are and dimensions. Along which they're. Essentially. Stupid yeah, in, fact you. Know what the dimension along which the machines are way are the. Farthest, ahead of us it's. In doing arithmetic. Right. We don't even give them credit for that but like you know machine, can be the human, at arithmetic by a factor, of what a billion to one or a trillion, and and. Of course this this design even residence but remember computer started out as a job description there. Are people whose job was to compute, and, it was considered something that required in a certain level of education and intelligence to just you know do the math correctly and now, we don't even pay attention to that anymore okay. And, so you know it feels like oh well move machines be smarter than people well, in some things that are the way smarter in some, things they are now becoming smarter than people and in some things they are very, very far from being you know as smart as people yeah, so we could start to think about. I'm. Gonna ask some questions about jobs, you, know we. Can start to think about what types, of problems, you might expect, the. Computers. To. Do. Really well I mean some of the earliest applications of. Machine. Learning in AI to business. Problems were things like credit scoring right. Where you and me where it's very important, of course to estimate, the probability of default right. And people, are. Absolutely. Miserable at, estimating the probability of anything. Right. But once you have lots and lots of data it's just arithmetic. You. Know and so so. Those would be areas where it wasn't just that you could replace, people. Because the machine could automate, their job to automate them away right you, know decades. And decades ago, the machine just blow people away at doing things like estimating, what's the likelihood of somebody defaulting, here's something that people in the field of AR in the beginning got very wrong and in fact a lot of people today still get wrong because it's a very natural mistake to make but now we actually we really know better than that and this, was the notion kind, of like very intuitive, maybe that the. First jobs to be automated will be the blue-collar ones, because. You know those are anybody. Can do those jobs right where, is you know like doctor, a lawyer you, know taxes, accountant, right those you know those, take you know college degree right actually, it's, more like the opposite, right. It's. Much easier to automate the job of a doctor or of, a financial advisor than. It is to automate the job of a construction worker. We. Are very far from having a robot that can walk around the construction site not tripping, over itself let alone actually doing something useful the. Reason this is counterintuitive, is that evolution. Spent, 500, million years evolving us to be competent, in the physical world and we. Are really good at it and we take it completely for granted, where's. The things that you have to go to college for those are the things that evolution, did not spend any time evolved, in you for and therefore we really beginners, at that and so, computers, can blow past us in that very very quickly so, the real division is not so much blue collar versus white collar as is. It a routine job, versus. Is it a job that requires a lot of flexibility, if. Your job is something fairly narrowly defined, and where. There's a lot of data for. That machines. Can probably want to do it well as a rule of thumb the more of, your brain your job uses the safer. It is because. If it requires a lot of common-sense knowledge a lot of integration, of you, know information from different sources you know a lot of you know combining.
You Know abstract, thinking with you know manipulating, things in the real world this is what is well beyond the capability, of computers and will probably continue to be for the foreseeable future and in blue claw call her work right. It was really the work that had been. Processed. Sized right. I mean a assembly. Line work you, know and so on where somebody, had laid it out and made it systematic. And so, you didn't have to move around a lot right and that's, the robots. Yeah, the more the, more. Flexible your environment, is the harder for robots in fact I mean industrial, robotics is a classic example of this you industrial, robots were distant that they do because a factory is a highly controlled environment, whereas. The home is a highly, uncontrolled, environment, right which is where it's very hard for home BOTS but, it may be an even better example of that, is a you. Know transportation. We. Already have robots that fly planes right, in fact you know human. Pilots forgive me if you're a pilot but they're mostly superfluous, these days because. An airplane in the air is actually a very simple environment. Driving. The truck on the freeway is actually much more complicated but. You know Frew is still not a very variable, environment and, it's where the frontier is like these days you actually have I think we will have self-driving, trucks that drive maybe on the freeway not on the city you know coming, online fairly, soon but driving a car in the city like in Manhattan, with everybody, you know honking at you and pedestrians. Running in front of you and whatnot that that's you know a whole different. You know degree of chaos and variability and you know we're not there yet yeah so let's talk about that because there's another thing I started out with the Turing test the self-driving cars is driving in Manhattan yeah so, right. So. When. Will we see self-driving. Cars, on mass. Not. Little tests right on mass on. The streets of matter in. Our lifetime they're. Already everywhere right it's just that the drivers, are our, robots, disguised, as humans. Well. It's April 1st so. Well. This is a very good question right so so here here's let. Me give you three answer to this the way more answer the. Way more ants right way more is like another company that is the leader in this they they're very confident, they say you know any day now. Surprisingly. Confident, in my opinion but hey at. The other end of the spectrum these people say like no in our lifetimes. Now. The. Reason to say no in our lifetime is everything that we've been talking about right, the reason why more is confident, is they're actually they're not confident, they're not pulling it off in there is the successes, that they've already achieved in. A lot of this kind of like smaller test unlikely, time and communities and so on and so forth right. Exactly. How long it's gonna take is actually you, know there's a lot of wild cards there you. Know my personal, opinion is the following. Driving. A car in the city is what is called an AI complete, problem meaning. To, solve that problem you need to solve every aspect of AI it's, not just the vision and the control it's the tactical, driving the, thing that really defeats self-driving cars is interacting with humans, right. Just like humans are very little, and like they will play you know games of chicken with you right and people being self-driving, cars you know at first they're like weirded, out and scared but ten minutes later they're just bored because. The self-driving cars like a grandma, it's like you, know it drives very slowly obeys. All the laws nobody, obeys all the traffic laws but self-driving, cars have to write, so, that this this is like this is one one aspect right but, but the other the other thing, to realize is that the world will so, this is an argument for it's gonna take a long time but. Another one for it's gonna happen sooner than you think is that cities. Are not going to stay and changed, while. Self-driving, cars evolved, for them as they are now cities, will evolve to, need self-driving, cars in the middle just. Like they evolved to meet horseless, carriages, and trains and everything else right.
We're Going to set up the environment in ways that make it easier, for the self-driving cars there's, gonna be guides on the streets you know maybe cops, will have you know they'll be talking you know in RFID, like no not themselves you know by voice but there's, gonna be all these ways in which we make things more adapted to self-driving cars so, you know it's an interesting question I think it's, not going to happen very very soon but I think will probably happen in our lifetimes we. Can. What I'm sorry. Easier. To navigate so though so one of the things I mean let's go back to the. To. The to the other humans, issue right, you know I think that so. By. The way sorry let me get some things the. Whole problem is those pesky humans, yeah back. In the 90s there was this test that was done in California where, this where they took a section of the five freeway that's the freeway that you know runs you know all California, and they, gave it over to self-driving cars and those. Self-driving cars actually have almost no AI they, followed magnetic, guides that were installed and they kept a certain constant dista