Research Forum Keynote: Research in the Era of AI
[Music] hi I'm really pleased and excited to be here for this first Microsoft Research Forum a series uh that we have here out of Microsoft research to carry out some important conversations with the research and scientific Community this past year has been quite a memorable one uh just some incredible advances particularly in Ai and I'll spend a a little bit of time talking about AI here uh to get us started uh but before doing that I I thought I would try to at least share how I see what is happening uh in the broader context of scientific disruption and to do that uh I want to go all the way back uh to the 1700s uh and the emerging science of biology the science of living things uh actually in the 1700s it was was well understood uh by the end of that Century uh that all living things were made up of cells everything from trees and plants to bugs animals and human beings but a fundamental scientific mystery uh that lingered for decades was where do cells come from and a prevailing theory of that uh was this concept of cell crystallization uh it had been understood in other areas uh that sometimes uh hard materials would crystallize into existence uh from fluid materials and so the thought was that out of living fluids under just the right conditions cells would crystallize into existence and a lot of biological research of the time was centered around that theory and in fact quite a few important and useful things came out of that line of research uh research that even has imp act medically today now of course there was an alternative Theory uh which I think is credited to Robert remac uh that in fact cells get created through a process of cell division and we know that this is true today um but it was really considered an alternative Theory uh until Rudolph verow was actually able to witness the mitosis of cells the division of cells and in fact coined the aphorism that all cells come from other living cells and this had a very significant impact on Vera's research uh and his research into what is now known as pathology overnight whole research legacies were rendered largely invalid because the whole concept of celf crystallization uh was then known to be invalid but even the very foundational infrastructure of research at the time changed in fact after verow to call yourself a researcher in biology you had to have access to a new piece of research infrastructure called the microscope and you had to be good at using it and so while the researchers themselves of the time were not invalidated they were disrupted in a really fundamental way and of course the discovery of mitosis uh really set biology research on the path ultimately to the discovery of DNA uh and the remarkable kinds of medical and biological advances we see in the field now I tell that story because when I think about that story and I learned it first from the great biology researcher and medical scientist Sid mooker at Colombia uh I think about what we as computer scientists are going through today we've now witnessed the incredible potential power of machine learning systems at scale and of specific architectures like neurot Transformers uh and there are many possibilities there many challenges and there are many Mysteries and furthermore the infrastructure of what we do as computer science researchers particularly in areas related to artificial intelligence has changed in the same way that biology researchers need access to new infrastructure like microscopes uh at least that was the case in the mid 1800s uh when verow made his Discovery uh today for our large segment of the kinds of research that we do we now realize we need new types of infrastructure infrastructure such as large data sets access to large scale GPU compute uh and even uh other training pipelines and Foundations and what we're seeing is that this is affecting virtually uh everything that we do today and so as we work together as a research community in computer science we are in this incredibly exciting stage a stage of being disrupted personally as researchers many of us as researchers finding uh large parts of what we had been working on being changed disruptor or even invalidated and a whole new Vista of possibilities in front of us and we are just incredibly excited within Microsoft research to be living through this there are difficult moments to be sure but also a sense of Joy a joy that comes from the realization that we are now living through something that is very special and very rare so now what has this meant and uh to do a little bit of a discussion about that uh if you will permit me I'd like to focus a little bit on the research particularly in AI uh within Microsoft research in this past year we had the opportunity to do something uh unusual uh and while I used the the word opportunity it was also a challenge uh in our ongoing collaboration with open AI uh when the new model that we now call gp4 uh was being made available for investigation and study within Microsoft research and this was toward the end of 2022 uh we for various reasons were required to do something that is EXT exceptionally unusual for Microsoft research and that is to work in secret uh for a period of several weeks um and this is exceptionally unusual for Microsoft research uh because almost all of the research we do at Microsoft uh is done in collaboration uh with external researchers particularly at Great universities all around the world and so really for the first time we were doing some core research uh in secret and that uh remained secret until the release publicly of gp4 in March of 2023 um that March of 2023 uh marked the time when we were allowed finally to speak publicly about gbd4 uh in the wake of open ey's public announcement of that model and allowed the publication of our initial findings on our own internal study and investigation of this model uh and that has led to a paper that was tantalizingly titled the Sparks of artificial general intelligence uh or what is now of sometimes refer to as the Sparks paper that Sparks paper uh was really a turning point uh for many of us in the research Community uh it tried to show a series of example interactions with this new large language model that defied complete explanation in terms of the emergence of apparent cognitive capabilities it was also uh a somewhat edgy or even controversial paper uh because of our then lack of ability to fully explain the core mechanisms about where these apparent capabilities were coming from at the same time it was a real chance to finally reach out and establish collaborations with many of you here today applications and collaborations to understand uh to what extent are these models able to do planning what is the nature of causal reasoning and causal inference contrafactual reasoning what is The Interchange between fundamental reasoning abilities of these models versus World Knowledge to what extent are common sense reasoning decision- making in controversial or morally charged circumstances uh and fundamentally what could this mean more broadly for us as people for the communities we live in and for societies the Sparks paper was just the beginning and with many of you uh we've had a series of important research advances that have been deepening our understanding in these and many other areas and we've been trying to put these things into action as we have also worked with our own Microsoft product developers as well as researchers and product developers in other organizations other companies uh We've really had to come to grips with the impact of AI technology on the world and on people uh in our internal collaborations uh with our Bing product team uh we devoted considerable effort in trying to understand the guardrails around responsible Ai and in fact today at Microsoft our responsible AI organization uh within mic Micosoft has hundreds of people really forming uh around understanding the impact not just of the potential harms and risks that AI Technologies can bring when deployed widely scale but the broader opportunities both for benefit as well as risks on society as a whole so I I'd like to say just a few more words about this whole concept of responsible Ai and in fact I prefer to think of this as the area of AI and its impact in society or Ai and Society for short for us in this new AI era uh we started in a difficult space uh because we devoted some of our best expertise across Microsoft but specifically Microsoft research to building the guard rails and understanding the risks of our first Integrations of gp4 into our products like Bing uh and that devotion in secret uh ended up being noticeable to the research Community after the release of chat gbt uh when uh we were in a somewhat difficult position in Microsoft research in having to remain silent while the research Community was starting to really delve into the research questions around AI safety uh of chat GPT in that case and then later in gp4 what has happened since then I think is now Renaissance in our understanding jointly with all of you about the nature of AI risks and the whole debate around the future of AI and Society not only the future of work which we care about a great deal in our own research here at Microsoft but the impact on communities on relationships and Societies in core areas like medicine and Healthcare in education in finance in law and you name it I extremely excited about what this has meant for our own research within Microsoft uh in an area that we call sociotechnical systems and specifically in something we call fate fairness accountability transparency and ethics there has never been a more exciting time and never been larger and more challenging research questions as well as bigger and most more relevant opportunities for the impact of This research and we can't be more excited uh to be working with all of you on many of these things within Microsoft this has also had a transformative effect uh we have evolved from having a single Organization for responsible AI to now deeply integrating responsible Ai and more broadly Ai and societal impact thinking into literally every Engineering Group across the company as well as areas in finance security and our legal departments and so as we think about the future of AI Society we just really look forward and we will be depending on our collaborations with all of you now it doesn't just stop there though uh there are tremendous other areas uh the actual cost of training the necessity or not of scale and the emergence of alternative models uh has become extremely important and so another thread of research uh that has been exceptionally important in AI for us over the past year has been in the emerging area of open- source and small language model development and we've been very proud to have shared with the research community in open source form a series of models called fi the FI models are exceptionally interesting because they've taken a new approach to the construction of training data to synthesize data to really focus on specific reasoning and problem solving strategies as opposed to World Knowledge and this has led to a series of models starting with F1 1.5 and now 52 uh that have been devoted to open source in the hopes of encouraging and supporting the open research community in developing greater understanding of the transparency and the limits of these types of models to understand better the alignment issues and to have further explication in the pre-training phase of areas related to AI safety and risk we've also been looking at platform and model orchestration what will this world look like in the future where there may be many models together and so we've been extremely proud of our work on autogen autogen has provided again in the open source Community a way to very easily and rapidly get multiple AI systems collaborating together as independent agents to solve problems more easily to for example have one model interact with human being to solve problems another model look over their shoulders to ensure that nothing goes wrong and maybe even a third agent which is a human being in the loop uh doing various kinds of checks and balances we've been studying tremendously about how we can extend our ability to train these models for specific domains uh our work on Orca and orca 2 has really helped shed more light on the nature of training data uh and our work on the lava model specialized to Medical image generation and lava Med uh has shown real promise for a future of specialized models devoted to aspects of medicine now while I'm talking about model specialization this interplay between specialization versus generalization has been another major theme for Microsoft research AI over the past year the basic question is do we need intense specialization and nowhere has that question been more pertinent than the area of healthcare and medicine do we need to take AI models to med school in order for them to be useful in the medical domain the question is still a mystery to us and in fact we released a series of prompting scripts that would automate the creation of Chain of Thought augmentation of prompts uh called prompt basee and its application to medicine called Med prompt that shows that gp4 when suitably prompted still outperforms any existing specialized model and so to date we still have a mystery as to the role of specialization uh and furthermore we have some hints that specialization may lead to the loss of some cognitive function and a really fun paper to read out of my Microsoft research uh and our Azure division is entitled who is Harry Potter where we show that even a small amount of specialized training of large language models can get a large language model to forget everything it ever knew uh about Harry Potter uh a humorous title but it makes an important Point uh in deepening understanding of the role of specialization all of these put together of course it was just a tip of a very very large Iceberg uh one that is growing at tremendous speed in fact uh today we are seeing so much of AI research happening in the world uh in social media at the speed of conversation um and to the point where even our top tier AI researchers uh feel at time uh that their heads are spinning but working together providing openness providing greater access uh we definitely looking back for this past year can see that we've made tremendous tremendous progress and it's not just the new discoveries that we jointly have made together but also in deepening understanding about how to care and mitigate against the downstream harms and risks of AI as we see them emerge uh as well as the broader societal impacts about what this will mean uh for the future of work uh for the future of life and the future of our relation relationships with technology now uh as we think more about AI it has just infected and I'm using the word for a reason uh almost all of the research that we do across Microsoft research whether it's in security and privacy whether it's in socio Technical Systems where whether it's in program analysis and verification and you name it generative AI has been having an impact I can't tell you how surprised and tickled I was the first time I saw our program analysis research group using gp4 to help synthesize loopin variants for the purposes of furthering program verification analysis just really cool uh a little bit amusing uh maybe even a little bit scary but just amazing no matter how you look at it now when we take all that I use the word infected because of course one area that has had a special place within Microsoft research over the last few years has been in areas related to health care and medicine and in fact we saw early on the potential impact that gp4 would have in healthcare and medicine and in fact I myself co-authored with Carrie Goldberg who is a science writer and Zack kohani from Harvard Medical School a book on the potential impact of gp4 on Healthcare and Medicine uh and we had already in place a Microsoft research lab called Health Futures that had been working on large language models such as biog GPT and Pub medt uh for the purpose of creating knowledge bases and supporting decisions uh in clinical settings and much of that in collaboration with our partners at Nuance the makers of really the most widely used medical transcription and not taking uh Technologies today gp4 and the emergence of large language models have really changed everything and in fact have broadened beyond the narrow confines of healthcare and medicine to other part of science other Natural Sciences the discovery of new materials the discovery of catalysts to help climate science the potential to take uh these sorts of models and make them multiscale so that we can now start modeling weather patterns and predict weather events ahead of time and in recognition all of that uh We've created a new Microsoft research lab uh called AI for Science and we're very proud that already we're seeing some tremendous results from this working with collaborators at the Pacific Northwest National Laboratories uh we very rapidly were able to synthesize the discovery of new electrolyte substances combining sodium and lithium in ways that could be the foundation of a new generation of solid state batteries that make a dramatically lowered use of what is oftentimes referred to as white gold or lithium and we' furthermore been able to work with the global Health drug Discovery initiative or giddy uh in very rapidly discovering new Inhibitors that may form the foundation of new drug treatments uh for diseases such as tuberculosis and Corona viruses and so the future is just incredibly bright and as we extend beyond language and medical images and other types of two-dimensional images to 3D structure learning and the ability ility to make predictions about the structure of physical systems like molecules we see just incredibly bright future ahead of all of us as I think about our future as fellow researchers as scientists uh I just see a tremendous reason to be optimistic you know we're in an era where what we do has never mattered more than it matters right now the things that we're working on have tremendous technological power to really Empower and reach every person on the planet and make their lives better in so many different ways and it's something that we can just do together and go directly from laboratory into the real world and so I really am hoping and I'm looking forward to us continuing in a spirit of collaboration in a spirit of openness to help ensure that that future is as vibrant and bright as we know it can be while at the same time being cleare eyed about the potential risks risks that we don't even understand or realize yet today but together we can do what scientists have always done in the past uh which is to ensure that we get as many of the benefits out of emerging new technologies while mitigating the downstream harms and risks if we do that together if we do it with the right spirit and attitude I think the future is incredibly bright I know I'm really looking forward to doing it with you thank you again for joining us uh and I hope you enjoy this first Microsoft Research Forum
2024-02-04 07:52