How open source can democratize AI

Show Video

Mo thank you for joining me today thank you so much for having you have um just about the most Irish name ever I do very proud you you weren't born in Ireland no my grandparents oh your grandparents oh I see where did you grow up New York Queens oh you're oh see so tell me a little bit about how how you got to Red Hat what was your path when I was in high school it was a chatty girl teenage girl on the phone we had one phone line my old older brother was studying at the local State College computer science and he had to tell net in to compile his homework one phone line and I'm on it all the time he got very frustrated and he needed a compiler to do his homework so he bought Red Hat Linux from Comp USA brought it home and that was on the family computer so I learned Linux and I started playing around with it I really liked it because you could customize everything like the entire user interface you could actually modify the code of the programs you were using to do what you wanted and for me it was really cool cuz especially when you're a kid and like people tell you this is the way things are and you just have to deal with it it's nice to be like I'm going to make things the way I want modify the code you playing yeah it was amazing and it was just such a time and like before it was cool I was doing it and what I saw in that as sort of the potential like number one of like a community of people working together and like the internet existed it was slow it involved modems but there were people that you could talk to who would give you tips and you share information and this collaborative building something together is really something special right I could file a complaint to whatever large software company made whatever software I was into or I could go to an open source software community and be like hey guys I think we should do this and like yeah okay I'll help I'll pitch in so you don't feel powerless you feel like you can have an impact and that was really exciting to me however open source software has a reputation for not having the best user interface not the best user experience so so I ended up studying computer science and electronic media dual major and then I did human computer interaction as my masters and my thought was wouldn't it be nice if this free software accessible to anybody if it was easier to use so more people could use it and take advantage of it and so long story short I uh ended up going to Red Hat saying hey I want to learn how you guys work let me Ed in your team draft out of my graduate program and I'm like I want to do this for a living this is cooler so I thought this is the way to go and I've been there ever since they haven't been able to get rid of me to backtrack just a little bit you were talking about the sense of community that surrounds this way of thinking about software talk a little bit more about what that Community is like the benefits of that Community why it appeals to you sure well and you know part of the reason I actually ended up going to The Graduate School track suddenly you're a peer of your professors and you're working side by side with them mhm at some point they retire and you're the Next Generation so it's sharing information building on the work of others in sort of this cycle that extends past human lifespan and in the same way like the open source model is very similar but you're actually you're building something and it's something in me I'm just really attracted like I don't like talking about stuff I like doing stuff with open source software the software doesn't cost anything the code is out there generally uses Open Standards for the file formats I can open up files that I created in open source tools as a high school student today mhm cuz they were using open format and that software still exists I can still compile the code and it's an active Community project like these things can Outlast any single company in the same way that the academic Community has been going on for so many years and hopefully will continue moving on so was sort of like not just the community around it but just the knowledge sharing and also bringing up the Next Generation as well like all of that stuff really appealed to me and also at the center of it the fact that we could democratize it by following this Open Source process and feel like we have some control we're not at the mercy of some faceless Corporation making changes and we have no impact like that really appealed to me too yeah for those of us who are not software efficient AOS take a step backwards and give me a kind of description of terms what's the opposite of Open Source proprietary proprietary is what we say so so specifically and practically the difference would be what between something that was open source and something that was proprietary sure so there's a lot of difference so with open- Source software you get these rights when you're given the software you get the right to be able to share it and depending on the different licenses that are considered open source have different little things that you have to be aware of with proprietary code it's 100% copyright the company even a lot of times when you sign your employment contract for a software company you write code for them you don't own it you sign over your rights to the company so if you leave the company the code doesn't go with you it stays in the ownership of that company so then when like one company buys out another and kills a product that code's gone it's gone for a business why would a business want to be have open source code as opposed to proprietary well for the same reasons like say you're a business you've invested all this money into this software platform right and you've upskilled your employees on it and it's a core part of business and then few years later that company goes out of business or something happens or even something less drastic you really need this feature but for the company that makes the software it's not in their best interest it's not worth the the investment they're not going to do it how do you get that feature you either have to completely migrate to another solution and this is something it's core at your business that's going to be a big deal to migrate but if it's open source you could either hire a team of experts you could hire software engine Engineers who are able to go do this for you go in the Upstream software Community implement the feature that you want and it'll be rolled into the next version of that company software so even if that company didn't want to implement the feature if they did it open source they would inherit that feature from the Upstream Community is what we call it so you have some control over the situation if it's open source you have an opportunity to actually affect change in the product and you could then pick it up or pay somebody else to pick it up or another could form and pick it up and keep it going so there's more possibilities if it's open source it's more like it's like an insurance policy almost so Innovation from the standpoint of the customer Innovation is a lot easier when you're working in an open source environment absolutely yeah so now at Red Hat you're working with something called instruct lab tell us a little bit about what that is so the thing that really excites me about getting to work on this project is AI is sort of the has been the scary thing for me because it's it's one of those things like in order to be able to pre-train a model you have to have unobtanium gpus you have to have Rich resources it takes months it takes expertise there's a small handful of companies that can build a model from pre-train to the something usable and it kind of feels like those early days when I was kind of delving in software in the same way I think if more people could contribute to AI models then it wouldn't be just influenced by whichever company had the resources to build it and there's been a lot of emphasis on pre-training models so taking massive terabytes data sets throwing them through masses of gpus over months of time spending hundreds of millions of dollars to build a base model but what instruct lab does is say okay you have a base model we're going to fine-tune it on the other end it takes less compute resources the way we've built instruct lab you can play around with the technology and learn it on an off-the-shelf laptop that you can actually buy so in this way we're enabling a much broader set of people to play with AI to contribute it to modify it and I'll tell you one story from Red Hat sui who is our chief diversity officer very interested in inclusive language and open source software doesn't have any experience with AI we have a community model that we have an upstream project around for people to contribute Knowledge and Skills to the model she's like I want to teach the model how to use inclusive language like replace this word with this word or this word with this word I'm like oh that's so cool so she paired up with Nicholas who is a technical guy at red hat and they built and submitted a skill to the model that you can just tell the model can you please take this document and translate this language to more inclusive language and it will do it and they submitted it to the community they were so proud it was like that's the kind of thing that like you know maybe a company would be incentivized to do that but if you have some tooling that's open source and something that anybody could access then those communities could actually get together and build that knowledge into AI models just so I understand what you guys have is the structure for an AI system and in other cases individual companies own and train their own AI systems it takes enormous amount of resources they Hoover up all kinds of information train it according to their own hidden set of rules and then a customer might use that for some price what you're saying is in the same way that we democratized the writing of software before let's democratize the training of an AI system so anyone can contribute here and teach the model the things that they're interested in teaching the model I'm guessing you correct me on the one hand this model at least in the beginning is going to have a lot fewer resources available to it but on the other hand it's going to have a much more diverse set of inputs that's right and the other thing is that IBM basically as part of this project has something called the granite model family and they've donated some Granite models so these are the ones that take the months and terabytes of data and all the gpus to train so IBM has created one of those and they have listed out and link to the data sets that they used and they talk about the relative proportions they used when pre-training so it's not just a black box you know where the data came from which is a pretty open position to take that is what we recommend as the base so you use the instruct Lab Tuning you take this Bas Granite model that IBM has provided and you use the instruct lab tooling that Red Hat Works on and you use that to fine-tune the model to make it whatever you want what I want to go back for a to the partnership between IBM and red hat here with them providing the granite model to your instruct lab is this the first time red hat and IBM have collaborated like this I think it's something that's been going on like another a product within the red hat family would be open shift AI where they collaborate a lot with IBM research team like BLM is one of the components of that product that there's a nice kind of exchange and collaboration between the two companies yeah how large is the potential community of people who might contribute to instruct lab it it could be thousands of people I mean we'll see it's early days this is early technology that was invented at IBM research that they partnered with us at Red to kind of build the software around it there's still more to go like right now we have a team in the community that's actually trying to build a web interface to make it easier for anybody to contribute so we have a lot of those sort of user experience for the contributor to the model stuff to work out that we're still actively building on but like my vision for it even is I like going back to that academic model of learning from what others and building upon it over time it would be very good for us to sort of go out and try to collaborate with academics of all Fields like hey you know the model doesn't know about your field would you like to put something into the model about your field so it knows about it or even you know talk to the model it got it wrong let's correct it can we lean on your expertise to correct it and make sure it gets it right and sort of use that Community model as a way for everybody to collaborate because before instruct lab my understanding is if you wanted to like take a model that's open source licensed and with it you could do that you could take a model kind of off the shelf from hugging face and fine-tune it yourself but it's a bit of a dead end because you made your contributions but there's no way for other people to collaborate with you so the way that we've built this is based on how the technology Works everybody can contribute to it this is something that you can keep growing and growing and growing over time yeah yeah what's the level of expertise necessary to be a contributor you don't need to be a data scientist and you don't need to have exotic Hardware where honestly if you don't even have laptop Hardware that meets the spec for doing instruct lab's laptop version you can submit it to the community and then we'll actually build it for you we have Bots and stuff that do that and we're hoping over time to make that more accessible first by having a user interface and then maybe later on having a web service yeah so give me an example of how a business might make use of instruct lab one of the things that businesses are doing with AI right now is using hosted API services are quite expensive but they're finding value but it's hard given the amount of money they're spending and one of the things that's a little scary about it too is like you have very sensitive internal documents and you have employees maybe not understanding what they're actually doing CU you know how would you if you're not technical enough when you're asking said public web service AI model information about you're copy pasting internal company documents it's going across the internet into another company's hands and that company probably shouldn't have access to that so what both redhead and IBM in the space are looking at like the instruct lab model it's very modest it's 7 billion parameter model very small it's very cheap to serve inference on a 7 billion parameter model it's competing with trillion parameter models that are hosted you take this small model that is cheap to run inference on you train it with your own company's proprietary data inside the walls of your company on your own Hardware you can do all sorts of actual data analysis on your most sensitive data and have the confidence that it's not left the premises in that use case you're not actually training the model for everyone you're just taking it and doing some private stuff on it exactly which doesn't leave the building but that's separate from an interaction where you're doing something that contributes overall right and that's that's something maybe that I I should be more clear about is there's sort of two tracks here and this is very red hat classic you have your Upstream Community track and you have your business product track so the Upstream Community track is just enabling anybody to contribute to a model in a collaborative way and play with it the downstream product business oriented track is now take that Tech that we've honed and developed in the open community and apply it to your business Knowledge and Skills Let's do an imaginary case study sure I'm a law firm I'm an entertainment law I have 100 clients who are big stars they all have incredibly complicated contracts I feed a thousand of my company's contracts from the last 10 years into the model and then every time I have a new contract I ask the model am I missing something can you go back and look through all our own contracts and show me a contract that is missing key components or exposes us to some liability there in that case the the model would know my Law Firm contracts really really well it's as if they've been working at my Law Firm they're not distracted by other people's particular Styles or or a bunch of contracts you know from the utility industry or the they know entertainment law contracts exactly yeah and you can train it in your own image your style of doing things it's something that your company can produce that is uniquely helpful to you no third party could do that because no third party understands how you do business business and understands your history and your documents so it's sort of a way of getting value out of the stuff you already have sitting in a file cabinet somewhere it's it's very cool yeah give me any sort of a a real world case study where you think the business use case would be really powerful what's a business that really could see an advantage to using uh instruct lab in this way the the demo that I've given a couple times at different events used an imaginary insurance company so you say you have this company you have to recommend repairs for various types of claims you've been doing this for years you know if you know the the windshield's broken and you gotten this type of accident and it's this model car these are the kinds of things you want to look at so you could talk to any insurance agent in the field and be like oh you know it's a it's a Tesla you might want to look at the battery or something like they'll have some latent knowledge just so you can take that and train it into them model honestly I think these kind of new technologies are better when they're less visible so say you have the claims agents in the field and they have this tool and they're kind of entering the claim data they're they're on the scene at the car and it might say oh look I see this is a Ford Fiesta these are things you want to look at for this type of accident as you're entering the data it could be going through the knowledge you had loaded into the model and be making these suggestions based on your company's background and hey you know let's not make the same mistake twice let's make new mistakes and let's learn from the stuff we already did yeah so that's one example but there's so many different Industries and ways that this could help and it could make those agents in the field more efficient have you had anyone talk to you about using instruct lab in a way that surprised you I mean some people have done funky things but sort of playing with the skills stuff that's where I see a lot of creativity the difference between Knowledge and Skills is that knowledge is pretty pretty understandable right like oh historical insurance claims or you know legal contracts skills are a little different so whenever somebody submits a skill sometimes it tends to be really creative because it's not something that's super intuitive somebody submitted a skill I don't know how well it worked but it was like making asy art like draw me a I don't know draw me a dog and it do like an aski art dog I mean it's stuff that you can do programmatically one that was actually very very helpful was you know take this table of data and convert to this format like oh that's nice that actually saves me time how far away are we from the day when I Malcolm glob well technology ignoramus can go home and easily interact with instruct lab maybe a few months few months I thought you going to say a few years no I think it' be a few months wow I hope it's power open source Innovation yeah oh that's really interesting yeah I I'm always taken by surprise I'm still thinking in 20th century terms about how long things take and you're in the 22nd century as far as I can tell honestly the instruct Lab Core invention was invented in a hotel room at an AI conference in December with an amazing group of IBM research guys December of 2023 wait back up you have to tell the story this group of guys we've been working with they they were at this conference together and it's a really funny story because you know it's hard to get access to gpus and like even you know you're at IBM and it's hard to get access everybody wants access they did it over Christmas break because nobody was using the cluster at the time and they ran all of these experiments and I'm like whoa this is really cool and um wa and their idea was we can do a strip down AI model and was the idea and even back then combine it with granite what was the core the original idea the original idea it's sort of multi there's like multiple aspects to it so like one of the aspects actually came on later but it starts at the beginning of the workflow is you're using a taxonomy M to organize how you're fine-tuning the model so the old approach they call it the blender approach you just take a bunch of data of roughly the type of data that you'd like and you kind of throw it in and then see what comes out don't like it okay throw in more try again see what comes out they had used this taxonomy technique so you actually build like a taxonomy of like categories and subfolders of like this is the Knowledge and Skills that we want to train into the model and that way you're sort of systematic about what you're adding and you can also identify gaps pretty easily oh I don't have a category for that let me add that so that's like one of the the parts of the invention here Point number one is Let's Be intentional and deliberate in how we build and chain this thing yeah and then the next component would be okay so it's actually quite expensive part of the expense of like tuning models and just training models in general is coming up with the data and what they wanted to do is have a technique where you could have just a little bit of data and expand it with something they're calling synthetic data generation and this is where it's sort of like you have this student and teacher workflow so you have your taxonomy the taxonomy has sort of the knowledge like a business's knowledge documents their insurance claims and it has these quizzes that you write and that's to teach the model so I'm writing a quiz based just like you do in school you read the chapter on the American Revolution and then you answer a 10 question quiz or you're giving the model quiz you need at least five questions and answers and the answers need to be taken from the context of the document and then you run it through a process called synthetic data generation and it looks at the document so it'll look at the history chapter it'll look at the questions and answers and then it'll look to that original document and come up with more questions and answers based on the format of the questions and answers you made so you can take take five questions of answers amplify them into 100 questions and answers 200 questions and answers and it's a second model that is making the questions and answers so it's synthetic data generation using an AI model to make the questions we use an open source model to do that so that's the second part and then the third part is we have a multiphase tuning technique to actually take the synthetic data and then basically bake it into the model so sort of that's the approach a general philosophy the approach is using Granite because we know where the data came from another approach is the fact that we're using small models that are cheap to run inference on they're small enough that you can tune them on laptop Hardware you don't need all the fancy expensive GPU Mania you're good so sort of like a whole system it's like not any one component but it's sort of the approach they took was somewhat novel and they were very excited when they saw the experimental results there was a meeting between red hat and IBM it was actually an IBM research meeting that red Hatter from invited to and I think the the red Hatters invol sort of saw the potential whoo we can make models open source finally rather than them just being these endless dead Forks we can make it so people could contribute back and collaborate around it so that's when Red Hat became interested in it and we sort of worked together and the research Engineers from IBM research who came up with the technique and then my team the software Engineers who know how to take research code and productize it into actually runnable supportable software kind of got together we've been hanging out in the Boston office at red hat and building it out April 18th was when we went open source and we made all of our repositories with all of the code public and right now we're working towards a product release or supported product how long did it take you to be convinced of the value of this idea I mean so people get together in this hotel room they're running these experiments over Christmas are you aware of the experiments is they running them when did they no I didn't find out till February you find out February so they come to you in February and they say mo can you recreate that conversation well our CEO Matt Hicks and then Jeremy eer who's one of our distinguished engineers and Steve watt who's a VP were present I think at that meeting so they kind of brought it back to us and said listen we've invited these IBM research folks to come visit in in Boston you know work with them like see does this have any Merit could we build something from it and so they gave us some presentations we were very excited when they came to us it only had support for Mac laptops of course you know Red Hat we're Linux people so we're like all right we got to fix that so A bunch of the junior Engineers around the office kind of came in they're like okay we're going to build Linux support and they had it within like a couple days it was crazy cuz this was just meant to be hey guys you know what these are invited guests visiting our office see what happens we end up doing like weeks of hack fests and late night pizzas in the conference room and like playing around with it and learning and it was it was very fun very cool do anyone else doing anything like this is not my understanding that anybody else is doing it yet maybe others will try a lot of the focus has been on that pre-training phase mhm but for us again that fine-tuning it's more accessible cuz you don't need all the Exotic Hardware it doesn't take months you can do it on a laptop you can do a smoke test version of it less than an hour what does the word smoke test me smoke test means you're not doing a full fine tuning on the model it's a different tuning process it's like kind of lower quality so it'll run on lower grade Hardware so you can kind of see hm did it move the model or not but it's not going to give you like the full picture you need higher-end Hardware to actually do the full thing so that's what the product will enable you to do once it's launched is you're going to need the gpus but when you have them we'll help you make the best usage of them yeah yeah and there's a little detail I want to go back to Sure in order to run the tests on this idea way back when they needed time on the gpus so this is this would be the in-house IBM and they were quiet at Christmas so how much time would you need on the gpus to kind of get proof of concept well what happens is and it's it's it's sort of like a lot of trial and error right and there's a lot about this stuff that like you come up with a hypothesis you test it out did it work or not okay it's just like you know in the lab with you know buns and burners and beers and whatever so it it really depends but it it can be hours it can be days it really depends on what they're trying to do and then sometimes they can cut the time down you know with the number of gpus you have so like I have a cluster of 8 gpus okay it might take a day but then if I can get 32 I can pipeline it and make it go faster and get it down to a few hours so it really depends you know but it's like everybody's home for the holidays it's a lovely playground to to get that stuff going fast yeah let's jump forward one year tell me the status of this project tell me who's using it tell me how big is it give me your optimistic but plausible prediction about what instruct lab looks like a year from now a year from now I would like to see kind of a Vibrant Community around not just building knowledge skills into a model but coming up with better techniques and Innovation around how we do it so I'd like to see like the contributor experience as we grow more and more contributors to be refined so like a year from now Malcolm Gladwell could come impart some of his wisdom into the model and it wouldn't be difficult it wouldn't be a big lift I would love to see the user interface tooling for doing that to be more sophisticated I would love to see more people taking this and even using it maybe you're not sharing it with the community but you're using it for some private usage like I I'll give you an example I'm in contact with a fellow who is doing AI research and he's working with doctors there are GPS in an area of Canada where there's not enough GPS for the number of patients so you know anything you can do to save doctors time to get to the next patient like one of the things that he has been doing experiments with is can we use an open-source licensed model that the doctor can run on their laptop so they don't have to worry about all of the different privacy rules like it's private it's on the laptop right there take his live transcription of his conversation with the patient and then convert it automatically to a soap format that can be entered in the database typically this will take a doctor 15 to 20 minutes of paperwork why not save him the paperwork at least have the model take a stab and does the model then hold on to that information and he can inter he interacts with the model again when well that's the thing not with instruct lab maybe that could be a future development it doesn't once you're doing inference it's not ingesting that what you're saying to it back in it's only the fine tuning phase so the idea would be the doctor could maybe load in past patient data as knowledge and then when he's trying to diagnose maybe you know what I'm saying like but the the main idea is somebody might have some private usage I would love to see more usage of this tool to enable people who otherwise never would have had access to this type of Technology who never like you know a small country Je GP doctors it doesn't have gpus they're not going to hire some company to custom build them a model but maybe on the weekend if he's a techie guy he could play with interesting mod I mean the more you talk the more I'm realizing that the Simplicity of this model is the killer app here once you know you can run it on a laptop you have democratized use in a way that's inconceivable with some of these other much more complex but that's interesting because one would have thought intuitively that at the beginning that the winner is going to be the one with the biggest most complex version and you're saying actually no there's a whole series of uses where being lean and focused focused is actually you know it enables a whole class of uses maybe another way of saying this is who wouldn't be a potential instruct Lab customer we don't know yet it's it's so new like we haven't really had enough people experimenting and playing with it and out all the things yet but that's that's the thing that's so exciting about it is like I can't wait to see what people do is this the most exciting thing you've worked done in your career I think so I think so well we are reaching the end of our time but before we finish we're going to do a little speed round sure all right complete the following sentence in five years AI will be boring it will be integrated it'll just work and there will be no now with AI thing it'll noral it'll be what's the number one thing that people misunderstand about AI it's just Matrix algebra it's just numbers it's not sensient it's not coming to take us over it's just numbers you're on this side of you're on the uh Team Humanity yeah you're on team un good what advice would you give yourself 10 years ago to better prepare for today Learn Python for real it's a programming language that is extensively used in the community I've always in it but I wish I had taken it more seriously yeah did you say you had a daughter I have three daughters you have three daughters I have two you're if you got three you're you're you're on your own are you making them study python I am actually trying to do that the we're using a microbit microcontroller tool to do like a custom video game controller they prefer scratch because it's a visual programming language but it has a python interface too and I'm like pushing them towards python good um chat box and image generators are the biggest things to Consumer AI right now what do you think is the next big business application private models small models fine-tuned on your company's data for you to use exclusively are you using AI in your own personal life these days honestly I think a lot of us are using it and we don't even realize it yeah I mean I'm a ficio of foreign languages there's translation programs that are built using machine learning learning underneath one of the things I've been dabbling with lately is using Tex summarizations because I tend to be very loquacious in my note taking and that is not so useful for other people who would just like a paragraph so that's something I've been experimenting with myself just to help my everyday work yeah we hear many definitions of open related to technology what's your definition of open and how does it help you innovate my definition of open is basically sharing and being vulnerable like not just sharing in a have a cookie way but in a you know what I don't actually know how this works could you help me and being open to being wrong being open to somebody helping you and making that collaboration work so it's not just about like the artifact you're opening it's your approach like how you do things being open yeah yeah Mo I think that wraps us up how can listeners follow your work and learn more about granite and instruct lab sure you can visit our project web page at instruct lab . or you can visit our GitHub at github.com instruct laab we have lots of instructions on how to get involved in instruct lab wonderful thank you so much thank you malcome

2024-09-10 08:30

Show Video

Other news

How Programmer Jobs get created and destroyed, & how AI might change that. 2024-11-19 04:49

the most advanced SPYING device ever created? #privacy 2024-11-16 23:58

Build your own virtual server from scratch and remote access 2024-11-15 18:38