Applied AI Challenge Industry Day - August 31, 2023

Applied AI Challenge Industry Day - August 31, 2023

Show Video

>> Hello. Good morning, good afternoon, and good evening depending on where you are in these great United States. We at the Artificial Intelligence Community of Practice within the General Services Administration are thrilled to have you join us for our Third Applied AI Challenge Industry Day. My name is Eboni J.D. Freeman, and I'm the Communities of Practice Lead within the GSA Centers of Excellence.

And I am so excited to meet all 102 of you. Hear these phenomenal presentations and learn more about the wonderful world of large language models. Before we begin, here are a few reminders to ensure each of us has the best time possible. As I mentioned, this event is being hosted by the Artificial Intelligence Community of Practice, which is a government-wide network of innovators that accelerates the thoughtful adoption of AI in the federal government.

If you haven't already joined the AI Community of Practice, please follow the instructions that are going to be dropped in the Q&A. It is open to all government employees and provides cutting edge, news tools, AN forums for all things AI. For today's presentations, we'll have a dedicated Q&A at the end of each focus area session. These Q&As will be led by the Centers of Excellence, Omar Saeb. He's the acquisition lead. Please feel free to drop your questions and comments directly in the Q&A where we'll have a member of the AI Community of Practice team monitoring the Q&A to make sure your questions are answered now or in our follow-up.

Additionally, there will be a survey link that will appear during the break. When you exit the event, AN also at the end of the event -- in this browser window. So we're going to hit it three times, so that survey link. Please take some time to provide your comments as it informs how, when, and whether or not we organize more AI challenges in the future. Along the way, you can find the link to a closed captioning in the chat box.

We have a lot of great material today and anticipate that this event will last approximately three hours. Today's event will be recorded, it's already recording, and our team at the General Services Administration will share the recording as soon as possible. Now, to ask questions to the presenters, which we know you're going to want to, all you need to do is open the Q&A window using the instructions stated on the slide in front of you. So first, you're going to enter your question in the Q&A box, then click Send. Second, if the host or panelist replies via the Q&A, you'll see a reply in the Q&A window, pretty helpful. Third, you can click the thumbs up icon to like a question.

So if you think it's a really good question, you want to make sure that one get answers live -- that one gets answered live, go ahead and give icon thumbs up. Or you can click the red thumbs down icon to unlike the -- the red thumbs up icon to unlike the comments. So if you're like maybe it's not as good of a question as I originally thought. Once you've asked your question, our presenters may answer you via that same Q&A window, or they may answer your question live. So you might get a little shout out.

The purpose of this AI challenge is to showcase real world use cases of AI to improve the lives of the American people. Today, August 31st, who will have six incredible vendors? Who will showcase their large language model solutions within four market segments related to climate, equity, economy, and customer experience? We know today's event will connect promising AI technologies with agencies and their AI program offices across the government, thereby increasing AI maturity within and across agencies for the benefit of American people. Here's a preview of what to expect during today's event. We are currently in the welcome portion, and soon we will hear from our very impressive keynote speakers.

We are honored to have Drew Myklegard, the Deputy Federal Chief Information Officer within the Office of Management and Budget. And Conrad Stosz, the Director of AI for the Office of the -- Federal Chief Information Officer at the White House. Join us today. After our keynotes, we will hear six AI proposals. Hope you're ready for them.

The content of the AI presentations are organized by our four focus areas. First, you'll hear from the folks focus on climate. Next, we'll hear from two teams concentrating on equity. Then we'll have a 10-minute break because everybody needs them.

When we come back, we will hear proposals focused on the economy, and then we'll finish with our last segment on customer experience. Finally, we'll wrap up our event with the most important thank yous and some next steps for all of us. Well, before we move on, we would love to hear from you. You're 144 of you all showed up and we want to hear from you. So we've got our first poll question.

Specifically, we're going to be asking, which topic area are you most interested in today? Oh, you all are coming in quick. How are your fingers moving this fast, you all? Oh, my gosh. More than half of you all have already answered. I just said it two seconds ago. You all are so good.

Okay. We're going to close it in just about 10 seconds. So if you haven't answered yet, this is your time. There's 79% of you. Oh, we hit the 80% mark. Oh, excellent.

So, wow. You know, that's a bit surprising. Thirty four percent of you are interested in customer experience, but pretty much even split amongst the other three. Well, I hope you feel that same level of energy for all the categories.

Before we go forward, let's take a quick, quick peek backwards into the brief history of the AI challenge. On June 30th, the General Services Administration, AI Community of Practice, AI CoP, that's what we like to call it, posted this LLM Industry Day on We posted this challenge as an invitation for businesses, nonprofits, and academic institutions to submit their AI technologies aimed to improve the lives of the American people.

In total, we received 88 submissions. That's a pretty good number, you all and evaluated them based on four criteria. Clarity. How well does the submission concisely describe the technology. Impact. What is the potential impact of the proposed technology? Mission. How does the submission serve the American people? You're going to keep hearing that one today, you all.

Feasibility. Is the approach described capable of being met? So you and I know that these are the best of the best proposals received. That's what you're going to hear today. And, you know, our esteemed audience members, our hope is that while you are listening, you're also scheming and dreaming about how these technologies could help your agency's mission deliver more effective and efficient government services for the American people. And this work wouldn't be possible without Eric, Michelle, Omar, Ryan, Nate, Danielle, Jenny, and Ed from the GSA Centers of Excellence.

And now it is my overwhelming pleasure to introduce our first keynote speaker. Again, we're honored to have Drew Myklegard, the Deputy Federal Chief Information Officer with the Office of Management and Budget. Drew is joined by Conrad Stosz, the Director of AI for the Office of the Federal Chief Information Officer at the White House -- Office at the White House to come and share a few words. Drew and Conrad, take it away. >> Thanks, Eboni.

I'm going to go first and then I'll turn over to Conrad quickly. I just want to thank everybody you know, like that -- right before Labor Day and hope that you all like are -- have some great plans for this week and after this exciting industry day. I just wanted to like signing in and looking at this challenge, reminded me when I first came into VA. And out of the military, my first project was to run an app.

We were building an app at VA on and how much work the team goes through like getting these on or getting these on, like answering all the questions, consolidating the materials, making the hard decisions, and keeping the timeline. So I just want to echo Eboni's shout out to the team for all their hard work and also thanks to the vendor community. Like it was a really good introduction for me and government to see how like the vendor community can really interact with government, you know, in atypical way. This was over 10 years ago.

So this was when it was a brand new platform, and it's great to see that this has become a part of the way that the federal government charges or is able to leverage and facing challenges and working on some of our hardest challenges? So at OMB where Conrad and I both work, we review in the IT department about $120 billion, IT federal spend portfolios. It's very large. We want to make sure that that's well managed, delivering great value, and provided and prioritized to things that matter to accomplish in the agency's missions and delivering value to our citizens. Like very specifically, we're charged with stopping federal or stopping foreign intrusions into US agencies, providing expertise in areas like digital identity and AI, redefining security expectations for software in the cloud, improving digital experience, and driving digital transformation.

We partner between like the overall spend of the federal government and our missionaries with our OMB budget colleagues to make sure that the critical programs that are being run at agencies get funded and IT is appropriately budgeted for and properly executed through the time. Specifically in AI, what we've seen in the last couple years through our AI use case inventory and just general reaching out to agencies, I've seen many diverse use cases for AI and the federal government. A number of them you're seeing here today, we've seen it, and in some of the other AI challenges that the AI Center of Excellence have done. So you've seen improving healthcare, safeguarding the environment, and then protecting our citizens -- our nation from cyber threats.

Obviously, this is a fast moving area and we need to ensure that the federal government is able to keep up and provide cutting edge services while also managing the risks of AI like bias and discrimination. Specifically, in the large language model areas, there's some incredible opportunities as all of us have seen like since last December and probably earlier on in the federal government. We really see as an opportunity to innovate and deliver on agency missions, but we do see a very important role with new considerations and risks.

You know, balancing those two is always challenging, but you want to stay excited about the opportunities and aware of the rest. So we're also very energized about this challenge. And I definitely want to see, you know, like the -- of the 88, unfortunately six could make, you know, the presentation to industry today. And then I'm looking forward to September. I think it's 15th, mid-September when you guys announce the winner of it. As a Deputy Federal CIO, it's my office role to ensure that we have strong safeguards in place to ensure the accuracy of any AI generated information.

Especially before we start using LLMs to augment government systems, we also want to empower our federal employees with access to cutting edge tools and ensure they have the ability to experiment and learn, which is why challenges like this are so important. It's also our role in the Office of the Chief Information Officer to ensure technology has the adequate safeguards in place so that employees can take advantage of this powerful tech, while also adhering to federal information and cyber security policy. As we've updated our policy in the areas of cybersecurity the last couple years and federal information, we've -- we also have to consider more than just the security of federal information, which is why OMB is releasing further guidance for agencies to ensure their development, procurement, and use of AI centers on -- and the use of AI centered on safeguarding the American people's rights and safety. But we are just at the beginning of the journey under the leadership of our Director of AI, Conrad. And so I'd like to hear from -- turn it over to him and let him talk more in depth of some of our key efforts and what progress we're making. Conrad.

>> Thanks, Drew. Yeah. This is a great time to be having this challenge, obviously, generative AI and LLMs are a bit -- have a bit of attention on them right now. And it's a great time to be hearing from industry on some of the ways that those can specifically be applied to the government.

So as Drew talked about briefly, OFCIO, our office within OMB, has a statutory responsibility to define and to promote the responsible use of AI and to ensure that agencies have the means to manage the risks associated with that use. So here in OMB, we are actually releasing guidance relatively soon on AI and government use of AI. But as the technology is advancing so rapidly, the federal enterprise still has a lot to learn obviously. So we are really welcoming of opportunities like this for federal agencies to get face-to-face with how AI and with generative AI in particular can help their missions, but we're also strong believers in safeguards in managing AI risks. This is helpful for that too, right? We definitely see the need for federal agencies to get hands-on with the technology to learn about it, to familiarize themselves and experiment in order to be able to both use it, but also to manage its risks effectively.

So to share some of the knowledge that agencies are doing both in avenues like this and their own experimentation, we're actively working with agencies to inventory their AI use cases, collect those, share them between agencies, provide some sort of connective tissue across -- many activities across government. And we're also committed to building and growing government AI expertise and awareness among federal workforce. And one of the things I just want to briefly highlight is that we've been partnering with GSA's AI Community of Practice and with faculty from Stanford University to pilot a new AI training for the federal workforce.

And that's happening soon in the near future and we're really looking forward to it, and it should be open to most government folks. So it's critical, you know, to provide the federal workforce with opportunities to gain and increase their AI knowledge. And during this training, employees will get a chance to hear perspectives from Stanford professors. It's not necessarily going to be authoritative training on everything you need to know about AI.

You can't fit that into a short training, unfortunately, but get to hear a lot of input and perspectives from leading professors on the science behind AI, managing the risks of AI, and ways in which AI can benefit the federal government. So if you're interested in that, in addition to the things we're going to learn today, I recommend and encourage folks to check out GSA's AI Community of Practice page, which has opportunities to register as a follow up to this great presentations today. So I'll wrap it up there and just say greatly appreciate the opportunity to be with you all and to hear from the great finalists and learn more about the ways that AI can be used to further drive some of the priorities of the government.

So, thanks. >> Thank you, Drew and Conrad for your encouraging words and support. I see we have a question for you all in the Q&A.

We'll make sure to get it over to you after the event so you can answer. Also, thank you for pumping up the Stanford High Event. We thank you for that little throw in there. Without further ado, I hope you're ready for our first category.

First, we have an AI submission that helps address the climate crisis. Welcome to the stage, Nexcepta. >> Hello, everyone. This is Kamal Tawasul from Nexcepta.

All right. And can you hear my voice okay? Okay, great. I'm not seeing the share screen yet. Oh, now I can see it. Perfect. Sounds good.

Okay. Okay, here we go. This is our presentation. All right. So very nice.

It's a great experience to be here. Thank you very much for inviting us or selecting us. Let me put down the hide.

Okay. My name is Kamal Tawasul I'm a Principal Scientist at Nexcepta. Today, I'll be presenting our solution called Climate Sense. We built Climate Sense as a large language model solution to address climate crisis and has public awareness. Nexcepta is a small business that is based in Gaithersburg, Maryland, and we have a lot of efforts in developing normal AI ML technologies, mainly focusing on NLP, our computer vision, and time series data. So with that, here we go.

Okay. So as a motivation, climate crisis is a global problem with what life that's on American people, as well as anybody around the world. I'm sure that most of you have seen the images of devastation that the wildfires have caused in Lahaina Maui. And that was just in August.

If you go back a few months, you can see the climate change has amplified in India as monsoon season with catastrophic outcomes. In June, India experienced major floods in the hematologic Pradesh region. And another extreme event that we have seen this year starting in February was the longest lived storm ever recorded around the world -- in the world. And there is Cyclone Freddy. More than 400 people have been killed and thousands of homes were destroyed. That cyclone traveled 5,500 miles path, a path of 5,500 miles within a span of 36 days.

And with the extreme events that we have seen -- the extreme events that we have seen this year is not limited to these three examples. For example, the wildfires in Canada affected the air pollution in the East Coast. As you can see, all these extreme events are a global problem, although it might be happening in another country or in another region. So with the climate crisis, we are going to see the scale, duration, frequency, and impact of these extreme events substantially increasing. So, unfortunately, some of the human activities contribute to significant destruction, degradation of fragmentation of the nature around us, in our communities, in our states, or even in our countries.

For example, activities such as draining, untreated sewage exhaustion, unplanned coastal development, trolling, overfishing, and deforestation can have direct or indirect results on soil erosion, raising sea levels, biodiversity loss, species loss, coral bleaching, and mangrove depletion. So why did we build Climate Sense? Our goal is to increase awareness and align potential solutions with local and federal policy makers, organizations, researchers, educators, and individuals, and the public in general. However, there are many challenges in connecting these two -- these potential solutions with the policymakers and local communities.

For example, as an interesting fact, today an average American consumes about the equivalent of 34 gigabytes of data and information, every day. So the scale of information that one needs to sift through in order to reach the right information source is very large. Another challenge is that the information resource that you're looking for could be in another language or could be coming from another geographical location to you that you may not be following that data stream, right? Another challenge is that information -- that information that you are looking for can be a local news that is not advertised on the national outcomes, but it'll provide as a solution to your problem.

So we have built Climate Sense as a large language model, a solution to understand, analyze, and organize volumes of climate related data rapidly, and accurately, and deliver valuable insights with actionable knowledge. In Climate Sense, we have a LLM-based information due solution. We can provide personalized recommendations and we built a conversational NLP capability.

We analyze more than 140 countries -- more than resources coming from 141 countries, spending more than 40 languages and 60,000 resources. Our platform uses multiple large language models for information retrieval to select a set of resources from a large collection of data. Select the document, text, article, government report, patterns, and so on, and provide sematic related keywords to you that you can expand your research on.

And conversation NLP capability that you can interact with the article, with the text, with the report, with the documents that you are reading. So, why do we use a large language model? Large language models are very good in transforming text into some numeric representation so that we can do the AIML applications that we can -- that we know how to use. But large language models are actually much more beyond that. Large language models are large and complex models that have -- that can understand the context. The natural language -- they have very good natural language understanding capabilities. That context awareness enables the LLMs to generate coherent and contextually relevant text, making them suitable for text like content generation and natural language understanding.

For example, you can use an LLM to classify a given document, or a paragraph, or an article. Search within a document, extract our name entities, rewrite a given document or a text, cluster multiple documents or summarize a given document, right? In our system, we are using a large language model that has trained over 50 different languages with multiple datasets. And the advantage of using a multilingual LLM model is to leverage knowledge and patterns that the model can learn -- patterns that model can learn from one language and improve its performance in another language. This enables the model to generate its -- generalize its understanding across languages and potentially even for the languages that we didn't train on. It was an explicit channel. In developing these models, we have followed ethical and responsible AI principles and practices in building and fine-tuning our LLM model.

So what does Climate Sense do? How does it work? The multilingual large language model can sift through the large quantities of resources. Those could be online or databases that we have already curated and select the semantically relevant ones. We can apply advanced text analytics to tailor the content to using it and provide climate crisis information and potential solutions.

If you enable the Climate Sense platform, can send you notifications, then it discovers a new -- and a relevant resource is -- new and a relevant resource is discovered. So what are the expected benefits of climate science? One of the major benefits of climate science is to promote sustainable practices through extraction and dissemination of relevant climate crisis information to different languages, two different demographics. And those could even include things like climate risk aware investment strategies, right? The second goal is to raise awareness about the climate crisis and identify indicators of climate change around you.

In your neighborhood, in your state, or even in your country. All these benefits will provide time and cost effectiveness. So let's take a look at our platform. This is our login page. After our quick registration, you can log to your account through your email and password, which will then take you to our homepage. This is the dashboard.

In the homepage, you will see that we have curated several climate related collections for you to browse as an initial step. You can click them here or you can click them on the left side under Topics. If you want to remove them, you can remove, or you can add a new topic that you want to follow. We also have a query box, where you can ask a question, apply any filters for language, time, geography. And this query could be in any of the 40 languages that LLM model uses.

If there's any resource that you like, you can add them to your favorites. And if there's a resource that you want to go back to read that you have read in our platform, you can see that through the history on the left pane. If you click on any of these topics, which will take you to this page, for example, this is about a flood protection. This is a collection of recent articles in the topic of flood protection and flood protection strategies. In our platform, if you ask a question on the query box, for example, if you say, can you find wildfire mitigation strategies? These are the collection of articles that are provided as of August 28th.

And you can click on any of these resources, which will then take you to this page. On this page, you can see that the article title, article source, the image from the source, a summary that we provide using our LLM model, our Word cloud using another LLM model. If you like the article, you can click on the Title, which will then take you to the outside source. In this case, it is the, right?

In the system, you can interact with the article. You can ask a question about this article and say that, can you summarize this text with action items to mitigate climate crisis? The LLM will read through the article and then respond to you with whatever -- with the action items it sees on the text. For example, it can say, action items to mitigate climate crisis and respond to wildfires in an area is emergency response and evacuation plan, prioritizing which residents to rescue first, establishing emergency shelters, and coordinating transportation, and so on. So these are the text that is contained in the article. So let's take a -- let's do a demo and let's take a look at our capabilities.

So this is our landing page. This is our login page. You can enter. For the sake of time, I'll just move forward, right? So these are our home favorites and history.

These are the topics that we have created as an initial step. You can ask your queries and do your filters here. If you want to focus in any of these, you just want to see our news articles or academic studies or government resources, you can click on those. And if you search for the flood protection, you can actually see the articles here.

And for the first fires, these are -- this is the collection of items. If you click on any of these, you go to another page called -- another page for the article. It is a summary that you provide an image from this article and you can click and go to that outsource -- outside source if you want to read it. Into the system, into our LLM model, our conversation and a lab capability, if you ask and say that please summarize this text with action items to mitigate climate crisis.

Climate crisis, right? Click on the Query and query will return you, you know, the -- with the action items to do in the case of fire and why you need to do as proactively to mitigate its effects on the population around it. Let's test another example. For example, if you say, "Can you find wildfires mitigation strategies?" You can select any of the language to do our filters and say query, which will then take you to another one, right? And if you click on the -- if you click on any of the text, this are text about -- this an article that was published in USA today about the vulnerability of other places in the United States against wildfires. It's the -- it's called -- the next mile could be anywhere.

And it has a lot of good resources in this article. So if you ask a question to the system saying that what are the lessons learned in this article -- -- it'll show you the query results with the results that are contained in this article. So who are end users of Climate Sense? We have identified three categories of end users all the way from individuals to local communities, to international decision makers. For example, this is a miracle house that was in Lahaina that was recently restored. And while the whole neighborhood was destroyed, it actually suffered very minor damages.

It was fire-hardened, and that will constitute as a -- that can constitute as a very good model in building Lahaina. To the local communities as an example is the Tucson and Arizona, where the community work together to restore the native vegetation to improve their drought resilience. In the state-wise or in national or international case, one good example is restoring the cost of habitat. There is a Blue Carbon National Action Plan in the United States. It is an international case for the -- from the World Economic Forum.

That constitutes a very good role for the international, you know, or the state-wise, or the national decision makers, but they can use our platform to reach information and they can even use it for addressing geopolitical issues. So our goal was to improve public awareness and share sustainable practices in real time using large language models. We hope that you enjoyed our presentation. And if you have any questions, you can send us email at, and I would like to thank you for the opportunity to present our work. Thank you.

>> Wow, how intriguing. Thank you so much. Now I'll pass the virtual mic to Omar to lead the climate Q&A. >> Thank you, Eboni.

I appreciate it. Just a couple quick reminders, please use the Q&A function for any of your questions and not the chat function. Also, we're going to -- we're kind of do about 10 minutes -- about eight minutes for questions for any Q&A that comes through.

Also, please indicate who the question is for the name or obviously the particular company. And if you can repeat the question and read the question out loud, that would be great for the actual presenter. I'll actually address the first one from James. So this is for Kamal, in your knowledge base, you appear to be storing the full body of news articles. How do you deal with sources that prohibit data scraping, for example, New York Times? >> Yeah, that's a very good question. Some of them do prohibit it and some of them do not, although there's a paywall.

Now, if there's a paywall, since we are not showing the text itself, we are not constrained by the loyalty to them and also we are not -- we are just summarizing the context to them, right? If there is a limited information that we can scroll through, for example, in New York Times, we will only be showing you the -- whatever is available through getting that article. That is a little bit about loyalties and, uh, the cases that they have. But as we grow this knowledge and we can pay them, then we can expand our resources. So right now, we are covering already 60,000 resources. As we get more funding, we can actually get subscribed to them and we expand our coverage on the ones that are behind paywalls. So I hope that answers that question.

Right now we are taking whatever is available, which is usually the first two paragraphs. But if there is -- as we get more funding, we can actually get subscription and expand our knowledge database as well. Thank you for the question, though. >> Yeah. We appreciate it. You got also another question. One, how does your product improve upon other commonly available ways to get information? And then one more, one other ones.

How do you qualify -- how do you quality control data in so many languages? >> Okay. Two great questions actually. So one of them, the first question was -- can you repeat the first question again? Sorry. >> Yeah. No problem.

How does your product improve upon other commonly available ways to get information? >> Definitely. So, for example, take Google, right? We are not competing. We don't want to compete with Google itself, right? But if you write a complex query on the Google Search, they actually struggle in writing -- doing the complex searches, right? For example, they have released a new -- I'm sure -- I'm now not. Maybe some of you have seen the generative AI capabilities that is in the trial mode in Google, right? For example, we just made a test that if you search for what is climate crisis on Google, it actually does generate AI on the backend, and it provides a good answer to you. But if you say what could be a good LLM application to mitigate climate crisis, it actually falls back to the old case, to the old search, and it doesn't provide you good solutions, right? In our case, we are doing the LLM since we are using a large language model to understand the similarities between the query.

And the data sources out there, it's actually a very good way of understanding semantic dissimilar things. Semantic similarity is very critical because, although you may not use the same word, it's -- they are actually in that embedding space. They are very close to their synonyms so that you can capture a lot of words and understand their semantic similarity much better than keyword searches, right? So the keyword search, or the page rank type of search capabilities are actually updated in the sense that they don't capture the semantic quality or semantic capability that we are looking for, right? Because page rank is more about how the websites are connected to each other irrespective of the -- in most cases, of the text itself, right? Whereas the keyword is more about how often that keyword comes up in that article.

But if you are searching for something, there's a synonym, then it'll be actually a zero, right? So that's why LLM has opened up a new world to us that we can do semantic search, which we didn't have before in most of the prior technologies. For data quality, it is all about, in my opinion, the data sources that you select. For example, in our AI model, we are actually selecting -- we are actually selecting the models as an initial case that are trained on vetted datasets. If your AI model has been trained on unvetted datasets like forums or some discussion areas, right? They're actually unfiltered. They might have biases.

They might have toxic content, which we want to avoid anyway. Therefore, if you train your AI algorithm on a dataset that is been vetted, that you know it has been clean and it doesn't have -- you know, they're bias, toxic elements, you know, the bad words and so on, it's actually a very good way of understanding the quality that your AI model is trained on, right? So this is about training your AI. If you train it on with bias, with toxic, you know, text, you will have subsequent effects in that, right? It's just like your child, if they watch some bad content on the TV, they'll be seeing those bad contents in their real life. Same for your AI model. It'll have the same, unfortunately, capabilities.

So that's for training the AI model with the data that we are using. The data that we are showing is actually about, you know, the new news articles that they're doing. So if there's any bias based on that news articles, political, you know, situation, we don't have much control on that one. So that's the only thing about the data quality that, you know, this is an online resource, but we are not only limited to the news articles. We are actually also looking at the government reports, which are very critical for other, you know, communities that they can know what the government is paying to do so they can do, you know, parallel actions so that they can mitigate the -- take actions to mitigate climate crisis or the effects on them. I hope that answers the question.

>> Thank you, Kamal. We've got a couple more questions as well. So multi multilingual language models may have varying performance depending on the language, in particular, for low resource languages.

Did you assess performance of Climate Sense for different languages? Did you notice differences in the relevance of results to queries or in the quality of summary for different languages? >> That's actually an excellent question as well. That is true. But it's also true that if you have a single model just for one language -- so there were like some interesting like natural language processing workshops where they opened up some challenges.

And in those challenges, the ones that always performed the best or the -- in the top, certain numbers, were the ones that were trained across different languages, right? So large language models are very good, right? In their performance. So in low resources, it is true that their performance is not that good. But since we are using large language models, if they have a relative language in our training database, it's actually a very good way of expanding or understanding sort of like the similarities. So we can -- just like I mentioned in one of the slides, since we know the patterns in that language in the, let's say, cousin language, we can actually carry that one into the low resource language as well. So that's a very good case.

But since a multilingual model based on 50 different languages, if it's cousin, if it's a relative language is within our team data, it is actually very good way of improving our performance or I'm capturing the semantic similarity there, but a very good question. Yeah. >> All right. One more question. How is the data updated? Is the internet query on-the-fly based on user query or is there time-based crawls with database updated in a match? Do you have a process for evaluating differing information on the same topic? >> That's true. That's a very good question.

That's why in our presentation, we said we are doing both of them at the same -- we are doing -- so we do continuously to update our system. But if there's a new query that's said that the time has passed since our query, we do a new query and we get it onto the place where is not updated in our system, right? So that's a very good way of getting online resources as well as the ones that we have already cached in our system. That brings us to advantage for latency, right? So when the user clicks on the Query, we want to make sure that the information is given to them as soon as possible. Like nobody will wait for another minute. So by pre-cashing and also pre-cashing through our database and also getting the new data with certain internals, we can actually present it to the users with minimal latency as possible. >> Thank you so much, Kamal.

I'm going to hand over to Eboni. >> Thank you, Omar, and thank you, Nexcepta Team. >> Thank you very much. >> We have two submissions. Focus on equity for all Americans.

Take it away ASMI AI. >> Hi, thank you. Let me get my slides up.

Okay. That was a very interesting presentation. I am Nathan Harmon from ASMI AI, and I'm excited to present Jibber Jabber. Jibber Jabber is a large language model based command and control for radio frequencies that enables humanitarian efforts. As we were just hearing about there's a lot of natural disasters occurring in our world and whether it's the Maui fires that have been extremely deadly, maybe the most deadly fire in over a century in the US, we still have people missing. And the impacts of these natural disasters are often accentuated by the rapidly changing conditions and the difficulty with communication and the resulting failures in communication. RF provides invisible backbone that's relied upon by emergency response teams.

It is a huge amount of information that it exists in the spectrum that we only partially utilize. The parts that we do utilize, we are very dependent on, whether it's two-way radios, blue force trackers, emergency alert systems, satellite phones. And when these systems fail or we don't utilize them to their full extent, we do not respond as we should to these events. Beyond just the data that exists from the response teams, there's also a lot of information being exchanged on the civilian side from walkie-talkies or news broadcasts over the FM and AM radio stations or cell phones and smart watches. When you can utilize all of this information, it really enhances your ability to respond to these situations, but oftentimes due to the sheer complexity and the knowledge requirements to utilize it properly, we fail to do that. Some of these complexities just come from knowing where in frequencies these devices exist or their access strategies or encoding.

And this puts a huge burden on human operators both in the amount of training they require to get into this field as well as just when you're in the field operating, you have all this extra complexity that takes away from your response mission. This is what we've built Jibber Jabber to solve. Jibber Jabber removes a lot of this cognitive burden off operators. So rather than having to know the exact frequency data access type device type when you need to go collect some data, you can now just say something like the prompt that's shown on the right here. Is anyone requesting help on a walkie-talkie? From there, Jibber Jabber will leverage a large language model in order to autonomously go and find the correct frequencies, access strategies for achieving this objective that the users provided.

And then we also have developed a radio stack that can drive a software defined radio. So then the large language model is capable of then operating this radio stack to go collect this data. The data is then demodulated into audio. This audio is then translated into the language that the operator speaks and then transcribe back into text. And then that allows the large language model to then query this as we were just seeing, so then you can ask questions about these streams of data such as, is anyone requesting help? If we just once again think about an emergency response situation, you might have one team that has had some conditions change and now their evacuation route has become unsafe while another team has successfully established a safe zone.

This information needs to be quickly communicated to the correct parties. And a system like Jibber Jabber can do this all automatically. It can listen to the conversations that are happening, understand the context that's happening, and facilitate this communication. This next slide, we're going to look at a video recording of actually running the Jibber Jabber system. This is using real hardware and real over-the-air data.

No simulation. It's all live. Was live when we recorded the video. So given that we're using large language models in order to drive this system, they also understand the parameters of the system that they're running so they can train a user how to operate them. So you can just ask questions like, you know, what are your capabilities? Or even questions that are RF related to increase your education level on these things. You know, what is an FM signal? How is that modulated? What types of data are commonly held there? Beyond the training aspect, you can also now just ask it to accomplish tasks.

So a simple one, collect FM radio. And then it'll tell you how it thinks it's going to achieve that, and then give you the actual parameters that's setting the radio to. And now you can see this live spectrogram showing all these channels of FM radio stations. So each of those lines is a different FM radio station.

And that ends up being a lot of data. So you can see as it starts collecting and demodulating all these streams, you end up getting a lot of text, a lot of information far beyond what one human could really comprehend at a single time. Once all of this data is transcribed, you can then go and select one of these transcripts and start interrogating it, whether it's just asking for a summary. So this was a BBC news channel. And so by asking for a summary of it, you end up getting a summary of the news from that day, or you can ask if specific topics are being discussed such as, you know, as somebody requesting hump over this RF channel. Beyond just a simple command, you can ask it to do more semantically rich tasks.

So look for a hiker that doesn't specifically tell it what to do in the RF domain, but it's intelligent enough to realize that, well, hikers might have walkie-talkies on them. And so then it goes and tunes the radio stack into the walkie-talkie band. There was not a lot going on in the walkie-talkie band in the lab. So we just set some stuff over a walkie-talkie, right? The background is getting blurred out.

And then the system will pick that up and transcribe that text as well. It goes. And so then that was what I said on the walkie-talkie in the lab. So the capabilities that we just saw there were this built-in training mechanism. So it's very important to be able to train your operators and this system will let their operators just ask questions as they go in their own words, and then provide feedback as far as questions relating to the RF domain.

It's also a capable of accomplishing objectives autonomously. So you can provide it with this objective for looking for missing persons, and it will go ahead and do intelligent things using the spectrum to try to locate those, such as looking for walkie-talkies. The radio stack is capable of detecting, identifying, and demodulating AM and FM signals. Our translation and transcription currently works in 57 languages and we're looking to increase that. We can, and then finally, extract insight from that collected data, such as looking for people requesting for help. The system is a microservice architecture.

So if you already have a collection platform, such as your own radio, we currently support the beta 9 format for serialization. So you can just plug your radio in and our system will run, but you can plug in at any of these levels. So if you have a radio and your own radio stack as far as processing these signals, then you can just plug into our natural language processing. As far as where we're headed, we have really been working a lot on these first two things with the continuous feedback into the LLMs. So you ask it to do a task, like look for a missing hiker and likely the first thing it'll choose to do is go look for a walkie-talkie, but maybe the hiker didn't bring a walkie-talkie with them. And so it's going to try to collect data for a while, it's not going to find anything, and then it's going to go ahead and move on and start looking for other RF emissions.

Maybe there's some Bluetooth coming off a smartwatch or some sort of signal coming off of a cell phone. And so it can continue to take in what it's collected and change its behavior to attempt to accomplish the objective. Then we've also been testing running on an NVIDIA Orin. It's a small embedded platform. It's, you know, a four-inch cube basically. And this has required, you know, fine-tuning smaller language models that are very task-specific that are capable of fitting on this smaller hardware footprint.

So that's the Jibber Jabber system. And if you have interest in this, we'd love to hear from you. You can contact me over email and please check out our website and blog. >> Wow, how interesting. Thank you. Now we'll hear from Figure Eight.

Don't you just love a Zoom Snarf you all? I just love them. >> Hi, everyone. Let me go ahead and share my screen. Hi, everyone.

My name is Tim. I'm the head of product here at Figure Eight Federal, and today I want to present our solution for an LLM based chatbot focused on driving equitable access to tax and regulatory information. Every American deserves an equal chance to get ahead. And a big part of that is breaking down the financial and other equity related barriers related to access to key information and getting answers to questions that allow individuals to innovate quickly and chase their dreams. Before I get started, I want to take a second to talk about who we are and why we're passionate about this topic. Figure Eight Federal's primary investor is Appen, the leading data enrichment company for AI technologies.

Figure Eight Federal, we are a small team who I mitigated from Appen with a sole mission of empowering federal employees with technology solutions that enable them to drive impact to the American people. And to date, we've had early success with both our parent companies commercial clients as well as federal agencies, where we focus on making data visible, accessible, understandable, and trustworthy for LLM initiatives. So I know what you're likely thinking, who cares about tax and regulatory information? This is a boring topic. And what does it have to do with equity? And what I'd like to tell you is a quick story that shows why we care about this topic and why we're passionate about solving some of the pain points related to this. So years ago, as a broke engineering college student, I had a dream of leveraging technology to connect urban and rural communities with access to non-GMO plants and natural garden products.

And the biggest challenge I faced was understanding the tax and regulatory information associated with starting and running a small business. I remember staying on hold with the IRS for over three hours trying to get answers to my questions. And some of my more wealthier classmates would recommend, oh, hire a law firm or an accountant to explain stuff. But at the time, that was beyond the -- my means. And as a result, I ended up having to spend a lot of time researching tax and regulatory information associated to the space that I was operating that small business in. My girlfriend is from Brazil.

Recently, her and some of her Brazilian friends wanted to start a business in the US and actually, tax and regulatory codes are very different in the US than they are in Brazil. And as an immigrant, she's facing an increased challenge from an equity perspective, gaining access to the information that she needs. And so hearing about my girlfriend and her struggles and her friend's struggles reminded me of those days in -- my college days and the struggle, I even had to find access to tax and regulatory information, even as somebody who grew up here in the DC area. And so it got me asking, what if there was a way to leverage LLMs to create a chatbot experience that could provide customized insights to complicated tax and regulatory information regardless of your financial status, where you're from, or what you're trying to achieve? What if there was a way to empower government agencies to have full control to easily modify LLMs, to check and mitigate for things like bias or hallucinations and keep legal information up to date in that chatbot itself? So at Figure Eight Federal, we believe that an LLM chatbot is the answer to providing equitable access to customized insights on tax and regulatory information. It will not only reduce the heavy demand on agencies like IRS and others with regards to phone support, but it will also solve a lot of the traditional issues with chatbots not being able to provide helpful enough information. The key issue though with using an LLM for providing insights on things like tax and regulatory information is on finding an adaptive approach that empowers government employees to continuously monitor and improve the LLM to evaluate it for safety, reliability, and accuracy, and identify and mitigate bias in real time.

So at Figure Eight Federal, we have experience developing a number of training data for major foundational LLMs on the market today, as well as customizing these LLMS for various enterprise applications through things like fine-tuning and reinforcement learning with human feedback. And when customizing an LLM, there are a lot of things under the hood that are required to optimize them for a niche use case and really to identify and mitigate the potential biasness that could be in your original training data that was used to develop that LLM to ensure you're providing not only equitable access to information, but providing accurate information as well. So we've developed a platform called the Figure Eight Federal LLM optimization platform that empowers agencies to ingest tax and regulatory information, connect to foundational LLMs, and then develop, test, and validate fine-tuning data to reduce bias and hallucinations, and ensure information is provided in a way that provides fair access to critical information that can help small businesses get off the ground. So our system also provides a frontend chatbot interface that agencies can deploy and share with end clients. And so instead of a static approach, taking a one-time customization of an LLM for a particular purpose, in this case, tax and regulatory information, what we've done is created an end-to-end solution that provides full government oversight through integration of government subject matter experts, continual monitoring, full transparency, and integrated red teaming to ensure that there are checks in place to evaluate for things like accuracy, reliability, effectiveness for all users across all demographics and backgrounds. So in the frontend, we've designed a simple interface that allows users to upload information about their business and to provide the LLM with context for supporting them with chats on questions related to tax and regulatory information customized to their small business.

It contains simple to use buttons for things like to indicate that the response is too complicated. Ask for an example or to clarify for exceptions. The frontend chat interface provides a feedback loop to the Figure Eight Federal LLM optimization platform on the backend that can escalate chats that are needed to provide additional insights and clarification from human support teams. So if we go back to the overall picture of the platform, what we have is a number of modules that provide things like the structuring of raw data from things like existing IRS webpages, tax laws, and other agency provided data, leveraging our workflow engine, quality control modules, and then the integrated red teaming.

We provide a tool set for identifying and mitigating bias and ensuring that inaccurate answers are flagged and sufficient training data is developed to optimize the LLM, to perform its additional improvements over time. And so one of the key differentiators here is actually our smart tasking module, which allows government employees to test and evaluate the optimization of this tech -- the text of the chatbot by connecting it to a wide range of end user groups. And so government employees have access to a diverse crowd through our parent company Appen directly within the platform that allows them to rapidly assemble diverse user groups where they can reward individuals that provide feedback on testing the quality and effectiveness of this tax and regulation chatbot in answering their need.

And this is key in really being able to identify bias before deploying it to a broader public audience. And so just wanted to highlight a couple of the modules on the backend to help make this a reality. We have a no-code/low-code interface that we developed, where government employees can have full customizability of developing the fine-tuning data to optimize those LLMs that feed into the chatbot interface. And we have a workflow module. And what is really key about the workflow module is it allows government employees to be able to automatically route information that is connected to that frontend chatbot so that they can integrate red teaming and quality flow technology to be able to identify issues and mitigate those issues quickly and effectively.

So just a highlight of the quality control module, this module actually monitors over 200 different human factors to ensure the quality of the fine-tuning datasets when an issue arises with the chatbot, where additional insights and information needs to be provided to fine-tune the chatbot for additional information and context related to some new tax or regulation related to small business. And basically, the overall approach of combining these elements in the background behind the chatbot is really to focus on enabling an agency to deploy a chatbot for something like very complicated like tax and regulatory information. But instead of having the static one-time approach, empowering the government employees to be able to quickly adapt and continuously monitor and flag for where certain users aren't able to get access to the type of information that they are seeking through the chatbot. And so that's why we developed this as more of a comprehensive solution of not only just the LLM interface and way for people to interact with this chatbot, but also the backend process for government agencies to have the ability to have more of a government guided approach to ensure that they are taking the needs of all different users that are interacting with the chatbot to gain relevant information related to tax and regulatory insights for their particular business, but providing that oversight and integration to adapt that as changes to the tax and regulatory codes develop over time. And so we see our tool as a way to not just be a one-time caught solution, but to be something that is -- allows an integrated teaming approach, empowering folks within the government to fully customize and adapt the chatbots interface and also rapidly task with those user groups, leveraging our global crowd for individuals from different countries that may be immigrating here and want to find certain information. Being able to test across, you know, different industries for small businesses that are focused on, you know, manufacturing or maybe they're focused on service type businesses, allows the IRS or other agencies to really be able to take a driver's seat perspective and monitoring and ensuring that a chatbot that they provide to the public is fully adaptable, and customizable, and can solve potential issues that arise that could be leading to bias or an inequitable approach with how individuals are trying to get access to tax and regulatory information.

And I want to thank you guys for inviting us for this presentation, and I look forward to your questions. >> Wow, now I'll pass the mic to Omar to lead the equity Q&A. >> Thank you, Eboni. I appreciate it. We're going to kind of go for the first particular question that came up.

This is for Jibber Jabber, right? This is going to be for Jibber Jabber as well as -- is there a particular reason you chose to showcase a live video versus a live demo? >> Yeah. The main reason is, you know, you want it to go perfectly during the presentation, but also, you

2023-09-26 11:18

Show Video

Other news