Large-Scale 3D Digital Twins with AI and Unreal Engine | Unreal Fest 2023
Hello, everyone. Thank you for spending this hour with me. Hopefully, it's going to be interesting for you. I will do my best to do so and to make it interesting.
My name is Hannes Walter. I'm heading the synthetic training solutions team at Blackshark. And I'm looking forward to run you through my presentation and take your questions afterwards, of course. So I structured the presentation as you can see-- so a few words about Blackshark for those who don't know us, then why the creation of large scale 3D digital twins is necessary and is needed, and what exactly is needed.
So I want to elaborate on that a little bit to set the ground. Then I will quickly run you through our process and how we are doing it, show you some use cases that are possible based on Blackshark Technologies, and also give an overview of what's included-- so the Unreal plugin, of course, but what else can you get from Blackshark and how is it structured and how can you use it and integrate it. And last but not least, how you get started with us-- so this is not so obvious sometimes. And I want to sketch out that a little bit as well.
So for those who are not familiar with Blackshark, Blackshark is an Austrian company. So our headquarter is in Graz, where the most of our team is located. We are about 100 people. But we have a office in the US as well. And as Sepp mentioned already, we are doing this for quite some while already.
The background is that our sister company, Bongfish is a video gaming studio which was founded 20 years ago. And their focus and stronghold was procedural terrain generation for the last 20 years, basically. Even I was part of the first snowboard game, large-scale snowboard game where the team procedurally generated the terrain. And the team did several video games together with Microsoft.
And that's why Microsoft approached us about seven years ago and asked the team if we can help to reconstruct the entire globe in 3D photorealistic for the then-upcoming Microsoft Flight Simulator 2020. And our CEO, Michael, took the right decision. So we have a pretty good university dealing with 3D computer graphics and gaming and geospatial as well-- so an interesting mixture for this kind of work.
And so he took the right decision and started to build a AI and machine learning team within the company in order to be able to reconstruct and to solve this problem for the Microsoft Flight Simulator. That was published and, of course, for us, was a very big reference. And later, when Satya Nadella called Michael, it was clear that this is something bigger. And we wanted to offer it and provide it to other industries and so on as well. And that was basically the foundation of Blackshark.
We had a Series A funding round a couple of years ago. You see that we have very interesting partners-- with Epic, of course. Microsoft invested in us. AWS is very relevant and a good partner as well.
Maxar Technologies-- the world's largest satellite imagery provider, one of our core partners. And Kevin and Esther are here in the room as well. So we can follow up on that-- and NVIDIA. Unfortunately, we can't still publish any customer names. That's, for me, personally, a bit of pain in them. But it is how it is.
So why are large-scale 3D twin creation capabilities needed? Some simulation scenarios simply need a vast scale, sometimes global scale. Think of global data simulation, for instance, just to give you an example, or a flat-level simulation, large-scale battleground simulation-- large-scale gaming, of course, but then the border between simulation is crossed already. But it's simply-- you probably know it, and that's why you are here in this session.
It's needed. But existing methods are simply very costly and time-consuming. So either manual 3D twin modeling, CAD modeling, or whatever tool a human being is using, or semi-automated with some kind of semiprocedural approach, or photogrammetry, where you need a lot of vantages and shots of course to calculate it-- anyway, it's both costly and time-consuming. But we want to have a complete 3D digital twin because incorrect or incomplete data results into wrong simulation results, as well as outdated 3D twin information can lead to imprecise training of humans, for instance.
So within the pilot training industry, this is very well-known for a very long time. The pilots need to be trained on the most current airport data, for instance, just to give you an example. And also, we want to have it as immersive and realistic as possible because in a simulation, or in a simulator for humans, this is important for the effectiveness of the training results. Now, we could argue there are global 3D digital twins out there already, like, for instance, Google Earth. But, of course, they are not-- or they might look very good for the human eye from a certain distance. But they are not ready for simulation.
This is, of course, an extreme example here. But the problem with photogrammetry used for simulation is always the same. It's lacking semantic information.
It's not vectorized. So we need to come up with another solution. And also, the coverage, or the global coverage, is limited. So there are still many, many regions where no photogrammetric data is available.
We could also argue, OK, there are vector-based data sets out there already, like OpenStreetMap. But also, OpenStreetMap has a very limited coverage. So, yeah, this was out there already when we started. So we needed to come up with another solution and another approach to be able to reconstruct the entire globe.
In general, especially for training in simulation, what's needed-- a consistent 3D digital twin with the real world. That might not be that important for gaming. But for simulation and training, of course, we want to have a consistent one. And at least we aim for it.
We need to have a vector-based representation that covers all relevant aspects. We want to have raster data, terrain data, vector data all integrated into this so that we can use it and leverage it for the simulation. If we go down an AI-based simulation approach, and we want to optimize it, then we need the semantic information and the metadata embedded. And it should cover the whole infrastructure of the globe without any gaps or missing areas. And, of course, here, especially at the Unreal Fest, we want to have it as realistic and high-quality as possible and high-performance at the same time, which is kind of a contradiction. So this was also a challenge for us, and we needed to be very creative to overcome all those requirements and fulfill it, basically.
And imagine the Microsoft Flight Simulator. It's a massive multiplayer game played by many, many users at the same time. And we want to stream it to everyone, the entire globe.
Everyone can decide where they're going to fly and so on. So, yeah, scale is a tricky part here as well, as we are talking about petabytes of data just for the satellite data alone. So how is our approach? We see here satellite data, Maxar Vivid, 50-centimeter in Italy. And we train our artificial intelligence to extract certain features from these satellite data.
So we only need and have this two-dimensional satellite data. But we can still extract 3D information about the buildings, plus some metadata regarding the building roof type, even what kind of building type it might be, and so on. So whatever we can extract with artificial intelligence, we extract. And we use that to reconstruct it in a procedural approach in a photorealistic manner.
And for simulation, of course, the semantic information is still embedded, even the material attributes for all the building details and so on. Just to give you a quick heads-up how it's done and, yeah, it's simulation-ready. It's the entire globe. So with the right shadow casting, you can really simulate how long a shadow would be on a specific date or position and so on. So let's have a closer look how we are going to do that or how we are doing it. We have several products and toolkits along the end-to-end geospatial platform and pipeline, which allows us to turn freshly collected or available sensor data by all means into a simulation-ready 3D environment fully automatically.
So first, our ocean data lake solution can host, manage, and provide tidal stream, petabytes of data. And it's cloud-native, fully scalable. So you can introduce any kind of resolution to this data lake, also your vector data. Basically, it's a big data dump. But it's organized in a very smart way. And then our geospatial intelligence solutions, called ORCA, leverage this data and extract the semantic information, as I've shown you.
But we have also a tool which allows anyone to easily extract specific objects and segment specific objects with no code at all. So it's really a no-code tool which can be used by anyone. And you can, by this, train your custom ML models and apply it at very large scale on all the data within OCN or define where it should be applied. And then, basically, we are streaming in and using all this information to turn it into this photorealistic 3D environment, either within the Unreal plugin, or we can export it in all kinds of formats, like OpenUSD, FBX, OBJ, CLDF. If you're interested in 3D tiles, we should talk later on the way as well.
And then, basically, you can build your simulation on top of it. Or another view on it-- basically, we have the imagery. We extract certain features, like the footprints, building volumes, metadata, and reconstruct it at runtime in a realistic way. I mentioned the labeling tool.
And here you see how it's done. On the left-hand side, you see freshly collected imagery. And this is point zero in labeling or introducing a new model. So, basically, everyone can easily just paint and indicate in green what their object of choice is. And with cyan, you basically tell the model, that's not what I'm looking for. And on the right-hand side, you get instant feedback what the model understood so far.
And with every additional stroke, you increase the accuracy of the model, basically. And within an hour, you have a full-blown precise model, fully scalable, which you can apply on your own data or on external data sets. And this was also done out of necessity because we were a small team, and we still needed to train models to extract features on the entire globe. So nowadays, it's clear, OK, yeah, we need annotated data. But, yeah, as I said, out of necessity, we need to be creative so that we were able to label all the buildings for the globe with two data labelers.
And after that, we applied it large-scale, as I mentioned already. So here you see a small reference project we did on the Iberian Peninsula. And just to give you an example why this is necessary, you see here indicated in yellow all the OSM buildings to a certain point. Microsoft published their building footprints as well.
And this is what our artificial intelligence extracts out of it. So for us, it's really about high-frequency, large-scale mapping and feature extraction and leverage this data for the simulation applications. One very interesting use case that's possible based on this very fast and scalable feature extraction capabilities is, for instance, change detection.
So here you see an example where we compared a Maxar Vivid vintage from 2021 in green with a vintage one day after the earthquake in Syria and Turkey. And you see that the artificial intelligence then can't find the buildings anymore. And basically, you can guide and direct the disaster relief teams accordingly, especially for very remote areas.
So this is basically our end-to-end pipeline. And now let's watch a short video summarizing this and also give you some impressions about SYNTH3D. [VIDEO PLAYBACK] - Blackshark.ai presents SYNTH3D, the first photorealistic 3D replica of the surface and infrastructure of the entire globe.
It contains all the terrain, vegetation, and infrastructure, and is ready for large-scale simulation training and visualization applications. SYNTH3D includes every country, city, building, and detail in a region-specific and geotypical way. In addition to realistic textures, it contains semantic metadata and material attributes of all objects.
Blackshark's digital airport generator turns airport ground charts into high-fidelity 3D airport models all embedded in SYNTH3D. SYNTH3D is derived from Maxar Vivid global satellite imagery base map by Blackshark's AI-powered geospatial platform, extracting relevant information with unprecedented speed and scale. And as Maxar Vivid is continuously updated, SYNTH3D always stays current. With the Blackshark Globe plugin for Unreal Engine, any georeferenced data can easily be integrated and combined with Blackshark's SYNTH3D and is ready to be integrated into your application wherever it's needed. At Blackshark, we put the whole world in your hands. [END PLAYBACK] Yeah, as mentioned, SYNTH3D is a joint brand, basically, between Maxar and Blackshark.
So Maxar is providing the 50-centimeter, cloud-free, color-corrected, orthorectified satellite data. And we are getting constant updates. And by this, we can leverage this satellite data, pick it up, including the realistic 3D terrain, the vegetation, our buildings and so on, and make it accessible for you. And just some impressions how this looks like in different areas of the globe. Of course, it's not geospecific. So one specific building in real might look different.
But for some use cases, this is sufficient. And for some regions, like in China, there is no other way to get such realistic 3D data and 3D twins of the area. And we basically can provide and cover the entire globe at this level of fidelity. So what are potential use cases? Or maybe, before that-- so SYNTH3D is basically extracted from the satellite data.
But as we also see later, our pipeline can ingest any kind of building vectors. So if it's about precise building vectors as well, then we don't apply our geospatial intelligence to extract it. But we can, for instance, use OSM, or you have your own building vectors and so on.
And then, of course, we have the accurate building volumes plus the synthesized facades and textures on top of it. So what are typical use cases for it? Of course, flight simulations are coming from Microsoft Flight Simulator. Gaming-- now we are doing many projects for the professional flight simulation business, as it was indicated in the video already. We have also an airport generator. In this case, we are not only ingesting satellite data, but official airport ground charts and approach plates.
And our pipeline and artificial intelligence turns that into simulation-ready 3D digital airport models, including the light systems, the markings, and so on. So for the light systems and markings, we stick to the official FAA rules and apply it based on them so that the light systems, markings, and so on meet the standards and the requirements. AEC and city visualization-- so, yeah, we're doing quite some projects in this space for several reasons.
Sometimes, there is no other data available. So this is one reason. Or with our semantic building vectors, it's very easy to style, to colorize the building volumes, based on any kind of input data.
So you can easily visualize and colorize each individual building based on, for instance, energy consumption and so on, and, again, at global scale, very large scale, still performant and so on. So here we're doing very interesting projects as well, just to give you an overview of what's possible. And you know your own use cases the best. That's an interesting one as well.
So until now, we only had the landmass reconstructed, or what we've seen so far. But as I mentioned, our OCN platform can basically ingest any kind of digital elevation model. So we did a project also with bathymetry-- and so, yeah, very flexible to conflate and integrate all kinds of external data sources and basically expose the best data set for a specific region that's available. And in this case, it was about port reconstruction, of course. All the containers were procedurally placed.
So our team is doing a great job here in leveraging artificial intelligence and all procedural methods possible. Also a very interesting use case to either visualize and reconstruct existing energy projects-- so in the US, there are about 64,000 wind turbines, and the geolocation is publicly available in an API. So we can easily stream in the location for each of the wind turbines and reconstruct all of US, including the wind turbines. Or it can be used to visualize planned projects or visualize also a national-size grid infrastructure, place all the poles, and so on.
For solar panel, it's interesting to have the semantic information underneath because based on that, we can optimize and place the solar panels, rule-based. Everything you have seen so far is basically covered by satellite data. And now it's going to be interesting because with Maxar's procedural content generation-- sorry, with Unreal's procedural content generation capabilities, of course, on top of our semantic data, you can go wild, and you can style and increase the level of fidelity, even on the ground to the level you need, basically.
So all the grass the details and so on here are done procedurally. So my colleagues just came up with some examples. Or, of course, you can replace the Blackshark generic buildings with your own building because you want to come up with a game, and it should be styled-- it should be styled in marshmallow style or whatsoever. So the semantic 3D canvas is there, and we expose it.
And based on that, you can integrate it and style it as you need. But still based on realistic data, visual sensor simulation is a very interesting use case. And maybe, also, we concentrate, really, on the static globe.
So we will introduce some the seasonal shaders and so on. But we don't do any dynamic objects and so on. So we really concentrate on the terrain. And the rest on top of it is up to you, based on our SDK and the plugin which can then be leveraged to integrate it.
In this case, it was about simulating certain scenarios for delivery drones. What if a child runs out? Does the sensor and the artificial intelligence of the drone behave in the right way? Based on the material attributes I mentioned already, of course, we can do physics-based sensor simulation-- so night-vision goggle simulation, for instance, or whatever sensor is needed. Battlefield simulation and scenarios simulation in all regards-- so either photorealistically or only visualizing certain aspects, or use the semantic canvas and ground to simulate autonomous agents and systems. Based on the realistic view, also, synthetic training data generation is a very interesting use case because we have the entire globe in 3D.
So depending on the angle of a sensor you want to simulate, certain objects you want to identify or you want to have annotated data for are partly hidden. And all this can be easily simulated-- weather conditions and so on-- based on our synthetic 3D twin. So what do you get from Blackshark if you're interested in our stuff? First, we need some data-- so raster data, vector data, terrain data, 3D assets, custom assets, whatsoever. We have our own data lake, which is hosted on Azure.
But maybe you have your own data, and it's hosted somewhere already. Anyway, with our OCN platform, we can do the data management. It can be used, as I mentioned already, to introduce additional data, to tile it automatically, to stream it, everything that's necessary. And if you want to leverage our ORCA GeoINT and capabilities and ML models, that can be done as well. So, basically, the ORCA part here is an option. So either you have all your own data or we go the route via the ORCA containers and extract certain features, like building footprints and heights and so on.
And that's then streamed in by the OCN platform into our OCTO 3D simulation environment. So this is basically our Blackshark SDK, which comes with a lot of features. And, of course, in this context I want to highlight the Unreal Globe plugin. This is still the main way to get hands on our data, our capabilities, and so on. But as I mentioned, we have static exports as well. But then we are limited in the size, of course.
So there is no way to export the entire globe in OBJ or FX-- not possible. So either you have your own data lake or you stream it in. Apart from the Unreal Globe plugin, we have some other streaming connectors as well.
Omniverse here is very interesting. And as I mentioned, 3D Tiles is a very interesting format as well. Yeah, and from the plugin, of course, it's streamed into the Unreal engine, or it's embedded there. And you might use it alongside with all kinds of other awesome Unreal plugins and data and so on.
Or you use it in your existing 3D software or stream it into Omniverse and build your application on top of it-- so a high-level view of how the whole thing is built up, basically. And now let's have a closer look on the Unreal plugin. So you see the plugin here.
It's a screenshot. And the core functionality, of course, is that you can control lat-long, altitude, so lat-long, where you want to fly with the camera, the altitude. But also, yeah, I just screenshotted some parts of it. So all the registered area of interest which we provide to you are easily accessible via the plugin-- all kinds of different terrain modes, object modes, to visualize it, either in a realistic way to visualize and expose the materials, the pure vectors, the terrain vectors versus synthesized terrain textures and so on and so on and so on. It's a small universe already for me as well.
So the team is doing an awesome job here. Yeah, another example-- so the buildings rendered in boxes or with the roof type information or not, and so on. Yeah, that's a very interesting one. We just had a panel discussion before this session, and I mentioned that exposing all this metadata was a big leap for us because before that, it was kind of a black box.
And it looked nice, but it was a bit difficult for our customers to integrate it and, of course, and to get hold of it. But as you can see, now we are exposing all kinds of metadata for each object and building. So what I did here is basically just right-click and show what's available in terms of metadata.
And either you use it via the API or a blueprint. So that's all there and possible. Yeah, some additional very important features I want to highlight here-- exclusion and overwrite polygon for buildings, terrain, vegetation, texture. So maybe you want to introduce your own buildings or a certain building because you have a much higher-detailed geospecific model of the building.
So you simply say, OK, I don't want to use this Blackshark building here but use mine instead. And you can manipulate the terrain. You can use your terrain data, your vegetation instead for certain areas. That's a very important feature for many use cases. Some parts of the city are larger. Cities come with some prebaked landmark buildings because they are so unique and relevant for their visual presence.
But, of course, you can introduce and also request additional ones from Blackshark. 3D asset instancing to keep the performance high, then flattening and manipulation-- so, for instance, around airports, you might want to flatten the runway, or you want to introduce a higher-resolution runway model you have at hand already. We just launched VR support. So the Varjo XR-3 is supported currently, which is interesting for many, many use cases as well.
And now I want to go a bit more in detail into our vision, how Blackshark can be used. So, basically, we expose our semantic features and the 3D canvas. And based on that, everyone can introduce procedurally their own assets, for instance. Or the most convenient way here is, of course, to use, for instance, Quixel assets and the procedural content generation capabilities of the Unreal Engine to come up with any kind of globe as you need it, basically, in a procedural, very large-scale manner. And you can replace, basically, all our stuff and so on if needed. And otherwise, you can go with our base map and base layer-- just some impressions.
So here, yeah, these are Quixel trees, for instance. And with Nanite, of course, the performance is still very high. And, yeah, we just introduced Raycasts for all kinds of use cases, for teleportation, line-of-sight simulation, whatsoever, all accessible via the SDK and the plugin. Next big step here will be to have some prebaked collision meshes for certain areas in different levels of detail. So what you get-- quick summary, end-to-end platform converting all kinds of sensor data into 3D terrains. If it's interesting to you, you can use our containerized object segmentation and feature extraction models and methods.
Scalable and cloud-native platform from ground up, the automated 3D terrain generation pipeline and service, either as SYNTH3D or based on your own data-- everything ready to go at the global scale-- SYNTH3D, of course, based on Maxar Vivid data, and the Airport Generator, as I mentioned, and all that integratable and easy to use based on the toolkit and the Unreal Engine plugin. Now, how to get started-- if you're interested in that, inquire us at Blackshark.ai. Or, of course, you send me a personal email. But this is our main channel for all the requests. And the bad news is you need to sign up in order to get access to our SDK, to the plugin. We need to know who you are.
Of course, we are very happy to hear from you to discuss your use case, your demands, what's missing, and so on, and very open to that. But still, we can't give access to it to everyone. So the last line-- so requests from sanctioned countries don't get access. And, of course, you need to have some kind of budget, use case, outlook for us.
If you have that, we have an early access program, which can be filled online, includes an NDA, and off you go. Then you basically get the first data set with the SDK, the documentation. And you can start to test it, to integrate it, to give us feedback. And we can give you support for specific questions and so on. What's, of course, also necessary is some indication of what area of interest are we talking about. What size? Where is it? What update rates are you interested in? So is it OK that it's derived from a two-year-old satellite data set? Or we have, for instance, projects where the customer wants the most recent satellite data collection.
And then we can source it, for instance, from our partner, Maxar. Yeah, I think-- yeah, we can have a quick look into a demo as well. So what you see is running on my gaming laptop.
We are in Barcelona. Sagrada Familia is obviously a landmark building where we replaced the building, the generic building, with the high-detailed model. In this case, we used the 50-centimeter Vivid satellite data as ground texture. And here, of course, it's going to be interesting if we replace that procedurally with some ground textures.
As I mentioned, ready for nighttime simulation-- we can cast here. And basically, it all starts with the 3D terrain, so Maxar Vivid. And then our feature extraction capabilities detect the building volumes. And in the next step, then we synthesize it. Yeah, of course, I didn't mention it-- LOD management for the entire terrain, very interesting and relevant to keep the performance up for the buildings.
So we have currently 3D-- three levels of detail for the buildings. But you see here you can flexibly define the distance for the building rendering. I don't want to go too wild here on my gaming laptop. But let's try it.
Could use more RAM here. Yeah, and maybe let's switch to semantic mode or material mode on the ground. We can do the same for the buildings as well, expose the building materials, as I mentioned, for sensor simulation, for instance. Let's go to semantic buildings and satellite data. In this case, we placed our own vegetation, our own tree types. And we are about to introduce regional-specific tree types so that you don't need to take care of the right tree types, depending on the [INAUDIBLE] and the region where you want to get access to our globe.
And maybe let's also have a look on an airport. San Francisco-- because here I'm going to show you also-- so here you see the vectorized runway and marking. So here we get vectorized input data.
And we place it on top or besides the satellite data as ground texture. Of course, we can hone it here, [INAUDIBLE].. But it's autogenerated, as I mentioned, including the light system, as I mentioned.
Yeah, I think I pretty much covered all the relevant aspects. For more details, of course, just drop an email or let's discuss after the session if there are any. Thank you. [APPLAUSE]