Mixed Reality is super important to VR!
Showing virtual reality to others is an incredibly important part of VR in the long run. Normally, we see a view from the player’s perspective, but that can be jarring to watch, especially since we turn our heads so quickly! Mixed reality is one of the best solutions to showing what it really feels like to be in VR!
Being in VR is great but watching it first person isn’t optimal!
How is it done today?
This new frontier of mixed reality is in its infancy, but generally it’s accomplished in video editing software by slapping video footage on top of engine footage or by sandwiching video footage of the person playing in between two separate renders of the engine content — foreground, and background. This is a big pain and has some significant drawbacks including poor performance, reliance on external software (which is cumbersome and fragile), seams between foreground and background footage, inability to cast consistent shadows across the separate renders and falls short for many intricate use cases, one major one being that the entire human being is at a perfectly flat depth (more on that later).
First glimpse at Owlchemy’s solution
We knew there had to be a better solution. Introducing Owlchemy’s new method of doing Mixed reality: Depth-based Realtime In-app Mixed Reality Compositing. Using a special stereo depth camera, everything is done in-engine using a custom shader and custom plugin to be able to green screen the user and depth sort them directly into the engine itself!
Correct full-body depth sorting! Hiding under desks without an HMD! All in Unity! Madness!
What are the advantages?
- Per-pixel depth!
- We can actually sort the player properly in complex environments, rather than having a flat representation! You can see the hand reaching forward over a filing cabinet while the rest of the body sorts behind. Knowing the depth, we can avoid so many sorting and occlusion problems!
- No extra software to stream!
- No need for OBS for real-time compositing and no need to spend days in After Effects doing compositing in post! That means that streaming setups can be simplified, avoid more apps in the content pipeline, and make the entire process of producing mixed reality content much easier!
- No compositing complications! (seams between layers and shadow issues)
- Using the sandwiching method, you lose the ability to cast a shadow between the foreground and background cameras. With the rendering happening in-engine, shadows act as they should and there are no seams.
- Static or tracked dolly camera mode!
- We can dolly the camera with a tracked controller to allow for shots with motion!
- All on one machine!
- No need to use multiple machines to composite mixed reality footage
- Doesn’t actually require wearing the HMD!
- Without needing cues from tracked controllers or tracked HMDs, you can still stand in a scene in VR. This also means you can have multiple users standing in mixed reality without hardware.
- Now that the user is in-engine, we can utilize dynamic lighting to achieve more believable results!
Lighting! Real-time shaky cam! Excuse the green screen quality. It’s a $100 greenscreen setup! 🙂
How does it work?
Using a stereo depth camera (in this case a ZED Stereo Camera), we record both video and depth data of the user on a green screen at 1080p at 30fps. The stereo depth camera is essential since infrared based cameras (Kinect, etc) can and will interfere with many VR tracking solutions. We then pump the stereo data in real-time into Unity using a custom plugin and a custom shader to cutout and depth sort the user directly in the engine renderer, which yields this amazing result. Not only is there no external application required to set this up, but you do not even need to be wearing a HMD for this to work.
Multiple people in mixed reality! Brainy!
Developing this pipeline was a large technical challenge as we encountered many potentially show-stopping problems, such as wrangling the process of getting 1080p video with depth data into Unity at 30fps without impacting performance such that the user in VR can still hit 90FPS in their HMD. Additionally, calibrating the camera/video was a deeply complicated issue, as was syncing the depth feed and the engine renderer such that they align properly for the final result. After significant research and engineering we were able to solve these problems and the result is definitely worth the deep dive. Job Simulator presents a huge problem for mixed reality because the close-quarter environments surround the player in all directions and players can interact with the entire world. Essentially, Job Simulator is the worst case scenario for mixed reality, as we can’t get away with simple foreground / background sorting where the player is essentially static and all the action happens in front of them (a la Space Pirate Trainer). If a solution can work for Job Simulator, it can likely be a universal solution for all content.
Who will benefit from this?
We are super excited about all the possibilities this unlocks for mixed reality. Streamers and content creators will have a much simpler, more performant pipeline to get mixed reality content into the wild. They’ll also be able to address their audience in a VR environment without having an HMD blocking their face. Complex VR scenes can be correctly sorted with real world objects brought into view of the camera. Conference goers will have a much easier time understanding whats going on under the HMD. The possibilities this unlocks are hard to predict, and likely we haven’t yet seen some of the other valuable use cases with this method!
Great! How do I get it?
We’ll need more time to flesh out our tech but rest assured, we have plans in the works to be able to eventually share some of our tech outside the walls of Owlchemy Labs.
UPDATE! Mixed Reality Part 2 *NEW*
Check out our latest update on our mixed reality tech and sign up for the beta here: