71

I'm creating an iOS app that I intend to display realistic/high-quality renders within AR. Whilst experimenting with these 3 options, I'm still unsure which of them I should go forward with developing my app's framework around: SceneKit, RealityKit and Metal.

I've read that SceneKit is built on top of Metal, but I'm not sure whether its worth the time/effort programming any custom shaders as opposed to using what SceneKit can offer by default. Regarding RealityKit, I don't need any of the animations or special effects that it offers, just the photorealistic rendering side.

Which of the three is the best to develop for AR and High-Quality Model Rendering?

Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
rmLuma
  • 829
  • 1
  • 6
  • 4

1 Answers1

242

Updated: September 01, 2023.

TL;DR


This post is not only about RealityKit, SceneKit & Metal, but also about related frameworks.



RealityKit

enter image description here

RealityKit is the youngest SDK in Apple family of rendering technologies. This high-level framework was released in 2019. It's made for AR / VR projects, has simplified settings for multi-user experience and can be used on iOS, visionOS, macOS and tvOS. RealityKit performs multithreaded rendering, async loading and provides a networking layer to implement state synchronization. Vision Pro headset supports foveated rendering based on RealityKit engine.

There's no Objective-C legacy, RealityKit supports Swift and Metal Shading Language, and rather declarative syntax, like the one in SwiftUI. The main advantage of RealityKit is its ability to customize Reality Composer Pro's scenes, and it can be a powerful supplementation for ARKit. In RealityKit, the main content is entities (ModelEntity, AnchorEntity, TriggerVolume, BodyTrackedEntity, PointLight, SpotLight, DirectionalLight and PerspectiveCamera) that have components and can be created from resources like ModelEntity. There are significantly more components in visionOS than in iOS/macOS.

The framework runs the Entity Component System (ECS) on the CPU to manage tasks like physics, animations, audio processing, and network synchronization. But RK is also heavily relies on GPU hardware to perform multithreaded rendering. RealityKit has eight materials: MaterialX (open standard launched by ILM), UnlitMaterial, SimpleMaterial, PortalMaterial, PhysicallyBasedMaterial (with 18 AOVs for controlling surface look), OcclusionMaterial, VideoMaterial and, of course, CustomMaterial.

enter image description here

Pay particular attention to shadows on iOS – devices with A9...A11 chipsets produce projective (a.k.a. depth map) shadows, but on devices with A12+ and M1+ chipsets we can render raytraced shadows. Maybe your choice could be fake shadows. Many contemporary render features in RealityKit are On by default: camera's depth of field, face/person occlusion, grounding shadows, motion blur, camera grain, etc. Note that keeping the main CPU thread below 16 ms timeframe does not guarantee 60 fps performance.

enter image description here

RealityKit reads in .usdz, .rcproject and .reality file formats. Supports transform and asset animation, rigid body dynamics, mesh descriptors, HDR Image Based Lighting, particle system (at the moment particles work in visionOS and tvOS only), playback animation controller, PBR materials, raycasting and spatial audio. All scene models must be tethered with anchors. Framework automatically generates and uses mipmaps, which are a series of progressively low-rez variants of objects' textures that improve render times when applied to distant objects. RealityKit works with a polygonal mesh generated with the help of Scene Reconstruction feature. I wanna add a few words about AR Quick Look - a zero-configuration framework that's built on the RealityKit engine and conceived as a tool for fast AR visualization.

visionOS app's sample code:

import SwiftUI
import RealityKit

struct ContentView: View {
    var body: some View {
        RealityView { content in
            let model = ModelEntity(mesh: .generateSphere(radius: 0.1))
            model.position.z = -1.0
           
            let anchor = AnchorEntity(.head)
            anchor.anchoring.trackingMode = .continuous
            anchor.addChild(model)
            
            content.add(anchor)
        }
    }
}

Conclusion: RealityKit gives you a high-quality render technology and up-to-date AR capabilities out-of-the-box. Supports LiDAR Scanner. Supports Photogrammetry tools. Plays Reality Composer's behaviors through its Notification API. RealityKit can be used as standalone framework, or as a partner of ARKit and MetalKit. Has access to fragment/pixel shaders and geometry modifiers via Metal scripting and CustomMaterials. Reality family software has CLI and GUI tools for fast and easy USDZ-conversion. Now RealityKit's apps work in Xcode's visionOS simulator.

RealityKit works with UIKit storyboards and with SwiftUI interfaces. It has a minimum of a boilerplate code. For example, RealityKit has a very simple setup for models' collision and gestures (pan, rotate, pinch), including alternative 2D gestures. Supports tracked raycasting. And there's a composition over inheritance, so it's rather a Protocol Oriented Programming framework – tight coupling in most cases is no longer a problem in your code. RealityKit fits perfectly with the Combine reactive paradigm, which helps handling publishers, subscribers and asynchronous events. A prime example of it, is the subscribe(to:on:_:) generic instance method that returns an object representing the subscription to an event stream, like SceneEvents.Update.self which triggered once per frame interval (60 fps).

RealityKit's visionOS views are RealityView and Model3D.

@available(xrOS 1.0, *)
@available(iOS, unavailable)
public struct RealityView<Content> : View where Content : View 

@available(xrOS 1.0, *)
@available(iOS, unavailable)
public struct Model3D<Content> : View where Content : View

RealityKit's iOS/macOS view is ARView.

@available(OSX 10.15, iOS 13.0, *)
@objc open class ARView : ARViewBase

typealias ARViewBase = NSView            // AppKit view
typealias ARViewBase = UIView            // UIKit view


SceneKit

enter image description here

SceneKit is a high-level framework as well. The oldest one in Apple family of rendering technologies. It was released in 2012. SceneKit was conceived for VR and can be run on iOS, macOS, tvOS and watchOS. For AR projects you can use it only in conjunction with ARKit. SceneKit supports both Objective-C and Swift. In SceneKit the main unit is a node (SCNNode class) that has its own hierarchy and can store a light (SCNLight), or a camera (SCNCamera), or a geometry (SCNGeometry), or a particle system (SCNParticleSystem), or audio players (SCNAudioPlayer). The main advantage of SceneKit – it's highly customizable, it can change geometry and materials at runtime, it has morphers, skinners and constraints, it renders a scene up to 120 fps. There are Blinn, Constant, Lambert, Phong, ShadowOnly and PBR shaders.

Occlusion shader is also available for us in SceneKit but in a custom form (there's no out-of-the-box occlusion material here like we can find in RealityKit). In case you need a video material in SCNScene you should assign an AVPlayer object to SceneKit's diffuse.contents or use SpriteKit's scene with SKVideoNode.

enter image description here

SceneKit allows you render such effects as beautiful deep depth of field, a cinematic vignette effect, screen space reflections, or even CoreImage filters with animated parameters. Also we can use a SCNProgram object to perform custom rendering – it's a complete Metal or OpenGL shader program that replaces SceneKit's rendering of a material or even geometry. SceneKit's reliable companion is a Model I/O library that carries out import, export, and models' manipulation using a common infrastructure. SceneKit performs a single-threaded rendering (it uses a secondary thread, not the main thread, of course).

iOS app's sample code:

import SwiftUI
import SceneKit

struct ContentView: View {
    @State private var scene = SCNScene()
    let options: SceneView.Options = [.autoenablesDefaultLighting]
    let sphere = SCNNode(geometry: SCNSphere(radius: 0.1))

    func createSphere() {
        sphere.geometry?.firstMaterial?.diffuse.contents = UIColor.purple
        scene.rootNode.addChildNode(sphere)
    }
    var body: some View {
        SceneView(scene: scene, options: options)
            .ignoresSafeArea()
            .onAppear {
                self.createSphere()
            }
    }
}

SceneKit reads in several file formats, including .usdz, .dae and .scn. Supports nested asset animation, dynamics, particles, PBR materials, HDR IBL and spatial audio. For implicit and explicit transform animation of any node you can use SCNAction, SCNTransaction and CAAnimation classes. Though a collisions' setup in SceneKit is a little bit complicated. To create a modular and scalable game architecture with SceneKit we need to implement GameplayKit’s entity-component pattern.

Conclusion: SceneKit gives you a high-quality render technology (but at first you need to setup physicallyBased shaders), although for AR projects you can use it only with ARKit. SceneKit is highly customizable and can be used with Swift and Objective-C, and it gives you a set of useful renderer(...) instance methods coming from ARSCNViewDelegate protocol that allows you update AR models and tracked anchors at 60 fps. Works with UIKit and SwiftUI (despite the fact that there's no SceneKit+SwiftUI template in Xcode). There are obvious reasons that Apple might make this framework deprecated during the next 2 years – SceneKit hasn't been updated since 2017, excluding minor changes like clearCoat material property, SSR and SwiftUI's SceneView). Note that in visionOS, SceneKit content isn't 3D, it's placed as 2D window.

         SceneView(scene: SCNScene? = nil,
             pointOfView: SCNNode? = nil,
                 options: SceneView.Options = [],
preferredFramesPerSecond: Int = 60,
        antialiasingMode: SCNAntialiasingMode = .multisampling4X,
                delegate: SCNSceneRendererDelegate? = nil,
               technique: SCNTechnique? = nil)

While RealityKit is getting more and more mature, SceneKit boasts some advantages over RealityKit. At least one thing that Swift developers forget about it, is that Objective-C SceneKit's apps ensure faster compile times.

SceneKit's views are SCNView and SceneView.

@available(iOS 8.0, tvOS 9.0, *)
open class SCNView : UIView, SCNSceneRenderer, SCNTechniqueSupport 
 
@available(OSX 10.8, *)
open class SCNView : NSView, SCNSceneRenderer, SCNTechniqueSupport

@available(iOS 14.0, macOS 11.0, tvOS 14.0, watchOS 7.0, *)
public struct SceneView : View


Metal | MetalKit

enter image description here

To be precise, Metal is not a rendering technology but rather the GPU accelerator with the ability of using a rich shading language (MSL). It was released in 2014. It's a low-level framework. Metal is implemented everywhere – in RealityKit, SceneKit, ARKit, CoreML, Vision, AVFoundation, etc. Metal combines functions similar to deprecated OpenGL ES and OpenCL under the hood of just one API. Of course, Metal can be used as a renderer for advanced 3D graphics. Metal renders not only reflections but also refractions and subsurface scattering phenomena.

According to Apple documentation: "Metal is a C++ based programming language that developers can use to write code that is executed on the GPU for graphics and general-purpose data-parallel computations. Since Metal is based on C++, developers will find it familiar and easy to use. With Metal, both graphics and compute programs can be written with a single, unified language, which allows tighter integration between the two."

In addition to Metal, you can use MetalKit module (released in 2015) that helps build Metal apps quicker and easier, using far less code. It renders graphics in a standard Metal view, load textures from many sources, and work efficiently with models provided by Model I/O framework.

Metal begins shining when you render a considerably greater number of polygons or 3D-particles than SceneKit or RealityKit is capable of rendering.

You will be surprised when you know that Metal can be used not only in the Apple ecosystem but also in Windows. Here's a link where you can download Metal Developer Tools for Windows.

Sample code:

import MetalKit
    
class RedCube: Primitive { ... }

class CubeScene: Scene {

    override init(device: MTLDevice) {           
        super.init(device: device)
            
        let redCube = RedCube(withDevice: device)
        objects.append(redCube)
        redCube.translate(direction: float3(0,0,-10))
        add(child: redCube)
    }
    override func render(commandEncoder: MTLRenderCommandEncoder, 
                              deltaTime: Float) {
        objects.forEach { 
            $0.rotate(angle: deltaTime, axis: float3(1, 1,-1)) 
        }
        super.render(commandEncoder: commandEncoder, deltaTime: deltaTime)
    }
}

Conclusion: Developers usually use Metal framework to generate a High-Quality GPU Rendering for games with sophisticated 3D environments, for video processing apps like Final Cut Pro and Nuke, for 3D apps like Maya, or for big data scientific apps that must perform for scientific research. Consider, raytracing in Metal is much more quicker than in RealityKit.

MetalKit's native view is MTKView.

@available(iOS 9.0, tvOS 9.0, *)
open class MTKView : UIView, NSCoding, CALayerDelegate

@available(OSX 10.11, *)
open class MTKView : NSView, NSCoding, CALayerDelegate


USD + Hydra

enter image description here

Pixar's Universal Scene Description isn't just another file format for 3D graphics' scene descriptions, it's rather an open and extensible ecosystem for describing, composing, simulating, and collaborating within 3D environment. USD files represent the scene as an hierarchy of primitives (or prims), i.e. scene layout, geometry, materials, animations, lights, cameras, transform hierarchies, etc. You can create USD files either through UI (in Reality Composer Pro, Maya or Blender) or programmatically in Python. The key feature of USD is the ability of composing layers. The main power of the layer paradigm is non-destructive editing. You can start with an existing scene and add new layers containing your edits.

At the moment there are four members in Universal Scene Description family:

  • USDA (ascii)
  • USDC and USD (binary)
  • USDZ (zip archive)

Another important thing about USD is that it comes with Hydra, which is a medium between current scene graph and renderer. In other words, Hydra is a modern rendering architecture optimized for handling very large scenes. Thanks to Hydra, USD file format supports almost any raytracing renderer - Storm, Metal, Renderman, Vulkan, i.e. Hydra enables a communication between multiple renderers & scene graphs.

enter image description here

Here's what Pixar's official documentation says about Hydra:

Hydra is the imaging framework that ships as part of the USD distribution. It connects scene delegates (that consume scene data) and render delegates (that send the scene data to particular renderers), in such a way that render and scene delegates can be mixed and matched as applications and consumers’ needs dictate.

enter image description here

Hydra’s first and primary render delegate is the rasterizing Storm renderer, which began as a modern OpenGL renderer, and which has now incorporated a “graphics interface” abstraction that allows Storm to use Vulkan, Metal, and potentially any rasterizing rendering API’s. Storm is highly scalable, multi-pass, and uses OpenSubdiv for mesh rendering.


SpriteKit

enter image description here

SpriteKit is Apple node-based framework for creating and rendering 2D games and 2D graphics. Was released in 2013. You can use SpriteKit as a standalone API or use it with SceneKit and ARKit. Its main feature is the ability to draw sprites with physics, 2D text and shapes, images and video, and even to rasterize SceneKit's scenes. In SpriteKit you can write a code in Objective-C or Swift.

A SwiftUI view's object that renders a SpriteKit scene looks like this:

        SpriteView(scene: SKScene,
              transition: SKTransition? = nil,
                isPaused: Bool = false,
preferredFramesPerSecond: Int = 60,
                 options: SpriteView.Options = [.shouldCullNonVisibleNodes],
            debugOptions: SpriteView.DebugOptions,
            shouldRender: @escaping (TimeInterval) -> Bool = { _ in true } )

Official documentation: "SpriteKit is a general-purpose 2D framework that leverages Metal to achieve high-performance rendering, while offering a simple programming interface to make it easy to create games and other graphics-intensive apps. Using a rich set of animations and physics behaviors, you can quickly add life to your visual elements and gracefully transition between screens".

SpriteKit works with two view types (UIKit and AppKit). The SwiftUI version is here too.

@available(iOS 7.0, tvOS 9.0, *)
open class SKView : UIView

@available(OSX 10.9, *)
open class SKView : NSView

@available(iOS 14.0, macOS 11.0, tvOS 14.0, watchOS 7.0, *)
public struct SpriteView : View


ARKit

enter image description here

Speaking of RealityKit and SceneKit, it's impossible not to mention ARKit. iOS RealityKit, like iOS ARKit, is based on ARSession and ARConfiguration objects, thus, when ARKit is being updated, then RealityKit is being updated too. iOS ARKit is the umbrella framework that includes SceneKit, SpriteKit, AVFoundation, Vision, UIKit and CoreMotion dependencies. ARKit is served in both languages ​​- Swift and Objective-C. The framework supports the same 4x4 matrices as SceneKit, RealityKit and Metal.

ARKit's collaboration with NearbyInteraction framework allows users to acquire the precise position of nearby devices (you can get distance, direction and IDs) with U1+ chips, using the visual capabilities of ARKit. .isCameraAssistanceEnabled is a boolean property that combines the spatial awareness of ARKit with NearbyInteraction to improve the accuracy of a nearby object's position.

import NearbyInteraction
import ARKit
    
let session = NISession()
    
@available(iOS 16.0, *)
private func acceptDiscoveryToken(_ peerDiscoveryToken: NIDiscoveryToken) {
    let config = NINearbyPeerConfiguration(peerToken: peerDiscoveryToken)
    config.isCameraAssistanceEnabled = true
    session.run(config)
}

ARKit, like ARCore, is good at Geospatial tracking. Since RealityKit does not currently have the ARGeoAnchor, RealityKit can only be used in conjunction with ARKit.

There's no rendering engine inside ARKit. This module is only responsible for high-quality Camera/Object Tracking and Scene Understanding (plane detection, ray-casting, scene reconstruction and light estimation). However, ARKit is capable of processing data of a Canonical Facial Meshes (ARFaceGeometry class) and also Vertices, Normals, Triangular Faces, and Classifications of a reconstructed geometry (ARMeshGeometry class). ARKit can capture 4K HDR video feed.

Here are some types of views ARKit is able to work with – ARSCNView, ARSKView, ARView, and MTKView.

@available(iOS 11.0, *)
open class ARSCNView : SCNView, ARSessionProviding

@available(iOS 11.0, *)
open class ARSKView : SKView, ARSessionProviding

@available(iOS 13.0, *)
@objc open class ARView : ARViewBase

@available(iOS 9.0, *)
open class MTKView : UIView, NSCoding, CALayerDelegate

If you need more info on ARKit and its capabilities, please read this post.


RoomPlan

enter image description here

RoomPlan is a magic wand for architects, interior/game designers, real estate agents and e-commerce representatives. Powered by ARKit and RealityKit, RoomPlan is a framework that utilizes the RGB camera and LiDAR Scanner on iOS device to create a 3D floor plan of a room, including such characteristics as dimensions and objects' types. Machine learning algorithm is capable of discerning walls, windows, doors, stairs, household appliances, furniture and plumbing.

Framework's on-screen instructions explain how to move the device to capture an interior. When the scan is done, the view displays a small-scale version of the scanned room for user to approve. RoomPlan outputs a result as parametric USDZ model, which makes it easy to modify individual components of the scanned room in 3d software.

Sample code in Swift:

private var rcView: RoomCaptureView!
private var rcSessionConfig = RoomCaptureSession.Configuration()
private var isScanning: Bool = false
    
private func setupRoomCaptureView() {
    rcView = RoomCaptureView(frame: self.view.bounds)
    rcView.captureSession.delegate = self
    rcView.delegate = self
        
    self.view.insertSubview(rcView, at: 0)
}

private func startSession() {
    self.isScanning = true
    rcView?.captureSession.run(configuration: rcSessionConfig)
}

RoomPlan uses RealityKit's ARView and has its personal view.

@available(OSX 10.15, iOS 13.0, *)
@objc open class ARView : ARViewBase

@available(iOS 16.0, *)
@objc class RoomCaptureView : UIView


AR Quick Look

enter image description here

AR Quick Look is a zero-config framework that uses RealityKit's rendering engine to display AR experiences. iOS built-in apps, like Safari, Mail or Notes, use AR Quick Look to display .reality and .usdz models with animation, spatial sound, contact shadows and PBR materials. Unlike RealityKit, AR Quick Look automatically plays an animation back if it exists. It boasts one extra levitation gesture and fully integrated Apple Pay.

You can embed AR Quick Look views into your mobile version of website with HTML 5 rel="ar" attribute.

<html>
    <body>
        <div>
            <a href="/folder/character.usdz" rel="ar">
                <img src="/folder/char-image.jpg">
            </a>
        </div>
    </body>
</html>

To use it in your iOS app, you need an instance of QLPreviewController and QLPreviewItem.

AR QuickLook's native view controller.

@available(iOS 13.0, *)
class QLPreviewController : UIViewController
Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
  • 1
    That’s a great answer! I am using SceneKit mainly to create custom geometry. It’s sad that the LiDAR features are not available for SceneKit. Also wondering what role SceneKit will play in the future ... – Peter Pohlmann Nov 15 '20 at 15:38
  • 3
    Thanks you very much Andy for you detailed answer. – Acammm Mar 24 '21 at 07:18
  • 2
    @PeterPohlmann it seems LiDAR stuff is available in scenekit to some degree though. e.g. you can use `arSceneView.session.run(configuration)` where configuration is of type `ARWorldTrackingConfiguration` which can `configuration.sceneReconstruction = .meshWithClassification` – grantespo May 26 '21 at 17:46
  • 1
    Super neat answer! – Moose Aug 01 '21 at 06:18
  • 3
    @PeterPohlmann You can forget SceneKit. RealityKit will be its replacement since RealityKit offers a .nonAR mode. SceneKit is single-thread, while RealityKit is multi-threaded. I learned about the .nonAR mode a few days ago. – Darkwonder Jul 07 '22 at 13:33
  • 1
    SceneKit still outperforms RealityKit... loading more than 1 complex 3d models in RealityKit results in excessive memory leaks and crashes compared to SceneKit. Not usable. Filed a radar for that. https://feedbackassistant.apple.com/feedback/9239141 Also no USDZ export from RealityKit and many other missing features. And this year no significant updates for RealityKit. So I'm not even sure if RealityKit will be running on the Glasses,... – Peter Pohlmann Jul 26 '22 at 10:32
  • 1
    @PeterPohlmann, I have three questions: 1. What models did you use – `usdz` or `reality`? 2. What texture size did these models have? 3. And have you turned off `Depth-Of-Field`, `Occlusion`, `ContactShadows` and `MotionBlur` in RealityKit's `.renderOptions`? – Andy Jazz Jul 26 '22 at 10:42
  • 1
    It's all USDZ files. I made a test app to load multiple copies of Apple´s standard USDZ assets. Just ran it again and it looks much better now. Not many leaks. Should have done it before I commented but it was an issue at the time. Anyhow I'm stuck with SceneKit because of the USDZ export feature but other than that RealityKit would be my choice now obviously. But we should all learn Metal I guess... – Peter Pohlmann Jul 26 '22 at 11:03
  • 1
    @AndyJazz Awesome synopsis -- so what is your update for 2023+ ?? – mobibob May 08 '23 at 23:21
  • @mobibob, I think it'll be something like xrOS (realityOS) – Andy Jazz May 09 '23 at 06:35