×
×
whitesmith.co

ARKit introduction

Ricardo Pereira

Apple has announced at WWDC17 a new iOS framework called ARKit. It's a framework that "allows you to easily create unparalleled augmented reality experiences for iPhone and iPad". The framework is shipped with the iOS 11 (currently in beta) and it's only supported by iOS devices that are powered by Apple's A9 or A10 chip. This means that it won't be working on older devices like iPhone 5S or iPad Mini. In addition, you can't use it in the simulator so you have to update your iPhone/iPad with the latest beta (iOS 11 SDK is only available in Xcode 9).

AR - creating the illusion that virtual objects are placed in a physical world using a camera.

We know that Augmented Reality isn't new but AR is getting a lot of buzz right now because of the Apple's new framework. One of the first and probably the most famous app that showed us the power of AR in apps was Pokemon Go. Achieving apps with the same interactivity as Pokemon Go isn't easy and that's why I think that ARKit will make the difference.

With this new framework, AR will be more accessible to developers by bringing native AR support to iOS. It does lighting estimation using the camera sensor, it can analyse what's presented by the camera view and find horizontal planes like tables and floors and it can place and track objects on anchor points. You can even render 3D objects using Metal, SceneKit, and third-party tools like Unity and Unreal Engine. ARKit does all of this with great performance and it's well documented.

Need some ideas to use ARKit? You can check madewitharkit and get a feeling about what's possible to achieve by incorporating this framework on your apps.

Measure objects using ARKit

One of the projects I really liked was the "AR Measure App Demo":

They created a precise virtual ruler, comparing to the real one, and I was stunned. I thought to myself: "I need to try this!", so I decided to create my own measure app using ARKit.

I started by watching the Introducing ARKit: Augmented Reality for iOS video from WWDC17. Then I read the documentation and played with the demo app (Placing Objects in Augmented Reality). After that, I got the sense of what I could use and how things work. From the demo, I understood that the scene units map to meters in ARKit, so this was a good tip.

The distance between two nodes

I wanted a basic app by simply tapping on the screen to select points and calculate the distance in meters of the last tap with the previous one. So, I created a new project using Swift and SceneKit:

Create project step 1
Create project step 1
Create project step 2
Create project step 2

The "Augmented Reality App" template provides us with a base code to start out. There's a ViewController that implements the AR scene view delegate (ARSCNViewDelegate) and it has already an IB outlet of an ARSCNView which is nice because that's the view for displaying AR with 3D SceneKit content using the camera.

Disclaimer: I played once with SceneKit, so I have some basic knowledge about it. If you don't have that knowledge or any of 3D rendering like Metal, OpenGL or Unity, then I suggest you look at one of them before playing with ARKit because it will help you understand the code that I'll present (for example, 3D concepts like vectors and matrices and the general operations that can be performed on them).

I removed the current scene that loads the ship.scn asset from viewDidLoad because I want to start with a clean environment (nothing in the camera view).

Then I added a UITapGestureRecognizer to the main view to recognize a tap gesture for adding a node. A SCNNode is "a structural element of a scene graph, representing a position and transform in a 3D coordinate space", where it's possible to attach geometry, lights, cameras, or other displayable content. I decided to use a sphere as geometry. I wanted the node at 10 cm in front of the camera, so I needed the current frame to have access to the position and orientation of the camera in world coordinate space.

Red is the `x` axis, green is the `y` axis and blue is the `z` axis.
Red is the `x` axis, green is the `y` axis and blue is the `z` axis.

To achieve the 10 cm translation, I needed to apply a transformation on the fourth column z. A positive value defines closer to the camera and negative further away. So, if you use 0, the object position will be right in front of the current camera frame.

@objc func handleTapGesture(sender: UITapGestureRecognizer) {
    if sender.state != .ended {
        return
    }
    guard let currentFrame = sceneView.session.currentFrame else {
        return
    }
    // Create a transform with a translation of 0.1 meters (10 cm) in front of the camera
    var translation = matrix_identity_float4x4
    translation.columns.3.z = -0.1
    // Add a node to the session
    let sphere = SCNSphere(radius: 0.005)
    sphere.firstMaterial?.diffuse.contents = UIColor.red
    sphere.firstMaterial?.lightingModel = .constant
    sphere.firstMaterial?.isDoubleSided = true
    let sphereNode = SCNNode(geometry: sphere)
    sphereNode.simdTransform = matrix_multiply(currentFrame.camera.transform, translation)
    sceneView.scene.rootNode.addChildNode(sphereNode)
}

The translation is a matrix with 4 rows and 4 columns. This is how 3D point is represented and transformations like translation, scaling, rotation, reflection, skewing and the combination of these can be applied (you can understand it more by searching for OpenGL Matrices).

The final step was calculating the distance between two nodes. The distance from two points can be achieved using the Three-dimensional Euclidean distance formula:

Euclidean Distance formula in 3D
Euclidean Distance formula in 3D

I subtracted the start node position with the end node position (two 3D vectors) resulting in a new vector and then I applied the formula |a| = sqrt((ax * ax) + (ay * ay) + (az * az)). This will give you the length of the resulting vector which is the same as saying: the distance from node A and node B.

func distance(startNode: SCNNode, endNode: SCNNode) -> Float {  
    let vector = SCNVector3Make(startNode.position.x - endNode.position.x, startNode.position.y - endNode.position.y, startNode.position.z - endNode.position.z)
    // Scene units map to meters in ARKit.
    return sqrtf(vector.x * vector.x + vector.y * vector.y + vector.z * vector.z)
}

self.distanceLabel.text = String(format: "%.2f", distance(startNode: nodeA, endNode: nodeB)) + "m"  

You can check this implementation here.

Enhance measurement

After the first implementation, I noticed that the measurement wasn't precise because you can't guarantee that node A and node B are in the same surface. In that case, I needed the plane detection feature. Vertical plane detection isn't a feature (yet) but it's possible to activate horizontal plane detection with one line of code configuration.planeDetection = .horizontal 🙌 and then ARKit will automatically add, change or remove plane anchors from the current session. You can observe those changes by implementing the session(_:didAdd:), session(_:didUpdate:) and session(_:didRemove:) methods from the ARSessionDelegate delegate (make sure you verify if the anchor parameter is a ARPlaneAnchor). The user should know when a horizontal plane is available in order to start adding points for measurement. The Apple's ARKit demo implements a square indicator which I thought it was available by using the sceneView.debugOptions property but it's not.

Plane detection in action
Plane detection in action

So, I borrowed the FocusSquare class from the Apple's demo.

Finally, the last question: how to place a node on the nearest plane? I already knew how to place a node where the camera is but how would I get the distance to the nearest plane. The answer is: hitTest(_:types:). This method searches the camera image for valid surfaces at a specified point in view coordinates and returns a list with the hit testing results ordered by nearest to farthest (in distance from the camera).

let planeHitTestResults = sceneView.hitTest(view.center, types: .existingPlaneUsingExtent)  
if let result = planeHitTestResults.first {  
    let hitPosition = SCNVector3.positionFromTransform(result.worldTransform)
    let sphere = SCNSphere(radius: 0.005)
    sphere.firstMaterial?.diffuse.contents = UIColor.red
    sphere.firstMaterial?.lightingModel = .constant
    sphere.firstMaterial?.isDoubleSided = true
    let node = SCNNode(geometry: sphere)
    node.position = hitPosition //<--
    sceneView.scene.rootNode.addChildNode(node)
}

I assumed the center of the main view as the aim and as default plane I used the first item of the list (the nearest plane).

As a final touch, I implemented the session(_:cameraDidChangeTrackingState:) method where it's possible to observe the device position tracking state.

func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) {  
    switch camera.trackingState {
    case .notAvailable:
        trackingStateLabel.text = "Tracking not available"
        trackingStateLabel.textColor = .red
    case .normal:
        trackingStateLabel.text = "Tracking normal"
        trackingStateLabel.textColor = .green
    case .limited(let reason):
        switch reason {
        case .excessiveMotion:
            trackingStateLabel.text = "Tracking limited: excessive motion"
        case .insufficientFeatures:
            trackingStateLabel.text = "Tracking limited: insufficient features"
        case .none:
            trackingStateLabel.text = "Tracking limited"
        case .initializing:
            trackingStateLabel.text = "Tracking limited: initializing"
        }
        trackingStateLabel.textColor = .yellow
    }
}

You can check the full project and test it on your device.

I'm really impressed with the power of AR. There are plenty of use cases that can be explored. If you have any project where we can help you using AR, then do not hesitate and contact us.


Subscribe to our newsletter

Would you like to receive more posts of this kind in your Inbox?