Selection Interaction for Vision OS with Unity Polyspatial — Apple Vision Pro Hand Interactions

4 min readFeb 8, 2024

Unity did it again. They have bridged development access to a new platform through an easy-to-use SDK. To address the needs of the Vision OS and its shared space apps, multiple windows and 3D volumes of interactable and independent software, Unity created Polyspatial. As the name suggests, this collection of technologies target most features that any AR or MR app could need. For this example, I will simply share one of the most fundamental interactions for any app; a selection similar to a mouse click that works by pinching the thumb and index finger together.

Before we start, you must know that Unity PolySpatial and visionOS support is only available to Unity Pro, Unity Enterprise, and Unity Industry users, you need Apple Silicon (M1, M2 or greater), XCode 15.2 and if you don’t have visionOS hardware, you can use the simulator built into XCode. Here's a look at the rest of requirements.

Let's set up the project in Unity 2022.3.18f1 or higher. I'm using the URP and have installed the following packages:

com.unity.polyspatial (1.0.3)
com.unity.xr.visionos (1.0.3)
com.unity.polyspatial.visionos (1.0.3)
com.unity.polyspatial.xr (1.0.3)

Within the Polyspatial package inside the package manager, you may find and import the samples. This example shows how to edit the Balloon Gallery interaction for your own purposes.

This sample works by directly or indirectly touching with the index finger or tapping (pinching) your index + thumb together. The magic of the Vision Pro lies in it's eye-tracking capabilities, as your eyes work as the cursor to point on the exact object that you want to interact with.

Let's take a look at the code and make our adjustments.

using Unity.PolySpatial.InputDevices;
using UnityEngine;
using UnityEngine.InputSystem.EnhancedTouch;
using UnityEngine.InputSystem.LowLevel;
using Touch = UnityEngine.InputSystem.EnhancedTouch.Touch;
using TouchPhase = UnityEngine.InputSystem.TouchPhase;

namespace PolySpatial.Samples
{
    public class GalleryInputManager : MonoBehaviour
    {
        [SerializeField]
        Transform m_InputAxisTransform;

        void OnEnable()
        {
            // enable enhanced touch support to use active touches for properly pooling input phases
            EnhancedTouchSupport.Enable();
        }

        void Update()
        {
            var activeTouches = Touch.activeTouches;

            if (activeTouches.Count > 0)
            {
                var primaryTouchData = EnhancedSpatialPointerSupport.GetPointerState(activeTouches[0]);
                if (activeTouches[0].phase == TouchPhase.Began)
                {
                    // allow balloons to be popped with a poke or indirect pinch
                    if (primaryTouchData.Kind == SpatialPointerKind.IndirectPinch || primaryTouchData.Kind == SpatialPointerKind.Touch)
                    {
                        var balloonObject = primaryTouchData.targetObject;
                        if (balloonObject != null)
                        {
                            if (balloonObject.TryGetComponent(out BalloonBehavior balloon))
                            {
                                balloon.Pop();
                            }
                        }
                    }

                    // update input gizmo
                    m_InputAxisTransform.SetPositionAndRotation(primaryTouchData.interactionPosition, primaryTouchData.inputDeviceRotation);
                }

                // visualize input gizmo while input is maintained
                if (activeTouches[0].phase == TouchPhase.Moved)
                {
                    m_InputAxisTransform.SetPositionAndRotation(primaryTouchData.interactionPosition, primaryTouchData.inputDeviceRotation);
                }
            }
        }
    }
}

This script works along the BalloonBehavior script to call the method Pop() that runs the popping animation with the particle effect.

I will use this base script to create an interaction where I click on three different game objects to set a path laid by pillars coming down for a bridge. I will use a switch method to use only one Tag for all three objects and to let players decide in which order to click the three elements to give players freedom of choice. Let's take a look at the edited script. All of the changes occur within the Update method:

void Update()
        {
            var activeTouches = Touch.activeTouches;

            if (activeTouches.Count > 0)
            {
                var primaryTouchData = EnhancedSpatialPointerSupport.GetPointerState(activeTouches[0]);
                if (activeTouches[0].phase == TouchPhase.Began)
                {
                    // allow object to be interacted with a poke or indirect pinch
                    if (primaryTouchData.Kind == SpatialPointerKind.IndirectPinch || primaryTouchData.Kind == SpatialPointerKind.Touch)
                    {
                        var interactedObject = primaryTouchData.targetObject;
                        if (interactedObject != null && interactedObject.tag == "Track")
                        {
                            if (interactedObject.TryGetComponent(out PillarButton pillar))
                            {
                                pillar.CallTracks();
                            }
                        }

                        else if (interactedObject != null && interactedObject.tag == "UIButton")
                        {
                            if (interactedObject.TryGetComponent(out UIButton button))
                            {
                                button.UIButtonPress();
                            }
                        }

                        else if (interactedObject != null && interactedObject.tag == "ChangeScene")
                        {
                            if (interactedObject.TryGetComponent(out ChangeScene loader))
                            {
                                loader.LoadMainScene();
                            }
                        }
                    }

                    // update input gizmo
                    m_InputAxisTransform.SetPositionAndRotation(primaryTouchData.interactionPosition, primaryTouchData.inputDeviceRotation);
                }

                // visualize input gizmo while input is maintained
                if (activeTouches[0].phase == TouchPhase.Moved)
                {
                    m_InputAxisTransform.SetPositionAndRotation(primaryTouchData.interactionPosition, primaryTouchData.inputDeviceRotation);
                }
            }
        }

By taking a look at how this new input system works, we can say that the Touch Phase begins a new interaction, upon detecting an interacted object, we can start customizing the target object's response by making sure that each tag corresponds to the script that we are looking for in the interacted object. On the same interaction scripts we can see that we can manage UI Buttons, Changing Scenes as well as our example of setting the pillars for our bridge. Here is how the PillarButton script looks to respond to this touch interaction:

using System.Collections;
using System.Collections.Generic;
using UnityEngine;

public class PillarButton : MonoBehaviour
{
    [SerializeField]
    private int _pillarID;
    [SerializeField]
    private TrainTracksBehaviour _tracksScript;
    private Collider _thisCollider;

    private void Start()
    {
        _thisCollider = GetComponent<Collider>();
    }

    public void CallTracks()
    {
        switch (_pillarID)
        {
            case 0:
                _tracksScript.SetTrackOne();
                _thisCollider.enabled = false;
                break;
            case 1:
                _tracksScript.SetTrackTwo();
                _thisCollider.enabled = false;
                break;
            case 2:
                _tracksScript.SetTrackThree();
                _thisCollider.enabled = false;
                break;
        }
    }
}

By using the switch statement, I can easily give each interactable button a _pillarID and allow players to click the buttons in any particular order as well as fitting all this interaction within one script. This is our final result:

The Polyspatial package really simplifies in an elegant and delicate manner the ability to interact with 3D objects with the Vision Pro. On the next article, I will discuss the manipulation sample that comes with Polyspatial and how to customize it for a specific use case.

Selection Interaction for Vision OS with Unity Polyspatial — Apple Vision Pro Hand Interactions

Written by Alberto Garcia

No responses yet