Good video about
Vision Pro by this new creator I found over the last few days
{{video
https://www.youtube.com/watch?v=rElOETrXpaM}}
I think that the object permanence of apps will be one of the “idioms” of spatial computing that ends up being the thing that just makes sense and we won’t look back from. You can imagine having certain apps strewn about your house, and certain apps scattered around your office, much like you would physical objects. Our minds are used to mapping physical spaces, and I bet after a few generations of this tech we’ll realize this is a more comfortable way to manage our HCI. It might not always be the most efficient, but I think it may become the most intuitive.
This is followed by the issue of object permanence in multi-user experiences. I’m glad, but not surprised, to see Apple thinking carefully about the perspectives of multiple users in remote experiences, both in terms of audio and video. If I’m not looking at you, you should be able to see that. This also applies to in-person multi-user experiences: if multiple users are wearing these headsets in the same room, there needs to be a shared perspective (and graceful ways to manage unshared experiences). In other words, pointing at something needs to work. I expect them to take this thinking into the greater world (see ARGeoAnchors in ARKit).
Like you alluded, I think the most worrisome thing out of these reviews is the mention of the friction with using eye tracking as the main input method. I suspect our eyes happen to move more with our thoughts than with our actions, and we may have to align the thoughts and actions (and thus slow our thinking) to work with this interface. To be fair, this is kind of how the existing mouse interaction works, you can’t really click without checking that your pointer is where you expect it to be. But notably, proficient computer users try to stay on the keyboard as much as possible so they can have the comfort of separating actions from where their eyes are looking, which is more efficient. Perhaps the solution will lie in changing the interface (something other than buttons and text fields that’s not just voice), or in some new version of a mouse that allows us to move in 3D and use tactility to navigate the interface without having to look too concertedly.
Like you alluded, I think the most worrisome thing out of these reviews is the mention of the friction with using eye tracking as the main input method. I suspect our eyes happen to move more with our thoughts than with our actions, and we may have to align the thoughts and actions (and thus slow our thinking) to work with this interface. To be fair, this is kind of how the existing mouse interaction works, you can’t really click without checking that your pointer is where you expect it to be. But notably, proficient computer users try to stay on the keyboard as much as possible so they can have the comfort of separating actions from where their eyes are looking, which is more efficient. Perhaps the solution will lie in changing the interface (something other than buttons and text fields that’s not just voice), or in some new version of a mouse that allows us to move in 3D and use tactility to navigate the interface without having to look too concertedly.
After going to the XR Guild event today (yesterday), participating in the Designing for Themed Entertainment and Experiences class last night, watching all these Vision Pro reviews, and thinking about how to design and build MIX both for locational AR and for visionOS… the thoughts have been brewing for a while.
Quite exhausted, but hopefully will have opportunities to put them down soon and flesh out more of my perspective on what spatial interfaces can be, and on Spatial computing in the built environment
Quite exhausted, but hopefully will have opportunities to put them down soon and flesh out more of my perspective on what spatial interfaces can be, and on Spatial computing in the built environment