User interface paradigms for manipulation of and interaction with a 3D audiovisual immersive environment

This research is being conducted primarily by Dalia El-Shimy and Jessica Ip, and is joint work with François Bérard, Zack Settel, Georgios Marentakis, and Stephen McAdams.


Motivated in part by questions raised through the AudioScape project, we are interested in interaction and control within a 3D environment. In particular, we need to develop an effective interface for object instantiation, position, view, and other parameter control, which moves beyond the limited (and often bewilderingly complex) keyboard and mouse devices, in particular within the context of performance. We can divide the problem into a number of actions (or gestures that the user needs to perform), the choice of sensor (to acquire these input gestures), and appropriate feedback (to indicate to the user what has been recognized and/or performed).

As a starting point, the initial action we consider necessary for almost all others is that of object or target "selection". Thus, our first experiment will investigate the comparative performance benefits of graphical (e.g. Zhai's silk cursor approach), auditory (e.g. a click or beep when a target is acquired), and haptic (e.g. a vibration transduced through a Wii controller) feedback, either individually or in combination. Next, we wish to consider the changing requirements as one moves from a 2D (conventional screen) display, to an immersive (multi-screen) display, to a stereoscopic (3D) display, and similarly, as audio evolves from a monophonic, to stereo, to a fully spatialized modality. Related questions might deal with target density and distance, in particular in the case of a 3D environment. We consider a large range of possible input devices, although to reduce complexity in the early stages, we will likely confine ourselves to use of a Wii mote and/or motion capture system.

Interaction 3D Experiment


We conducted a study comparing the performance of devices with different degrees of freedom (DoFs) in a 3D placement task. Two display conditions were tested as separate experiments in this study: a 1280x1024 LCD display and a stereo display using two 1024x768 resolution projectors. For both tests, we compared four different input devices for the placement of a silk cursor over a smaller red target. The four devices include the mouse, SpaceNavigator, depth slider, and wii remote. The experiments were divided into several blocks that varied in difficulty as determined by Fitts' Law. During the experiments, the performance (time and number of errors) for each participant was measured while the biosignals were recorded passively.

Left to right: depth slider (with and without slider apparatus), space navigator, wii remote.

In both experiments, we displayed a perspective view of the scene in the acqusition task for all the non-mouse devices. For the mouse, we divided the screen up into four views where three corresponded to the top, side, and front view of the scene, and the fourth showed the perspective view of the scene. Since the mouse is limited to two DoFs, it is necessary to provide these 2D views of the scene for mouse interaction. The rationale behind having different display modalities for the mouse and other devices is to provide the optimal output mechanism for each device for a fair comparison.

a) Three orthographic views and one perspective view for the mouse.

b) Maximized perspective view used for all other devices.

Experiment One

We hypothesized that the higher DoF devices would outperform the lower DoF devices like the mouse. We believed users would feel more comfortable interacting in 3D space using devices with higher degrees of freedom and this would be reflected in the performance results.

Experiment one results, colours denote type of device. Top left: Average time and number of errors. Top right: Average placement time and number of errors over trial blocks. Bottom left: Average biosignal readings. Bottom right: Average biosignal readings over trial blocks.

The results show the mouse outperforming all the other devices. After these contrasting results, we hypothesized that this astonishing result is due to inappropriate and insufficient display mechanisms provided for the higher DoF devices. To follow up, we designed another experiment using a stereoscopic display for the SpaceNavigator, depth slider, and wii remote. Display for the mouse condition remained the same with four views (three orthographic, and one perspective).

Experiment Two

For the second experiment, we hypothesized that people should perform better with the higher DoF devices given a more appropriate display output (stereoscopic projection). After the results of the previous experiment, we were uncertain as to whether or not the mouse would still outperform the other devices.

Experiment two results, graphs correspond to previously shown graphs.

The results we found from the second experiment were significantly different from the results of the first experiment. Although the mouse is still significantly better, there is substantial improvement with the other devices. According to these results, we can safely say that there are many contributing factors (like display mechanism) to the success of different devices in a 3D placement task.

Future Work

Volume specification via user-plotted points around a target 3D shape using high DoF devices. The user will draw a convex hull around a tumor or blood vessel in the brain. Accuracy of performance will be measured based on area difference of convex hull and target under a given time limit.

Last update: April 27, 2009