Lessons learnt from building Mixed Reality/AR applications for training

Having spent 5 years building instructor-led training MR applications I thought it might be useful to jot down some of the lessons learnt. If you are interested in more or have specific questions just drop me a message (simon(@)scenaria.live).

Summary: (Appropriate as I am writing on  Feb 14th)

KISS, KISS, KISS … (Keep It Simple Stupid).

By that I mean to keep it simple for the Instructor and Trainees. Our analogy was the automatic gearbox, keeping it simple for the user meant that behind the scenes we had to overcome many difficult technical and logistical problems.      

One thing I would say is, despite the list that follows, headset based Mixed Reality instructor-led training has the potential to deliver a good level of ROI.


Typically, our clients were corporates involved in industrial activities and had an initial budget of $20k for a 2+ person training solution (inc hardware). Training took place anywhere there was space (offices, canteens, outside, training centre, the production floor) this could change at the last minute or during a session.

The training was instructor-led, so it always involved 2 or more synced MR headset devices (we used HoloLens/ nReal and IPAD Pro).


From the client’s point of view: Reducing the time, cost, and improving the success of training where PowerPoint or the 'real thing' could not be used. 

To be able to measure the ROI.

From the instructor’s point of view: Reliability, zero setup time (take the headset out of the box, place on trainee’s head and go). Ability to control the training event and generate new content/changes quickly (<<24hrs).

From the trainee’s point of view: Maximise the time and effort to learn and to do it in a safe supportive environment free of distractions.

Top lessons


With multiple users, changes to scenes (during the training) should happen within the individual's viewport. This meant that we kept changes either local for a user or in a resettable way so each user could have a go. Generally, we also tried to keep away from any grand or spooky changes.

The ability to resize and reposition the entire training environment (or an element of it) to fit the changing physical space was important.

We found that we had a lot of assets in a scene, so we were using

  1. Very low polygon assets.
  2. Hiding assets not in use.
  3. Removing (where required) asset faces.
  4. Switching scenes rather than cramming stuff into a single scene.

As supplied the training material often required a huge amount of work to simplify and turn into usable content. This process required multiple iterations and absorbed most of the project time budget.

Trainers are constantly looking to change/adapt their training materials (often with little notice).

This process would not have been economic if we had to write c sharp code each time, using a simple scripting language, downloadable assets and deploying a prebuilt 3D 'scene player' allowed for this.     



We gave in using user gestures to control actions in scenes. There were many reasons:

  1. The trainees didn’t know them and training them to use them got in the way of delivering the ‘real’ training.
  2. Instructors and trainees alike would have animated conversations and point and do strange things like role on the ground (to look under assets). Gestures were repeatedly being triggered at the wrong time.
  3. Some training activities required people to be reasonably close together and this led to falsely triggering each other's gestures.
  4. Hand tracking was of no use, trainers and the trainees often had to wear PPE – gloves, reflective jackets, face masks etc. People would hold stuff, clipboards, coffee cups, tablets, phones (make calls during training). Again, this just got in the way. 

Even 'simple' gestures were unpredictable and led to confusion. 

In the end, we used head gaze and a 3rd party clicker to control all interactions. This also worked well as we moved to a more heterogeneous environment where different devices meant different gestures (or none) were available. 

It was relatively easy to manage the clickers and stop trainees from doing things, if all else failed the instructor could simply remove them.  


Playback: Though it was a multiuser environment, we restricted the use of sounds to local headsets and specific prompts or video playback. We found that environmental noise (instructor, or 3rd party) would mean the trainee couldn’t hear or needed to replay the item to hear it.

With HoloLens 1 we found that trainees would put the headset on and inadvertently change the volume (the buttons are easy to catch)  mostly they seemed to turn the sound off.

Attempting to sync audio to video and broadcast audio across multiple headsets suffered from lip-sync issues. We stopped using it.

Voice recognition: Suffered from external noise, false triggering, inability to correctly recognise trigger commands. We stopped using it.

Other stuff:

IPD: Headsets are handed from person to person there was never any time to set up IPD.

Anchors: For a lot of reasons, we gave in using world or room anchors and instead relied on manual anchor syncing that was instructor driven and easy to repeat during the session as positioning was lost.

We also found that this manual syncing was important as we moved to a heterogeneous headset environment (i.e. mixing multiple types/generations of headsets and IPADS) where the manual syncing continued to work without issue.


Kiosk Mode: We never achieved kiosk mode on HoloLens 1. But we felt it was important for future projects to remove all 3rd party distractions e.g. the OS interface and thus prevent miss interpreted system-related gestures/audio prompts/pop up messages/ update prompts etc.

We have lost count of the no of training sessions spoilt by inadvertent bloom gestures on HL1

Security: Our goal was zero setup time, so anything that got in the way of the trainer was removed or turned off. In this sort of environment physical security was more important than user access controls.   

Updates: Where possible we disabled all system OS updates/changes or general internet access for the headsets. Client’s lost training sessions as the devices updated themselves.

  1. As delivered the devices didn’t generally need any updates or security patches. We felt that it was more important to have a stable fixed environment with as few changes as possible. Infrequent  IT visits were used to do controlled manual updates.
  2. Devices were only be turned on to do training and so they had no time for updates.
  3. Remote IT support for devices was made more difficult by the unpredictability of updates.


So that's just a quick list, if you are interested in more or have specific questions just drop me a message (simon(@).scenaria.live).



Popular posts from this blog