Home About the Journal Latest Work Current Issue Archive Special Issues Editorial Board

TABLE OF CONTENTS

2021,  3 (5):   397 - 406

Published Date:2021-10-20 DOI: 10.1016/j.vrih.2021.09.001

Abstract

Background
In mega-biodiverse environments, where different species are more likely to be heard than seen, species monitoring is generally performed using bioacoustics methodologies. Furthermore, since bird vocalizations are reasonable estimators of biodiversity, their monitoring is of great importance in the formulation of conservation policies. However, birdsong recognition is an arduous task that requires dedicated training in order to achieve mastery, which is costly in terms of time and money due to the lack of accessibility of relevant information in field trips or even specialized databases. Immersive technology based on virtual reality (VR) and spatial audio may improve species monitoring by enhancing information accessibility, interaction, and user engagement.
Methods
This study used spatial audio, a Bluetooth controller, and a head-mounted display (HMD) to conduct an immersive training experience in VR. Participants moved inside a virtual world using a Bluetooth controller, while their task was to recognize targeted birdsongs. We measured the accuracy of recognition and user engagement according to the User Engagement Scale.
Results
The experimental results revealed significantly higher engagement and accuracy for participants in the VR-based training system than in a traditional computer-based training system. All four dimensions of the user engagement scale received high ratings from the participants, suggesting that VR-based training provides a motivating and attractive environment for learning demanding tasks through appropriate design, exploiting the sensory system, and virtual reality interactivity.
Conclusions
The accuracy and engagement of the VR-based training system were significantly high when tested against traditional training. Future research will focus on developing a variety of realistic ecosystems and their associated birds to increase the information on newer bird species within the training system. Finally, the proposed VR-based training system must be tested with additional participants and for a longer duration to measure information recall and recognition mastery among users.

Content

1 Introduction
Colombia ranks among the twelve most megadiverse nations and is ranked second in terms of global diversity[1]. In addition, Colombia is the most avian-diverse country, owing to its nearly two thousand different species of birds[2]. However, the list of endangered species has been growing every year due to the reduction of natural forest, which is lost to illegal deforestation driven by rapid urbanization, the rise of extractive industries, and farming[3]. Since the existence of birds is an optimal ecological indicator, their identification is of great importance for diversity. Presently, bioacoustics is the most widely used methodology to monitor birds as most of them are easier to hear rather than see in the field[4].
Very few professionals develop the ability to recognize bird sounds for several bird species, which causes difficulties in monitoring avian activity through sounds. Furthermore, teaching volunteers leads to additional costs and lengthy learning processes that may take more than a year. Indeed, learning to identify birds to tag them is not an easy task because it requires constant practice[4]. There are two traditional methods of acquiring the required knowledge: real and virtual. The first method involves guided sighting hikes to perform visual and acoustic observations of birds. This is the traditional method that has to be supported by expert guides in the field, which incurs additional costs. The second training method comprises active learning through a remote study of photographs and sound databases to identify bird species. Unfortunately, both methods have limited accessibility and interactivity[5]. High associated costs reduce accessibility. For example, field trips are costly and depend on the availability of expert guides and the appropriate environmental conditions for the visit, which can result in wasted time and money if not handled appropriately. In addition, access to specialized birdsong databases is restricted and expensive. Moreover, both training methods depend strongly on expert guides, who are often not experienced educators and fall short when sharing knowledge or motivating trainees; this translates into reduced interactivity.
These challenges motivated the creation of an immersive birdsong recognition training process. The aim was to create a systematic interactive training program that would allow individuals interested in birds to learn and identify them in an interactive virtual reality environment. The system focuses on improving information accessibility and interactivity using inexpensive equipment that can be purchased and used easily by anyone.
1.1 Immersive technology for training
The notion of immersive technology refers to any technology that blurs the boundary between the real and virtual worlds by immersing the user fully or partially in a virtual experience[6]. Virtual reality (VR) is based on a three-dimensional environment, generated by a computer that completely simulates a new version of the physical world. Users who explore this type of environment interact with the virtual world as if they are exploring it in reality using different displays and input devices. Therefore, virtual environments can provide a rich, interactive, and engaging context that can support experiential learning. Moreover, the ability of VR to engage users by employing enjoyable multimodal experiences that immerse them in focused tasks can lead to higher mastery and retention of knowledge[7]. Although virtual reality has been around for over 40 years, it has become available for consumer use only recently. Various devices have been introduced to display immersive technology, which include expensive head-mounted displays (HMDs) such as HTC or Oculus and economic optical adapters that use available mobile devices to display the related images. The most economically accessible and adaptable way to use VR technology is by using cardboard-based adapters, such as Google Cardboard, which uses cheap DIY materials in its build. By using this type of optical adapter, the training system will be more accessible for general use.
However, previous research on VR training reported negative responses associated with immersive technology use. For example, researchers found that some HMD users experienced motion sickness, physical discomfort, and/or cognitive overload[8,9]. Therefore, the appropriate design of immersive experiences must take into account the limits of perceptual comfort and avoid the overstimulation of users. Researchers have also suggested that for training purposes, immersive experiences should incorporate different stages of learning tasks[10]. In this way, trainees will seek to accomplish increasingly difficult tasks that provide them a sense of self-progress, which may enhance the learning process and user engagement. Meanwhile, positive outcomes resulting from the use of immersive technology for training are as follows: the improvement of effectiveness and task performance, an increase in positive attitude toward the learning task, a reduction in task completion time, and an increase in the accuracy rate[9-13]. Although the widespread use of immersive technology is at an early stage of development, designing training tasks that take advantage of the experiential learning potential of VR can achieve systematically positive results when traditional training is highly inaccessible.
1.2 Spatial audio for HMD-based VR
Spatial audio refers to 360° sound reproduction around the listener. Spatial audio resembles what we hear in real life and has the potential to enhance the immersion of the listener into virtual experiences. Specifically, for HMD-based VR experiences, spatial audio enhances the interaction by using dynamic binaural audio synthesis based on head-related transfer functions (HRTFs), virtual loudspeakers, and headset orientation data[14]. However, HRTFs are highly personalized and require individualized calibration using expensive equipment. To avoid this limitation, generic HRTFs are usually employed to render sufficient spatialized sounds in VR[15]. Audio synthesis can provide realistic immersive experiences with standard headphones, mobile phones, and cardboard headsets. Thus, the experience is enhanced because the virtual world images and other visual stimuli can be associated with the location and distance of the corresponding sound source.
2 Methods
We conducted a between-subjects study that aimed to compare the engagement and accuracy of birdsong recognition using a traditional training system against a training system based on VR. Participants were assigned to one of two groups, and both groups had the same learning goals. The learning steps were as follows: First, participants heard a birdsong associated with a particular bird; then, they heard another audio file with different sound sources and tried to identify the time when the targeted bird sang. The test audio file was labeled by an expert to provide a ground truth in which all the birds that sang were correctly recognized. Finally, all participants responded to the user engagement scale test to assess their perceived engagement during the training activity. The participants were between 18 and 54 years of age and had normal visual and auditory perception. The University Committee on Human Research approved the study, and the subjects voluntarily agreed to participate by giving verbal informed consent after the experimenter explained the project in general terms.
2.1 Equipment
The participants wore an economical optical adapter (VR box virtual reality glasses), where a smartphone was introduced as their HMD. In addition, they used a 7.1 headphone (VanTop Technology and Innovation Co. Ltd., Shenzhen, Guangdong) to reproduce the spatial audio. The participants interacted with the virtual world using a standard Bluetooth controller connected to a smartphone (Figure 1). The standard Bluetooth controller had four buttons and a multidirectional stick that was used to navigate the VR environment. The four buttons were programmed as triggers for bird recognition in a VR-based environment. The developed VR mobile application was designed for an Android operating system. The mobile phone used in the training test was a standard Android phone (A51, Samsung, South Korea), with 1080×2400 (FHD+) screen resolution, 4GB RAM, and an Octa-Core (2.3GHz) processor.
2.2 Training task
The training task began by presenting information corresponding to a particular bird, i.e., the target bird. This information included the scientific name, common name, image of the bird, and audio samples of the specific birdsong. This learning phase was identical for both experimental groups, the traditional training group and the VR-based immersive training group. The participants could remain at this learning stage until they were confident in their ability to identify the birdsong. Once they decided that they could recognize the birdsong, the participants entered the next phase of the task, the test. For both conditions, the duration of this phase was identical (278s), corresponding to the reproduction of an audio file with the song of the target bird and other birds several times. The influence of time on the task was avoided by controlling the duration of the test. For the traditional training, a simple computer interface was designed and implemented in Java to allow participants to play an audio file and recognize the onset time when the birdsong was reproduced. This labeling decision was performed using the space bar of the computer. The audio reproduced was a recording of a single microphone, the most common condition in traditional training.
In contrast, for the VR immersive training, the participants listened to a 3D sound, based on a six-microphone array recording of the same environment. Moreover, the subjects visualized a virtual forest that stretched in all directions (Figure 2). The task was to move into the virtual world and recognize the time when a targeted bird sang. This temporal labeling was performed using a Bluetooth controller. The graphics and tasks were designed and implemented in Unity. Each participant was able to look around the environment by moving their head and was allowed to move in any direction to follow the sound sources. The 3D spatial sound was reproduced to permit the correct identification of the source of the sounds in terms of location and distance. The VR participants were free to move around the VR environment active area, set to 60m×60m. The participants were told to pay attention to the birdsongs and move toward the sound sources. However, there was no guarantee that the participants correctly followed this instruction, as they could move to any location in the VR environment, without limitations. Therefore, there was the possibility of going to locations where the target birdsongs were harder to identify due to low loudness.
2.3 Data collection
The age of the participants and the number of times they listened to the audio of the birdsong in the learning phase were recorded for each participant. In the test phase, the accuracy of each participant was registered by comparing their onset time identification with the ground truth provided by the expert. The accuracy was calculated using a grading system based on how long it took the participant to recognize the audio. If the participant recognized the bird while the song played, the participant scored 100, the maximum score. Even if the recognition was in the next two seconds after the end of the bird song also, the participant scored 100. If the selection was 3 seconds off, the score was 85. At 4s, the score was 80. At 5s, the score was 75. Finally, at 6s and 7s off, the scores were 70 and 65, respectively. At the end of the test trial, the accumulated score was divided by the number of times the participant heard the birdsong. For example, if a participant recognized the birdsong five times in the test trial but got it right only three times (e.g., scored 75 points for each good recognition event), the overall score was (75×3)/5=45 points.
After each test trial, all participants responded to the User Engagement Scale-Short Form (UES-SF)[16] to assess their perceived engagement with the respective training activity. The UES-SF divides the "engagement" into four dimensions: focus attention, perceived usability, aesthetic appeal, and endurability. The dimensions are measured using three questions rated on a five-point Likert scale: (1) strongly disagree, (2) disagree, (3) neither agree nor disagree, (4) agree, and (5) strongly agree. The focus attention dimension is related to the feeling of being lost in the experience and losing track of time. The perceived usability dimension tested the perceived control of the participants in the interaction and their levels of frustration. The aesthetic dimension accounts for the perceived appeal of the interface concerning the senses. Finally, the endurability dimension measured the reward and satisfaction perceptions of the participants. Table 1 presents the questions corresponding to each dimension.
UES-SF questions used to measure perceived engagement
UES-SF Dimension Question
Focus Attention (FA) FA1: I lost myself in this experience
FA2: I was absorbed in this experience
FA3: The time I spent using the application just slipped away
Perceived Usability (PU) PU1: I felt frustrated while using this application
PU2: I found this Application confusing to use
PU3: Using this Application was taxing
Aesthetic Appeal (AA) AA1: This Application was attractive
AA2: This Application was aesthetically appealing
AA3: This Application appealed to my senses
Endurability (E) E1: Using the Application was worthwhile
E2: My experience was rewarding
E3: I felt interested in this experience
2.4 Statistical analysis
For each measured parameter, descriptive statistics were calculated by testing significant differences using a two-sample t-test at a confidence level of 95 %. There is controversy in deciding which type of statistical procedure is appropriate for analyzing 5-point Likert items, such as the UES-SF[17-19]. For instance, the Wilcoxon signed-rank test is recommended as it is a non-parametric procedure that is used as an alternative to the t-test when the distribution of the difference between two samples' means cannot be assumed to be normally distributed. However, the recommendations to choose a non-parametric procedure over the t-test, particularly with small sample sizes and Likert scale data, appear to be groundless, even when the t-test assumptions are violated[18]. Nevertheless, running both statistical analyses (t-test and Wilcoxon signed-rank test) produced similar results. We decided to present the t-test values as they are the most commonly used when reporting differences when using the user engagement scale[20,21]. Additionally, the Pearson correlation coefficient for each parameter was calculated to measure the linear relationships between all variables.
3 Results
Eighteen participants completed the study. Eight of them were female, and the average and standard deviation of the participants' age was 28.0±10.8 years old.
Summary results describing the age, the number of repetitions in the learning phase, and the accuracy points for all participants depending on the group, traditional or VR-based training, are presented in Tables 2 and 3, respectively.
Individual results for the traditional training group
User ID Age # of Repetitions in Learning Phase Accuracy Points
Total 25.9 (15.9) 2.9 (1.9) 34.8 (10.3)
User 2 18 2 26.7
User 4 18 3 30.0
User 7 24 1 40.0
User 10 24 2 25.4
User 11 22 7 75.0
User 12 22 1 35.0
User 16 18 5 26.0
User 17 44 3 25.0
User 18 43 2 30.0
Notes: Total values are presented as mean (±SD).
Individual results for the VR-based training group
User ID Age # of Repetitions in Learning Phase Accuracy Points
Total 30.2 (11.6) 2.6 (1.6) 71.1 (13.4)
User 1 21 6 87.7
User 3 23 2 77.8
User 5 45 1 40.0
User 6 21 2 70.4
User 8 24 2 65.0
User 9 27 3 76.0
User 13 25 1 67.8
User 14 54 3 77.0
User 15 32 4 78.0
Notes: Total values are presented as mean (±SD).
When comparing the age for the nine participants in the traditional training group (M=25.9 years old; SD=15.9 years old) against the age of the nine participants in the VR-based training group (M=30.2 years old; SD=11.6 years old), no significant differences were found (t(8)=-0.79, p=0.45>0.1). Similar results were obtained for the number of repetitions in the learning phase, where the averages of both training groups were not significantly different, t(8)=0.2, p=0.84>0.1. However, a significant difference (t(8)=-3.9, p<0.01) was observed for the average accuracy points between groups. The accuracy of the participants that used the VR-based training was significantly higher (M=71.1 points; SD=13.4 points) than the accuracy of the participants in the traditional training (M=34.8 points; SD=10.3). Moreover, Pearson's correlation analysis revealed no significant relationship between the age of the participant and its accuracy (r(18)=0.001, p=0.99>0.1). Additionally, correlation analysis between the age of the participant and the number of repetitions in the learning phase was not significant (r(18)=0.16, p=0.51>0.1). Finally, no significant correlation (r(18)=0.35, p=0.15>0.1) was observed between the number of repetitions in the learning phase and the accuracy points. Figure 3 shows a plot of the accuracy points in the vertical axis against the number of repetitions on the horizontal axis.
The UES-SF results revealed that the overall engagement in the VR-based training (M=4.1; SD=0.8) was rated significantly higher (t(107)=9.8; p<0.001) than that in the traditional training (M=2.9; SD=1.1). For the focus attention dimension, the participants rated the VR-based experience (M=4.4; SD=0.6) significantly higher (t(26)=6.3; p<0.001) than the traditional training (M=2.5; SD=1.2). Similarly, the perceived usability was significantly higher (t(26)=3.7; p<0.005) for the VR-based training (M=3.3; SD=0.6) than for the traditional training (M=2.7; SD=1.1). The aesthetic appeal was rated significantly higher (t(26)=5.3; p<0.001) for the VR-based training (M=4.4; SD=0.7) than for the traditional training (M=3.3; SD=0.8). Finally, the endurability dimension was rated significantly higher (t(26)=5.0; p<0.001) for the VR-based training (M=4.4; SD=0.6) than for the traditional training (M=3.1; SD=1.1). Table 4 and Figure 4 show the results of the UES-SF discriminated by questions and the training group.
UES-SF results for both trainings
UES-SF Dimension Question Traditional Training VR-based Training
Focus Attention (FA) FA1 2.8 (1.1) 4.4 (0.5)
FA2 2.6 (1.2) 4.7 (0.5)
FA3 2.3 (1.5) 4.1 (0.8)
Perceived Usability (PU) PU1 2.6 (1.4) 3.1 (0.6)
PU2 3.6 (0.5) 3.7 (0.5)
PU3 2.0 (1.3) 3.2 (0.7)
Aesthetic Appeal (AA) AA1 3.3 (0.5) 4.6 (0.5)
AA2 3.6 (0.9) 4.8 (0.4)
AA3 3.0 (1.0) 3.9 (0.8)
Endurability (E) E1 3.7 (0.5) 4.0 (0.7)
E2 3.0 (0.9) 4.7 (0.5)
E3 2.8 (1.5) 4.6 (0.5)
Notes: UES-SF Ratings: Values are presented as mean (±SD).
4 Discussion
A VR-based training system for the recognition of birdsongs was designed and implemented using easy-to-find and low-cost equipment. It was characterized mainly by the reproduction of spatial audio and enhanced interactivity with the virtual world. To test the training, two groups of participants were recruited and assigned to either a traditional version of the training or the VR-based training. Both types of training followed the same steps: first, a self-controlled learning phase, followed by an accuracy test. The last step consisted of a user engagement survey, based on the UES-SF. The age of the participants was not significantly different between the two groups. Our experimental results revealed that age was not a determining factor of accuracy in the training; for example, older participants had similar accuracy points compared to younger users. This finding was surprising because previous research has documented a degrading relationship between age and performance in VR environments[22,23]. However, it is possible that since our training was designed as a short experience, it did not frustrate or fatigue users to the point of seeing any effect on physical endurance that may be attributed to the decrease in performance associated with age. Nevertheless, we recognize the need to expand the number of subjects to include more participants (older and younger) to generalize this result.
Regarding the first step of the interaction, the self-controlled learning stage, the number of repetitions that the participants decided to reproduce the audio was not correlated with the accuracy points obtained in the test phase. This surprising result also contradicts earlier research that showed that previous repetition in a recognition task positively impacts the subsequent recognition performance[24]. In particular, Figure 3 shows two participants that can be considered as outliers of their respective groups that could follow this trend. For the traditional training group, user 11 repeated the recording seven times and scored 75 points for accuracy, which is similar to the results of the VR-based training group. Similarly, in the VR-based training group, user 5 only reproduced the learning audio once and scored only 40 points for accuracy. Both users were outliers in their respective groups because the accuracy of the other participants was not dependent on the number of times they repeated the audio. This difference may arise because of different motivations for participation. User 11 was intrinsically highly motivated to perform well in the task and enjoyed the learning phase, according to comments conveyed by the participant to the experimenter. On the contrary, user 5 revealed a lack of desire to achieve high scores, preferring to enjoy the relaxing atmosphere of the VR-based training interaction. These insights reveal the importance of designing experiences for already motivated users. Apart from these two outliers, the rest of the users had similar numbers of repetitions and accuracy of performance within their respective groups.
The significant difference in the accuracy between both types of training may indicate that VR-based training is more effective than traditional training. Indeed, according to the UES-SF results, the participants perceived the VR-based training as significantly more engaging than the traditional method in all four UES dimensions. Greater focus attention may lead to increased performance in recognition activities because participants avoid distractions and concentrate on a specific task[25,26]. Usability was perceived as significantly higher for VR-based training than for traditional computer-based training. This result validates our design of a VR-based experience that transforms boring traditional tasks, such as practicing sound recognition, with an immersive and interactive experience where users have more control and less frustration. In addition, the aesthetic appeal of VR-based training was rated significantly higher. The use of spatial sound in an immersive and interactive 3D environment appeals to the multimodal sensory system of participants[14]. Finally, the endurability and satisfaction perception in the VR-based system was significantly higher than that in the traditional training, possibly indicating that the system motivated users to interact and that the design of the task did not overstimulate them. However, it remains unclear whether these results are dependent on the novelty of the approach to the participants in the VR-based system. Prolonged exposure to the VR-based training system may reveal whether the novelty wears off, and if the accuracy and engagement may be affected[27].
In conclusion, when our VR-based training system was tested against a traditional training system, the experimental results revealed higher user engagement and accuracy, validating the experience design. Future research will focus on developing a variety of realistic ecosystems with their associated birds to increase the information on newer bird species in the training system. The emphasis will be on accurately reproducing the spatialized sounds, for which carefully calibrated and personalized HRTFs will be measured for each participant. Furthermore, a gamification module with cooperative and social interactions may enhance the experience and motivate participants to continue with the training. Finally, the training system must be tested with additional participants and for a longer duration to measure information recall and recognition mastery among users.

Reference

1.

Myers N, Mittermeier R A, Mittermeier C G, da Fonseca G A, Kent J. Biodiversity hotspots for conservation priorities. Nature, 2000, 403(6772): 853–858 DOI:10.1038/35002501

2.

Baptiste M P, Loaiza L M G, Acevedo-Charry O, Acosta-Galvis A R, Wong L J. Global register of introduced and invasive species- Colombia. 2018

3.

Krause T. Reducing deforestation in Colombia while building peace and pursuing business as usual extractivism? Journal of Political Ecology, 2020, 27(1) DOI:10.2458/v27i1.23186

4.

KvsnR R, MontgomeryJ, GargS, CharlestonM. Bioacoustics data analysis-A taxonomy, survey and open challenges. IEEE Access, 2020, 8: 57684–57708 DOI:10.1109/access.2020.2978547

5.

Venier L A, Mazerolle M J, Rodgers A, McIlwrick K A, Holmes S, Thompson D. Comparison of semiautomated bird song recognition with manual detection of recorded bird song samples. Avian Conservation and Ecology, 2017, 12(2): art2 DOI:10.5751/ace-01029-120202

6.

Suh A, Prophet J. The state of immersive technology research: a literature analysis. Computers in Human Behavior, 2018, 86: 77–90 DOI:10.1016/j.chb.2018.04.019

7.

Alrehaili E A, Al Osman H. A virtual reality role-playing serious game for experiential learning. Interactive Learning Environments, 2019, 1–14 DOI:10.1080/10494820.2019.1703008

8.

Goh D H L, Lee C S, Razikin K. Interfaces for accessing location-based information on mobile devices: an empirical evaluation. Journal of the Association for Information Science and Technology, 2016, 67(12): 2882–2896 DOI:10.1002/asi.23566

9.

Munafo J, Diedrick M, Stoffregen T A. The virtual reality head-mounted display Oculus Rift induces motion sickness and is sexist in its effects. Experimental Brain Research, 2017, 235(3): 889–901 DOI:10.1007/s00221-016-4846-7

10.

Ibáñez M B, Di-Serio Á, Villarán-Molina D, Delgado-Kloos C. Support for augmented reality simulation systems: the effects of scaffolding on learning outcomes and behavior patterns. IEEE Transactions on Learning Technologies, 2016, 9(1): 46–56 DOI:10.1109/tlt.2015.2445761

11.

Frank J A, Kapila V. Mixed-reality learning environments: Integrating mobile interfaces with laboratory test-beds. Computers & Education, 2017, 110: 88–104 DOI:10.1016/j.compedu.2017.02.009

12.

Loup-Escande E, Frenoy R, Poplimont G, Thouvenin I, Gapenne O, Megalakaki O. Contributions of mixed reality in a calligraphy learning task: Effects of supplementary visual feedback and expertise on cognitive load, user experience and gestural performance. Computers in Human Behavior, 2017, 75: 42–49 DOI:10.1016/j.chb.2017.05.006

13.

Ke F F, Lee S, Xu X H. Teaching training in a mixed-reality integrated learning environment. Computers in Human Behavior, 2016, 62: 212–220 DOI:10.1016/j.chb.2016.03.094

14.

Hong J, He J J, Lam B, Gupta R, Gan W S. Spatial audio for soundscape design: recording and reproduction. Applied Sciences, 2017, 7(6): 627 DOI:10.3390/app7060627

15.

Berger C C, Gonzalez-Franco M, Tajadura-Jiménez A, Florencio D, Zhang Z Y. Generic HRTFs may be good enough in virtual reality. improving source localization through cross-modal plasticity. Frontiers in Neuroscience, 2018, 12: 21 DOI:10.3389/fnins.2018.00021

16.

O’Brien H L, Cairns P, Hall M. A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form. International Journal of Human-Computer Studies, 2018, 112: 28–39 DOI:10.1016/j.ijhcs.2018.01.004

17.

de Winter J F C, Dodou D. Five-point Likert Items: t test versus Mann-Whitney-Wilcoxon (Addendum added October 2012). Practical Assessment, Research, and Evaluation, 2019

18.

Meek G E, Ozgur C, Dunning K. Comparison of the t vs. wilcoxon signed-rank test for likert scale data and small samples. Journal of Modern Applied Statistical Methods, 2007, 6(1): 91–106 DOI:10.22237/jmasm/1177992540

19.

Harpe S E. How to analyze Likert and other rating scale data. Currents in Pharmacy Teaching and Learning, 2015, 7(6): 836–850 DOI:10.1016/j.cptl.2015.08.001

20.

Nguyen D, Meixner G. Gamified augmented reality training for an assembly task: a study about user engagement. In: Proceedings of the 2019 Federated Conference on Computer Science and Information Systems. IEEE, 2019 DOI:10.15439/2019f136

21.

Ruan S, Jiang L W, Xu J, Tham B J K, Qiu Z N, Zhu Y S, Murnane E L, Brunskill E, Landay J A. QuizBot: a dialogue-based adaptive learning system for factual knowledge. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Glasgow Scotland Uk, New York, NY, USA, ACM, 2019 DOI:10.1145/3290605.3300587

22.

Coxon M, Kelly N, Page S. Individual differences in virtual reality: Are spatial presence and spatial ability linked? Virtual Reality, 2016, 20(4): 203–212 DOI:10.1007/s10055-016-0292-x

23.

Arino J J, Juan M C, Gil-Gómez J A, Mollá R. A comparative study using an autostereoscopic display with augmented and virtual reality. Behaviour & Information Technology, 2014, 33(6): 646–655 DOI:10.1080/0144929x.2013.815277

24.

Kaplan A D, Cruit J, Endsley M, Beers S M, Sawyer B D, Hancock P A. The effects of virtual reality, augmented reality, and mixed reality as training enhancement methods: a meta-analysis. Human Factors, 2021, 63(4): 706–726 DOI:10.1177/0018720820904229

25.

Huang H M, Rauch U, Liaw S S. Investigating learners' attitudes toward virtual reality learning environments: Based on a constructivist approach. Computers & Education, 2010, 55(3): 1171–1182 DOI:10.1016/j.compedu.2010.05.014

26.

Huang T L, Liao S L. Creating e-shopping multisensory flow experience through augmented-reality interactive technology. Internet Research, 2017, 27(2): 449–475 DOI:10.1108/intr-11-2015-0321

27.

Rutter C E, Dahlquist L M, Weiss K E. Sustained efficacy of virtual reality distraction. The Journal of Pain, 2009, 10(4): 391–397 DOI:10.1016/j.jpain.2008.09.016