Usability and Effectiveness of Immersive Virtual Grocery Shopping for Assessing Cognitive Fatigue in Healthy Controls: Protocol for a Randomized Controlled Trial

Background: Cognitive fatigue (CF) is a human response to stimulation and stress and is a common comorbidity in many medical conditions that can result in serious consequences; however, studying CF under controlled conditions is difficult. Immersive virtual reality provides an experimental environment that enables the precise measurement of the response of an individual to complex stimuli in a controlled environment. Objective: We aim to examine the development of an immersive virtual shopping experience to measure subjective and objective indicators of CF induced by instrumental activities of daily living. Methods: We will recruit 84 healthy participants (aged 18-75 years) for a 2-phase study. Phase 1 is a user experience study for testing the software functionality, user interface, and realism of the virtual shopping environment. Phase 2 uses a 3-arm randomized controlled trial to determine the effect that the immersive environment has on fatigue. Participants will be randomized into 1 of 3 conditions exploring fatigue response during a typical human activity (grocery shopping). The level of cognitive and emotional challenges will change during each activity. The primary outcome of phase 1 is the experience of user interface difficulties. The primary outcome of phase 2 is self-reported CF. The core secondary phase 2 outcomes include subjective cognitive load, change in task performance behavior, and eye tracking. Phase 2 uses within-subject repeated measures analysis of variance to compare preand postfatigue measures under 3 conditions (control, cognitive challenge, and emotional challenge). Results: This study was approved by the scientific review committee of the National Institute of Nursing Research and was identified as an exempt study by the institutional review board of the National Institutes of Health. Data collection will begin in spring 2021. Conclusions: Immersive virtual reality may be a useful research platform for simulating the induction of CF associated with the cognitive and emotional challenges of instrumental activities of daily living. Trial Registration: ClinicalTrials.gov NCT04883359; http://clinicaltrials.gov/ct2/show/NCT04883359 International Registered Report Identifier (IRRID): PRR1-10.2196/28073 (JMIR Res Protoc 2021;10(8):e28073) doi: 10.2196/28073


Background
The application of digital technologies to improve the monitoring and treatment of chronic clinical conditions is an emerging field in medical research and practice. At the most basic level, the maintenance of and nearly instantaneous access to medical records facilitates tracking and coordination of care among providers is an example of how digital technologies have directly influenced the practice of medicine. The steady increase in apps and digital devices developed to track health-related behaviors and monitor physiological data is a testament to the interest and potential powerful role that technology will play in the future of medicine. These tools may become most useful for aiding health care in the gaps between formal treatment (eg, hospital, clinic, and doctor visit) and day-to-day living in extended or chronic conditions. For example, individuals with chronic medical conditions often experience significant symptoms of cognitive fatigue (CF); however, it is a challenge for clinicians to evaluate the impact of this symptom on daily activities. Technological solutions potentially provide greater insight into the impact of symptomatology on the quality of life. Researchers and clinicians alike have a profound interest in technology and its current and future role in health care delivery.

Immersive Virtual Reality
Immersive virtual reality (VR) technology has been increasingly used by researchers in many fields as a tool to observe and measure the responses of individuals to complex stimuli in a controlled environment [1][2][3]. Auditory and visual stimuli induce the sense that they are in a space different from where their physical body is located. Usual tasks (locomotion, pointing, and grasping) are accomplished in a modified manner using ancillary equipment (eg, hand controllers and sensor gloves). Immersive VR environments enable researchers to study psychological phenomena that are more closely connected to the subjective experience of an individual (eg, a tall building to elicit fear) to recreate situations that elicit symptoms (eg, anxiety) or measure specific skills (eg, a kitchen to evaluate home safety). VR environments have been used to evaluate human and environmental factors associated with performing important instrumental activities of daily living (IADLs) such as driving [4], navigating public transportation [5], cooking [6], social relatedness [7], and grocery shopping [8]. The relative advantage of virtual environments over physical spaces is the ability to safely expose individuals to situations that may pose a risk in real life (eg, driving while distracted) and the ability to create controlled environments that would be extremely difficult to duplicate in a consistent, standardized fashion in real-life simulations.

Implications of CF
CF is a common human experience that can result in serious negative consequences, such as mistakes [9,10] and accidents [11][12][13]. Although most healthy people experience some degree of CF at varying times, CF can become a debilitating and life-altering experience for individuals diagnosed with chronic medical conditions [14][15][16]. Debilitating levels of CF occur as a frequent comorbid symptom in a range of medical [17], neurological [18], and acquired conditions [19], particularly those affecting the integrity of neuronal processes [20,21]. The serious consequences of CF at work, during daily activities, and as a potential cause of disability across a broad spectrum of clinical conditions make the study of objective and subjective fatigue in healthy and clinical populations a priority across multiple disciplines.

CF Induction
The most well-established model for inducing CF under experimental conditions is prolonged cognitive performance. Specifically, participants perform a cognitive task for an extended period (eg, 15-120 minutes) and assessments of fatigue level occur before, during, and after the fatiguing task. Various cognitive tasks reliably induce subjective feelings of fatigue, including continuously performed attention [22,23], inhibition [20,24,25], working memory [26][27][28], and complex cognitive activities [29][30][31]. Tasks requiring continuous visual monitoring for critical events produce a highly replicable phenomenon called the vigilance decrement [32], which has a moderate effect size [33]. Factors affecting the onset of vigilance decrement include image quality [34], response frequency [35], rest breaks and secondary task interruption [36], and multitasking [37]. Moderately complex cognitive functions such as working memory [27,[38][39][40][41] and inhibitory control [24,[42][43][44][45][46][47][48] tasks produce subjective feelings of fatigue but inconsistently produce performance decrements. Simple and complex vigilance tasks produce CF; however, these laboratory tasks may not best represent how CF occurs in daily life as boredom and task disengagement may account for observed vigilance decrement effects [38]. A better approach to understand CF for clinical purposes may require the evaluation and assessment of fatigue in typical daily living activities.

Work Task and Environment Characteristics and CF
Work fatigue studies target tasks and environmental characteristics that produce CF in everyday activities. Close visual work involving inspection, comparison, or identification of details on visual images [49][50][51][52][53] and high rates of decision-making are sources of work fatigue [54][55][56]. Work interruptions interfering with workflow increase feelings of frustration [57], stress [58], and feelings of emotional exhaustion [59]. Work interruptions cause a loss of focus [60] and increase cognitive workload [61], mental effort, annoyance, frustration, and sense of time pressure [62,63]. Random, uncontrollable, interruptions in the middle of a task [61,62] that require immediate attention induce the most stress [62][63][64]. Individual differences in personality impact the level of perceived stress and fatigue associated with work-related tasks [63]. Work requiring intensive visual inspection or high rates of decision-making induce fatigue, and environmental factors such as distractions and interruptions significantly increase perceived frustration, workload, and fatigue.
driving [66,67]. Time to fatigue while driving is hastened by extra cognitive demands, stress, distractions, multitasking, and environmental factors [66][67][68], although time on task and monotony are most impactful [69,70]. Personal characteristics associated with driving fatigue include fatigue proneness, dislike of driving, and coping style [66]. Surprisingly, few studies have evaluated the relationship between IADLs and CF; however, such assessments offer tremendous potential for discerning points for clinical intervention. There is some evidence that apathy, depression, and impaired cognitive functioning are risk factors for difficulties in performing IADLs [71,72]. A public transit study demonstrated that a common IADL induces cognitive workload in real life, task experience moderates perceived workload, and immersive VR provides a close approximation of the cognitive effects observed in real life [5]. The extended performance of a daily activity may induce CF, and the effects are moderated by individual and environmental factors. Grocery shopping provides an apt task for assessing CF.

Immersive VR and Grocery Shopping
Virtual shopping environments have been used to evaluate how cognitive functions might operate in real-life situations [73][74][75] and may prove effective for the study of CF. Grocery shopping requires a combination of low and high levels of cognitive processes [8,[73][74][75]. Looking for a specific product requires visual inspection, scanning, and focused attention. Traversing a shopping store requires visual attention (eg, looking for signs), spatial mapping, working memory, memory, and executive functioning [8,[73][74][75][76][77]. Virtual shopping environments have been used successfully among individuals with significant cognitive impairment [8,[78][79][80][81], and virtual shopping tasks correlate with real-life shopping activities [81,82].
We identified immersive VR grocery shopping as a suitable model to study fatigue associated with an IADL because it provides familiar but complex visual stimuli, affords the opportunity to search and choose, and presents the participant with well-known but complex cognitive challenges, such as comparisons, discernment, and decision-making. A potential disadvantage of using immersive VR to study CF is the risk of physical distress and eye strain in VR, which may confound the experience of CF or its measurement [83][84][85]. The risk of eye strain and other physical symptoms is reduced when high-quality head-mounted display (HMD) devices are used, motion is performed using physical walking or teleporting, the field of view is large, and each eye receives high-quality images [84]. In some cases, the realism of the environment must be sacrificed to reduce side effects.
We propose a 2-phase evaluation of the CF induction in VR. In phase 1, we will explore the feasibility of using immersive VR as a platform for studying CF using a user experience (UX) research methodology. We will use a combination of qualitative and quantitative approaches to identify components of the VR interface or environment that may contribute to feelings of eye strain or distress or make the shopping task difficult to perform. Phase 2 explores cognitive, environmental, and individual characteristics associated with VR-based grocery shopping-induced CF.

Objectives
Despite extensive research on CF, questions remain regarding the individual and environmental characteristics that relate to CF, particularly in daily living activities. Prior studies evaluating CF in daily activities have primarily focused on driving [5,70] or very specific job-related activities [49,53,86]. We will use immersive VR to control environmental and task characteristics to identify factors that affect the onset of fatigue. Grocery shopping is used as a fatigue-induced activity because it requires multiple simple and complex cognitive functions, has been identified as a significant cause of CF in susceptible individuals [87], and is susceptible to disruption by disability [88]. On the basis of previous research, engaging healthy participants using virtual shopping environments indicates the feasibility and acceptability of VR and therefore provides the best chance of detecting the CF response [5,86,89].
In the experiment, we will replicate numerous cognitive aspects of shopping, including simultaneous and successive engagement of multiple cognitive processes including working memory, spatial planning, inhibitory control, visual search, inspection, and comparison, reading and applying information from nutritional labels, and decision-making. We can manipulate the mental workload through specific task requirements. In addition, we can test the relative effect of environmental factors, such as the effect of sound and visual cues on CF and workload, by introducing the presence of interruptions, distractions, and goal interference. In a controlled shopping environment, where interruptions can be planned carefully, as the participant executes goal-directed behaviors, real-life frustrations such as poor shelf organization and item placement, crowded conditions, noise, and other disruptions can be implemented. In future studies, the virtual shopping environment will allow us to test hypotheses related to the relationship of task difficulty, perceived task difficulty, environmental disruptions, and feelings of frustration with CF. Initial trials will use healthy controls, and subsequent studies will evaluate CF in clinical populations.
The aim of phase 1 is to evaluate the design elements of the virtual shopping environment to identify any factors that may hinder the ability of participants to effectively perform tasks in the virtual environment, identify the risk of physical distress, and obtain user feedback about realism and functionality. The primary hypothesis for the UX study is that the VR environment will be acceptable; however, some users will exhibit minor difficulties using the controllers and interacting with the environment. The primary outcome measures will be observational ratings assessing user difficulties with controller use, interacting with objects, and moving in the environment. The secondary hypotheses include that participants will report only minimal feelings of distress, will report that the virtual grocery store appears realistic and immersive, and will provide a general positive response to the experience with additional helpful ideas about how the experience could be improved.
The primary aim of phase 2 is to evaluate individual and environmental characteristics associated with susceptibility to experiencing CF in the context of performing an IADL, specifically shopping. Our primary hypothesis in phase 2 is that individuals performing structured grocery tasks will report more CF than simple exploratory behavior in the grocery store and that individuals experiencing distractions and interruptions will report more fatigue than those who do not experience interruptions. The primary outcome measure in phase 2 is the self-reported change by participants in CF by shopping experience. The secondary aims of phase 2 are to identify performance and eye-tracking measures that objectively identify fatigue, cognitive abilities, personality characteristics, shopping experience, or transient mood states that affect susceptibility to fatigue during shopping. Specific secondary exploratory hypotheses include that perceived workload increases with time on task for structured tasks and disruptive environments, percent eye closure and gaze shift increase with time on task and are associated with self-reported fatigue, and shopping accuracy declines with time on task.

Study Design
This will be a 2-phase development (UX) and implementation (eg, randomized controlled trial) research protocol. The two phases share the same general immersive VR environment, as shown in Figure 1. The two phases diverged in the non-VR-related procedures used in each protocol. The VR sequence in each phase will follow the standard model commonly used in CF induction studies, that is, baseline cognitive assessment, baseline subjective fatigue and workload assessment, fatigue induction with a midpoint (eg, at 15 minutes) subjective assessment of fatigue, finishing with a postassessment of fatigue, and cognitive assessment. Each of these elements is shown in Figure 1. Randomization will be used in each study to assign participants to 1 of the 3 grocery shopping experiences: shopping exploration, standard shopping, and shopping interference. In each study phase, participants completed a brief self-reported medical history to rule out conditions associated with chronic fatigue, cognitive impairment, or susceptibility to seizures. Participants in both studies completed the Virtual Reality Symptom Questionnaire (VRSQ) [90] before VR immersion and immediately after VR immersion. These procedures will help differentiate the impact of VR immersion from the fatigue induced by the shopping task.
The phase 1 study will evaluate the participants' capacity to learn to interact with objects in the virtual environment, navigate within the grocery store environment, read and respond to information and questionnaires, and identify any early adverse effects of VR exposure. The data collected from this study will be used to improve the VR interface and modify the participant interactions or the length of exposure. The study staff will observe the engagement of the participant in the immersive task by viewing the person as well as by viewing their exact point of view on a separate computer screen. Participants will complete rating scales including feelings of presence [91] in the shopping environment, self-reported simulator sickness symptoms [90], and shopping values or experience [92]. All participants completed a standardized UX interview. The phase 2 study protocol, detailed in Figure 2, will incorporate additional self-report and performance measures (see Table 1 for lists of measures in each phase). Additional measures include state and trait measures of fatigue [93,94], current emotional state (ie, anxiety and depression) [94], personality traits [95], and cognitive functioning [96]. These measures will be completed before the VR portion of the study with a 1-hour break between completing additional study measures and VR immersion. Similar to phase 1, participants will complete measures of presence [91] and shopping values or experience [92] to assess the impact of realism, shopping as a pleasant versus utilitarian task, and frequency of grocery shopping in real life on fatigue and performance. A brief post-VR interview will be completed to obtain additional insight about the environment and to debrief participants about the purpose of the study.

Kitchen Tasks
The participants will be seated while performing the tasks in the kitchen environment. The participant will appear to be seated in a kitchen table with a pillbox and pill bottles in front of them.
In the first task, the participant will be instructed to correctly select the pillbox compartment (labeled with the days of the week) where each pill belongs. A calendar on the table shows an image of each pill and the pillbox location (eg, Sunday or Monday); when a pill appears in front of the examinee, they will select the correct pillbox location by using a scroll and trigger pull sequence. An animation sequence will show the pill entering the selected location. Another pill will appear with a sound alert until 120 seconds have passed or 120 pills have been sorted.
In the working memory task, the participant will be shown a series of pills by day and time of day associations. Using the same calendar concept, the participant will see where 2 pills are to be placed in the pillbox (eg, red pill in the morning on Monday and blue pill in the evening on Friday) for 10 seconds. They were instructed to remember the location of each pill. The key will be taken out of view and the pills will appear one at a time. Each participant will select the location where each pill belongs using a scroll and trigger sequence. The task will increase in difficulty with 3, 4, 5, and 6, the number of pill locations to recall. The task will end when the participant obtains four consecutive scores of zero.

Shopping Tasks and Experiences
Participants will remain seated during all VR shopping experiences. Product labels are legible for brands and specific products without selecting the object. Product selection enables the viewing of all product details. Products will be selected off a shelf using a wand controller acting as a pointing device (eg, laser beam), followed by a point, highlight, and trigger pull selection sequence. Selected product labels will appear on a virtual cell phone in front of the participant with a menu of options (eg, buy product, return product, and review shopping list). Participants traverse the store using a restricted teleport feature. Movement will be restricted to a more realistic experience and to avoid long-distance movements that might result in disorientation and difficulties in learning the store layout. Figure 3 presents a screenshot of the grocery store and shows the cell phone, products, and aisle.

Shopping Training
The participant will appear in a small version of the grocery store. They will be instructed on how to use the virtual cell phone to check their shopping list, review items in their cart, and answer text messages with the left-hand controller. They will be instructed on how to teleport and select items off the shelf with the right-hand controller. In the shopping training task, the participant must follow specific directions and must correctly put four items in their cart from the shopping list, one of which must be returned to the shelf, and they must answer a text to complete the shopping training. To complete the shopping portion of the task, the participant will need to teleport successfully to multiple shelves and aisles.

Shopping Experience Number 1
Experience number 1 will be a control experience that will allow the participant to explore the grocery store with no specific task to complete. The shopping environment includes a few avatars and some low music in the background to simulate a realistic shopping environment during off-hours. All shopping actions will be enabled, and the participant may select items and place them in the shopping cart. The only requirement will be that they remain in the environment for 30 minutes. The control experience will evaluate whether the VR environment itself induces significant fatigue that may confound the interpretation of task-specific fatigue induction.

Shopping Experience Number 2
Experience number 2 is a standard shopping experience designed to mimic a realistic shopping experience during a typical day. Participants will be provided with a shopping scenario. They will be told that they are shopping for sick friends. The participant will try to obtain as many items as possible from the cell phone shopping list. Participants will traverse the grocery store to find objects on the list and place them in the shopping cart. Avatars are present in the store but do not hinder progress or create any specific distractions. The background sound includes typical background noise, music, and overhead announcements. This condition assesses the cognitive load and fatigue related to the mental activity of shopping.

Shopping Experience Number 3
Experience number 3 will be the standard shopping experience with frustrating and interrupting events. Participants will be provided the same shopping scenario as experience number 2; however, this shopping experience will be designed to mimic very high traffic, a holiday shopping experience, store crowding, misplaced items, and loud distractions. In addition to environmental stressors, the cell phone will receive texts from the friend requesting changes to the grocery list after items have already been selected. Text alerts will be short, repetitive, high-pitched sounds that continue until the text is answered. The progress of the participants will be impeded by an avatar standing in front of a needed item, an aisle blocked for a spill, or a palette blocking access to a specific shelf area. The sounds of a baby crying, people talking, coughing, laughing, and sneezing are present. The music and announcements are played at a slightly higher volume than in the standard shopping condition. This condition will assess the cognitive load and fatigue related to the mental activity of shopping in the presence of distractions and frustrating events.

Fatigue Assessment
Fatigue induction studies evaluate real-time changes in fatigue symptoms by self-reporting, performance, and eye tracking. An adapted version of the Visual Analog Scale-Fatigue (VAS-F) [97] will be used as a state fatigue measure given its history of use in fatigue induction research. [27,98,99] A closely linked concept to CF is cognitive workload. Cognitive workload applies an ergonomic and human factors model (eg, elements of a job or task that create a feeling of mental work) to understand fatigue as it relates to sustained work performance [100][101][102][103][104]. The NASA-TLX is a commonly used measure of workload [105,106]. In addition to subjective measures, there are two approaches to use performance data to objectively measure fatigue: change in performance on the induction task or using a pre-versus postintervention cognitive assessment [26,29,107,108]. Tests of reaction time [42], working memory [23], and inhibitory control [32] are used to assess fatigue effects.
Psychophysiological measures identify objective brain or autonomic nervous system indicators of fatigue using EEG (electroencephalogram) [22,23,30,109], ERP (event related potential) [28,31,86,110], functional brain imaging [27,40,107,108], and ECG (electrocardiogram) [29,30] to measure changes in brain or cardiovascular activity associated with fatigue. Of the various physiological indicators, eye tracking has emerged as a promising, noninvasive tool for identifying objective measures of CF. Eye tracking studies show changes in blink rate, percent eye closure, gaze fixation (eg, length and location), and gaze shift rate are associated with CF [30,[49][50][51][111][112][113][114]. Changes in gaze shift rate may indicate use of less efficient lower-level cognitive processing [49] and a centralized fixation can indicate a loss of full attention to the task [115]. Several sources, using different task demands, show changes in visual activity as the time on task increases.

Engineering and Technology
The virtual environment was created using Unity 3D (Unity Technologies). Products will be created by the graphics design team using digital image files obtained from the product manufacturer, labels scanned from acquired grocery items, or modified from items purchased through the Unity Asset Store. All labels are converted into 3D objects using a variety of programs and techniques. The design team will develop a cohesive store branding and coordinated color scheme for store assets. Within the environment, near objects will be displayed with a high degree of visual detail, whereas distant objects will have reduced detail. General product labeling will be legible without selecting the object; however, specific product information (eg, reduced sodium or nutritional values) will only be legible after product selection.
The VIVE Pro Eye (HTC Corporation) will be the HMD device used in each study. This device has a 2880 × 1600-pixel display resolution and includes eye tracking and high-resolution surround sound and allows for the use of glasses and adjustable optics that are designed to minimize eye fatigue and cybersickness. Participants will interact with the virtual environment and objects within the environment by using a wand. The VR program is delivered to the HMD via a display port from a Dell Precision workstation 7920. MHz RDIMM ECC (error correction code) memory. This equipment will have adequate processing power, graphical speed and resolution, and memory to provide a vivid, smooth immersive experience. Eye-tracking data will be collected from individual participants using the integrated eye-tracking system contained within the HTC Vive Pro HMD. The data sampled by the HMD eye tracker include data output (eye information): timestamp (device and system), gaze origin, gaze direction, pupil position, pupil size, and eye openness, which are captured every 200 ms.

Participants
The participants will be recruited from a local metro region. We anticipate that participant background characteristics (eg, education, ethnicity, sex, and age) will be representative of the metro area in background characteristics (eg, education, ethnicity, sex, and age). Recruitment will be managed by the National Institutes of Health (NIH) Office of Patient Recruitment, using local flyers; Office of Patient Recruitment website; and posts on social media, including Facebook and Twitter. Participants will be remunerated to participate in the study. All protocol activities will take place in a local NIH facility in Bethesda, Maryland.
For each phase of the study, participants will be healthy individuals aged 18-75 years. Recruitment for phase 1 will target an older (≥55 years) and younger group (18-54 years) with 50% targeted for each group, stratified by sex. Recruiting a broad age range will ensure usability among older individuals, as future apps will likely involve older adult clinical populations. The sample will be stratified by sex, as some research suggests that women may experience immersive VR differently from men [116,117]. The phase 1 sample size will be 24, with 8 participants completing each of the three shopping conditions. Evaluating participants from a variety of backgrounds is important in UX research to identify any systematic issues in the interface, content, or instructions. The phase 2 study recruited 60 healthy individuals aged 18-75 years. For this study, there will be no targeted recruitment of older adults, as any design issues specifically associated with subject age will be addressed before phase 2. The sample size was determined based on the calculated effect sizes of fatigue induction studies that used the VAS-F (Cohen d=0. 65

Overview
This statistical analysis plan was reviewed by the National Institute of Nursing Research (NINR) statistician. All data will be processed, cleaned, and analyzed using the SAS 9.4 (SAS Institute). The data analysis approach for phase 1 focuses on descriptive and nonparametric tests. The primary goal of the phase 1 study is to evaluate the measures and identify any interface issues that cause participants to have problems interacting with the environment or producing unexpected physical symptoms. The phase 2 study will test specific hypotheses using inferential statistics.

Phase 1
The data analysis for phase 1 will inform decisions related to programming, data outputs, adequacy of obtained score distributions, evaluation of the psychometric quality of the cognitive tests, and identification of any potential confounds (eg, length of VR exposure) that could impact future studies. We will examine the initial evidence for the fatigue induction effects of the three shopping conditions. We will use frequency and nonparametric procedures to evaluate the rates of observed difficulties using controllers, interacting with the environment, and following instructions generally, by age groups and by sex. We will compare self-reported feelings of distress before entering the VR environment to the self-reported symptoms after exiting the VR environment. For this analysis, the Wilcoxon signed-rank test will be used. Secondary analyses evaluate distributions of key dependent measures including self-reported CF and workload, eye-tracking data (eg, blink rate, percent eye closure, gaze fixation length, and gaze shifts), and performance data for shopping and cognitive tasks (eg, correct response and response speed), as having a score distribution of several SDs will be important when the measures are applied in hypothesis testing. Following a structured interview, responses will be analyzed for common interface or immersive content issues (eg, difficulty teleporting, difficulty reading text, and problems accessing grocery list).
For example, in phase 1, we will compare the participants' self-reported physical symptoms and eye-related symptoms from the VRSQ before versus after completing the VR shopping experience. We will use the Wilcoxon signed-rank test, given the high probability of a nonnormal distribution in the dependent measure. This comparison will provide evidence to determine whether the VR environment produces physical distress or eyestrain. We computed the total scores for each of the observation scales. These totals inform about the number of times the participant had difficulties with the interface. We will compare the frequencies of interface problems in older and younger and male and female subjects using the chi-square test. These are structured statistical analyses planned as part of the formal UX results. Exploratory procedures are used to identify the relationship between user behavior (eg, number of items reviewed, distance traveled in the environment, and accuracy of shopping behaviors with measures of shopping enjoyment, shopping experience, and sense of immersion in the environment). For these analyses, we used Spearman rank-order correlations.

Phase 2
We will use the repeated measures analysis of variance (ANOVA) with the VAS-F and NASA-TLX as repeated dependent variables by shopping experience (fixed) to test the hypothesis that grocery shopping creates fatigue and workload, particularly when the person experiences interruptions and distractions. Secondary analyses will evaluate whether objective indicators of fatigue, such as eye tracking, shopping performance, and cognitive functioning (eg, pre-and postshopping processing speed and working memory) significantly differ by shopping experience using repeated measures ANOVA. A third series of analyses evaluated the relationship among individual characteristics, perceived CF, and workload. We will primarily use correlation to evaluate the relationship among pre-existing symptoms of fatigue, anxiety depression, personality traits, and fatigue susceptibility. Cognitive measures from the NIH toolbox will be correlated with perceived fatigue and workload to identify whether cognitive abilities influence the perception of cognitive workload and fatigue.
For example, we will use the repeated measures ANOVA to test the primary hypothesis that the cognitive activity of shopping for specific items will create a greater perception of mental workload and fatigue compared with just exploring the environment unless the distribution of dependent measures does not allow for using this specific statistical procedure. Similarly, we will use an appropriate correlation procedure to compare the level of activity measures such as distance traversed in the store, number of items selected and reviewed, and efficiency and accuracy of shopping activity with perceptions of fatigue and workload. Correlation procedures will be used to assess the relationships between constructs, such as personality style, cognitive ability, fatigue susceptibility with self-reported mental workload, and fatigue to identify individual differences in fatigue susceptibility. Eye tracking such as percent eye closure will be explored as a possible objective indicator of fatigue by serving as a dependent measure in the repeated measures ANOVA by shopping experience and in correlational analysis with self-reported fatigue and workload. The actual analysis considers the appropriateness for each specific variable distribution.

Results
This study was approved by the scientific review committee of the NINR and identified as an exempt study by the institutional review board of the NIH. Data collection will begin in spring 2021.

Overview
The development of a complex, immersive VR environment requires close collaboration between individuals from multiple disciplines. The iterative design of the grocery store involves simulation of activities (eg, selecting objects using various techniques), legibility assessment of various product creation strategies, user testing by team members to identify potential sources of physical discomfort (eg, effect of antialiasing on visual acuity and developing headache), comparison of movement modalities (eg, walking vs sitting), ambient environmental factors (eg, store sounds and signage), and sizing of store elements (eg, shelf height, length, and store size). In addition, the research team will implement several simulations to evaluate the software performance and integrity of the data outputs. For each activity, the team of engineers, graphic designers, clinical experts, and researchers evaluated the relative impact of design on study requirements, UX, and software functionality. This process requires a high degree of communication and knowledge sharing.
The digital development process is fraught with potential pitfalls, particularly if team communication breaks down, and a collaborative spirit is diminished. For example, the design of the user interface can have a significant effect on the cognitive demands of using the software. If not created collaboratively, the resulting user interface may create a confound in the interpretation of the cognitive processes required for performing an IADL, as unintended skills may be introduced into the process. When communication is effective, multiple options for the experience are evaluated, such as comparing the use of different processes to remove an individual item from a shelf. Some of these options produce unintended consequences associated with product legibility and the potential for users to develop headaches from the experience. However, a seemingly less natural object selection process (eg, point and trigger pull) alleviates these issues with only a slight reduction in the sense of realism. Similarly, creating intricately detailed products had a negative effect on software functionality (eg, lower flicker fusion rate), which produces an unpleasant experience for the user. By reducing the object vectors and polygons, it is possible to maintain a high degree of realism without interfering with the software functionality. Researchers wishing to deploy complex, immersive VR experiences must anticipate the myriad of factors that potentially introduce confounding variance that reduces the fidelity of an intervention or the measurement of key constructs. In our experience, team communication of design requirements, relying on an interdisciplinary set of skills and knowledge, continuous informal UX testing, and applying an iterative design approach are necessary for effectively using VR as a research platform.

Strengths and Limitations of This Study
CF is a complex phenomenon influenced by task, environment, personal experience, and individual differences. Our experimental conditions included a familiar task performed in a realistic immersive VR environment that allows for the precise control of stimuli. The ability to control stimuli and timing of events will enable us to determine the relative contribution of distraction, boredom, task complexity, and person characteristics on the development of CF. The strength of the immersive VR experience is the capacity to create a cognitive experience that closely aligns with real-life demands. Our ability to control the presence and timing of interfering factors enables us to assess environmental influences that would be almost impossible to standardize using an actual grocery store.
The immersive VR environment allows us to seamlessly use multiple measures of CF. We will use subjective indicators of CF and workload to better understand how perceived fatigue (eg, physical fatigue: tiredness and sleepiness; cognitive fatigue: efficiency and difficulty in concentrating) and workload (eg, mental demand, effort, and frustration) relate to the effects of tasks, environments, and other factors. Potential objective measures of CF, such as changes in behavior (eg, performance efficiency, shopping list rechecking, rate, and the efficiency of movement) and changes in eye movement, can be measured unobtrusively. The use of a randomized controlled design is a strength of this study. Participants will be randomly assigned to 1 of the 3 shopping conditions to control for any confounding effects of person-level background characteristics (eg, age) that may affect fatigue or reactions to the VR experience.
The primary weakness of the study is the potential for the immersive VR environment itself to create feelings of eye strain and fatigue. This visual effect of the VR environment may have a stronger impact compared with the fatigue effects of the cognitive task, reducing the observed differences between experiences. We are mitigating the potential of VR-induced fatigue by using high-resolution HMDs. In addition, we will measure symptoms of physical distress pre-and postimmersion to identify any signs of physical distress that could affect the levels of self-reported fatigue. We are limiting the potential for motion sickness by using the teleport function for movement and other changes to the visual presentation to minimize any potential for headaches. The UX study is performed to specifically address questions of usability, including identifying any factors that might produce physical discomfort.

Conclusions
Our initial informal user testing indicated a high sense of immersion and realism in the virtual shopping experience. We will continue to modify the shopping experience to meet the research goals of evaluating the effect of cognitive and emotional factors that influence fatigue onset. The store size will be 18,000 square feet, consistent with the dimensions of a small grocery store in the United States with hundreds of unique items created. Additional products are being created to give the store correct proportionality, typicality in selection options, and a visual experience that is consistent with a grocery shopping experience in the United States. The software will be ready for formal UX testing as outlined in this paper in the spring of 2021. We anticipate that the virtual shopping experience will provide a wealth of data related to the experience of CF while performing routine activities.

Introduction
Early childhood caries (ECC) is a significant public health problem in children aged 6 years and under living in South Africa [1]. According to the Global Burden of Disease study, the prevalence of untreated dental caries in primary teeth is 532 million [2].
Untreated dental caries has many adverse effects that can affect physical development, including increased absenteeism from school [3], low BMI [4,5], negative educational outcomes [3], and poor oral health-related quality of life [6,7].
Children are at the highest risk of developing dental caries as they are vulnerable and depend on their caregivers for their dietary needs and oral hygiene. Dental caries develops over time and is a consequence of the demineralization of tooth enamel by acids produced during the metabolism of sugars by cariogenic bacterial sugars [8]. The early stages of the disease are often asymptomatic, but if left untreated, dental caries can result in severe pain and life-threatening infections.
Global statistics show an inconsistent prevalence of ECC between different continents and within the same country. In 2007, the prevalence of ECC in children under 5 years of age was 40% in Brazil [9], and in 2016, it varied between 41.9% and 16% in 2 separate districts in India [10,11]. Ismail and Sohn [12] conducted a systematic review in China and reported that the prevalence of ECC in the country was between 78.6% and 85.5%. A later study by Zhang et al [13] recorded prevalence rates between 0.3% and 70.7% in children aged 1-6 years in the same country.
The most recent prevalence rate of ECC in China was documented by Zeng et al [14], who recorded it to be 49.13% in preschool children between ages 3 and 5 years in a southeast Chinese province. Similar varying prevalence rates were recorded across continents (ranging from 22.9% in India to 90% in Indonesia) [15].
In South Africa, the national prevalence rate of ECC is 60% among children under 6 years or age [1]. The prevalence of dental caries in children aged between 2 and 4 years in Johannesburg was 47.74% [21], whereas in the Western Cape, this varied from 71.6% [22] to 80% [23].
For the allocation of resources necessary to manage ECC effectively, it is important to understand the demographics of South Africa. The country is inhabited by 55.7 million people, among which 10.3% are under the age of 5 years [24]. Approximately 20% of children reside with either a grandparent or a caregiver [25], and 13.1% of households live in informal dwellings. Many families lack access to basic amenities including electricity, clean water, food, and a stable income [26] and more than one-quarter of the population rely on social grants, particularly in the poorest provinces [25]. Furthermore, the prevalence of HIV in the country was estimated to be 13.1% in 2018 [24]. With the high level of poverty, lack of access to infrastructure, and high HIV prevalence, the prevention of ECC has not been a priority in this country.
The purpose of this study was to determine the prevalence and severity of ECC in South Africa in children under 6 years of age. To date, this will be the first scientifically conducted systematic review on the prevalence of ECC in South Africa.

Study and Ethics Approval
This protocol will be conducted using the PRISMA-P (Preferred Reporting Items for Systematic reviews and Meta-Analyses for Protocols) guidelines [27]. Ethics approval was not required as this is not a primary study involving participants. The study protocol was registered with PROSPERO (CRD42018112161) on November 21, 2018.

Types of Studies
Cross-sectional and cohort studies reporting the prevalence of ECC in healthy children aged 6 years and under living in South Africa will be included in the review. This is a prevalence/incidence study, and consequently, no interventions will be assessed. The primary outcome is the prevalence/incidence and severity of ECC. The severity of ECC will be measured using the WHO guidelines in infants and children up to the age of 6 years. The WHO criteria include dmft scores (decayed, missing, and filled teeth; lower case indicates deciduous teeth) and the percentage of children that are caries free (including noncavitated caries [white spot lesions]).
Scientific articles published in all South African official languages will be included in the review. Non-English articles will be translated by the Department of Foreign Languages, University of the Western Cape or a reputable translation company. To authenticate the translations, we will cross-reference the original article with the English abstract (which is usually available online) and reverse translations will be conducted to ensure its correctness.
Commentaries/letters and other gray literature will be excluded from this review.
Secondary searching (PEARLing) will be conducted (PEARLing is a search strategy where the reference lists of all the studies, whether included or excluded, are identified for possible inclusion). Manual searching will not be conducted due to the difficulty in replicating this method.

Study Selection
The articles will be uploaded into Rayyan [28] and screened in 2 stages. Two review authors (FK-D and TR) will independently assess the titles and abstracts of the studies and compare them against the inclusion criteria. The full texts of eligible papers and those that contain insufficient information will be sourced.
Other reviewers will be consulted when a disagreement pertaining to the inclusion of a publication arises. The searching process will include all prevalence studies up to November 15, 2020. All eligible studies will be included, and authors will be contacted if any clarification is needed.
After reading the full-text articles, those that do not meet the inclusion criteria will be discarded and the reasons will be recorded in the "Characteristics of excluded studies" table. The reference list of all included publications will be reviewed for additional eligible studies.

Data Extraction and Management
Two reviewers (FK-D and TR) will independently extract data onto a standardized data extraction form (initially piloted on a small sample of studies) using Microsoft Excel (2014). Upon completion of data collection, the data will be uploaded to the University of the Western Cape's data repository for safekeeping [29]. The data will be pilot tested, and the independent authors will be trained on how to extract data. The content of the form will be compared, and any differences in opinion will be resolved by discussion and consultation with the other reviewers. If any information from the studies is unclear or missing, the corresponding authors of the original papers will be contacted (where feasible). Study information will include author, title, year of publication, study design, and year in which the study was conducted. Participant-level data will include age, the province where the study was conducted, dmft score and standard deviation, number of cases and total sample size, and urban/rural setting. Pooled prevalence will be obtained by dividing the number of participants with the caries with the number of participants in the whole population, and the data will be assessed using Stata (StataCorp LLC). Pooled standard deviations will be calculated using the Cohen (1998) formula [30].

Availability of Data and Materials
All data, irrespective of the quality of publication, will be included in the review. If details on study publications cannot be obtained, a librarian will be consulted, and if the study remains non-obtainable, it will not be included in the qualitative or quantitative analysis.

Study Quality and Risk of Bias Assessment
The quality assessment of studies will be conducted using the Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Studies Reporting Prevalence Data [31].

Analysis of Study Findings
A meta-analysis will be conducted, using Stata 17, if there are studies of similar comparisons reporting the same outcomes using a random-effects model and only if there are 4 or more studies.

Assessment of Heterogeneity
This review will include diverse modalities of interventions and will result in heterogeneity of the content of interventions, outcomes, and outcome measures. We will contemplate the feasibility of conducting a meta-analysis on a subgroup of included studies once the data have been extracted. Where feasible, we will assess the statistical heterogeneity in the meta-analysis by visually inspecting the scatter of effect estimates on the forest plots, Cochran test (using .10 level of significance), and by using the I 2 statistic [32].

Assessment of Reporting Biases
Where possible, we will use multiple sources of data, including those from unpublished trials. Should a meta-analysis be conducted, we will assess publication bias according to the recommendations described in the Cochrane Handbook for Systematic Reviews of Interventions [32]. Reporting biases such as selective reporting, duplication, and language of publication will be investigated.

Analysis of Subgroups or Subsets
We will use a subgroup analysis to examine heterogeneity using Stata 17. This will include exploring the influence of factors such as participant age, province, and urban/rural status. If sufficient numbers of studies are included, a meta-analysis will be performed.

Results
This protocol was registered with PROSPERO in October 2018, and the electronic searches were completed by November 15, 2020. The original search yielded 2247 articles.

Principal Results
The study aims to assess the prevalence of dental diseases and its severity in children under the age of 6. The South African government does not regularly monitor the dental disease of children or adults. The last national oral health survey was conducted in 2004 in children only and adults were excluded [33]. There are plans to determine the disease prevalence and severity in South Africa in the next few years. Until then, this review will inform the dental and medical fraternity about the prevalence of ECC in South Africa.

Conclusions
There are very few studies detailing the prevalence and severity of dental disease in young children in South Africa. It is imperative that we monitor the trends of dental disease in children to inform stakeholders of this burden. Dental disease is a noncommunicable disease, and is associated with childhood obesity and childhood diabetes. More efforts need to be made to prevent the onset of dental disease, and thus prevent the incidence of other noncommunicable diseases in the future leaders of South Africa.

Background
The workload for health care workers has remained high for many years [1,2]. Several factors contribute to this trend and result in different effects for employees and the health care system [3]. Factors that promote a high workload include understaffing, long working hours [4], and information overload [5]. Work-related stress has become one of the main challenges in the health care sector [6] and has different impacts on employees. Nurses in particular report high levels of work-related stress that can lead to negative physical and psychological effects for them as well as for their patients [7]. Nurses describe themselves as feeling empty and report depressive symptoms [8,9]. In Germany, health care professionals (HCPs) have an above-average number of sick days compared to workers in other sectors; overall, there was a 29% increase in sick days between 2004 and 2018 [10]. In addition to musculoskeletal disease diagnoses, which account for the majority of sick leaves, absences due to mental illness are increasing significantly [11].
Partly responsible for the workload-promoting factors described above are the consequences of demographic changes that have led to an increase in the number of multimorbid older adult patients and a decline in the number of nursing staff. The transformation process of digitization in health care is a chance to counteract this change and its consequences. However, in Germany in particular, the process is proceeding very slowly; Germany is ranked 19th of 27 countries in Bertelsmann's Digital Health Index [12]. The application of digital health technology (DHT) is an important factor of this digitization process. DHT in the context of this work means technologies that are directly linked to outpatient and inpatient care and are applied by nurses or physicians. DHT includes hospital information systems (HIS), medical devices, and other digital applications that support patient care from the perspective of HCPs.
In addition to the positive effects of the use of DHT, there is also evidence to suggest that the use of DHT causes an extra load. This may be due to a lack of usability and user involvement as well as poor implementation processes [13,14].
Poor usability and other factors rooted in technologies can cause a high mental workload (MWL) [14]. High workloads can result in a more error-prone performance-even for experts-induced by difficulties in decision-making processes [15].
Working with patients can be considered a safety-critical environment. This means that many tasks, varying in complexity, occur within limited time windows.
In this context, decisions must be made all the time and are supported by different systems (eg, HIS) through the structured and standardized presentation of information. The interaction between users and systems is complex and interdependent, which makes it difficult to predict the effects of the systems on the users [16].
High workload or overload caused by several factors (including technology) can have a severe impact. Aside from the negative impact on patient care due to a potential increase in errors, overload can also have a negative impact on the health of HCPs, potentially resulting in technostress, mental health issues (eg, depression, burnout), and decreased job satisfaction. These are only a few of the potential negative effects of overload [17].
There is growing evidence that DHT are contributing to increasing mental health problems (eg, burnout) among health care workers [18,19].
In order to identify possible causes of mental health problems in physicians and nurses (eg, emerging burnout [20]), the investigation of MWL in different situations is a possible approach.

Mental Workload
MWL can be defined using different approaches and is usually influenced by different and multiple factors. It is multidimensional, multifaceted, and one of the most important variables to understand and predict human performance.
The possible definitional approaches of workload can be derived from two different perspectives: (1) MWL as an external variable referring to task requirements (ie, the amount of work and the number of tasks to be completed in a limited time [task load]) and (2) interaction between task and human resources resulting in a subjective psychological experience [21,22].
Summarizing different approaches, we can define MWL as the amount of attentional resources that are required to perform a task mediated by task demands and experience [15,23,24]. Following this definition, the state of overload is reached when the task demands are too high while the user's resources are limited. In contrast to this is the condition of underload, which occurs when the task requirements are too low while resources are sufficient. In both cases, the result is poorer performance [25]. Mental states such as a high workload or underload play a critical role in the occurrence of errors as well as preventable adverse events [26]. Regardless of how competent and/or experienced an HCP is, this type of mental state can lead to a higher frequency of errors.

Assessment of Mental Workload
MWL assessments were first developed and applied in other safety-critical environments such as aviation/aerospace and nuclear power plants. Safety-critical environments have similar conditions (already described). Due to these similar conditions, workload assessment could also be a useful approach in the clinical setting.
MWL can be assessed using different techniques. A distinction between analytical and empirical methods may be drawn. Analytical methods tend to be used in system development, while empirical methods are employed when workload is to be measured directly in the executing system or in the simulation [21].
Analytical assessment methods are simulation models, expert opinions, or task analyses. Empirical methods are distinguished into three different categories: performance measures, subjective methods, and physiological techniques [15]. Performance measures refer to the measures of the primary and secondary task.
Depending on the situation and the underlying question, one or more of these techniques are appropriate to apply. Several factors should be considered when selecting assessments, including sensitivity, diagnostic accuracy, intrusiveness, validity, reliability, simplicity of use, and user acceptance [27].

Objectives
DHT may contribute to the heavy workload in health care. MWL can best reflect the workload caused by technology. In addition to the existence of some methodological issues (eg, assessing MWL in the field), there are also some knowledge gaps concerning MWL caused by DHT.
The planned systematic review intends to identify the impact of DHT, particularly HIS, on the MWL of health care workers. In addition, the review will aim to assess what methods are currently being used in health care to measure MWL relating to DHT. In particular, the application of eye tracking or pupillometry as an assessment method will be investigated.

Research Questions
The review will seek to answer the following research questions:

Study Registration
The protocol is registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD42021233271). This protocol follows the PRISMA-P (Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols) 2015 guidelines [28].

Eligibility Criteria
We define the inclusion criteria for this systematic review according to the PICO framework [29] and the research questions. Inclusion criteria relate to the study population (P), intervention (I), outcome (O) of the study, and study setting (C). In addition to these criteria, we include studies by study design as detailed below.

Study Design
All types of study designs reporting original primary data as well as systematic reviews that align with our other inclusion criteria will be included. We will exclude commentaries, letters, guidelines, and narrative reviews.

Study Participants
We focus on HCPs who work with HIS or DHT and who are directly engaged in patient care. These can be nurses, physicians, radiology assistants, or other clinicians. It is essential that the participants are supported by the HIS/DHT in their daily work with patients. We exclude studies that focus on patients who use digital technologies.

Intervention
We include studies that investigate the effects that HIS/DHT have on workers' MWL. The focus lies on the evaluation of whether there is a direct or indirect effect of DHT on workers' MWL. Since the second research question concerns the extent to which eye tracking is commonly used as a measurement method, we focus on the inclusion of studies that apply eye tracking. We exclude studies that investigate related constructs such as technostress.

Study Setting
We include all studies that take place in inpatient or outpatient care. We exclude studies that focus on the measurement of MWL in other contexts (eg, aviation).

Information Sources
The following databases were systematically searched between February 28 and March 15, 2021, using defined keywords (and synonyms) like "mental workload," "health information system," "assessment," "health care professionals," and "eye tracking" that resulted in specified search strings: MEDLINE (PubMed), Web of Science, Academic Search Premier and CINAHL (both EBSCO), and PsycINFO. Additionally, we will search for relevant research in the reference sections of included studies as well as those of relevant recently published reviews. Following PRISMA-P [28], we organized the search terms by database and question in a separate document (Multimedia Appendix 1).

Search Strategy
The search strategy includes four categories, each represented by keywords and synonyms: technologies used (eg, HIS), population (eg, health care professionals), methods (eg, assessment), and MWL. In addition, eye tracking will be added for questions 2.1 and 2.2. The terms are linked by the Boolean operators AND or OR.
We restrict our search to articles published in the period between 2000 and 2021. This search time frame was chosen because it documents the development of the current generation of prehospital communication technology, such as telemedicine and electronic patient care reports [30]. The literature search is limited to articles written in English or German since both reviewers have a sufficiently high level of fluency in these languages.

Data Management
Citavi is used for literature handling (ie, import and further screening). The Rayaan web-based screening tool is used to perform abstract screening and full-text analysis in a structured way. In this context, the inclusion and exclusion criteria are also provided; they will be the basis for the abovementioned analysis process. The included articles will be then imported to a Microsoft Excel (Microsoft Corp) spreadsheet.

Selection Process
The selection process will be performed by two reviewers (LK and BB; if a consensus cannot be reached, ML and RR will serve as additional reviewers) according to the PRISMA guidelines and will be displayed in a flowchart. First, both reviewers will assess the studies regarding the inclusion and exclusion criteria for abstract screening. In the next step, the full texts of the resulting studies will again be assessed independently. Finally, we will search the references of the papers for further potentially eligible studies. In case of disagreements in any of the phases, a discussion between the two reviewers (LK and BB) based on the inclusion criteria will be attempted first. If the discussion is inconclusive, a third reviewer (ML or RR) will be involved.

Data Collection Process
For data extraction, an Excel spreadsheet based on the outcomes of the review will be used. To ensure uniformity across reviewers, we will conduct a pretest standardization exercise before starting the data extraction process. Each reviewer will extract the themes of interest to an Excel spreadsheet. The extracted data items are presented below.

Data Items
LL and BB will read the full texts and extract information concerning identified and relevant aspects of the studies. We will differentiate between main study characteristics, measurements and outcomes, and relevant findings and recommendations. The aspects are aggregated in Table 1, Table  2, and Table 3.  In addition to the descriptive presentation of study characteristics and findings, we are aiming to extract factors or aspects of DHT that contribute to an increasing MWL. Furthermore, we would like to extract how the included studies assess workload and in which settings eye tracking is used with regard to specific outcomes. Based on the extraction, we would like to develop an overview of the methods that can be used to measure MWL caused by DHT and provide meaningful and valid data.
The methods, settings, and outcomes will be organized into logical categories that are rated by the reviewers. The typical categories of methods referring to MWL assessments are analytical or empirical techniques. Typical categories for settings are laboratory or field. Categories referring to assessed outcomes have to be defined during the reviewing process. In each category, we will extract how often an indicator for a category was applied (category percentage, ie, method applied/n studies) and how often combinations of specific indicators were used (total percentage, eg, method A with setting B and outcome C; combination applied/N studies). A typical indicator for the category empirical technique would be a questionnaire. If an indicator was identified, the reviewers fill the row with a 1; if no indicator was identified (eg, if the method was not applied), the table is filled with a 0. An example is displayed in Figure  1.

Outcomes
The primary outcome of the first research question is to explore the correlations between DHT and the MWL of HCPs. The secondary outcome is to investigate the type of effect (direct/indirect) DHT has on the MWL of HCPs as well as the aspects of DHT that contribute to MWL.
The primary outcome of the second research question is the exploration of the best method to determine this relationship. Particular attention will be given to the role of eye tracking technology, which will be included as a secondary outcome.

Risk of Bias in Individual Studies
For the review, two authors will independently rate the methodological quality of the identified studies using the Joanna Briggs Institute Critical Appraisal Tool [31]. An initial screening of studies that could be included indicates a small proportion of studies with an experimental design and adequately defined criteria for conducting the study and analyzing the data. Disagreements will be resolved via discussion (LK and BB) or by a third reviewer (ML or RR), if necessary.

Data Analysis and Synthesis
After screening the search results, we do not expect to be able to conduct a meta-analysis. A first look revealed that comparing the study designs and effect measures of studies will be difficult. This may be explained by the explorative character of the review and the potentially low level of evidence, especially regarding eye tracking. Instead, we will perform a descriptive analysis to summarize the data, starting with a comparison of evaluation methods (qualitative, quantitative, or mixed methods) and survey methods. To do this, we will first compare the studies in terms of the evaluation methods used (qualitative, quantitative, mixed methods), followed by a comparison of survey methods.
For data synthesis, we use two nonquantitative approaches: tabulation and a narrative approach. Table 1 and Table 2 describe the tabular synthesis of potential findings.
In a first step, all main characteristics of each study will be extracted (ie, study design, setting of target population, sample size, age, sex, population type). Studies that do not report those main characteristics and those with a sample size under 20 participants will be excluded. We will analyze studies regarding objectives, outcomes, and assessments, as well as type of DHT. Data on overall MWL in studies, MWL levels related to DHT, quality criteria of assessments, applied eye tracking, and outcomes assessed via eye tracking will be extracted.
All included studies are evaluated with regard to their risk of bias. A textual narrative synthesis of all included studies will be made and the comparable findings will be synthesized. Additionally, a descriptive analysis of eye tracking measures is planned.

Results
As the systematic review is currently ongoing, no results are available as of yet. The preliminary searches have been completed and the piloting of the study selection process as well as the formal screening against eligibility criteria has started. We are currently analyzing the data and expect to complete the review in summer 2021.

Discussion
The aim of the review is to show which methods are currently used to measure MWL in health care and the impact of such technologies on the workload of HCPs. Additionally, the role of eye tracking should be evaluated.
In the discussion section of the review, we will discuss the results and the methodological quality of the findings, strengths and weaknesses of the review (limitations), and research gaps and opportunities for future research.