Published on in Vol 13 (2024)

Preprints (earlier versions) of this paper are available at, first published .
Benchmarking Mental Health Status Using Passive Sensor Data: Protocol for a Prospective Observational Study

Benchmarking Mental Health Status Using Passive Sensor Data: Protocol for a Prospective Observational Study

Benchmarking Mental Health Status Using Passive Sensor Data: Protocol for a Prospective Observational Study


Department of Psychology, University of Utah, Salt Lake City, UT, United States

Corresponding Author:

Robyn E Kilshaw, MS

Department of Psychology

University of Utah

380 S 1530 E BEH S 502

Salt Lake City, UT, 84112

United States

Phone: 1 (801) 581 6124

Fax:1 (801) 581 5841


Background: Computational psychiatry has the potential to advance the diagnosis, mechanistic understanding, and treatment of mental health conditions. Promising results from clinical samples have led to calls to extend these methods to mental health risk assessment in the general public; however, data typically used with clinical samples are neither available nor scalable for research in the general population. Digital phenotyping addresses this by capitalizing on the multimodal and widely available data created by sensors embedded in personal digital devices (eg, smartphones) and is a promising approach to extending computational psychiatry methods to improve mental health risk assessment in the general population.

Objective: Building on recommendations from existing computational psychiatry and digital phenotyping work, we aim to create the first computational psychiatry data set that is tailored to studying mental health risk in the general population; includes multimodal, sensor-based behavioral features; and is designed to be widely shared across academia, industry, and government using gold standard methods for privacy, confidentiality, and data integrity.

Methods: We are using a stratified, random sampling design with 2 crossed factors (difficulties with emotion regulation and perceived life stress) to recruit a sample of 400 community-dwelling adults balanced across high- and low-risk for episodic mental health conditions. Participants first complete self-report questionnaires assessing current and lifetime psychiatric and medical diagnoses and treatment, and current psychosocial functioning. Participants then complete a 7-day in situ data collection phase that includes providing daily audio recordings, passive sensor data collected from smartphones, self-reports of daily mood and significant events, and a verbal description of the significant daily events during a nightly phone call. Participants complete the same baseline questionnaires 6 and 12 months after this phase. Self-report questionnaires will be scored using standard methods. Raw audio and passive sensor data will be processed to create a suite of daily summary features (eg, time spent at home).

Results: Data collection began in June 2022 and is expected to conclude by July 2024. To date, 310 participants have consented to the study; 149 have completed the baseline questionnaire and 7-day intensive data collection phase; and 61 and 31 have completed the 6- and 12-month follow-up questionnaires, respectively. Once completed, the proposed data set will be made available to academic researchers, industry, and the government using a stepped approach to maximize data privacy.

Conclusions: This data set is designed as a complementary approach to current computational psychiatry and digital phenotyping research, with the goal of advancing mental health risk assessment within the general population. This data set aims to support the field’s move away from siloed research laboratories collecting proprietary data and toward interdisciplinary collaborations that incorporate clinical, technical, and quantitative expertise at all stages of the research process.

International Registered Report Identifier (IRRID): DERR1-10.2196/53857

JMIR Res Protoc 2024;13:e53857




Modern mental health care is the product of a tremendous volume of basic and applied research. Until recently, the majority of this research has relied on methods and practices (eg, randomized controlled trials [1]) that require large amounts of human labor to recruit participants (eg, through in-person recruitment at clinics or calls to members of research registries) and collect the necessary data (eg, through clinical interviews, performance-based tasks, or manual extraction from eHealth records). These methods and practices also tend to generate siloed data sets that are difficult to share beyond the original study team (see [2] for a review). Though existing research using these methods and practices has produced very valuable findings and greatly improved treatment options for individuals coping with mental health challenges, the pace of development is slow [3], and recent estimates suggest that only 40% to 50% of Americans who need mental health care receive treatment [4]. Novel approaches to mental health research are needed to address these challenges, and recent advances in computational analysis and digital technology have the potential to do just that (see [5-7] for reviews).

Computational psychiatry is an approach that includes both theory- and data-driven applications of mathematical modeling with the goal of advancing the diagnosis, mechanistic understanding, and treatment of mental health conditions [8-10]. For example, on the theory-driven side, reinforcement-learning models applied to functional magnetic resonance imaging data have helped predict posttreatment outcomes (ie, abstinence vs relapse) in patients with alcohol dependence [11], and Bayesian models of cognitive task performance have improved our understanding of how the ability to flexibly update previous beliefs differs between diagnoses (see [12] for a review). Similarly, data-driven studies using machine learning models with high-dimensional data (eg, electroencephalogram and magnetic resonance imaging data) have been able to accurately distinguish patients with schizophrenia from controls [13,14] and have also identified multivariate “biomarkers” that have helped improve pharmacological treatment-response prediction in patients with depression [15-17].

Enthusiasm about this body of work has led to numerous calls to extend these methods to mental health risk assessment in the general public [18-20]. However, one of the largest barriers to doing so is that many of the data sources used in computational psychiatry research with clinical samples (eg, brain imagery data and standardized task performance data) are not available for community-dwelling individuals in the general population. As a result, researchers are increasingly leveraging the near ubiquity of personal digital devices (PDDs) [21] (eg, smartphones and smartwatches) as a more scalable and accessible means of collecting high-dimensional behavioral, contextual, and even physiological data streams from individuals [22,23]. Research of this type represents a subset of the broader computational psychiatry field commonly referred to as digital phenotyping.

Digital phenotyping refers to using quantitative methods with PDD data—particularly passive sensor data (eg, GPS, accelerometry, call and text logs, and app usage)—to identify behavioral phenotypes or “digital biomarkers” relevant to mental health [24-26]. Thus far, digital phenotyping research has primarily focused on using computational methods such as machine learning models for detecting, monitoring, and predicting changes in symptom severity as well as predicting or improving treatment response in clinical samples (for reviews, see [27-29]). For example, machine learning algorithms applied to passive sensor data have been able to successfully identify depressive and manic episodes in individuals with bipolar disorder [30,31] and predict psychotic relapses in patients with schizophrenia [32]. Furthermore, digital biomarkers derived from PDD data have demonstrated reasonable accuracy for predicting treatment response to transcranial magnetic stimulation in patients with depression [33]. Finally, a growing body of research demonstrates that passive sensor data from PDDs can also be used to identify common risk factors for mental health conditions (eg, stress, depressed mood, and anxiety) in clinical [34] and student samples [35].

This nascent body of research provides an empirical and methodological foundation for extending digital phenotyping to identify markers of mental health risk in the general population. While, to the best of our knowledge, no studies have done this using a prospective study design with community-dwelling adults, existing computational psychiatry and digital phenotyping research, along with guidance from leading advocates in these fields, provides a strong set of recommendations for generating a data set that is well-suited to this objective. These recommendations include: (1) using a transdiagnostic and dynamic understanding of mental health to guide study design and data collection [20,36]; (2) incorporating the data requirements of cutting-edge computational methods into data collection methods [37,38]; and (3) using careful consent procedures and data curation methods to ensure that data can be safely and ethically shared with researchers across academia, industry, and government so as to harness the expertise of diverse professionals working toward improving mental health [39].


On the National Institute of Mental Health Data Archive [40], there currently exists a small number of shareable computational psychiatry data sets that include longitudinal PDD data from community-dwelling participants [41,42], while other such data sets are in the process of being created [43,44]. Nevertheless, all of these data sets comprise participants who either meet specific diagnostic criteria (eg, trauma exposure [42], diagnosis with a serious mental illness [43], and binge drinking [44]) or developmental characteristics (eg, school-attending adolescents) and are best suited for identifying digital biomarkers of mental health risk in clearly defined subsamples of the population. Therefore, as a complement to this existing work, we aim to create the first computational psychiatry data set that is tailored to studying mental health risk in the adult general population, includes PDD sensor-based behavioral features, and is designed to be widely shared across academia, industry, and government using gold standard guidelines for privacy, confidentiality, and data integrity.

To maximize the relevance and use of this data set for researchers from a wide array of disciplines, we will use the guidelines listed above to ensure our proposed data set is optimized for both mental health and computational considerations. On the mental health side, our proposed data set will be informed by state-of-the-art etiological and phenomenological models of mental health and mental health disorders. On the computational side, our proposed methods are designed to generate a high-dimensional, multimodal, and multirate feature set with maximum variability across levels of measurement and analysis, a balanced classification design, and minimal missing data [22,38]. These considerations will therefore support the primary objective of this project, which is to create a data set that, through its design and accessibility, has the potential to advance mental health risk assessment in the general population using digital phenotyping methods. Although we do not have specific hypotheses or analytic plans guiding the creation of this data set, we believe that in combination with computational methods, the data we collect could be used to investigate research questions such as the following: can PDD sensor-based features predict the likelihood of a future mental health event (eg, receiving a psychiatric diagnosis or treatment) at rates significantly above chance? and does incorporating information about preexisting vulnerability factors (eg, difficulties with emotion regulation and past mental health conditions) improve the accuracy of these predictions?


We are recruiting 400 individuals at varying levels of risk for a lifetime incidence of experiencing an episodic mental health disorder (eg, depression, anxiety, and adjustment disorder). To achieve this goal, we are using a stratified, random sampling design with 2 crossed factors: (1) difficulties with emotion regulation [45], a transdiagnostic risk factor for a wide range of mental health conditions [46], and (2) perceived overall life stress during the past 30 days [47]. These 2 factors are assessed during screening, and eligible individuals will be included in the sample such that 40% to 60% of participants report higher than average difficulties regulating strong emotions and 40% to 60% report currently experiencing significant life stress. This sampling design and the choice of these 2 factors were guided by the diathesis-stress model of psychopathology, which suggests that many episodic (ie, non-neurodevelopmental [48]) psychological disorders are the result of an interaction between preexisting vulnerabilities—in this case, difficulties with emotion regulation—and stress due to life experiences [49]. Participants must also meet the following eligibility criteria: be 18 years of age or older, currently living in Utah, have a smartphone with an active cellular data plan and an Apple or Android operating system, be able to speak and read English fluently, and receive their health care through either the University of Utah Health (UHealth) or Intermountain Health Care (IMHC) systems.

Individuals who report suicidal ideation in the past 3 months or any history of a suicide attempt, active mania, or psychosis (ie, severe mental health problems potentially requiring hospitalization during their participation) during screening are ineligible to participate. Initially, individuals who reported current symptoms of a substance use disorder (eg, daily binge drinking) and were not engaged in substance use treatment were also ineligible for participation; however, we received additional funding after beginning the study that allowed us to increase the study staffing so that we could remove this exclusion criterion. This change to the exclusion criteria occurred approximately 6 months into recruitment, after 180 participants—almost half of our target sample size of 400—had provided consent. Finally, individuals with a history of conviction for a violent crime; a history of child abuse or neglect perpetration substantiated by child protective services; and, for individuals in committed relationships, recent physical intimate partner violence are also excluded for potential ethical reasons. Specifically, licensed mental health providers and physicians are legally required to report instances of each of these final criteria to authorities in most states in the United States, and this duty is not waived by a certificate of confidentiality. The presence of these events would therefore render the data set unshareable for most purposes.


Participants are recruited using web-based advertising through the University of Utah websites, social media websites, and listserves, as well as paper fliers posted throughout the Salt Lake City area. Interested participants complete a web-based screening survey to determine eligibility. Eligible participants are then forwarded to an electronic consent form, and consenting participants complete another web-based battery of questionnaires to assess current and lifetime psychiatric and medical diagnosis and treatment, current psychological symptoms and social functioning, and demographic variables. Participants who complete this baseline battery are compensated US $25 and are scheduled for a videoconference call during which study staff instructs them on how to use the study equipment (ie, audio recorder) that is mailed to them.

Following this call, participants begin a continuous 7-day period of intensive data collection that includes wearing the audio recorder during waking hours, providing raw sensor data collected through a smartphone app (Beiwe [50]), completing a brief survey to assess mood and important events at the end of each day, and responding to a brief phone call from study staff every evening. During this phone call, study staff assess compliance with study procedures, ask participants to describe the most positive and negative events from that day, and inquire whether there is any period of the recording from that day that they wish to delete. Providing participants the opportunity to review the contents of their recordings is necessary to meet ethical standards for long-term, ambulatory data collection [51]. After returning the audio recorder, participants are compensated US $10 and US $4.28 for every day that they provided raw sensor (PDD) data and audio recording, respectively.

Participants also complete web-based questionnaires 6 and 12 months after the 7-day intensive data collection phase to reassess current psychological symptoms and social functioning, as well as psychiatric and medical diagnosis and treatment during the intervening time; they are compensated US $20 for the completion of each of these questionnaires. See Figure 1 for the complete procedure flow diagram. All of these procedures are described in more depth below.

Figure 1. Procedures flowchart. DERS: Difficulties with Emotion Regulation Scale; PSS-4: Perceived Stress Scale-4.

Data Privacy and Confidentiality

To maximize data privacy and security, data from all sources are encoded and can only be matched using a key maintained in a password-protected file only accessible to approved study personnel. Study data collected from the Beiwe app are encrypted in transit and at rest, and no identifiable data are stored on participants’ devices. For complete details on all the security features of the Beiwe research platform, see [52]. Audio files are saved to a microSD card and returned with the recorders through registered mail. All digital data, both in raw and processed formats, are stored in a Health Insurance Portability and Accountability Act (HIPAA)–compliant protected environment maintained by the University of Utah Center for High-Performance Computing.

Data Sources

Self-Report Questionnaires

Participants are given the option to skip any self-report questionnaire item. See [53] for full versions of all publicly available standardized measures as well as any nonstandardized questionnaires included in this study (ie, demographics questionnaire, clinical history questionnaire, and daily events questionnaire).

Difficulties With Emotion Regulation Scale

The Difficulties with Emotion Regulation Scale (DERS) [45] is a well validated and widely used measure of subjective emotion regulation ability [54]. A total of 36 items are responded to on a 5-point Likert scale (from 1=“almost never [0%-10%]” to 5=“almost always [91%-100%]”; range of scores 36-180), such that higher total scores indicate greater difficulty regulating in the context of strong emotions. Overall, 6 subscale scores can also be generated, capturing individuals’ lack of emotional awareness, lack of emotional clarity, nonacceptance of emotions, limited access to emotional regulation strategies, and difficulties engaging in goal-directed behavior or inhibiting impulsive responses in the context of strong emotions. The DERS is administered as part of the screening survey and is only presented to individuals who have already met all other inclusion and exclusion criteria. The mean total DERS score from a community sample [55] is used to classify participants as having high (ie, total score >81.64) versus low (ie, total score ≤81.64) difficulties with emotion regulation for enrollment purposes.

Perceived Stress Scale-4

The Perceived Stress Scale-4 (PSS-4) [47] is a 4-item short form of the widely used Perceived Stress Scale, which measures individuals’ subjective level of life stress during the previous month. The PSS-4 displays acceptable psychometric properties in nonclinical populations [47,56] and is used to assess how unpredictable and overwhelming individuals currently find their lives. Items are responded to on a 5-point scale anchored by 0 (never) and 4 (very often), and higher total scores indicate greater perceived stress (range of scores 0-16). The PSS-4 is only presented to eligible participants as part of the screening survey. The mean total score from a validation sample of community participants [56] is used to identify participants experiencing high (ie, total score >6.11) versus low (ie, total score ≤6.11) life stress for enrollment purposes.

Demographics Questionnaire

As part of the baseline battery, participants are asked to report on a number of demographic factors, including age, biological sex, gender, sexuality, relationship status, race, ethnicity, spoken languages, religion, education history, and current employment and income.

Clinical History Questionnaire

Participants are asked to provide information about their current and past physical and mental health, including smoking status; history of a major medical event (eg, cancer, diabetes, or stroke); current and past psychiatric diagnoses; current and past use of psychiatric medication; or other mental health treatment, including when and what service or medication was used. This questionnaire was created for this study and is included in the baseline battery as well as the 6- and 12-month follow-up questionnaires as a way of assessing any significant physical or mental health changes during the study period.

Depression, Anxiety, and Stress Scale-21

The Depression, Anxiety, and Stress Scale-21 (DASS-21) [57] is a 21-item questionnaire designed to measure depression, anxiety, and tension or stress during the past week. It is a short form of the widely used and well-validated original DASS-42 [58]. Individuals respond to items on a 4-point Likert scale from 0 (did not apply to me at all) to 3 (applied to me very much or most of the time), and 3 subscale scores are produced corresponding with the 3 negative emotional states. The DASS-21 is administered at baseline as well as at the 6- and 12-month time points.

Tobacco, Alcohol, Prescription Medications, and Other Substance Tool

The Tobacco, Alcohol, Prescription Medications, and Other Substance (TAPS) [59] tool is a 2-part measure of substance use. In the first part, individuals respond to 4 items assessing how frequently (ranging from “never” to “daily or almost daily”) they have used tobacco, alcohol, illicit drugs, or prescription drugs for nonmedical reasons in the past year. Any individual who screens positive in part 1 (ie, responds with anything other than “never”) then completes part 2, which consists of a brief assessment of use-related behaviors during the past 3 months. Scores from part 2 can be used to generate 3 levels of risk for each substance endorsed (ie, no use in the past 3 months, problem use, and higher risk). The TAPS tool has demonstrated adequate psychometric properties as a screening measure for high-risk substance use behaviors in adult primary care patients [59]. We are including a measure of substance use in our data set because this is a well-established, transdiagnostic behavioral marker of risk, and in combination with our affective marker of risk (ie, difficulty regulating emotions), it will improve the precision of risk estimation [60]. This measure is administered at baseline as well as at the 6- and 12-month time points.

Life Functioning Questionnaire

The Life Functioning Questionnaire (LFQ) [61] is a 2-part questionnaire designed to assess individuals’ subjective difficulty functioning in 4 domains of life (leisure time with friends, leisure time with family, duties at work or school, and duties at home) during the past month. In part 1 of the LFQ, individuals indicate the degree of problems (from 1 [no problems] to 4 [severe problems]) they have experienced within each domain in terms of the amount of time spent on related activities, amount of conflict experienced, level of enjoyment, and self-assessed performance (for work and home duties only). For duties in the work or school domain, individuals are also asked to indicate the number of days they were absent as well as the factors that contributed to their absence (eg, mental or physical health symptoms and interpersonal difficulties). Part 2 asks additional questions about individuals’ work situation during the previous month, previous full-time work, and reasons for leaving, as well as their living and financial status in the past 6 months. The LFQ was originally designed to assess functional capacity in psychiatric patients and has demonstrated adequate psychometric properties with adult inpatients seeking treatment for mood disorders [61]. In this study, the LFQ is administered at baseline and at the 6- and 12-month time points.

Brief Symptom Inventory-18

The Brief Symptom Inventory-18 (BSI-18) [62] is an 18-item measure that assesses the level of distress a person has experienced in the past day due to various psychological (eg, feelings of worthlessness) and physical (eg, pains in the heart or chest) symptoms. Questions are responded to on a 5-point scale from 0 (not at all) to 4 (extremely) and can be summed to produce 3 subscale scores (depression, anxiety, and somatization) as well as a global severity index that measures overall psychological distress. The BSI-18 has been validated and normed with community samples and is acceptable to use repeatedly as a measure of symptom change [63]. The BSI-18 is administered nightly during the 7-day intensive data collection phase of this study.

Daily Positive and Negative Mood Questionnaire

Aspects of participants’ daily mood are assessed using a version of the Positive and Negative Affect Schedule [64] described by Smyth et al [65]. Participants rate their current level of 4 positive and 5 negative mood adjectives on a 7-point scale anchored by 0 (not at all) and 6 (extremely). Items are summed to produce positive and negative mood subscales that demonstrated acceptable psychometric properties in a previous sample of community participants [65]. This measure is administered nightly during the 7-day intensive data collection phase of this study.

Daily Events Questionnaire

The Daily Events Questionnaire asks individuals to select from a list of common daily stressors (eg, a lot of work at school or work or a financial problem) and people (eg, a friend or spouse) to assess if any troublesome events happened to them since they woke up that morning, as well as if they experienced any tension or arguments with anyone. Individuals can select as many options as apply and have the option to describe any “other” event or relationship not listed. For each selected event, individuals are then asked to indicate approximately when this event occurred and how distressed they felt during the event, from 1 (not at all) to 10 (very much). Similarly, for each person that a participant indicates they had an argument with, they are asked to provide approximately when the argument began and ended, as well as how distressed they felt and how satisfied they felt with the outcome of the event (using the same 10-point response scale). This measure has been used previously to successfully identify the approximate timing of various distressing events throughout the day [66]. In this study, it is administered nightly during the 7-day intensive data collection phase of this study.

Daily Call With Study Staff

Study staff calls participants each evening near the end of the day (ie, between 6 PM and 8 PM) to inquire about compliance with study procedures, problems with study equipment, and the nature and timing of participants’ most positive and negative events of the day. A transcript of the semistructured interview questions used during these phone calls is available on our Open Science Foundation site [53].

Audio Data

In situ audio is continuously recorded while participants are awake at 24-bit/48 kHz using omnidirectional Lavalier microphones connected to miniature field recorders. Recordings are segmented into smaller, 15-minute-long files to optimize data transfer and increase data processing efficiency.

We have carefully considered legal and ethical issues in proposing these in situ recordings. Audio recordings are governed by wiretapping laws, which vary from state to state. Utah is a single-party consent state, meaning that as long as a study participant consents to be recorded, other individuals captured on those recordings do not need to additionally consent. However, the ethical principles of beneficence, nonmaleficence, autonomy, and justice require that other individuals who may be recorded need to be aware of that possibility and given the opportunity to not be recorded. For these reasons, participants are instructed to wear a badge that states that they are participating in a study that records audio during daily life and will be allowed to pause the recording whenever an individual they come into contact with requests that they do so, or the participant wishes or is required to do so by policy or law [51]. We have used these methods in previous work, and they typically generate 10 to 14 hours of audio per day.

Passive Sensor Data

Raw smartphone sensor data are collected using Beiwe, a cross-platform digital phenotyping app created by Onnela and colleagues [25,50]. The Beiwe platform collects raw data generated by smartphone sensors, including, but not limited to, GPS and accelerometry, Wi-Fi connectivity, Bluetooth device scans, phone and screen status, and phone call and SMS text message event logs linked to onboard device contacts. The specific data collected for a participant is determined by the sensors on their smartphone and the policies of the smartphone manufacturer [52].

Medical Records

The state of Utah maintains the Utah Population Database (UPDB) and All Payer Claims Database, which are standardized (ie, use the Systemized Nomenclature of Medicine–Clinical Terms), digital archives of medical encounters in the UHealth and IMHC systems, and insurance claims filed in the state, respectively. These databases are generated for research purposes and will be used to verify and augment participant reports. These databases represent a highly unique and valuable opportunity because it is well documented that retrospective reports of psychiatric history incorrectly fail to identify ~25% to 40% of true positives relative to medical records [67]. Relevant records from these databases will be linked to other participant data. UPDB policy dictates that these records will not be shareable, and individuals or entities wishing to access them will have to seek permission directly from the database administrators.

Planned Data Processing

Raw item responses as well as relevant total and subscale scores from the self-report questionnaires will be created using standard scoring protocols and included for all self-report measures in the final data set. In addition to raw DERS scores, age- and gender-adjusted t-scores computed using the methods in [55] will be included in the final data set. Annotations of participants’ most positive and negative daily events will also be available in the data set. To create these, study staff will use the information provided by participants during the daily phone calls to annotate details about their most positive and negative event for each day, including the beginning and end time of the event, and the type (eg, interaction with another person or financial problem) and nature of the event (eg, positive vs negative). Study staff will additionally annotate speaker IDs (eg, participant and female 1), emotional expressions, and communication behaviors (eg, arguing or cooperating) using information from the corresponding recorded audio.

Audio recordings will be processed to generate gold-standard acoustic feature sets used in behavioral signal processing [68] and affective computing [69] research using the openSMILE toolkit [70]. These methods produce 88 acoustic variables that represent frequency-, energy-, and spectral-related aspects of speech and ambient noise. Acoustic variables will be generated over the smallest window of time possible for each variable and downsampled to produce a summary score for each acoustic variable for each 1 second of the recording.

Raw PDD sensor data will be processed using Forest [71], a freely available library for analyzing Beiwe data developed by the creators of Beiwe. Similar to the acoustic variables described above, Forest produces summary variables for each sensor type that quantify a wide range of behavioral and contextual information. For example, outputs of GPS data include time spent at home, total distance traveled, physical circadian rhythm, and the type (eg, shop, restaurant, and place of worship) and duration of locations visited. The location type is generated using information from the open-source platform OpenStreetMap [72] and is particularly valuable for the current data set as it will allow for the creation of a library of geographically referenced place tags that index risky (eg, amenity=bar; amenity=tobacco retailer) and protective (eg, amenity=library; amenity=gym) locations and will increase the potential information value of the database by providing additional context for the passive sensor data streams collected. Another example of the summary features available from Forest includes outputs of call and text logs, the total number of calls received, the total number of unique callers, and the total duration of calls received. Additional details about all summary variables are available on the Forest GitHub page [71]. Summary variables will be generated for each day and included in our final data set.

Ethical Considerations

All study procedures were approved by the University of Utah Institutional Review Board (00149365). Eligible participants are given the opportunity to review all study procedures, including planned data-sharing processes, and are provided with contact information for the study’s principal investigator to answer any additional questions before signing the consent form on the internet. Several measures are in place to protect the privacy and confidentiality of participants’ data (see Data Privacy and Confidentiality and Results sections for additional details). These include using a secure and password-protected database for storing all study data, only sharing deidentified and nonsensitive data publicly, and requiring a more stringent data use agreement and relevant ethics approval to be provided by individuals requesting access to identifiable data sources (eg, raw audio data). All participants are offered US $165 as compensation for completing all study procedures.

Data collection for this project began in June 2022 and is expected to conclude by July 2024. To date, 310 participants have consented to the study; 149 have completed the baseline questionnaire and 7-day intensive data collection phase; and 61 (ie, 85% of eligible participants) and 31 (ie, 91% of eligible participants) have completed the 6- and 12-month follow-up questionnaires, respectively.

Once completed, the proposed data set will be made available consistent with the findable, accessible, interoperable, and reusable guidelines for data management and stewardship [73]. We will publish a description of the data set in a general science outlet with a broad readership to increase awareness of it among a broad audience. We will also index the data set on recommended data banks, such as the Open Science Foundation and Science Data Bank [74] to increase its findability.

We will use a stepped approach to making the data set accessible to academic researchers, industry, and government. Deidentified, nonsensitive data (eg, raw item and scale scores from self-report measures, summary features from acoustic and passive sensor data, and annotations of daily events) will be publicly available for download directly from data repositories after completion of a brief data use agreement but without contact with the study team. Identifiable and other sensitive data (eg, raw audio recordings and raw passive sensor data) will only be made available for download from a University of Utah server after researchers have provided evidence of the necessary approvals from their institutional review board or comparable entity, completed a more stringent data use agreement, and have communicated with a member of the study team. Data will be made available to academic and government researchers at no cost and licensed to commercial entities.


Computational psychiatry and digital phenotyping have the potential to make significant contributions to mental health research and treatment development. Digital phenotyping studies have already demonstrated that by applying machine learning techniques to PDD data, it may be possible to predict upcoming mental health events (eg, manic episodes and hospitalization) in clinical samples [32] and detect mental health risk factors (eg, increasing stress and depressed mood) in students [35]. These findings are paving the way for a future where digital markers of increased risk could be used to trigger just-in-time interventions that connect individuals with their mental health care team when they need support most and before their symptoms worsen [75]. Similarly, ongoing computational psychiatry work aims to improve mental health treatment by using machine learning to identify the precise cognitive and neurobiological mechanisms underlying psychiatric disorders and their symptoms [76]. These findings may then be used to gain a better understanding of which treatments work best for which patients at what time.

Whereas much of this existing work focuses on predicting and treating mental health symptoms in individuals who meet certain diagnostic criteria or have received treatment for a psychiatric diagnosis, we are proposing a data set that is designed to advance the field’s ability to identify community-dwelling members of the adult general population at risk for a future mental health event. This is a complementary approach to current computational psychiatry and digital phenotyping research and has the potential to improve mental health risk assessment within the general population. The proposed data set possesses several strengths that make it well suited for this goal. First, in response to numerous calls for more theory-driven digital phenotyping studies [20], both the sample we are recruiting and the data we are collecting are guided by a well-established, transdiagnostic model of psychopathology development—the diathesis-stress model [49]. Second, instead of focusing on stable individual differences or 1-time measures of risk, the proposed data set will include multimodal data collected across multiple time scales and levels of analysis. By including digital traces of behavior that can be linked to both time and context, the proposed data set will therefore allow for the identification of features that better capture the dynamic nature of mental health risk [36,37]. Third, by using the Beiwe platform to collect raw PDD data, future analyses will not be limited by inconsistencies in the algorithms used to derive summary statistics or features from different devices [22]. Similarly, including raw PDD data in this data set means that not only is it designed to support current computational techniques, but it will continue to be relevant for future quantitative developments. Finally, the shareable nature of this data set will encourage interdisciplinary collaboration and ideally maximize the rate, rigor, and accuracy of the predictive machine learning models that are developed from it.


Although the proposed data set possesses many strengths, there are also some limitations to acknowledge. First, the types of digital data included in this data set are limited to what is available through the Beiwe platform [52]. Although it is possible that other passive data streams (eg, app usage log) may carry digital traces of participants’ behaviors that are informative for mental health prediction, our decision to use only what is available from Beiwe was guided by the desire to prioritize participant privacy and the shareability of the data set, which are 2 features supported by the design and maintenance of Beiwe. Second, some of the data sources captured by Beiwe are not consistent across Apple and Android devices due to variability in the sensors installed on different devices and the policies of different smartphone manufacturers (eg, call and SMS text message logs are only available on Android devices [52]). This may limit some of the analyses that can be performed and features that can be developed if certain data sources are only available from a small subset of the total sample. Third, our decision to exclude participants with suicidal thoughts and behaviors or other persistent and severe mental health conditions (ie, active mania or psychosis) means that the predictive models arising from this data set will not be optimized for identifying an increase in risk for the mental health events associated with these conditions (eg, inpatient hospitalization). However, such events are less likely in the general population and are not the primary outcome of interest for the proposed data set.


Computational psychiatry and digital phenotyping have been lauded as pillars of the next great revolution in mental health care [7,8,39]. The past 2 decades have seen a dramatic increase in the number of mental health studies using these methodologies with PDD data. Furthermore, rapid developments in digital technology and quantitative analysis suggest that the potential benefits of PDD data and computational techniques for mental health research and treatment development are only going to continue to expand. In order to achieve these benefits, it will be important for the field to move away from siloed research laboratories collecting proprietary data and toward interdisciplinary collaborations that use clinical, technical, and quantitative expertise to produce widely applicable and shareable data sets.


REK conceptualized and designed the study, led data collection, and drafted the manuscript. AB, OE, and EB assisted with data collection and provided feedback and edits for the manuscript. FRL provided clinical oversight for the study and feedback and edits for the manuscript. BRWB conceptualized and designed the study and provided feedback and edits on the manuscript.

REK was supported, in part, by funding from the Social Sciences and Humanities Research Council of Canada.

Data Availability

Upon completion of this study, the data sets generated and analyzed during this study will be available in a widely accessible data repository.

Conflicts of Interest

None declared.

  1. Bickman L. Improving mental health services: a 50-year journey from randomized experiments to artificial intelligence and precision mental health. Adm Policy Ment Health. 2020;47(5):795-843. [FREE Full text] [CrossRef] [Medline]
  2. Tackett JL, Brandes CM, King KM, Markon KE. Psychology's replication crisis and clinical psychological science. Annu Rev Clin Psychol. 2019;15:579-604. [FREE Full text] [CrossRef] [Medline]
  3. Morris ZS, Wooding S, Grant J. The answer is 17 years, what is the question: understanding time lags in translational research. J R Soc Med. 2011;104(12):510-520. [FREE Full text] [CrossRef] [Medline]
  4. Reinert M, Fritze D, Nguyen T. The state of mental health in America 2022. Mental Health America. Alexandria VA.; 2021. URL: [accessed 2024-02-28]
  5. Bucci S, Schwannauer M, Berry N. The digital revolution and its impact on mental health care. Psychol Psychother. 2019;92(2):277-297. [CrossRef] [Medline]
  6. Montag C, Sindermann C, Baumeister H. Digital phenotyping in psychological and medical sciences: a reflection about necessary prerequisites to reduce harm and increase benefits. Curr Opin Psychol. 2020;36:19-24. [CrossRef] [Medline]
  7. Insel TR. Digital phenotyping: technology for a new science of behavior. JAMA. 2017;318(13):1215-1216. [CrossRef] [Medline]
  8. Huys QJM, Maia TV, Frank MJ. Computational psychiatry as a bridge from neuroscience to clinical applications. Nat Neurosci. 2016;19(3):404-413. [FREE Full text] [CrossRef] [Medline]
  9. Maia TV. Introduction to the series on computational psychiatry. Clin Psychol Sci. 2015;3(3):374-377. [CrossRef]
  10. Stephan KE, Mathys C. Computational approaches to psychiatry. Curr Opin Neurobiol. 2014;25:85-92. [CrossRef] [Medline]
  11. Garbusow M, Schad DJ, Sebold M, Friedel E, Bernhardt N, Koch SP, et al. Pavlovian-to-instrumental transfer effects in the nucleus accumbens relate to relapse in alcohol dependence. Addict Biol. 2016;21(3):719-731. [CrossRef] [Medline]
  12. Huys QJM, Browning M, Paulus MP, Frank MJ. Advances in the computational understanding of mental illness. Neuropsychopharmacology. 2021;46(1):3-19. [FREE Full text] [CrossRef] [Medline]
  13. Silva RF, Castro E, Gupta CN, Cetin M, Arbabshirani M, Potluru VK, et al. The tenth annual MLSP competition: schizophrenia classification challenge. 2014. Presented at: 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP); September 21-24, 2014; Reims, France. [CrossRef]
  14. Solin A, Särkkä S. The 10th annual MLSP competition: first place. 2014. Presented at: 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP); September 21-24, 2014; Reims, France. [CrossRef]
  15. Khodayari-Rostamabad A, Reilly JP, Hasey GM, de Bruin H, Maccrimmon DJ. A machine learning approach using EEG data to predict response to SSRI treatment for major depressive disorder. Clin Neurophysiol. 2013;124(10):1975-1985. [CrossRef] [Medline]
  16. Etkin A, Patenaude B, Song YJC, Usherwood T, Rekshan W, Schatzberg AF, et al. A cognitive-emotional biomarker for predicting remission with antidepressant medications: a report from the iSPOT-D trial. Neuropsychopharmacology. 2015;40(6):1332-1342. [FREE Full text] [CrossRef] [Medline]
  17. Korgaonkar MS, Rekshan W, Gordon E, Rush AJ, Williams LM, Blasey C, et al. Magnetic resonance imaging measures of brain structure to predict antidepressant treatment outcome in major depressive disorder. EBioMedicine. 2015;2(1):37-45. [FREE Full text] [CrossRef] [Medline]
  18. Baumgartner R. Precision medicine and digital phenotyping: digital medicine's way from more data to better health. Big Data Soc. 2021;8(2):205395172110664. [FREE Full text] [CrossRef]
  19. Onnela JP, Rauch SL. Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology. 2016;41(7):1691-1696. [FREE Full text] [CrossRef] [Medline]
  20. Davidson BI. The crossroads of digital phenotyping. Gen Hosp Psychiatry. 2022;74:126-132. [CrossRef] [Medline]
  21. Perrin A. Mobile technology and home broadband 2021. Pew Research Center. 2021. URL: [accessed 2024-01-22]
  22. Onnela JP. Opportunities and challenges in the collection and analysis of digital phenotyping data. Neuropsychopharmacology. 2021;46(1):45-54. [FREE Full text] [CrossRef] [Medline]
  23. Mohr DC, Zhang M, Schueller SM. Personal sensing: understanding mental health using ubiquitous sensors and machine learning. Annu Rev Clin Psychol. 2017;13:23-47. [FREE Full text] [CrossRef] [Medline]
  24. Dagum P, Montag C. Ethical considerations of digital phenotyping from the perspective of a healthcare practitioner. In: Montag C, Baumeister H, editors. Digital Phenotyping and Mobile Sensing: New Developments in Psychoinformatics. Cham. Springer Nature; 2019;13-29.
  25. Torous J, Kiang MV, Lorme J, Onnela JP. New tools for new research in psychiatry: a scalable and customizable platform to empower data driven smartphone research. JMIR Ment Health. 2016;3(2):e16. [FREE Full text] [CrossRef] [Medline]
  26. Birk RH, Samuel G. Can digital data diagnose mental health problems? a sociological exploration of 'digital phenotyping'. Sociol Health Illn. 2020;42(8):1873-1887. [FREE Full text] [CrossRef] [Medline]
  27. Huckvale K, Venkatesh S, Christensen H. Toward clinical digital phenotyping: a timely opportunity to consider purpose, quality, and safety. NPJ Digit Med. 2019;2:88. [FREE Full text] [CrossRef] [Medline]
  28. Dlima SD, Shevade S, Menezes SR, Ganju A. Digital phenotyping in health using machine learning approaches: scoping review. JMIR Bioinform Biotech. 2022;3(1):e39618. [FREE Full text] [CrossRef]
  29. Moura I, Teles A, Viana D, Marques J, Coutinho L, Silva F. Digital phenotyping of mental health using multimodal sensing of multiple situations of interest: a systematic literature review. J Biomed Inform. 2023;138:104278. [FREE Full text] [CrossRef] [Medline]
  30. Gershon A, Ram N, Johnson SL, Harvey AG, Zeitzer JM. Daily actigraphy profiles distinguish depressive and interepisode states in bipolar disorder. Clin Psychol Sci. 2016;4(4):641-650. [FREE Full text] [CrossRef] [Medline]
  31. Saeb S, Zhang M, Karr CJ, Schueller SM, Corden ME, Kording KP, et al. Mobile phone sensor correlates of depressive symptom severity in daily-life behavior: an exploratory study. J Med Internet Res. 2015;17(7):e175. [FREE Full text] [CrossRef] [Medline]
  32. Barnett I, Torous J, Staples P, Sandoval L, Keshavan M, Onnela JP. Relapse prediction in schizophrenia through digital phenotyping: a pilot study. Neuropsychopharmacology. 2018;43(8):1660-1666. [FREE Full text] [CrossRef] [Medline]
  33. Kelkar RS, Currey D, Nagendra S, Mehta UM, Sreeraj VS, Torous J, et al. Utility of smartphone-based digital phenotyping biomarkers in assessing treatment response to transcranial magnetic stimulation in depression: proof-of-concept study. JMIR Form Res. 2023;7:e40197. [FREE Full text] [CrossRef] [Medline]
  34. Jacobson NC, Chung YJ. Passive sensing of prediction of moment-to-moment depressed mood among undergraduates with clinical levels of depression sample using smartphones. Sensors (Basel). 2020;20(12):3572. [FREE Full text] [CrossRef] [Medline]
  35. Wang W, Nepal S, Huckins JF, Hernandez L, Vojdanovski V, Mack D, et al. First-gen lens: assessing mental health of first-generation students across their first year at college using mobile sensing. Proc ACM Interact Mob Wearable Ubiquitous Technol. 2022;6(2):95. [FREE Full text] [CrossRef] [Medline]
  36. Hitchcock PF, Fried EI, Frank MJ. Computational psychiatry needs time and context. Annu Rev Psychol. 2022;73:243-270. [FREE Full text] [CrossRef] [Medline]
  37. Cohen AS, Cox CR, Masucci MD, Le TP, Cowan T, Coghill LM, et al. Digital phenotyping using multimodal data. Curr Behav Neurosci Rep. 2020;7(4):212-220. [CrossRef]
  38. Rutledge RB, Chekroud AM, Huys QJ. Machine learning and big data in psychiatry: toward clinical applications. Curr Opin Neurobiol. 2019;55:152-159. [CrossRef] [Medline]
  39. Browning M, Carter CS, Chatham C, Den Ouden H, Gillan CM, Baker JT, et al. Realizing the clinical potential of computational psychiatry: report from the Banbury Center meeting, february 2019. Biol Psychiatry. 2020;88(2):e5-e10. [CrossRef] [Medline]
  40. NDA data repositories. NIMH Data Archive. URL: [accessed 2024-02-07]
  41. Garavan H, Bartsch H, Conway K, Decastro A, Goldstein RZ, Heeringa S, et al. Recruiting the ABCD sample: design considerations and procedures. Dev Cogn Neurosci. 2018;32:16-22. [FREE Full text] [CrossRef] [Medline]
  42. McLean SA, Ressler K, Koenen KC, Neylan T, Germine L, Jovanovic T, et al. The AURORA study: a longitudinal, multimodal library of brain biology and function after traumatic stress exposure. Mol Psychiatry. 2020;25(2):283-296. [FREE Full text] [CrossRef] [Medline]
  43. Granholm E. Context-aware mobile intervention for social recovery in serious mental illness. NIMH Data Archive. URL: [accessed 2024-02-07]
  44. Chung T. Smartphone sensors to detect shifts toward healthy behavior during alcohol treatment. NIMH Data Archive. URL: [accessed 2024-02-07]
  45. Gratz KL, Roemer L. Multidimensional assessment of emotion regulation and dysregulation: development, factor structure, and initial validation of the difficulties in emotion regulation scale. J Psychopathol Behav Assess. 2008;30(4):315-315. [FREE Full text] [CrossRef]
  46. Aldao A, Gee DG, De Los Reyes A, Seager I. Emotion regulation as a transdiagnostic factor in the development of internalizing and externalizing psychopathology: current and future directions. Dev Psychopathol. 2016;28(4pt1):927-946. [CrossRef] [Medline]
  47. Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J Health Soc Behav. 1983;24(4):385-396. [Medline]
  48. Neurodevelopmental disorders. In: Diagnostic and Statistical Manual of Mental Disorders, 5th Edition, Text Revision (DSM-5-TR). Washington, DC. American Psychiatric Association; 2022.
  49. Chaplin TM, Cole PM. The role of emotion regulation in the development of psychopathology. In: Hankin BL, Abela JRZ, editors. Development of Psychopathology: A Vulnerability-Stress Perspective. Thousand Oaks, CA. SAGE Publications, Inc; 2005.
  50. Onnela JP, Dixon C, Griffin K, Jaenicke T, Minowada L, Esterkin S, et al. Beiwe: a data collection platform for high-throughput digital phenotyping. J Open Source Softw. 2021;6(68):3417. [FREE Full text] [CrossRef]
  51. Mehl MR, Robbins M. Naturalistic observation sampling: the electronically activated recorder. In: Handbook of Research Methods for Studying Daily Life. New York, NY. Guilford Press; 2012;176-192.
  52. Onnela Lab Contributors. Beiwe documentation. GitHub. URL: [accessed 2023-08-16]
  53. Kilshaw RE, Baucom BRW. Benchmarking mental health status using passive sensor data. Open Science Foundation. URL: [accessed 2024-01-26]
  54. Hallion LS, Steinman SA, Tolin DF, Diefenbach GJ. Psychometric properties of the Difficulties in Emotion Regulation Scale (DERS) and its short forms in adults with emotional disorders. Front Psychol. 2018;9:539. [FREE Full text] [CrossRef] [Medline]
  55. Giromini L, Ales F, de Campora G, Zennaro A, Pignolo C. Developing age and gender adjusted normative reference values for the Difficulties in Emotion Regulation Scale (DERS). J Psychopathol Behav Assess. 2017;39(4):705-714. [CrossRef]
  56. Warttig SL, Forshaw MJ, South J, White AK. New, normative, english-sample data for the short form Perceived Stress Scale (PSS-4). J Health Psychol. 2013;18(12):1617-1628. [CrossRef] [Medline]
  57. Henry JD, Crawford JR. The short-form version of the Depression Anxiety Stress Scales (DASS-21): construct validity and normative data in a large non-clinical sample. Br J Clin Psychol. 2005;44(Pt 2):227-239. [CrossRef] [Medline]
  58. Lovibond SH, Lovibond PF. Manual for the Depression Anxiety Stress Scales, 2nd Edition. Sydney. Psychology Foundation of Australia; 1996.
  59. McNeely J, Wu LT, Subramaniam G, Sharma G, Cathers LA, Svikis D, et al. Performance of the Tobacco, Alcohol, Prescription medication, and other Substance use (TAPS) tool for substance use screening in primary care patients. Ann Intern Med. 2016;165(10):690-699. [FREE Full text] [CrossRef] [Medline]
  60. Eaton NR, Rodriguez-Seijas C, Carragher N, Krueger RF. Transdiagnostic factors of psychopathology and substance use disorders: a review. Soc Psychiatry Psychiatr Epidemiol. 2015;50(2):171-182. [CrossRef] [Medline]
  61. Altshuler L, Mintz J, Leight K. The Life Functioning Questionnaire (LFQ): a brief, gender-neutral scale assessing functional outcome. Psychiatry Res. 2002;112(2):161-182. [CrossRef] [Medline]
  62. Derogatis LR. BSI 18, Brief Symptom Inventory 18: Administration, Scoring and Procedures Manual. Minneapolis, MN. NCS Pearson, Inc; 2001.
  63. Derogatis L, Fitzpatrick M. The SCL-90-R, the Brief Symptom Inventory (BSI), and the BSI-18. In: Maruish ME, editor. The Use of Psychological Testing for Treatment Planning and Outcomes Assessment: Volume 3, 3rd Edition. Mahwah, N.J. Lawrence Erlbaum Associates; 2004;1-41.
  64. Watson D, Clark LA, Tellegen A. Development and validation of brief measures of positive and negative affect: the PANAS scales. J Pers Soc Psychol. 1988;54(6):1063-1070. [CrossRef] [Medline]
  65. Smyth JM, Zawadzki MJ, Santuzzi AM, Filipkowski KB. Examining the effects of perceived social support on momentary mood and symptom reports in asthma and arthritis patients. Psychol Health. 2014;29(7):813-831. [CrossRef] [Medline]
  66. Baucom BRW, Baucom KJW, Hogan JN, Crenshaw AO, Bourne SV, Crowell SE, et al. Cardiovascular reactivity during marital conflict in laboratory and naturalistic settings: differential associations with relationship and individual functioning across contexts. Fam Process. 2018;57(3):662-678. [CrossRef] [Medline]
  67. Vieira LS, Nguyen B, Nutley SK, Bertolace L, Ordway A, Simpson H, et al. Self-reporting of psychiatric illness in an online patient registry is a good indicator of the existence of psychiatric illness. J Psychiatr Res. 2022;151:34-41. [FREE Full text] [CrossRef] [Medline]
  68. Narayanan S, Georgiou PG. Behavioral signal processing: deriving human behavioral informatics from speech and language: computational techniques are presented to analyze and model expressed and perceived human behavior-variedly characterized as typical, atypical, distressed, and disordered-from speech and language cues and their applications in health, commerce, education, and beyond. Proc IEEE Inst Electr Electron Eng. 2013;101(5):1203-1233. [FREE Full text] [CrossRef] [Medline]
  69. Eyben F, Scherer KR, Schuller BW, Sundberg J, Andre E, Busso C, et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing. IEEE Trans Affective Comput. 2016;7(2):190-202. [FREE Full text] [CrossRef]
  70. Eyben F, Schuller B. openSMILE:): the Munich open-source large-scale multimedia feature extractor. ACM SIGMultimedia Rec. 2015;6(4):4-13. [CrossRef]
  71. Onnela Lab Contributors. Forest. URL: [accessed 2023-08-16]
  72. OpenStreetMap Contributors. Planet OSM. URL: [accessed 2023-08-16]
  73. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018. [FREE Full text] [CrossRef] [Medline]
  74. Data repository guidance. Nature. URL: [accessed 2022-05-25]
  75. Marsch LA. Opportunities and needs in digital phenotyping. Neuropsychopharmacology. 2018;43(8):1637-1638. [FREE Full text] [CrossRef] [Medline]
  76. Gueguen MCM, Schweitzer EM, Konova AB. Computational theory-driven studies of reinforcement learning and decision-making in addiction: what have we learned? Curr Opin Behav Sci. 2021;38:40-48. [FREE Full text] [CrossRef] [Medline]

BSI-18: Brief Symptom Inventory-18
DASS-21: Depression, Anxiety, and Stress Scale-21
DERS: Difficulties with Emotion Regulation Scale
HIPAA: Health Insurance Portability and Accountability Act
IMHC: Intermountain Health Care
LFQ: Life Functioning Questionnaire
PDD: personal digital device
PSS-4: Perceived Stress Scale-4
TAPS: Tobacco, Alcohol, Prescription medications, and Other Substance
UHealth: University of Utah Health
UPDB: Utah Population Database

Edited by A Mavragani; submitted 21.10.23; peer-reviewed by B Montezano, A Hudon; comments to author 02.12.23; revised version received 27.01.24; accepted 22.02.24; published 27.03.24.


©Robyn E Kilshaw, Abigail Boggins, Olivia Everett, Emma Butner, Feea R Leifker, Brian R W Baucom. Originally published in JMIR Research Protocols (, 27.03.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.