This is an openaccess article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on http://www.researchprotocols.org, as well as this copyright and license information must be included.
Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required.
This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses.
We developed a Webbased open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4).
An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test.
Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use.
Evidencebased treatment guidelines in health care are predominantly based on nomothetic, groupbased research. Samples of patients are investigated to find general laws of symptomatology and functioning, which are then generalized to all individual members of the investigated population [
Allport was an early advocate of the idiographic, individualbased approach. In the 1960s and 1970s, the enthusiasm for idiographic research diminished. It was qualified as unscientific [
A number of research examples can be found in which intensive time series designs are used to map the mental and physical functioning of individual people [
In another study, Rosmalen et al [
Despite the promising examples described above, there still is a significant gap between the research context in which intensive time series analysis is experimented with and health care practice in which individual patients may profit from its results. An important challenge is the substantial burden that data collection and processing puts on patients and researchers. Patients have to complete at least 50 assessments, and preferably even more [
Intensive time series analysis can only be applied in daily care practice when certain requirements are met. First, data collection and data management should be standardized to some extent, as to enable professionals and patients to select relevant assessment domains from a prespecified set of measures. This is to prevent a situation in which intensive time series data collection needs to be built from scratch for every individual patient. Second, to deploy intensive time series in the course of a treatment process, as a diagnostic means, or as a method to evaluate treatment effects, time series data need to be available realtime so that the outcomes can be used immediately. Third, it should be possible to conduct a reliable analysis of time series data, without extensive statistical training. Fourth, professionals and patients should be able to interpret the output of intensive time series and to understand how the results relate to their particular care context.
The latter 2 conditions, which allow for a situation in which the researcher becomes superfluous, may be the hardest and most fundamental conditions to meet. So far, analysis of time series data has always required advanced statistical expertise, including extensive knowledge of the statistical procedures and a high level of experience.
There are several forms of time series data. Time series can be eventbased, in which the assessments follow a specific event, or timebased in which the assessments are performed at specific time points. Moreover, timebased assessments can be conducted either at fixed or random moments. Each method has its own purposes. If data is collected at fixed moments, with equidistant intervals in between time points, temporal dynamics between variables can be analyzed by a method such as vector autoregressive modeling (VAR) [
The “vector” term in vector autoregressive modeling refers to the multivariate character, which is an extension of the single variable autoregressive model. VAR models consist of a set of regression equations in which all variables are treated as endogenous variables, meaning that they function as both outcome and predictor. VAR analysis can be conducted without a prior hypothesis about the direction of the association between variables. A statistical test called the “Granger causality test” can be used to examine whether the lagged values of one variable x are useful in the prediction of values of another variable y. If so, it is said that variable x
In the VAR modeling process, researchers are broadly faced with 2 main tasks, namely (1) to build statistical models and conduct a reliable, iterative analysis to evaluate the validity of these models and (2) to choose the best model with which they can work. The first task is predominantly a statistical one. Although the researcher has to make some choices, such as which variables to include in the VAR and the maximum lag length (ie, the maximum number of previous observations that contain relevant information for estimating the current observations), the biggest part of this task consists of statistical analysis conducted with predefined tests. By means of residual diagnostics, the models are checked for assumptions of stability, “white noise” (ie, no residual autocorrelation), homoscedasticity, and normality based on which valid models can be selected. The second task is less statistical. Choosing the “best” model out of all valid models mostly is an informed choice of content. It is based on a combination of statistical parameters (eg, model selection criteria like the Akaike information criterion (AIC) or the Bayesian information criterion (BIC)) theoretical assumptions about the data, and common sense. The researcher plays a crucial role here.
Quantitative idiographic assessment has shown to be promising, but application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. This study aims to show how this limitation can be overcome by introducing innovative technology.
We provide a proofofprinciple that might bring idiographic assessments closer to health care practice by automating analytical processes. We developed a Webbased application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. We report on our experiences with the program in reanalyzing a set of time series data.
To evaluate the outcomes of our automated analysis, we reanalyzed data that were previously analyzed in a manual analysis in a study by Rosmalen et al [
Patients were asked to complete daily measures of depressive symptoms and physical activity every evening, during a period of 3 months. Depressive symptoms were measured with the depression module of the Patient Health Questionnaire [
To encourage compliance to the daily assessments, patients were promised that they would be provided with a personal report of the assessments results after completion of the assessments. They were also offered a small gift certificate of €25. During the study period, one patient dropped out after 2 weeks because he was too busy at work and could not manage to complete the daily assessments. This patient was not included in the analysis.
Our starting point was the study by Rosmalen et al [
AutoVAR is developed to take over those actions that in the manual analysis can only be conducted by a statistical expert. The solution that AutoVAR follows is to test all possible models within given restrictions and to summarize outcomes of all valid models. When the program is running, AutoVAR creates time plots for each selected variable, defines the possible VAR models, checks all models for validity, and finally presents all valid models. AutoVAR is freely accessible online and it is accompanied by documentation and a user example [
The total number of possible VAR models is determined by the combinatorial search space. AutoVAR’s combinatorial search space is defined by multiple factors:
The lag length. The lag length refers to the maximum number of previous observations that contain relevant information for estimating the current observations. AutoVAR tests all lag lengths, up to a maximum set by the user by typing the number into the box “Max. lag.” In this paper, the maximum lag length was set to 2, following the procedure by Rosmalen et al [
Potential need for inclusion of a trend variable. AutoVAR checks whether a time series is stationary around a trend with the PhillipsPerron test [
Potential need for inclusion of seasonal variables. AutoVAR checks whether seasonal variables should be included using dummies for the weekdays (if AutoVAR’s option “timestamps” is checked). AutoVAR evaluates, by default, every model twice. Once with and once without dummy variables for weekdays. In a manual analysis, dummy variables for weekdays are added when it seems to make sense, for instance when a lag of 7 is indicated by lag length selection criteria.
Potential presence of outliers. Outlier values that violate model assumptions are accounted for in AutoVAR and manual analyses by including a dummy variable as an exogenous variable (eg, 0/1). In AutoVAR, outliers are defined as values larger than 3.5, 3.0, or 2.5 standard deviations from the mean of the residuals. AutoVAR will first test a model without outliers; if this model is invalid, it will test a model with outliers that deviate 3.5 standard deviations from the mean of the residuals; if the model is still invalid, it will test a model with outliers that deviate 3.0 standard deviations. If this still yields no valid model, AutoVAR will stop, unless the option is checked to look for outliers of 2.5 standard deviations. In a manual analysis, the presence of outliers is determined by looking for extraordinary values in the time plots of the (residuals of the) variables and based on additional information provided by the patient.
Log transformation. AutoVAR constructs and calculates each model with and without log transformation. In a manual analysis, a researcher determines whether a log transformation is necessary based on a normality test, such as the SkewnessKurtosis test [
Potential need for constraints put to model parameters. Like the manual procedure, AutoVAR sets to “0” those parameters that do not significantly contribute to the model, starting with the parameter that has the highest
Potential need for exogenous variables added to the model, based on additional patient information. Sometimes time plots show strange characteristics (eg, an unexpected increase in activity) that may be explained by external factors (eg, change of jobs). In AutoVAR, these external factors can be added to the model, by having the user select them as “additional exogenous variables.” In a manual analysis, the researcher adds additional exogenous variables to the model as part of the regular analysis procedure.
After each model is estimated, AutoVAR checks them for validity by means of an automated residual diagnostics procedure, in which 4 assumptions are tested. The stability assumption is checked by the eigenvalue stability condition, the “white noise” assumption by a Portmanteau test on the residuals, the homoscedasticity assumption by a Portmanteau test on the squares of the residuals, and the normality assumption by the SkewnessKurtosis test (see [
The validity of models also plays a role in the total number of models that AutoVAR runs. Strictly speaking, AutoVAR does not run all possible models defined by the combinatorial search space, but only the nonredundant ones. Of all the models that AutoVAR considers, it filters out the redundant models prior to running the final model calculations. AutoVAR considers a model redundant when it is not needed for optimization of the data modeling. For instance, a valid model without modeled outliers makes a model with the exact same model specifications but with modeled outliers redundant. This is to say that AutoVAR always tries to fit the most simple model (eg, without outliers) to the data first and only resorts to more complex models (with outliers) when these simple models do not suffice (ie, when they invalidate one or more of the model assumptions). This procedure has consequences for the number of valid models that can be fitted to the data. If simple models do not suffice to fit the data, AutoVAR has to resort to more complex models and thus the total number of possible models increases. For instance, if a model without outliers is not valid, AutoVAR will widen the combinatorial search space to include models with outliers. As a result, the total number of valid model fits for complex models often will be higher than the total number of valid model fits for more simple models. Finally, for all valid models, AutoVAR calculates AIC and BIC scores. AutoVAR orders the valid models on ascending order of the best (ie, lowest) AIC or BIC score. If the ordering of models based on AIC scores differ from the ordering based on BIC scores, AutoVAR will present the ordering based on AIC by default. However, users have the option to change the ordering based on BIC score by checking a box on the Advanced Settings page. Results of Granger causality tests are summarized in an image.
The AutoVAR procedure deviates from the manual procedure in two important respects. First, AutoVAR tests all possible VAR models within a given combinatorial search space, whereas a researcher tests a selection of models based on statistical and theoretical considerations. Second, AutoVAR orders the valid models and presents all of them in a Granger causality image, whereas a researcher evaluates the models and chooses one “best” model.
AutoVAR screenshot.
The basic VAR model used in this study was the same model as the one used by Rosmalen et al. The model consists of a system of two endogenous variables, namely, depression and physical activity, which are shown in
In these equations
There are 4 main assumptions that need to be met for a VAR model to be valid: (1) the stability assumption requires that the VAR model is stable (ie, that it is stationary over time), (2) the “white noise” assumption requires a model to leave no autocorrelation in the residuals, (3) the homoscedasticity assumption requires homogeneity of variance over time, and (4) the normality assumption requires the residuals to be normally distributed.
In the Rosmalen et al study, the VAR analyses were performed in STATA 11 software, using the suite of VAR commands [
The endogenous variables for depression and physical activity used in this study.
For patient 1 of the study by Rosmalen et al [
Comparison of AutoVAR output versus manual analysis output.


Autovar analysis  Manual analysis 
Patient 1 
Granger causality Wald test  Increase activity → decrease depression ( 
Increase activity → decrease depression ( 

Lag length  2  2 

Trend variable included  No  No 

Weekday dummies included  No  No 

Outlier variables  Outlier dummies for day 4 (Depression) and day 13 (Activity)  Outlier dummies for day 4 (Depression) and day 13 (Activity) 

Log transformation  No  No 

BIC  655.41  655.89 

AIC  631.22  631.70 
Patient 2 
Granger causality Wald test  Not significant 
Not significant 

Lag length  1  1 

Trend variable included  No  No 

Weekday dummies included  No  No 

Outlier variables  Outlier dummy for day 12 (Depression)  Outlier dummy for day 12 (Depression) 

Log transformation  Yes  Yes 

BIC  390.07  386.15 

AIC  381.49  375.43 
Patient 3 
Granger causality Wald test  Increase depression → decrease activity ( 
Increase depression → decrease activity ( 

Lag length  2  2 

Trend variable included  Yes  Yes 

Weekday dummies included  Yes  Yes 

Outlier variables  Outlier dummy for day 5 (Depression)  Outlier dummy for day 5 (Depression) 

Log transformation  No  No 

BIC  307.21  304.64 

AIC  275.06  283.21 
Patient 4 
Granger causality Wald test  Increase depression → decrease activity ( 
Increase depression → decrease activity ( 

Lag length  1  1 

Trend variable included  No  No 

Weekday dummies included  No  No 

Outlier variables  Outlier dummy for day 27 (Depression)  Outlier dummy for day 27 (Depression) 

Log transformation  Yes  Log transformation yes 

BIC  398.59  398.59 

AIC  386.23  386.23 
^{a}T is the number of time points at which patients completed a measure.
The results of the Granger causality tests of all valid models are summarized visually, in a rather selfexplanatory image in
For the other 3 patients in the Rosmalen et al study, the Granger causality images generated by AutoVAR are also presented in
Granger causality plots.
Comparing the output generated by AutoVAR to the outcomes resulting from the manual analysis described by Rosmalen et al, we found rather similar results in terms of model specification, model validity, information criteria, and Granger causality estimates (see
In this paper, we provided a potential solution to bridge the gap between the use of intensive time series analysis in research and health care practice by automating the analysis processes. Results suggest that automated time series analysis is feasible and that the output can be presented in an intuitive way. Automated analysis can make the role of the statistical interpretation less important and, as such, it saves a significant amount of time. Whereas AutoVAR generates results in a few seconds, manual analysis may take several days. Automated analytical procedures and accessible visual presentation of statistical outcomes might pave the way for health care professionals and patients to use methods such as EMA as an integral part of the treatment trajectory, without extensive training. As such, general treatment guidelines based on nomothetic research could be complemented by idiographicbased information. This may support health care professionals in taking a tailored treatment approach. Although the personal narrative of patients remains an important basis for tailormade treatment, intensive time series assessments can add information that professionals are unable to see with the naked eye. EMA may be particularly valuable in those situations in which treatment trajectories have become stuck, when patients do not sufficiently benefit from treatment, and professionals do not know why. Furthermore, since completing EMA assessments can be quite an investment, an automated EMA approach may be especially suitable for settings in which patients receive longterm treatment for a chronic disease, such as depression or a heart disease in which controlling, instead of curing, is the main focus. The creation of a thorough and detailed patient profile of symptoms, behaviors, and experiences can help to shape the treatment toward individual needs.
Apart from EMA being an instrument to support professionals, we may also speculate that automated time series analysis provides opportunities for using EMA as part of selfmanagement processes. If patients are able to analyze and interpret their own data, they may find it helpful to monitor themselves and map their symptoms or functioning in certain situations or periods. A promising perspective is sketched by Nikles et al [
AutoVAR is promising, but the application needs further validation and refinement prior to implementation in health care practice. In this study, we applied AutoVAR to replicate the results of the manual analysis conducted by Rosmalen et al. Analysis of additional datasets is needed in order to validate the application for general use. Whereas the output of AutoVAR was rather similar to the manual output of the Rosmalen et al study and the most important output, namely the directions of the Granger causality relationships were identical, the model selection criteria (AIC and BIC) were not exactly the same in the different procedures. This may be due to differences in optimization algorithms in STATA versus R and therefore needs a more thorough scrutiny of discrepancies between the statistical packages in future research. An important question in this context is how to determine the validity of different procedures. In this paper, we compared automated analysis to manual analysis. Nevertheless, the manual analysis need not be the golden standard. The major advantage of a manual procedure is that a researcher can make informed decisions about the analysis process in a way that an application like AutoVAR can perhaps never do. These decisions are, however, subjective. They may depend on the researcher’s experience, preference, and “staying power.” As a consequence, valid time series models might be overlooked in a manual procedure. AutoVAR, in contrast, takes into account all possible models, thus following a more objective procedure. A limitation of this latter procedure is the risk of capitalization on chance. By testing many models, AutoVAR may generate more incidental findings. In the current version of AutoVAR, we tried to minimize this risk in 3 ways: (1) by not running redundant models, (2) by an extensive check of validity assumptions, and (3) by summarizing the results of the Granger causality tests in an image in which the thickness of the arrow indicates the probability of the effect.
The automated processes of the current version of AutoVAR need to be optimized. AutoVAR cannot yet handle missing data. VAR models can be processed with missing values, but this is suboptimal as this usually decreases the number of observations considerably, and thus decreases statistical power. Data collected from assessments completed at nonequidistant time intervals need to be preprocessed before AutoVAR can analyze them. There is as yet no functionality in AutoVAR to use spline smoothing and resampling of data. Moreover, AutoVAR currently functions most optimal when several settings are set manually. The lag length is one of these settings. AutoVAR also has several options that users can choose to check or leave blank, such as setting timestamps and adding additional exogenous variables based on patient information. These issues need to be solved before using automated analysis in health care practice. In addition, the user interface of AutoVAR has a rather technical lookandfeel and therefore needs a radical redesign to meet the criteria of userfriendliness for health care practice. We are currently working on an improved version of AutoVAR in which we will account for these issues.
One of the most important limitations of idiographic analyses compared to nomothetic analyses is their presumed limited generalizability. What holds for one individual is not necessarily true for another. Nevertheless, the question is whether this limitation needs to be overcome in the context of health care practice, for in this context the presumed weakness of idiographic research can also be considered one of its main strengths. If the main aim is to elucidate the specific temporal patterns of symptoms or experiences, and their triggers and effects on functioning within one specific patient, then the argument of generalizability to a larger population does not hold. The principal requirement for a meaningful use of intensive time series analysis as a supportive means in diagnostics and treatment of a specific individual is that the models selected provide a good description of the dynamic relationships in the EMA data registered by that very individual. Nevertheless, what remains is the issue of generalizability over time, within an individual. Whether the results of time series analysis need to be generalizable to the individual patient on multiple moments depends on the context. In those treatment contexts in which one is mainly interested in the temporal dynamics of variables in a specific time window, a single time series analysis may suffice and its results do not need to be generalizable to other points in time. Nevertheless, if one wants to generalize within one individual over time, for instance when the aim is to unveil the temporal dynamics of variables that are assumed to be rather stable, a second time series analysis is needed to confirm the explorative results of the first analysis.
Finally, instead of having nomothetic research replaced by idiographic research, the most ideal situation may be a combination of both. Gates et al [
The benefits of automated time series analysis can only be fully exploited when it is embedded in an “EMAfriendly” health care context. Just like the analysis and interpretation processes, the collection and management of data also need to be facilitated. This may best be realized by integrating time series assessments in the existing information technology infrastructure used by professionals and patients, such as systems for routine outcome monitoring (ROM). In the Netherlands, almost all mental health organizations use electronic ROM systems, which offer professionals and patients the opportunity to select and complete questionnaires and other measurements, of which the results are automatically presented in the electronic patient files. These systems were created for the mandatory yearly routine assessments among patients in which health care effects are examined. However, several systems have been extended with functionality for frequent and repeated assessments; for instance, by means of a diary app [
To facilitate intensive time series measurements, the electronic monitoring systems should include a specified set of reliable instruments that are appropriate for time series analysis of particular variables. From this set of instruments, health care professionals can select the relevant variables for specific patients. Time series diaries might also be automatically composed by having variables selected based on deviating scores on completed ROM measures. Time series measurements need not be restricted to selfreport questionnaires. Current technological developments have given rise to smart and consumerpriced mobile devices measuring heart rate, activity, sleeping behavior, and so on. An increasing number of devices have a socalled open application programming interface, meaning that the data collected by these devices can be used by and be integrated into existing applications. Provided that they are validated, these devices can be excellent EMA data collectors. They often collect data automatically, so that minimal input is needed from the person who carries the device.
If patients are willing to participate in intensive time series measurement, they will have to deal with a long series of assessments. Motivation to complete the assessments is therefore crucial. A key element in motivating patients for EMA data collection is demonstrating to patients the personal and theoretical benefits EMA can have for them prior to the assessments [
Future studies should examine whether patients and care professionals are actually willing and able to use time series analysis in an individual care trajectory and how intensive time series analysis can best be integrated into the daily care practice. In addition, we need to investigate whether tailored treatment advice, based on the analysis, can improve clinical outcomes. After all, this is the ultimate test to determine the actual validity of intensive time series analysis for health care practice.
In this paper, we have conducted a proofofprinciple study that has demonstrated the viability of a quantified idiographic approach in health care practice by using automated time series analysis. Compared to a manual procedure, the automated procedure is more robust and saves a significant amount of time. In addition, the output of automated time series analysis can be presented in an intuitive way. These findings may pave the way for health care professionals and those in need of care to use intensive time series analysis as an integral part of the treatment trajectory, without extensive statistical training.
Akaike information criterion
Bayesian information criterion
ecological momentary assessment
routine outcome monitoring
time points
vector autoregressive
LvdK, ACE, MA, and SS were supported by ZonMw, the Netherlands organization for health research and development; Fonds Psychische Gezondheid; ICT regie; and the Dutch Ministry of Health, Welfare and Sport (grant number 300020011). JGMR and HR were supported by the University Medical Center Groningen.
LvdK wrote the manuscript, with input from all authors. ACE wrote the software for the AutoVAR application, with support of EHB. JGMR, HR, and PdJ conceived the study, with help of SS and MA. JGMR, HR, and SS contributed to the funding of the study. All authors participated in the interpretation of the results, critically reviewed, and approved the final manuscript.
None declared.