This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on http://www.researchprotocols.org, as well as this copyright and license information must be included.
A considerable proportion of outdoor physical activity (PA) is done on sidewalks and streets, necessitating the development of a reliable measure of PA performed in these settings. The Block Walk Method (BWM) is one of the more common approaches for this purpose. Although it utilizes reliable observation techniques and displays criterion validity, it remains relatively unchanged since its introduction in 2006. It is a nontechnical, labor-intensive, first generation method. Advancing the BWM would contribute significantly to our understanding of PA behavior.
This study will develop and test a new BWM that utilizes a wearable video device (WVD) and computer video analysis to assess PAs performed on sidewalks and streets. The specific aims are to improve the BWM by incorporating a WVD (eyeglasses with a high-definition video camera in the frame) into the methodology and advance this WVD-enhanced BWM by applying machine learning and recognition software to automatically extract information on PAs occurring on the sidewalks and streets from the videos.
Trained observers (1 wearing and 1 not wearing the WVD) will walk together at a set pace along predetermined 1000 ft sidewalk and street observation routes representing low, medium, and high walkable areas. During the walks, the non-WVD observer will use the traditional BWM to record the numbers of individuals standing, sitting, walking, biking, and running in observation fields along the routes. The WVD observer will continuously video the observation fields. Later, 2 investigators will view the videos to determine the number of individuals performing PAs in the observation fields. The video data will then be analyzed automatically using multiple deep convolutional neural networks (CNNs) to determine the number of humans in the observation fields and the type of PAs performed. Bland Altman methods and intraclass correlation coefficients (ICCs) will be used to assess agreement. Potential sources of error such as occlusions (eg, trees) will be assessed using moderator analyses.
Outcomes from this study are pending; however, preliminary studies supporting the research protocol indicate that the BWM is reliable for determining the PA mode (Cramer V=.89;
We expect the new approach will enhance measurement accuracy while reducing the burden of data collection. In the future, the capabilities of the WVD-CNN system will be expanded to allow for the determination of other characteristics captured in videos such as caloric expenditure and environmental conditions.
PRR1-10.2196/12976
Physical inactivity facilitates the development of chronic diseases including obesity, cardiovascular disease, type 2 diabetes, and some cancers and independently contributes to nearly 11% of total annual US health care expenditures [
Neighborhood built environment characteristics have been studied extensively over the past 10 years and are some of the strongest correlates of PA [
Studies and evaluations of PAs performed on sidewalks and streets, whether to detect changes in usage or determine how associated environmental conditions impact their usage, necessitate a reliable, accurate, and easily administered approach for assessing PA. Self-report questionnaires are hampered by recall bias, plus they have not been adequately validated for geo-locating PAs [
In contrast, the observation method is a reliable approach to counting the number of individuals engaged in various PAs in different environmental settings [
The BWM uses time sampling techniques in which observers actually walk predefined segments of sidewalks and streets at a set pace while systematically chronicling the number of individuals performing activities of interest (eg, walking and cycling). The BWM is better than pedestrian counts because it captures a substantially greater proportion of the sidewalks and streets, and thus, a wider spectrum of environmental exposures and a richer context in which to explore PA behavior. Mobile observers, as used in the BWM, provide a very objective, precise, scientifically rigorous, and replicable way to assess PAs performed in diverse environmental conditions. Despite the BWM’s many benefits, it has not been updated since its introduction in 2006, and limitations inherent in its original design are still present. In its current form, the BWM is time consuming, requires extensive training, and has questionable accuracy when observing larger groups.
The extension of video technology within mobile and wearable video devices (WVDs) provides extraordinary opportunities for objectively measuring georeferenced imagery including sidewalk and street users in real time. It is now feasible to leverage these technologies to supplement or replace the traditional observational methods used by the BWM. Until recently, video recording devices were bulky, and the video resolutions were crude. Video recorders can now be embedded into the frame of a pair of sunglasses or attached to an unmanned aerial vehicle to provide a completely new, more robust vantage point. Although video capture has not been used to study PAs on sidewalks and streets, it has been used along with computer vision techniques to identify and classify people in different PA intensities (eg, light, moderate, and vigorous) [
As described above, the BWM (and PA observation methods in general) has limitations. Whether today’s technology can be used to alleviate these limitations in human populations is virtually unknown. The proposed study seeks to develop and test a new BWM that utilizes a WVD and computer video analysis to assess PAs performed on sidewalks and streets. The following aims will be completed to accomplish this objective:
For this cross-sectional study, we will first identify low, medium, and high walkability areas of different size cities. Afterwards, we will randomly select a sample of observation routes (1000 foot long street segments) from each walkability and city strata. The BWM will then be conducted along each observation route on 2 different days and at 6 different times. A total of 2 observers will perform the BWM simultaneously. A total of 1 observer will follow the traditional BWM procedures, whereas the other walks side-by-side with this observer and records video using the WVD. Later, 2 investigators will review the videos and, based on the BWM criteria for counting individuals, derive independent counts of individuals being physically active on sidewalks and streets. Comparative analyses will be conducted to determine the equivalence of the 2 approaches.
We are stratifying our sample to observe PAs occurring along sidewalks and streets given a wide range of conditions related to city size and walkability. We selected 3 cities: West Chester, Pennsylvania; Wilmington, Delaware; and Philadelphia, Pennsylvania that are small, medium, and large in terms of population, respectively (
Drawing from our familiarity with the study cities and examinations of aerial maps, we will identify 3 neighborhoods per city we estimate as being low, medium, and high walkability. This is being done to streamline the process because there are 44, 92, and 160 defined neighborhoods in West Chester, Wilmington, and Philadelphia, respectively. Afterwards, we will actually measure walkability for each selected neighborhood using WalkScore. As WalkScores can vary across neighborhoods, we will base a neighborhood’s WalkScore on the average of WalkScores for 10 randomly selected addresses drawn from a list of all addresses in the neighborhood. This process will be repeated until 1 low (WalkScore ≤33), 1 medium (WalkScore 33 to ≤66), and 1 high (WalkScore >66) walkable neighborhood is located in each city giving us a total of 9 neighborhoods. We are using WalkScore because it is a valid measure for estimating walkability [
The total linear length of sidewalks and streets in the 9 neighborhoods will be estimated using the ruler tool in Google Earth (a geobrowser that accesses satellite, aerial imagery, and other geographic data to represent the Earth as a 3-dimensional globe). The ruler tool is a geographical information systems-based application with submeter resolution. We have found the ruler tool to be accurate to within ±1.5% for measuring street segment lengths. Based on our previous work, we expect an average of 180,000 total linear ft. of sidewalks and streets per neighborhood [
Each observation route in a neighborhood will be observed 3 times on a weekday and 3 times on a weekend day, which will give us a stable estimate of the outcome variable [
During an observation period, 2 trained observers (1 wearing a WVD and the other not wearing a WVD) will traverse an observation route at a pace of 100 ft/min (50 steps/min [largo]; stride width 2 ft; pace set by a battery-powered metronome). The observer without the WVD will record the number of individuals engaging in the targeted activities within an observation field. The observation field will be defined as a line extending to the left and right of the observer’s shoulders, linear and perpendicular from the observer’s plane of motion. The observation fields are expected to range in width from 30 to 70 ft. and include both sidewalks (if present) and the streets associated with an observation route. Individuals will be counted only if they cross a parallel plane of motion with the observer (
Block Walk Method procedure.
The primary outcome variable for aim 1 is the number of individuals observed walking, cycling, running, and standing/sitting along each observation route/50 min of observation.
The 2 study’s principal investigators will conduct independent evaluations of the videos obtained during the BWM. This will be done over a 1-year period beginning after the first week of BWMs are completed. They will use the BWM criteria to count individuals walking, cycling, running, and sitting/standing on sidewalks and streets along the observation routes.
All observers will participate in 2 training sessions before beginning data collection. During the first training session, they will be given detailed instructions on the BWM and procedures to be used. The second training session will involve mock field observations.
Data on meteorological conditions (rainfall, relative humidity, temperature, wind speed, and barometric pressure) for the exact time of day observations are conducted will be obtained from an automated weather sensor system located at the local airport.
The Pivothead Smart (Pivothead, Denver, CO) is a state-of-the-art, noninvasive WVD indistinguishable from a pair of normal sunglasses (
The Pivothead sunglasses used in this study.
Example of Pivothead sunglasses being worn.
High resolution image taken with Pivothead glasses.
We will use the videos of observation routes assessed in aim 1.
The WVD video data, along with annotated ground truth for each human and feature of interest, will be analyzed automatically using multiple deep CNNs. The first deep CNN will be used in collaboration with the Simple, Online, and Real-time Tracking algorithms to determine the number of humans in BWM videos who cross the path of the observer (criteria for being counted) and the distance they traveled per unit time before crossing paths with the observer. For each human in the video, a bounding box will be drawn around their pixels, with identifying information such as faces blurred automatically. Once the humans in the scene are identified, activity recognition will be the next step. Activities will include standing/sitting, walking, cycling, and running. For bicycle riders, the answer is already given by the detection algorithm. For other activities, a new, separate deep network can be applied to classify the target behavior. An activity is a temporal event that is defined across many frames, so a recurrent neural network will need to be designed to handle this. These networks must be tested and fine-tuned for ground-level views. There are several state-of-the-art networks to choose from, but because of the dynamic nature and heterogeneous viewpoints, a new network architecture may be necessary. The output of the automatic methods can be compared against ground truth to give an accuracy score for how reliable the automatic methods are.
The primary outcome variable for aim 2 is the number of individuals observed walking, cycling, running, and sitting/standing along each observation route/50 mins of observation.
Before developing statistical models, an examination of the univariate distribution of variables will be conducted (eg, scatter plots). Statistics such as means or proportions, SEs, ranges, and estimates of skewness and kurtosis will be derived. Data transformation procedures (eg, logarithmic) may be applied to quantitative variables whose distribution shows considerable departure from normality. Bland-Altman plots will be used to assess agreement on quantitative measures between the traditional BWM and WVD manual video analysis, the WVD manual video analysis, and the automated video analysis [
Our research team has published 3 peer-reviewed journal articles examining the use of the BWM. In the first study, the BWM was used in 12 urban US census block groups to record the number of individuals walking, cycling, and running on sidewalks and streets and the geographical location (address) where they were observed [
As the first study was limited to urban areas, we conducted a second study of the BWM in suburban settings [
The third study was designed to determine if PAs observed using the BWM were associated with environmental characteristics [
We have previously deployed CNNs to detect cars, bikes, and pedestrians at busy intersections in collaboration with the Delaware Department of Transportation. Using a GoPro Hero Silver 3 with 720 p resolution at 30 fps, videos of pedestrians and cars were recorded over the course of a few hours. Using a modification of You Only Look Once with additional postprocessing, pedestrians, bicycle riders, and cars were automatically and accurately detected from the video (97% agreement with human detection). Tracking was performed with the Simple, Online, and Real-time Tracking algorithm, which uses a deep network for feature extraction and matching and a Kalman filter to improve the reliability [
Efforts to increase PA are needed to reach a large portion of the population, and community-level interventions are highly recommended for this purpose. To accurately assess their effectiveness, the proposed study is being conducted to develop a new BWM that uses current technology to capture and analyze video data for the purpose of measuring PAs performed on sidewalks and streets. At this study’s completion, we will have demonstrated that a WVD can be used to improve the acquisition and accuracy of data collected using the BWM and that machine learning and recognition software can be used to automatically extract information on PAs occurring on the sidewalks and streets from the videos.
The outcomes from this study have the potential to establish new levels of accuracy for measuring PA on sidewalks and streets and advance the study of PA by using machine learning (deep CNNs) to automatically extract relevant data from the videos. In addition, the proposed study will lead to further developments in this area that will allow for other important characteristics captured by the WVD to be determined with deep CNNs including geographical-level (eg, street segment and park) caloric expenditure, demographics (eg, sex and age), health status (eg, body mass index) as well as current environmental conditions that could affect PA (eg, acts of incivility and weather). Therefore, the potential exists for this study to not only create a novel and valuable tool for researchers but develop an approach that could be easily used by public health officials, government agencies, and numerous other community groups.
We expect the WVD to experience technical difficulties at times. In recent months, we have been working with the Pivothead Smart, and on a few occasions there were issues with the recording device stopping during use and uploading videos from the device to a computer, which was because of a faulty cable. To correct or minimize these issues, we will provide observers with a reserve pair of glasses and keep additional cables on hand.
It is probable that some observations will be conducted in high-crime areas, making it unsafe for data collectors. We have encountered this in previous studies and addressed this by having a law enforcement officer accompany data collectors when necessary.
The Hawthorne effect is the alteration of behavior by the subjects of a study because of their awareness of being observed. Although this is a valid concern, in our past studies using the BWM we have not found any noticeable reaction to the observers. This is likely because of a couple of reasons such as the observers not standing out and appearing simply as individuals walking down the sidewalk. If people do react to the observers, it would most likely be because the observers walk at a slow pace and periodically write in a notebook while walking. We expect this concern to be eliminated with the use of the video glasses that are indistinguishable from regular glasses.
City characteristics.
NIH Summary Statement.
Block Walk Method
convolutional neural network
global positioning system
intraclass correlation coefficient
physical activity
wearable video device
None declared.