This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included.
Though artificial intelligence (AI) has the potential to augment the patient-physician relationship in primary care, bias in intelligent health care systems has the potential to differentially impact vulnerable patient populations.
The purpose of this scoping review is to summarize the extent to which AI systems in primary care examine the inherent bias toward or against vulnerable populations and appraise how these systems have mitigated the impact of such biases during their development.
We will conduct a search update from an existing scoping review to identify studies on AI and primary care in the following databases: Medline-OVID, Embase, CINAHL, Cochrane Library, Web of Science, Scopus, IEEE Xplore, ACM Digital Library, MathSciNet, AAAI, and arXiv. Two screeners will independently review all abstracts, titles, and full-text articles. The team will extract data using a structured data extraction form and synthesize the results in accordance with PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines.
This review will provide an assessment of the current state of health care equity within AI for primary care. Specifically, we will identify the degree to which vulnerable patients have been included, assess how bias is interpreted and documented, and understand the extent to which harmful biases are addressed. As of October 2020, the scoping review is in the title- and abstract-screening stage. The results are expected to be submitted for publication in fall 2021.
AI applications in primary care are becoming an increasingly common tool in health care delivery and in preventative care efforts for underserved populations. This scoping review would potentially show the extent to which studies on AI in primary care employ a health equity lens and take steps to mitigate bias.
PRR1-10.2196/27799
Artificial intelligence (AI) is a field of computer science that aims to create systems that are capable of independent reasoning [
Vulnerable populations in health care, such as women and transgender individuals, Black and Latinx populations, and those with low socioeconomic status, represent cohorts of individuals who experience significant baseline health disparities and are at heightened risk of being affected by algorithmic bias [
Primary care is the cornerstone of health care delivery and serves, in theory, as the entry point for most patients into the health care setting [
We selected a scoping review as the best method for assessing the research landscape of AI and health equity in primary care because it offers a way to systematically identify key research gaps, opportunities, evidence, and concepts in this understudied space. This type of review differs from systematic reviews and meta-analyses in that it does not narrow the parameters of the review to a specific quality assessment. Instead, it is a systematic approach to examine the landscape of a research field using broad questions to examine both empirical and conceptual aspects [
A committee of medical professionals at different levels (medical students and attending physicians) with multiple domain expertise (AI, primary care, and fairness in machine learning) and training in recognition of health care disparities led the scope of this study. We used the methodology of Arksey and O’Malley [
Research questions.
Research questions | Operational definitions |
What is the representation of vulnerable individuals in the intended target population for any study on artificial intelligence within primary care? | Vulnerable populations are defined as those with known disparities as described by the following categories: Place of residence (eg, rural) Race, ethnicity (eg, Black) Occupation (eg, coal miners) Gender, sex (eg, transgender) Religion (eg, Amish) Education (eg, high-school only) Socioeconomic status (eg, low income) Social capital (eg, isolation) |
How well do current studies on artificial intelligence in primary care report different types of bias that may be perpetuated as health disparities by their systems? | Data extraction elements ( |
What interventions do current studies on artificial intelligence in primary care use to address harmful effects of pre-existing biases in their systems? | Example interventions are listed below: Preprocessing Modified data sources Preprocessing data for fairness Model development Demographic parity Equalized odds/opportunity Disparity regularization Counterfactual fairness Postprocessing Subgroup analysis Meta-regression Quality assurance |
To guide the search strategy for our scoping review, we have developed a number of protocols and parameters. We will use Covidence [
To retrieve all AI and primary care literature, we will use a similar search strategy and eligibility criteria documented by Kuepfer et al [
We will also apply the search strategy and screening criteria applied to any new articles since Kueper et al [
Once this process is complete, a final PRISMA flow diagram [
In line with Kueper et al [
We built a preliminary data framework in accordance with the suggestions of Daudt et al [
Data extraction elements.
Category | Elements appraised |
Reviewer information |
Reviewer name Reviewer comments |
Bibliometrics |
First and last name of the first author Title Source Year of publication Country Status of publication |
Primary care function (adapted from Kueper et al [ |
Diagnostic decision support: artificial intelligence–assisted diagnostics Treatment decision support: artificial intelligence–assisted treatment, including remote management of care Referral support: artificial intelligence–assisted support for any portion of the referral process, especially for direct referrals of patients to specialist services Scheduling assistance: models for optimizing clinic schedules and overbooking Future state prediction: artificial intelligence offering predictions about the future state, such as consult service utilization or prognosis of existing conditions. (this excludes predictions of one’s chances of developing a health condition in the near term, which falls under diagnostic decision support) Health care utilization analyses: artificial intelligence extracts information retrospectively to understand more about the current processes or interactions within a health care system Knowledge base and ontology construction or use Information extraction: artificial intelligence extracts knowledge from structured or unstructured data sources Descriptive information provision: Artificial intelligence summarizes existing data in interpretable or useful ways Other: function not represented above, but specifics of function will still be recorded in case a new category emerges |
Author-reported intended end-users |
The intended user of the artificial intelligence product, including but not limited to patients, physicians, nurses, nurse practitioners, administrators, researchers, others, and unknown (if an end-user is not specified as the tool was still in development, a researcher was designated) |
Target health condition (adapted from Kueper et al [ |
General Diabetes Cancer, non-skin Heart valves, murmurs Musculoskeletal/joint Dementia, cognitive impairment Lung apnea, chronic obstructive pulmonary disease Chronic disease, frailty Skin cancer Stroke, neurological Psychiatric Coronary artery disease Heart failure Hypertension Other cardiovascular disease Gastrointestinal/liver Ear, nose, and throat Eye and retina Trauma, emergency surgery Infection Metabolic Kidney and urinary tract Immunization, reactions Skin disorders Obesity Pediatric/developmental Other |
Data set |
Size: number of unique patients Time period if applicable Source of data: Electronic health record National registry Claims Remote monitoring devices (ie, smart watch or mobile phone) Other (specified) Unknown Number of institutions: single or multiple Setting (urban, rural, both, or unknown): We use the United States Census’ County Classification Lookup Table [ |
Compliance with “Ethics Guidelines for Trustworthy AI” [ |
Human agency and oversight: how well does the algorithm support human decision-making and permit oversight on its predictions? Technical robustness and safety: how well-suited is the algorithm for its intended use? How well does it mitigate harm? Privacy and data governance: how well does the algorithm’s data ingestion and analysis pipeline respect patient privacy (eg, HIPAA compliance) and enforce safeguards against unpermitted access? Transparency: does the artificial intelligence algorithm explain reasons for its outputs in a traceable and interpretable way? Diversity, nondiscrimination, and fairness: how biased is the algorithm with regard to its performance? How easy is it for stakeholders to provide feedback on the algorithm’s performance for its continuous development? Societal and environmental well-being: what are the societal (eg, dehumanizing relationships) and ecological (eg, energy consumption) impacts of the algorithm? Accountability: who is held responsible to ensure the algorithm’s development, outcomes, harm, and regulation? |
Model fairness and focus on health equity: is the main purpose of the study specifically outlined to improve health for a vulnerable population (yes/no)? | Must be explicitly stated in the introduction or abstract as motivation for the paper to focus on at least 1 vulnerable population (though there may be other populations studied as well) defined by any of the following categories which are largely based off of the NIMHD Research Framework [ Place of residence (eg, rural) Race, ethnicity (eg, Black African American or Latinx) Occupation (eg, coal miners) Gender, sex (eg, transgender) Religion (eg, Amish) Education (eg, low) Socioeconomic status (eg, low income) Social capital (eg, isolation) Does the study include key variables that could reflect disparities across protected classes (eg, age, sex, or race/ethnicity)? If reported, do they include these variables in their evaluation (eg, subgroup analysis to demonstrate equal performance)? Existing biases: does the study discuss biases or potential repercussions related to vulnerable populations? [ Historical bias (ie, data retrieval) Representation bias (ie, population representation) Measurement bias Aggregation bias Evaluation bias Deployment bias Bias mitigation: does the study attempt to reduce existing biases, either explicitly or implicitly? If so, what methodology do they employ? Preoutput (changes to the algorithm or input data) Postoutput (user education, transparency, and specifying the use case) Other |
Stage of the study |
Methodological development: generation of novel artificial intelligence methods or modification of existing artificial intelligence methods to accomplish a task relevant to primary care. Retrospective data analysis or model development: developed an artificial intelligence model trained on retrospectively collected data to identify trends or perform a task that awaits prospective validation. Evaluation: artificial intelligence implemented in the intended setting as part of a pilot study, such as a prospective cohort study or randomized controlled trial. Postimplementation: assessing the impact of an artificial intelligence implementation after officially deployed in its intended setting. |
Our analysis will involve both a descriptive numerical summary and an interpretive synthesis. While our approach in stage 5 will be an iterative process, we will use this section to first provide descriptive tables, frequency tables, and visual representation of the results. Further synthesis will be performed to identify current obstacles, gaps, and opportunities in the literature.
Our scoping review will include consultation with other AI researchers in academia, nonprofit, and industry to enhance the perspective, applicability, and purpose of our study and ultimately offer more practical recommendations. We will engage with stakeholders at three timepoints: (1) prior to the submission of this protocol, (2) during the finalization of the data collection framework, and (3) at the end of the study during the collation, summarization, and reporting of the results.
Electronic database searches were conducted in October 2020, and title and abstract screening are currently underway. We expect to complete the remaining steps of the scoping review, including publication, by fall 2021.
To our knowledge, this will be the first scoping review that applies an equity lens to the existing literature on AI in primary care. Primary care has a large potential to reduce costs and improve quality of life, especially for underserved populations [
After completing this scoping review, we will write a briefing paper to address the implications of the findings in a narrative. We will also develop a manuscript and PRISMA-ScR checklist to submit for publication.
Our scoping review will not incorporate a peer review process for our search strategy despite being recommended in Peer Review of Electronic Search Strategies [
AI has immense potential to improve the patient-physician relationship by augmenting physician capabilities. Primary care is an especially viable area for the integration of AI, given its early entry point, broad scope of vulnerable populations, the heavy toll these socioeconomic factors have on patient care, and the need to address these factors to manage disease more effectively. However, algorithms are susceptible to performance disparities across different subgroups, which may further reinforce pre-existing health inequities if not rigorously assessed before deployment. With this scoping review protocol, we aim to provide a process to assess the state of AI in primary care for vulnerable populations.
Search strategy and overview for Kueper et al [<xref ref-type="bibr" rid="ref15">15</xref>].
PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) flow diagram from Kueper et al [<xref ref-type="bibr" rid="ref15">15</xref>].
artificial intelligence
Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews.
We are very grateful for Jill Barr-Walker, who served as our clinical librarian consultant in this study. Jill Barr-Walker assisted us in the process of protocol development and validating our search queries. US received funding from the National Institute of Health’s National Cancer Institute Midcareer Investigator Award (grant K24CA212294).
US is funded by the National Institute of Health’s National Cancer Institute, the California Healthcare Foundation, the Center for Care Innovation, the United States Food and Drug Administration, the National Library of Medicine, and the Commonwealth Fund. She is also supported by an unrestricted gift from the Doctors Company Foundation. She has received prior funding from the United States Department of Health and Human Services’ Agency for Healthcare Research and Quality, Gordon and Betty Moore Foundation, and the Blue Shield of California Foundation. She holds contract funding from AppliedVR, Inquisithealth, and Somnology. Furthermore, US serves as a scientific/expert advisor for the nonprofit organizations HealthTech 4 Medicaid and for HopeLab. She has been a clinical advisor for Omada Health and an advisory panel member for Doximity. SS is a co-founder and equity holder in Monogram Orthopedics. JHC is supported in part by the National Institutes of Health/National Library of Medicine via Award R56LM013365 and Stanford Clinical Excellence Research Center (CERC), is the co-founder of Reaction Explorer LLC, which develops and licenses organic chemistry education software, and has been paid consulting or speaker fees by the National Institute of Drug Abuse Clinical Trials Network, Tuolc Inc, Roche Inc, and Younker Hyde MacFarlane PLLC.