Background: Lupus is a complex autoimmune disease that is difficult to diagnose and treat. It is estimated that at least 5 million Americans have lupus, with more than 16,000 new cases of lupus being reported annually in the United States. Social media provides a platform for patients to find rheumatologists and peers and build awareness of the condition. Researchers have suggested that the social network Twitter may serve as a rich avenue for exploring how patients communicate about their health issues. However, there is a lack of research about the characteristics of lupus patients on Twitter and their attitudes toward using Twitter for engaging them with their health care.
Objective: This study has two objectives: (1) to conduct a content analysis of Twitter data published by users (in English) in the United States between September 1, 2017 and October 31, 2018 to identify patients who publicly discuss their lupus condition and to assess their expressed health themes and (2) to conduct a cross-sectional survey among these lupus patients on Twitter to study their attitudes toward using Twitter for engaging them with their health care.
Methods: This is a mixed methods study that analyzes retrospective Twitter data and conducts a cross-sectional survey among lupus patients on Twitter. We used Symplur Signals, a health care social media analytics platform, to access the Twitter data and analyze user-generated posts that include keywords related to lupus. We will use descriptive statistics to analyze the data and identify the most prevalent topics in the Twitter content among lupus patients. We will further conduct self-report surveys via Twitter by inviting all identified lupus patients who discuss their lupus condition on Twitter. The goal of the survey is to collect data about the characteristics of lupus patients (eg, gender, race/ethnicity, educational level) and their attitudes toward using Twitter for engaging them with their health care.
Results: This study has been funded by the National Center for Advancing Translational Science through a Clinical and Translational Science Award. The institutional review board at the University of Southern California (HS-19-00048) approved the study. Data extraction and cleaning are complete. We obtained 47,715 Twitter posts containing terms related to “lupus” from users in the United States published in English between September 1, 2017 and October 31, 2018. We included 40,885 posts in the analysis. Data analysis was completed in Fall 2020.
Conclusions: The data obtained in this pilot study will shed light on whether Twitter provides a promising data source for garnering health-related attitudes among lupus patients. The data will also help to determine whether Twitter might serve as a potential outreach platform for raising awareness of lupus among patients and implementing related health education interventions.
International Registered Report Identifier (IRRID): DERR1-10.2196/15716
Background and Rationale
Lupus is a chronic disease characterized by an autoimmune response that can range in its frequency and affect any part of the body (skin, joints, and organs). It is estimated that at least 5 million Americans have lupus, with more than 16,000 new cases of lupus being reported annually in the United States . The condition strikes mostly women of childbearing age, while women of color are 2-3 times more likely to develop lupus than Caucasian women. However, the disease can present in men and children as well.
Lupus is a difficult disease to diagnose as its symptoms can often mimic those of other diseases . Systemic lupus erythematosus (SLE), the most common form of lupus, has been reported to remain undiagnosed in some populations for an average of 6 years [ ]. SLE tends to present more abruptly and cause more damage in patients of color. This often comes in the form of a spike in disease activity called a “flare” and without treatment, can lead to organ damage and failure. Therefore, early diagnosis is essential for patients with lupus [ ].
As most people with lupus develop the disease between the ages of 15 years and 44 years, we hypothesize that social media provides a potentially promising tool for raising awareness and supporting early diagnosis and management of lupus. This study aims to shed light on the use of Twitter among patients who publicly discuss their lupus condition on the platform and to assess their attitudes toward using Twitter to engage them with their health care.
The term “social media” describes widely accessible web-based and mobile technologies that allow users to view, create, and share information online and to participate in social networking [- ]. Social media provides both a unique data source for data mining of health concerns and related attitudes [ , ] and an unprecedented opportunity for delivering information to reach large segments of the population [ ] as well as hard-to-reach subpopulations [ , ]. Today, more than 70% of American adults use at least some type of social media [ ].
The Social Network Twitter
The social network Twitter is used by 23% of American adults, and users are diverse, including Hispanics (25%), Blacks (24%), and Whites (21%) . Twitter users can post short messages (tweets) that are limited to 280 characters. They can search for any public message and engage with tweets (ie, they can “like,” reply, and “retweet” [share] them). By default, Twitter account information such as the profile name, description, and location is public unless a user decides to opt out and make an account private [ , ]. Previous research has suggested that Twitter provides a “rich and promising avenue for exploring how patients conceptualize and communicate about their health issues” [ ]. The increasing use of Twitter among members of disease communities is further evidenced by the abundance of disease and health topic hashtags used in the messages [ - ]. A hashtag is a word or phrase preceded by a hash or pound sign (#) and used to identify messages on a specific topic (eg, #lupus, #spoonies). However, there is little information about the use of social media among lupus patients as well as their attitudes toward using Twitter for engaging them with their health care [ ].
Previous Research on Social Media and Lupus
The emergence of social media has created new sources of analyzable data  and led to new research fields (ie, infodemiology and infoveillance) [ , ]. The data social media users generate through their online activities are referred to as their digital footprint [ ] or social mediome [ ]. On Twitter, for example, health surveillance researchers have used this data to gain insight into public perspectives on a variety of diseases and health topics such as influenza, autism, schizophrenia, smoking, and HIV/AIDS [ - ]. In some cases, social media user data demonstrated a correlation between disease prevalence and frequency with which Twitter users discussed a disease [ ]. The investigators are not aware of lupus-related surveillance research that involved the social network Twitter.
However, previous research examined user-generated content about lupus on Facebook . The authors looked at the representation of health conditions and found that lupus-related pages ranked the highest for patient support [ ]. Additionally, a patient commentary highlighted the use of social media, in particular Twitter, among lupus patients to find rheumatologists, specialist care, and peers and to build awareness of their health needs and experiences [ ]. To our knowledge, there are no studies that have leveraged Twitter to improve the understanding of attitudes among patients with lupus.
Study Objective and Research Questions
This study has two objectives: (1) to conduct a content analysis of Twitter data published by users (in English) in the United States between September 1, 2017 and October 31, 2018 to identify patients who publicly discuss their lupus condition and to assess their expressed health themes and (2) to conduct a cross-sectional survey among the lupus patients on Twitter to study their attitudes toward using Twitter for engaging them with their health care.
Our findings will shed light on whether Twitter provides a promising data source for garnering insights and attitudes about lupus expressed among patients. The findings will help to determine whether Twitter might serve as a potential outreach platform for raising awareness of lupus among patients and implementing related health education interventions.
This is a mixed methods study that analyzes retrospective Twitter data and conducts a cross-sectional survey among lupus patients on Twitter.
This study will analyze user-generated posts in English that include keywords related to “lupus” () from the social network Twitter and were published between September 1, 2017 and October 31, 2018. To access public Twitter user data, we used Symplur Signals [ ], a health care social media analytics platform. We limited the dataset to posts from users with locations in the United States.
Twitter posts containing terms related to “lupus” () were obtained for the range between September 1, 2017 and October 31, 2018. We applied the approach suggested by Kim et al [ ] to develop the search filters. These terms can appear in the post or in an accompanying hashtag, for example, Lupus or #LupusChat. We selected keywords and hashtags based on expert knowledge (clinicians, social media experts) and used a systematic search of topic-related language based on data in Symplur Signals.
Data Cleaning and Debiasing
The following types of irrelevant tweets were excluded: (1) non-English language tweets identified using the Liu method , (2) retweets (ie, messages shared by Twitter users that other users composed), and (3) messages that originated from outside the United States. Locating users in the United States was accomplished using a mapped location filter provided by Twitter GNIP through the “Profile Geo Enrichment” algorithm (formerly known as GNIP’s Profile Geo 2.0, which was acquired by Twitter) [ ]. This Twitter data service is among the most commonly used data sources in academic Twitter surveillance research [ ]. To determine a user’s location, the algorithm uses a number of data points including the self-reported “Bio Location” in the Twitter user profile and geotracking data if available. The Profile Geo service adds “structured geodata relevant to the user location value by geocoding and normalizing location strings where possible” [ ]. Research using a similar multi-indicator method to infer the location of the user showed the capability of locating 92% of all tweets [ ]. However, the Profile Geo service attempts to determine the best choice for the geographic place described in the profile location string. We acknowledge that the results may not be accurate in all cases due to factors such as multiple places with similar names or ambiguous names. If a value is not provided in a user’s profile location field, the Profile Geo service does not provide a classification.
As we attempt to understand attitudes, we relied on machine learning to identify Twitter posts by social bots or marketing-oriented accounts that could possibly influence the results and introduce bias [, ]. We used the program BotOrNot [ ] to identify those Twitter accounts. Messages from these accounts were removed from the dataset to focus on analyzing patient perspective data. The program BotOrNot scores a detection accuracy above 95% [ ].
Data Collection and Confidentiality
Any identifying and personal health information was redacted from the dataset by the coders. Since the “Tweet ID,” “Tweet URL,” “Profile thumbnail URL,” “Username,” and “Display Name” in the dataset can potentially identify the person directly, we removed these from the initial data collection sheet and used a unique code identifier instead. We maintained the link between the unique code and the identifiable elements in a separate file. We retained the data only for use in this project and destroyed the identifiable (Tweet ID, Tweet URL, Profile thumbnail URL, Username, and Display Name) information prior to the data analysis as requested by the local IRB.
The data will be retained in a secure database called REDCap at USC. The anonymous data will be kept for future research. Individuals are informed that they should not participate in the study if they do not want their data kept.
The proposed survey study involves contacting lupus patients in the United States who discuss their health in English on Twitter. Eligible survey respondents will be patients with lupus 18 years of age and older. To focus on feasibility, we will limit this pilot to lupus patients who discuss their health on Twitter. Other individuals who talk about how the condition affects a family member or friend (eg, parents, siblings) will be excluded from this study.
The goal of the survey is to collect data about the characteristics of lupus patients (eg, gender, race/ethnicity, educational level) and their attitudes toward using Twitter for health care engagement among lupus patients (eg, How concerned are you about researchers using Twitter user information to identify patients with lupus? How interested are you in getting information related to lupus via Twitter? How interested are you in receiving personalized information about ongoing research and clinical research opportunities on Twitter?). The full survey is included in.
Recruiting Patients With Lupus via Twitter
We will conduct self-report surveys via Twitter by inviting all identified lupus patients who discuss their health on Twitter. We will recruit via the project Twitter account using a personalized message package approach () and replying to a user’s most recent Twitter message where they mention their lupus condition. Sending multiple messages will allow us to introduce the research project and research team ensuring investigator transparency, ask recipients to follow the project Twitter account, and remind them of the privacy risks of using Twitter. Via the URL link in the message, interested users will be directed to a webpage ( ) that includes more information about the study [ ]. The page will be hosted by the USC Clinical Studies Directory, a public tool that allows anyone to search for clinical research studies at USC. Only those recipients who decide to follow the project account will be able to receive the link to the survey via a private, direct message. The survey will be available in English and can be completed on any computer, tablet, or smartphone. In the case of no response, reminders will be sent up to 4 weeks after the initial contact.
Eligible lupus patients who are 18 years and older will proceed to the information sheet form and access the survey once they give consent via a check box in the survey form.
Survey participants will be able to enter a raffle to win one of three US $100 gift cards after they complete the survey.
We will use a standard coding approach for characterizing the Twitter messages and users. Two independent team members will use a range of text classifiers () to identify a priori and emergent code categories in the Twitter posts. We will further characterize the user of the Twitter accounts who generated the posts ( ) based on information available in a user’s Twitter profile (ie, username, description, profile image). Cohen Kappa will be calculated for each code category to assess interrater reliability [ , ]. Average Cohen Kappa greater than 0.8 for all categories will be considered substantial for this research. The project principal investigators will help to build consensus for instances where coders disagree.
We will use descriptive statistics to analyze the data and identify the most prevalent topics in the Twitter content. Units of analysis will be unique terms in posts as well as the number of Twitter messages and users. For each analysis, we will present findings in a confusion matrix where the diagonal line indicates the prevalence of a topic and the off-diagonal lines indicate topic overlap. The number of posts containing 2 or more topics is found at the intersection of the matrix for these topics. We will further describe the patient characteristics such as age, gender, race/ethnicity, and other characteristics and survey responses. We will use multiple regression to assess which variables (eg, demographics) are significantly associated with acceptance of using Twitter for health care engagement. Analyses will be performed in SPSS (v.24), using P=.05 for statistical tests.
Sample Size Calculation
The sample size estimate (and survey protocol) is based on previous similar research that demonstrated the usefulness of user data to identify and engage cancer patients on Twitter .
Twitter data from users in Los Angeles County posted over the course of 12 months were used to identify 134 cancer patients who had discussed their cancer condition on Twitter. Nearly one-quarter (33/134, 24.63%) of them responded positively to the outreach on Twitter that was focused on clinical trial recruitment. As the prevalence of SLE is lower in the United States, with 20-150 reported cases per 100,000 , compared to the cancer incidence, which is 439 per 100,000 men and women per year (based on 2011-2015 cases) [ ], we anticipate a lower number of people who discuss their lupus condition on Twitter. In this pilot study, we anticipate identifying around 100-300 Twitter accounts of lupus patients across the United States. We expect that at least 25% of these lupus patients, who we will contact to participate in the survey study, will complete the survey.
This study presents minimal-risk research. We will use public data from the social network Twitter. We will de-identify any subject’s names or Twitter handles, and they will not appear in the analysis dataset. We have implemented a number of measures to ensure data security and confidentiality (see Data Collection and Confidentiality section). We will further abide by USC IRB regulations and the USC Privacy of Personal Information policy. In general, all data will be entered into a computer and database that are password protected. The data will be stored using appropriate, secure computer software and encrypted computers.
Dissemination of Study Findings
The study authors plan to publish the study findings in a peer-reviewed journal and at topic-related conferences (to be determined at a later date). All listed authors or contributors are compliant with guidelines outlined by the International Committee of Medical Journal Editors for author inclusion in a published work. Furthermore, to support research transparency and reproducibility, we will share the de-identified research data after publication of the study results. We will share the de-identified data on Figshare, a repository where users can make all of their research outputs available in a citable, shareable, and discoverable manner.
This study was approved by the IRB at USC (Protocol HS-19-00048;). Data extraction and cleaning are complete. The detailed data extraction and cleaning flow chart is included in . We obtained 47,715 Twitter posts containing terms related to “lupus” from users in the United States published in English between September 1, 2017 and October 31, 2018. After removing duplicates, retweets, non-English tweets, and Twitter posts from commercial and bot-like accounts, 40,885 posts were included in the analysis. Data analysis was completed in Fall 2020.
The generalizability of the study is somewhat limited, and we recognize that the use of social media data could also lead to potential bias. Social media research and social media–based intervention favor those with internet access. Twitter users tend to be younger (38% are 18-29 years of age), college graduates (32%), and located in urban areas (26%) . Nonetheless, it is worth mentioning that social media users have grown more representative of the broader population; for example, they include the Black population (24%) as well as Whites (21%) and Hispanics (25%) [ ]. Additionally, Twitter messages from locations outside the United States and messages in other, non-English languages such as Spanish will not be included. It is also possible that fewer lupus patients discuss their health on Twitter than we anticipate. We addressed this issue by searching Twitter data from users across the United States. However, even if we identify lupus patients on Twitter, it is possible that a lower number of them will engage and take the survey. To incentivize survey completion, participants who complete the survey will be able to enter a raffle to win one of 10 US $100 gift cards.
Finally, we will take several steps to reduce the chance of fraudulent survey responses on Twitter, including sharing the survey link only via private messages on Twitter once a user has followed the project account on Twitter. In the case that the majority of users seems reluctant to follow the Twitter account, we will send reminder messages with the personalized link to the survey via a public reply message on Twitter to increase the survey response rate.
This pilot project will provide preliminary data and practical insight into the application of publicly available Twitter data to gain a better understanding of lupus patients who publicly discuss their condition on Twitter and their attitudes toward using the platform to engage them with their health care. The data will also help to determine whether Twitter might serve as a potential outreach platform for raising awareness of lupus and implementing related health interventions.
The development of the study protocol and the implementation of the study have been supported by the Southern California Clinical and Translational Science Institute (SC CTSI) through grant UL1TR000130 from the National Center for Advancing Translational Sciences (NCATS) of the NIH. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Conflicts of Interest
The authors have no conflicts of interest to report and will not be rewarded in any way, either financially or other by Symplur.com. Members of the Symplur.com team are neither included in the data analysis nor in the interpretation of the study findings.
Lupus-related keywords and hashtags used for the Twitter search. The selection is based on data from Symplur Signals.PDF File (Adobe PDF File), 80 KB
Survey.PDF File (Adobe PDF File), 86 KB
Twitter recruitment messages.PDF File (Adobe PDF File), 48 KB
Study information page.PNG File , 726 KB
Coding table used for identifying main themes in lupus-related Twitter posts.PDF File (Adobe PDF File), 55 KB
Code categories to classify Twitter users.PDF File (Adobe PDF File), 42 KB
IRB approval notice.PDF File (Adobe PDF File), 1140 KB
Data extraction and cleaning flow diagram.PDF File (Adobe PDF File), 54 KB
- What is lupus? Lupus Foundation of America. URL: https://resources.lupus.org/entry/what-is-lupus [accessed 2021-04-24]
- Rees F, Doherty M, Lanyon P, Davenport G, Riley RD, Zhang W, et al. Early Clinical Features in Systemic Lupus Erythematosus: Can They Be Used to Achieve Earlier Diagnosis? A Risk Prediction Model. Arthritis Care Res (Hoboken) 2017 Jun;69(6):833-841. [CrossRef] [Medline]
- Amsden LB, Davidson PT, Fevrier HB, Goldfien R, Herrinton LJ. Improving the quality of care and patient experience of care during the diagnosis of lupus: a qualitative study of primary care. Lupus 2018 Jun;27(7):1088-1099. [CrossRef] [Medline]
- Dizon DS, Graham D, Thompson MA, Johnson LJ, Johnston C, Fisch MJ, et al. Practical guidance: the use of social media in oncology practice. J Oncol Pract 2012 Sep;8(5):e114-e124. [CrossRef] [Medline]
- Obar JA, Wildman SS. Social media definition and the governance challenge: an introduction to the special issue. Telecommunications Policy 2015;39(9):745-750 [FREE Full text] [CrossRef]
- Lober WB, Flowers JL. Consumer empowerment in health care amid the internet and social media. Semin Oncol Nurs 2011 Aug;27(3):169-182. [CrossRef] [Medline]
- Zeraatkar K, Ahmadi M. Trends of infodemiology studies: a scoping review. Health Info Libr J 2018 Jun;35(2):91-120. [CrossRef] [Medline]
- Sinnenberg L, Buttenheim AM, Padrez K, Mancheno C, Ungar L, Merchant RM. Twitter as a Tool for Health Research: A Systematic Review. Am J Public Health 2017 Jan;107(1):e1-e8. [CrossRef] [Medline]
- Carson KV, Ameer F, Sayehmiri K, Hnin K, van Agteren JE, Sayehmiri F, et al. Mass media interventions for preventing smoking in young people. Cochrane Database Syst Rev 2017 Jun 02;6:CD001006 [FREE Full text] [CrossRef] [Medline]
- Gold J, Pedrana AE, Stoove MA, Chang S, Howard S, Asselin J, et al. Developing health promotion interventions on social networking sites: recommendations from The FaceSpace Project. J Med Internet Res 2012 Feb 28;14(1):e30 [FREE Full text] [CrossRef] [Medline]
- Bender JL, Cyr AB, Arbuckle L, Ferris LE. Ethics and Privacy Implications of Using the Internet and Social Media to Recruit Participants for Health Research: A Privacy-by-Design Framework for Online Recruitment. J Med Internet Res 2017 Apr 06;19(4):e104 [FREE Full text] [CrossRef] [Medline]
- Social Media Fact Sheet. Pew Research Center: Internet & Technology.: Pew Research; 2021 Apr 21. URL: https://www.pewresearch.org/internet/fact-sheet/social-media/ [accessed 2021-04-27]
- Twitter terms of service. Twitter. URL: https://twitter.com/en/tos [accessed 2019-07-09]
- Xu S, Markson C, Costello KL, Xing CY, Demissie K, Llanos AA. Leveraging Social Media to Promote Public Health Knowledge: Example of Cancer Awareness via Twitter. JMIR Public Health Surveill 2016;2(1):e17 [FREE Full text] [CrossRef] [Medline]
- Rosenkrantz AB, Labib A, Pysarenko K, Prabhu V. What Do Patients Tweet About Their Mammography Experience? Acad Radiol 2016 Nov;23(11):1367-1371. [CrossRef] [Medline]
- Pinho-Costa L, Yakubu K, Hoedebecke K, Laranjo L, Reichel CP, Colon-Gonzalez MDC, et al. Healthcare hashtag index development: Identifying global impact in social media. J Biomed Inform 2016 Oct;63:390-399 [FREE Full text] [CrossRef] [Medline]
- Pemmaraju N, Utengen A, Gupta V, Kiladjian J, Mesa R, Thompson MA. Social Media and Myeloproliferative Neoplasms (MPN): Analysis of Advanced Metrics From the First Year of a New Twitter Community: #MPNSM. Curr Hematol Malig Rep 2016 Dec;11(6):456-461. [CrossRef] [Medline]
- Chiang AL, Vartabedian B, Spiegel B. Harnessing the Hashtag: A Standard Approach to GI Dialogue on Social Media. Am J Gastroenterol 2016 Aug;111(8):1082-1084. [CrossRef] [Medline]
- Sedrak MS, Cohen RB, Merchant RM, Schapira MM. Cancer Communication in the Social Media Age. JAMA Oncol 2016 Jun 01;2(6):822-823. [CrossRef] [Medline]
- Utengen A. The Rise of Patient Communities on Twitter - Twitter Visualized. Symplur. 2012 Dec 10. URL: https://www.symplur.com/shorts/the-rise-of-patient-communities-on-twitter-visualized/ [accessed 2019-07-21]
- Reuter K, Danve A, Deodhar A. Harnessing the power of social media: how can it help in axial spondyloarthritis research? Curr Opin Rheumatol 2019 Jul;31(4):321-328. [CrossRef] [Medline]
- Eysenbach G. Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. J Med Internet Res 2009 Mar 27;11(1):e11 [FREE Full text] [CrossRef] [Medline]
- Zhang D, Guo B, Li B, Yu Z. Extracting Social and Community Intelligence from Digital Footprints: An Emerging Research Area. In: Yu Z, Liscano R, Chen G, Zhang D, Zhou X, editors. Ubiquitous Intelligence and Computing. Berlin Heidelberg: Springer Publishing Company; 2010:4-18.
- Asch DA, Rader DJ, Merchant RM. Mining the social mediome. Trends Mol Med 2015 Sep;21(9):528-529 [FREE Full text] [CrossRef] [Medline]
- Wagner M, Lampos V, Cox IJ, Pebody R. The added value of online user-generated content in traditional methods for influenza surveillance. Sci Rep 2018 Sep 18;8(1):13963 [FREE Full text] [CrossRef] [Medline]
- Hswen Y, Gopaluni A, Brownstein JS, Hawkins JB. Using Twitter to Detect Psychological Characteristics of Self-Identified Persons With Autism Spectrum Disorder: A Feasibility Study. JMIR Mhealth Uhealth 2019 Feb 12;7(2):e12264 [FREE Full text] [CrossRef] [Medline]
- Hswen Y, Naslund JA, Brownstein JS, Hawkins JB. Monitoring Online Discussions About Suicide Among Twitter Users With Schizophrenia: Exploratory Study. JMIR Ment Health 2018 Dec 13;5(4):e11483 [FREE Full text] [CrossRef] [Medline]
- Malik A, Li Y, Karbasian H, Hamari J, Johri A. Live, Love, Juul: User and Content Analysis of Twitter Posts about Juul. Am J Health Behav 2019 Mar 01;43(2):326-336. [CrossRef] [Medline]
- Nielsen RC, Luengo-Oroz M, Mello MB, Paz J, Pantin C, Erkkola T. Social Media Monitoring of Discrimination and HIV Testing in Brazil, 2014-2015. AIDS Behav 2017 Jul;21(Suppl 1):114-120 [FREE Full text] [CrossRef] [Medline]
- Bychkov D, Young S. Social media as a tool to monitor adherence to HIV antiretroviral therapy. J Clin Transl Res 2018 Dec 17;3(Suppl 3):407-410 [FREE Full text] [Medline]
- Tufts C, Polsky D, Volpp KG, Groeneveld PW, Ungar L, Merchant RM, et al. Characterizing Tweet Volume and Content About Common Health Conditions Across Pennsylvania: Retrospective Analysis. JMIR Public Health Surveill 2018 Dec 06;4(4):e10834 [FREE Full text] [CrossRef] [Medline]
- Hale TM, Pathipati AS, Zan S, Jethwani K. Representation of health conditions on Facebook: content analysis and evaluation of user engagement. J Med Internet Res 2014 Aug 04;16(8):e182 [FREE Full text] [CrossRef] [Medline]
- Greene A. Patient commentary: social media provides patients with support, information, and friendship. BMJ 2015 Feb 10;350:h256. [CrossRef] [Medline]
- Symplur Signals. Symplur. URL: https://www.symplur.com/signals/ [accessed 2019-07-09]
- Kim Y, Huang J, Emery S. Garbage in, Garbage Out: Data Collection, Quality Assessment and Reporting Standards for Social Media Data Use in Health Research, Infodemiology and Digital Disease Detection. J Med Internet Res 2016 Feb 26;18(2):e41 [FREE Full text] [CrossRef] [Medline]
- Lui M, Baldwin T. langid.py: An off-the-shelf language identification tool. 2012 Presented at: ACL 2012 System Demonstrations; July 2012; Jeju Island, Korea URL: https://www.aclweb.org/anthology/P12-3005 [CrossRef]
- Profile Geo. Twitter Developer. URL: https://developer.twitter.com/en/docs/tweets/enrichments/overview/profile-geo.html [accessed 2019-11-28]
- Lienemann BA, Unger JB, Cruz TB, Chu K. Methods for Coding Tobacco-Related Twitter Data: A Systematic Review. J Med Internet Res 2017 Mar 31;19(3):e91 [FREE Full text] [CrossRef] [Medline]
- Schulz A, Hadjakos A, Heiko P, Johannes N, Mahlhauser M. A Multi-Indicator Approach for Geolocalization of Tweets. 2013 Presented at: 7th International AAAI Conference on Weblogs and Social Media; July 8–11, 2013; Cambridge, MA URL: https://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/6063/6397
- Allem J, Ferrara E. The Importance of Debiasing Social Media Data to Better Understand E-Cigarette-Related Attitudes and Behaviors. J Med Internet Res 2016 Aug 09;18(8):e219 [FREE Full text] [CrossRef] [Medline]
- Ferrara E, Varol O, Davis C, Menczer F, Flammini A. The rise of social bots. Commun. ACM 2016 Jun 24;59(7):96-104. [CrossRef]
- Davis CA, Varol O, Ferrara E, Flammini A, Menczer F. Botornot: A system to evaluate social bots. 2016 Presented at: 25th International Conference Companion on World Wide Web; April 2016; Montréal, Québec, Canada. [CrossRef]
- Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009 Apr;42(2):377-381 [FREE Full text] [CrossRef] [Medline]
- Lupus patients on Twitter: What do they think about using Twitter to engage them with their health? Keck Medicine of USC. URL: https://clinicaltrials.keckmedicine.org/lupus-patients-on-twitter-what-do-they-think-about-using-twitter-to-engage-them-with-their-health?locale=en/ [accessed 2019-07-15]
- Cohen J. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 2016 Jul 02;20(1):37-46. [CrossRef]
- McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb) 2012;22(3):276-282 [FREE Full text] [Medline]
- Reuter K, Angyan P, Le N, MacLennan A, Cole S, Bluthenthal RN, et al. Monitoring Twitter Conversations for Targeted Recruitment in Cancer Trials in Los Angeles County: Protocol for a Mixed-Methods Pilot Study. JMIR Res Protoc 2018 Sep 25;7(9):e177 [FREE Full text] [CrossRef] [Medline]
- Lawrence RC, Helmick CG, Arnett FC, Deyo RA, Felson DT, Giannini EH, et al. Estimates of the prevalence of arthritis and selected musculoskeletal disorders in the United States. Arthritis Rheum 1998 May;41(5):778-799. [CrossRef] [Medline]
- Cancer Statistics. National Cancer Institute.: National Cancer Institute URL: https://www.cancer.gov/about-cancer/understanding/statistics [accessed 2019-11-27]
|IRB: institutional review board|
|NCATS: National Center for Advancing Translational Sciences|
|NIH: National Institutes of Health|
|REDCap: Research Electronic Data Capture|
|SLE: systemic lupus erythematosus|
|USC: University of Southern California|
Edited by G Eysenbach; submitted 04.08.19; peer-reviewed by O Leal Neto, D Mather; comments to author 31.08.19; revised version received 28.11.19; accepted 04.02.20; published 06.05.21Copyright
©Alden Bunyan, Swamy Venuturupalli, Katja Reuter. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 06.05.2021.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included.