JMIR Publications

JMIR Research Protocols


Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Published on 11.10.17 in Vol 6, No 10 (2017): October

This paper is in the following e-collection/theme issue:

    Original Paper

    Knowledge Management Framework for Emerging Infectious Diseases Preparedness and Response: Design and Development of Public Health Document Ontology

    1Fu Foundation School of Engineering and Applied Science, Department of Chemical Engineering, Columbia University, New York, NY, United States

    2Mailman School of Public Health, Department of Epidemiology, Columbia University, New York, NY, United States

    3Center for the Management of Systemic Risk, Columbia University, New York, NY, United States

    Corresponding Author:

    Venkat Venkatasubramanian, PhD

    Fu Foundation School of Engineering and Applied Science

    Department of Chemical Engineering

    Columbia University

    500 W. 120th St., Mudd 801

    New York, NY,

    United States

    Phone: 1 212 854 2487



    Background: There are increasing concerns about our preparedness and timely coordinated response across the globe to cope with emerging infectious diseases (EIDs). This poses practical challenges that require exploiting novel knowledge management approaches effectively.

    Objective: This work aims to develop an ontology-driven knowledge management framework that addresses the existing challenges in sharing and reusing public health knowledge.

    Methods: We propose a systems engineering-inspired ontology-driven knowledge management approach. It decomposes public health knowledge into concepts and relations and organizes the elements of knowledge based on the teleological functions. Both knowledge and semantic rules are stored in an ontology and retrieved to answer queries regarding EID preparedness and response.

    Results: A hybrid concept extraction was implemented in this work. The quality of the ontology was evaluated using the formal evaluation method Ontology Quality Evaluation Framework.

    Conclusions: Our approach is a potentially effective methodology for managing public health knowledge. Accuracy and comprehensiveness of the ontology can be improved as more knowledge is stored. In the future, a survey will be conducted to collect queries from public health practitioners. The reasoning capacity of the ontology will be evaluated using the queries and hypothetical outbreaks. We suggest the importance of developing a knowledge sharing standard like the Gene Ontology for the public health domain.

    JMIR Res Protoc 2017;6(10):e196



    Crowdfunding campaign to support this specific research

    We help JMIR researchers to raise funds to pursue their research and development aimed at tackling important health and technology challenges. If you would like to show your support for this author, please donate using the button below. The funds raised will directly benefit the corresponding author of this article (minus 8% admin fees). Your donations will help this author to continue publishing open access papers in JMIR journals. Donations of over $100 may also be acknowledged in future publications.

    keyboard with crowdfunding key instead of enter key

    Suggested contribution levels: $20/$50/$100


    The 2014 Ebola epidemic in West Africa reminded the public health community again of the weaknesses in preparing for and responding to emerging infectious diseases (EIDs). The epidemic directly affected the health and economies of multiple countries in West Africa for 2 years and resulted in 11,299 deaths among 28,599 suspected infections [1]. The initial international response was regarded as slow and uncoordinated by many experts [2], an indication of the poor application of the lessons learned from prior global pandemics.

    Effective coordination and communication of information among different stakeholders are necessary components of a strong response to an EID outbreak [3]. Public health coordination and communication requires not only sharing resources and specialties but also sharing, managing, and using knowledge effectively. This is a recognized challenge in practice [4-8]. Knowledge sharing and management is not a single government task. It needs the collaboration of multiple groups across several sectors. Such effort, however, is usually hindered by geographical, temporal, and political constraints. Lack of a strong public health infrastructure in many countries and the persistent problems in our global health governance structure could exacerbate the crisis and complicate the collaboration [4]. The spatial-temporal dynamics of outbreaks further complicate the real-time preparedness and response processes [9-11]. Moreover, how to use the knowledge from prior pandemics to make a prompt decision under current conditions perplexes the public health community.

    Different approaches have been employed to address this challenge. Recent progress includes influenza information management [12], public health meta-knowledge analysis [13], and public health surveillance [14]. Semantic reasoning has been used to address the spatial-temporal difficulties of epidemic management [9]. However, advances in the knowledge management of public health have been limited. In this work, we demonstrate how to apply systems engineering concepts to develop a knowledge management framework facilitated by ontology and semantic reasoning.

    The public health system is a complex adaptive system [6]. We can tackle its complexity using a systems engineering–based approach [15]. Systems engineering, first proposed by Bell Telephone Laboratories in the 1940s [16], describes an interdisciplinary engineering methodology that focuses on how to design and manage complex systems. It emphasizes the joint effect of system components, their dynamical interactions, and the environment. Systems engineering promotes the development of risk management in various industries, including aerospace, defense, chemical, and nuclear. Venkatasubramanian [17] discusses the necessity of the systems engineering idea for risk management in a complex system. Leveson [18] develops a systems engineering–based modeling framework to assess risks of engineered systems. There are other similar efforts in different domains [19-21]. EID preparedness and response resemble risk management in many engineering disciplines. Recently, systems engineering concepts have gained considerable attention in the public health community. The National Academy of Engineering and the Institute of Medicine have advocated the widespread application of systems engineering tools [22]. Systems engineering methods such as Markov models are used to enhance public health preparedness [23].

    As a result, we propose a novel systems engineering–inspired, ontology-driven knowledge management approach. In this work, we demonstrate how to develop the ontology and semantic rules to manage knowledge and support decision making. This ontology could also serve as a part of other applications, such as a public health training or practice tool. Its flexibility enables the integration with other ontologies.


    Overall Architecture

    Public health knowledge management aims to systematically manage tasks and support decision making, which view implicit and explicit knowledge as key strategic resources [24]. Knowledge management needs storage, retrieval, and utilization of public health knowledge. We propose the ontology-driven knowledge management approach as shown in Figure 1, which decomposes public health documents to elements of knowledge and stores them in an ontology, namely, the Public Health Document Ontology (OntoPH). OntoPH was developed using ontology competency questions as guidance. Grüninger and Fox [25] state that an ontology should answer competency questions proposed based on the motivation of the ontology. Competency questions define the terminology and specify the definitions and constraints of the terminology. Knowledge is modeled using the terminology and retrieved via semantic rules. An inference engine accesses knowledge models and assembles and manipulates elements of knowledge in the ontology to draw conclusions about EID preparedness and response.

    Public health knowledge is mainly preserved in public health documents, which include guidelines, procedures, and academic publications. They are the most important media to share, store, and manage knowledge because they are vetted, high-quality, generated by an authoritative content source, verifiable by a trusted source, and up to date and regularly updated [5]. In order to support decision making, OntoPH’s corpus should meet at least 2 requirements: breadth and depth. Breadth means the corpus should cover many, if not all, fields that are involved in public health decision making. Depth means the corpus should contain not only global-level guidelines but also local-level procedures.

    Figure 1. Systems engineering inspired ontology-driven knowledge management approach.
    View this figure

    Function-Based Knowledge Representation

    The first task is to represent knowledge preserved in public health documents. Effective knowledge storage and retrieval requires a knowledge representation, which addresses both hierarchical complexity and semantic heterogeneity. The hierarchical complexity of public health knowledge is rooted in the multiple layers of public health activities. Public health practitioners need different chunks of knowledge in various contexts to prepare for and respond to EIDs. Health workers in the clinic, for example, demand knowledge about disease diagnosis, whereas the department of health wants to know how to manage and coordinate. Knowledge always serves some purposes. The health workers’ knowledge leads to accurate diagnoses. The department of health’s knowledge achieves effective emergency response. Multiple layers of public health activities are linked via their purposes. For example, to better respond to emergencies, departments of health require the health workers to diagnose diseases more effectively.

    Semantic heterogeneity, on the other hand, is the result of the cross-reference of public health knowledge, which is a mixture of various fields such as medical science, epidemiology, biology, and engineering [8]. For instance, the knowledge of physician training lies in the intersection of medical science (ie, what skills to train) and management science (ie, how to train). Nonetheless, the 2 aspects share the same purpose (ie, training physicians for better EID preparedness). A recent study by Venkatasubramanian and Zhang [26] finds that complex system activities usually have 4 common purposes: communication, decision making, processing, and sensing. Training, as part of education, is an important type of implementation activities.

    One can resolve both hierarchical complexity and semantic heterogeneity by identifying the purpose of knowledge, for a piece of knowledge could serve different purposes under different conditions. Venkatasubramanian and Zhang [26] identify the importance of means-end relation in complex system risk management and propose a systems engineering framework to explicate the relation. Adopting this idea, our approach models elements of knowledge based on their means-end relations. We use teleological functions to represent the purposes of knowledge elements. Unlike mathematical functions that map a set of inputs onto a set of permissible outputs, teleological functions emphasize the means to realize a goal by indicating the common purpose between 2 connected entities. The 4 common purposes induce 4 types of teleological functions. A function-based knowledge representation has been used in many fields including engineering [27-30] and data science [31].

    To develop such a function-based knowledge representation, we first classify public health documents into 2 categories, general documents that contain general public health principles and specific documents that store evidence-based procedures. There exists a gap between the 2 types of documents: general documents are usually too general to implement, whereas specific documents are mostly event-specific thereby limiting their usefulness for new events. We organize knowledge of general documents as a teleological function of that of specific documents: knowledgegeneral doc= f(knowledgespecific doc1, knowledgespecific doc2,...), where f is a teleological function. Specific activities expand a general guideline with specific recommendations. For example, since the 2009 influenza A H1N1 pandemic, many specific documents have discussed vaccination preparedness and distribution [32,33]. The World Health Organization (WHO) also has issued general guidelines for vaccination preparation during the pandemic [34]. The function vaccination describes activities related to vaccination preparedness and distribution. Therefore, the equation can be rewritten as knowledge[34] = vaccination (knowledge[32], knowledge[33]), meaning that WHO guidelines about vaccination can be expanded with specific activities and, hence, bridge the gap. The function-based knowledge representation is depicted as a tree structure shown in Figure 2. The root of the tree is a public health document, and the leaves are the event-based procedures. A general document (eg, g1) contains general knowledge expressions (eg, ge1.1 and ge1.2). A general knowledge expression specifies a teleological function. For instance, the WHO guideline [34] points out roles of the health and nonhealth sectors in vaccination sharing and distribution activities. We can label this knowledge expression with a function vaccination (eg, f2). Specific guidelines (eg, s2) elaborate the teleological functions and define many specific knowledge expressions (eg, se1.2). Specific knowledge expressions can further indicate subfunctions (eg, sf1.2), which include detailed procedures and instructions. Unlike specific procedures, teleological functions are event independent. The same functions can apply to different events with similar fundamental lessons. The tree structure demonstrates how general documents and specific documents are linked via teleological functions. The function-based knowledge representation handles the hierarchical complexity through the tree structure of documents and manages the semantic heterogeneity by grouping distinct activities under the same function. Teleological functions define the scope and intention of the specific documents. They let a specific document elaborate a general document by adding actionable items.

    Figure 2. The tree structure of function-based knowledge.
    View this figure

    Ontology Development


    An ontology is a formal description of entities and their properties, relationships, and constraints [25]. It is widely used for the information system and knowledge management. An ontology consists of classes, individuals, and properties. Classes are a collection of concepts in the domain of discourse. Individuals are instances of each class. Properties are relations between classes, values restrictions, or instance descriptions in the domain of discourse. An ontology models knowledge by axiomatizing concepts as well as the relationships between them [35]. Knowledge is defined and organized in a layer style (see Multimedia Appendix 1). Terms with similar meanings are classified as synonyms. A list of synonyms is defined as a concept. Concepts form a hierarchy and are connected by relations. Concepts and relations constitute general axioms that represent the knowledge of discourse. Figure 3 shows the ontology development process, which consists of 3 steps: (1) concept extraction: extracting knowledge from the corpus; (2) ontology assembly: decomposing knowledge into terms, relations, constraints, and descriptions, integrating these components to form an ontology; and (3) reasoning: creating semantic rules to enable knowledge retrieval.

    Figure 3. Ontology development process.
    View this figure
    Concept Extraction

    There are 2 concept extraction methods available: manual annotation and natural language processing (NLP) annotation. Manual annotation requires domain experts to review and annotate every term in the corpus per predefined criteria. Manual annotation provides high accuracy but requires tremendous human effort. On the other hand, NLP annotation automatically recognizes and classifies terms into predefined categories [36]. NLP annotation is much more efficient than manual annotation but at the cost of accuracy. Usually, an NLP-based information retrieval performs clustering or classification to identify key concepts. The performance is usually measured by precision or recall [37].

    Ontology Assembly

    OntoPH includes 199 classes, 78 properties, and 1234 axioms (see Multimedia Appendices 2-8). We developed the general structure of OntoPH based on the Legal Knowledge Interchange Format (LKIF) core ontology. The LKIF core ontology was developed by the European Project for Standardized Transparent Representations to extend a legal accessibility consortium to cater to a continuing need for a standard vocabulary of basic legal terms [38]. We expanded this legal term vocabulary to include public health vocabulary.

    OntoPH is structured in a modularized nature. Modularization improves the reusability, scalability, and maintenance of an ontology [39,40]. OntoPH has 7 modules: space-time, agent, action, role, process, document, and event. Inheriting all modules, OntoPH core ontology has 9 main classes (Textbox 1). The space class defines spatial concepts such as region and nation. The time class describes temporal concepts such as time point or period. The resource class specifies resources used for public health preparation and response. The action class defines potential actions for an EID event. Actions are categorized regarding the 4 basic teleological functions: communication, control, implementation, and monitoring [26]. Subclasses of the action class represent specific functions under the 4 basic functions. The process class describes both continuous and discrete event flows. The agent class lists all the intelligent and nonintelligent agents involved in a process or an action. The description class describes the state and role of any agent, action, or process. The medium class summarizes different types of public health documents, such as legal or nonbinding documents. Last, the expression class represents the knowledge expressions of the documents.

    Textbox 1. Ontology main classes and subclasses.
    View this box

    OntoPH properties (see Multimedia Appendices 6 and 7) define the relationships between classes and subclasses. For instance, participate (Figure 4) has a domain of role and a range of action, indicating that a role participates in some actions. This property has an inverse of participate_by. OntoPH contains individuals extracted from public health documents. For example, legal_role, a subclass of role, has individuals of emergency committee and public health authority (Figure 5).

    Figure 4. Protégé screenshot for property participate.
    View this figure
    Figure 5. Protégé screenshot for individuals of Legal_role.
    View this figure
    Semantic Rules and Reasoning

    OntoPH is developed using web ontology language (OWL) under the Protégé environment [41]. Logic-based semantic rules allow OWL to “exploit the considerable existing body of logical reasoning to fulfill important logical requirements” [42]. They imply answers to the competency questions. OntoPH answers 3 types of questions: (1) the relation between actions and roles, (2) the relation between roles and the condition of interest, and (3) the relation between actions and the condition of interest. OntoPH uses time, space, resource, and process classes to describe the conditions of an EID outbreak. Hence, we can construct the following informal competency questions:

    1. What action must a role perform?
    2. What are the roles specified by an action?
    3. What are the actions required under a condition of interest?
    4. What are the roles specified under a condition of interest?

    Informal competency questions should be translated to a formal format so that an ontology can retrieve the elements of knowledge to answer them [25]. We denote Tontology as a set of axioms in the ontology, Gground as a set of ground instances, and Q as a first-order sentence using only predicates in the language of Tontology. We can formulate the formal translations for the 4 informal competency questions.

    1. Let Q(action) denote a sentence that describes some actions. Given a ground formula Grole defining instances of a role, determine the possible actions, as shown in Figure 6.
    2. Let Q(role) denote a sentence that describes some roles. Given a ground formula Gaction defining instances of an action, determine the possible roles, as shown in Figure 7.
    3. Let Q(action) denote a sentence that describes some actions. Given a ground formula Gcondition defining instances of a condition, determine the possible actions, as shown in Figure 8.
    4. Let Q(role) denote a sentence that describes some roles. Given a ground formula Gcondition defining instances of a condition, determine the possible roles, as shown in Figure 9.

    Semantic rules link axioms T with instances G and entail a first-order sentence Q, which is the answer to the competency question.

    Semantic rules are created using semantic web rule language (SWRL), a rule language for the semantic web. SWRL rules apply unary predicates for describing classes and data types, binary predicates for properties, and some special built-in n-ary predicates [43]. An example SWRL rule is shown in Textbox 2.

    Textbox 2. A simple example of semantic web rule language rule.
    View this box

    This rule describes the assertion that someone is a child of married parents. Letters with a question mark (eg, ?x) denote variables. Person(?x) indicates that a variable x is a person. The binary relation hasParent(?x, ?y) indicates that person x has a parent y. The formal formula is shown in Figure 10, which reads: there exists persons x, y, and z. If x has parent y, and x has parent z, and y and z are spouses, then x is a child of married parents. SWRL rules translate natural language assertions into computable forms (Figure 10).

    We create SWRL rules in 3 steps. First, public health experts review documents and identify knowledge expressions. For example, the WHO Technical Advice for Case Management of Influenza A (H1N1) in Air Transport (WHO Advice Air Transport) [44] is a WHO-issued guideline for air transportation case management. It specifies the procedures that the pilot in command should follow when a suspicious case is identified. We identify a knowledge expression pilot_in_command_action under the expression class. Second, public health experts create logic expressions for knowledge expressions. This intermediate step translates a procedure into a formal representation. For example, the pilot_in_command_action can be written as logic expressions, as shown in Figure 11.

    Logic expressions and natural language are interchangeable. The first expression in Figure 11 shows that WHO Advice Air Transport contains specifications about pilot actions. The pilot in command should report any suspicious activities on the flight. The second expression in Figure 11 shows that WHO Advice Air Transport requires communication between agencies. The public health authority should communicate with other agencies. Third, public health experts work with ontology engineers to develop the SWRL rules based on the logic expressions from step 2. Textbox 3 shows the SWRL rule created for the same example. The rule first states the knowledge expression and its parent document. Then, it specifies the roles (Pilot and PH_authority) and the expected actions.

    Textbox 3. Semantic web rule language rule for the pilot_in_command_action example.
    View this box

    Logical inference connects documents with knowledge expressions. An inference process is depicted in Figure 12. WHO Advice Air Transport carries many knowledge expressions. One of them informs the chief pilot’s actions for an EID emergency during a flight mission. This piece of knowledge then implies that pilots and public health authorities should report suspicious cases and communicate with each other in time.

    Reasoning results are presented per individual. Figure 13 shows the reasoning results of Mayor’s Office of Emergency Management under the department class. Given an individual, we obtain a list of sentences, such as “Mayor’s Office of Emergency Management performs delivery strategy.” These sentences in fact are the elements of knowledge.

    Figure 6. The formal expression of competency question 1.
    View this figure
    Figure 7. The formal expression of competency question 2.
    View this figure
    Figure 8. The formal expression of competency question 3.
    View this figure
    Figure 9. The formal expression of competency question 4.
    View this figure
    Figure 10. The formal expression of someone is a child of married parents.
    View this figure
    Figure 11. The formal expression of pilot in command action.
    View this figure
    Figure 12. An inference process.
    View this figure
    Figure 13. Reasoning results for Office_of_Emergency_Management.
    View this figure


    Concept Extraction Results

    The corpus, with 135,946 words in total, consists of the US Code [45], federal level regulations [32,33,46,47], international health regulations [34,44,48,49], and pandemic evaluations of outbreak responses [50,51]. They cover all types of public health documents aforementioned. The US Code is the generic legal document, which ensures that the ontology aligns with laws. The federal regulations and the international health regulations are guidelines regarding surveillance, transportation, and preparedness. The evaluations are chosen per disease. H1N1 and West Nile Virus are 2 specific diseases chosen for illustration. These 2 cases were selected because they are well-studied recent emerging diseases with an impact on health resources both locally and globally. In addition, their impacts on health and geographical coverage are both significant. We wanted to evaluate case examples where the primary infection risk is associated with different infection transmission routes in order to evaluate the potential for having a unified framework for EIDs.

    We implemented a hybrid concept extraction approach. NLP methods are used to preprocess the corpus. By removing stop words and tagging the parts of speech, one can extract meaningful and most frequent terms and relations using text mining tools like KHCoder [52]. The classification work is done manually with 2 domain experts reviewing every term and relation and deciding their descriptions and constraints. OntoPH is built upon these terms and relations. Domain experts and ontology engineers work collaboratively to select and annotate documents. Such a team-based method has been used extensively in many scientific studies and applications, such as hazard and operability analysis in chemical engineering [53]. Such a team should be as small as possible while maintaining sufficient expertise. In a series of meetings, team members work together to select documents. Conflicts must be resolved before the list of documents is finalized. Each domain expert annotates a part of the corpus and reviews others’ annotations. This practice, therefore, keeps the corpus and annotation as objective as possible.

    Ontology Evaluation

    The quality of ontology is critical. It affects not only the quality of reasoning results but also the effectiveness of the application. Ontology can be evaluated on many aspects, namely, vocabulary, syntax, structure, semantics, representation, and context [24]. Extensive research has been conducted to formally evaluate the quality of ontologies [24,54-57]. Among these methods, we follow the Ontology Quality Evaluation Framework (OQuaRE) approach [55], which adapts the International Organization for Standardization standards for Software Quality Requirements and Evaluation. OQuaRE assesses 6 characteristics and 39 subcharacteristics of an ontology using quality metrics. Quality metrics are composed of primitive and derived measurements. Primitive measurements are metrics that can be measured directly on the ontology, such as number of classes, number of relations, etc. Derived measurements are combinations of some primitive ones [55]. With a scale of 1 to 5 (1=not acceptable and 5=exceeds the requirement), it rates every aspect of an ontology. The final score is the arithmetic average of individual scores of all characteristics. The details of this method can be found in Duque-Ramos et al [55]. We include 30 out of the 39 subcharacteristics in our evaluation. The other 9 subcharacteristics, which require expert subjective assessment, are excluded. The evaluation results of the OntoPH core ontology are presented in Table 1. The evaluation indicates that the OntoPH core ontology is satisfactory, with an average score of 4. Problems have been found on redundancy and controlled vocabulary, mainly due to the relatively small corpus size.

    Table 1. Ontology evaluation results.
    View this table


    Principal Findings

    The possibility of using ontology and semantic reasoning in public health decision making has been recognized in literature [58]. In this work, we adapt this idea and our previous experience in knowledge management in the pharmaceutical industry [59] to derive a detailed methodology on how to develop such a tool. We introduce the systems engineering–inspired ontology-driven framework for public health knowledge management. We demonstrate how complex and heterogeneous public health knowledge can be modeled and stored in an ontology. Previous work has focused on local activities, such as activities within a health care network [60]. OntoPH extends the scope from local level to global/national level by focusing on general documents.

    OntoPH’s strength is threefold. First, it stores public health documents knowledge as classes, relations, and instances. Public health documents, including guidelines, procedures, and academic publications, are important sources of knowledge. Even though medical records, geographic information system data, and disease information have been studied and stored in ontologies [60,61], to our knowledge, there is no ontology for public health documents. OntoPH provides this missing piece of public health knowledge management. Second, we present a flexible knowledge management framework. OntoPH implements a modularized structure, which ensures its extensibility. For example, the space-time module can be extended using time ontologies [60,62] and World Wide Web Consortium spatial ontologies [63]. It is also possible to add new modules. If disease information is needed, we can create a new disease module, which inherits the disease ontology [61]. This modularized structure makes OntoPH a potential generic public health knowledge center. Third, OntoPH can manage the hierarchical complexity and heterogeneity of public health knowledge. Elements of knowledge are effectively organized by the teleological functions that highlight the means-end relations.

    This framework is most useful in low- and middle-income countries (LMICs). Lack of resources and public health experts in LMICs usually makes a knowledge management system difficult to implement. Nonetheless, OntoPH’s general knowledge is widely applicable. By expanding the data sources to include LMIC-specific knowledge [64] and connecting with other ontologies [61-63,65], OntoPH would become a useful tool to help LMICs respond to an outbreak quickly, both at the national and local levels.

    Potentially, OntoPH can support decision making by answering user queries. For example, given an outbreak scenario, a user could list questions regarding disease identification, transmission prevention, disease control, and risk mitigation. With enough prestored knowledge, OntoPH could answer the list of questions by producing logical assertions with respect to each question.


    At this stage, however, there are some limitations. First, the training document corpus is relatively small. Only 5 general documents and 7 specific documents are prestored due to the manual annotation constraint. It will require a concerted effort to annotate and develop a more extensive public health knowledge base for widespread application. Nonetheless, the current corpus is comprehensive enough for proof of concept. Second, the selection of documents is subjective. When the corpus size is small, the accuracy of reasoning results is dependent on the document selection rather than the knowledge base. Increasing the size of the corpus and precise query statement will improve reasoning accuracy in general. In addition, rule-based reasoning has its intrinsic limitations—semantic rules are subjective. SWRL rules rarely allow ternary relations, and that limits the power of the SWRL representation. Third, the current framework is restricted to public health documents, which lack information from various data sources, such as geographic information system data, news articles, social media feeds, etc. This limits OntoPH’s real-time usage. Moreover, current knowledge representation would not be able to capture knowledge in research articles that do not fit in the knowledge model. However, the basic and domain ontologies, such as space-time, resource, role, and agent modules, contain fundamental public health knowledge, therefore, making the knowledge framework extendable to cover research articles. This, of course, requires further study of new knowledge representation. Potentially, a research article knowledge expression module could be developed and incorporated into OntoPH.

    Future Work

    Future work will address the limitations and evaluate OntoPH’s reasoning capacity. Adopting artificial intelligence techniques would significantly reduce the human effort, and, thus, get rid of many of the limitations. Specifically, a term extraction module implementing NLP techniques such as topic modeling would enable automated concept classification of public health documents, reducing the amount of work required for annotation. Enriching data sources will improve OntoPH’s ability for real-time response. We plan to expand the corpus incorporating expert opinions. A survey for eliciting expert feedback on what to include in the corpus will be conducted. A systematic literature review on effectiveness of policy and interventions could help us determine what documents to include.

    To evaluate this method, we will collect a list of general queries regarding general EID preparedness and response from public health experts and practitioners. Moreover, we will test OntoPH’s reasoning capacity on hypothetical outbreaks. These full-scale case studies will provide us with valuable information on how to improve the usage and accuracy of OntoPH decision support.


    In recent decades, many EID outbreaks and epidemics have resulted in considerable human disability and mortality, in part due to ineffective coordination or slow response at the start of the outbreak. Responding to EID outbreaks is intrinsically challenging due to the uncertainties associated with EIDs, specifically level of risk and potential for impact of its spread in a population. During an outbreak, evidence-based public health policies developed by public health authorities, legislators, and other government officials facilitate the implementation of a strong public health response. However, there are structural and political forces that prevent decision makers from making evidence-based policies in response to outbreaks. Therefore, it is necessary to have in place a mechanism to easily identify evidence in order to evaluate the consequences of public health or policy actions recommended to address these public health emergencies. An ontology framework for public health outbreak response will cut the time spent aggregating expert opinions during the initial stages of an outbreak. It would also assist public health administrators and government officials on next steps based on individual- and systems-level factors associated with the outbreak.

    Our approach is a potentially effective methodology for EID preparedness and response. It manages complex knowledge via a function-based knowledge representation. It introduces a systematic way of storing, retrieving, and using public health knowledge. Accuracy and comprehensiveness of the ontology can be improved as more knowledge is stored. We advocate the public health community work toward the goal of developing a Gene Ontology-like [66] knowledge sharing standard. OntoPH demonstrates the possibility of knowledge management for EID emergency preparedness and response.


    We would like to thank Dr Yu Luo for his constructive comments on ontology design. ZZ, MG, SM, and VV designed the research; ZZ and MG built the ontology; ZZ and MG developed the semantic rules; and ZZ wrote the manuscript with the input of MG and VV. This work is supported in part by Columbia University and the Center for the Management of Systemic Risk.

    Conflicts of Interest

    None declared.

    Multimedia Appendix 1

    Ontology layer representation (Adapted from Cimiano [35]).

    PNG File, 225KB

    Multimedia Appendix 2

    Ontology classes part 1.

    PNG File, 34KB

    Multimedia Appendix 3

    Ontology classes part 2.

    PNG File, 25KB

    Multimedia Appendix 4

    Ontology classes part 3.

    PNG File, 25KB

    Multimedia Appendix 5

    Ontology classes part 4.

    PNG File, 26KB

    Multimedia Appendix 6

    Ontology properties part 1.

    PNG File, 18KB

    Multimedia Appendix 7

    Ontology properties part 2.

    PNG File, 16KB

    Multimedia Appendix 8

    Ontology individuals (selected).

    PNG File, 30KB


    1. World Health Organization. Ebola Data and Statistics. 2015.   URL: [accessed 2016-01-15] [WebCite Cache]
    2. Tomori O. Will Africa's future epidemic ride on forgotten lessons from the Ebola epidemic? BMC Med 2015 May 14;13:116 [FREE Full text] [CrossRef] [Medline]
    3. Stoto MA, Nelson C, Higdon MA, Kraemer J, Hites L, Singleton C. Lessons about the state and local public health system response to the 2009 H1N1 pandemic: a workshop summary. J Public Health Manag Pract 2013;19(5):428-435. [CrossRef] [Medline]
    4. Oshitani H, Kamigaki T, Suzuki A. Major issues and challenges of influenza pandemic preparedness in developing countries. Emerg Infect Dis 2008;14(6):0839.
    5. Revere D, Turner AM, Madhavan A, Rambo N, Bugni PF, Kimball A, et al. Understanding the information needs of public health practitioners: a literature review to inform design of an interactive digital knowledge management system. J Biomed Inform 2007 Aug;40(4):410-421 [FREE Full text] [CrossRef] [Medline]
    6. Bloom B. Crossing the quality chasm: a new health system for the 21st century. JAMA 2002;287(5):646-647.
    7. LaPelle NR, Luckmann R, Simpson EH, Martin ER. Identifying strategies to improve access to credible and relevant information for public health professionals: a qualitative study. BMC Public Health 2006 Apr 05;6:89. [CrossRef] [Medline]
    8. Ho K, Peter WWP. Harnessing the social web for health and wellness: issues for research and knowledge translation. J Med Internet Res 2014;16(2):e34 [FREE Full text] [CrossRef] [Medline]
    9. Li S, Mackaness W. A multi-agent-based, semantic-driven system for decision support in epidemic management. Health Inform J 2015;21(3):195-208. [Medline]
    10. Ostfeld RS, Glass GE, Keesing F. Spatial epidemiology: an emerging (or re-emerging) discipline. Trends Ecol Evol 2005 Jun;20(6):328-336. [CrossRef] [Medline]
    11. Mao L, Bian L. Spatial-temporal transmission of influenza and its health risks in an urbanized area. Comp Environ Urban Syst 2010;34(3):204-215.
    12. Keselman A, Rosemblat G, Kilicoglu H, Fiszman M, Jin H, Shin D, et al. Adapting semantic natural language processing technology to address information overload in influenza epidemic management. J Am Soc Inf Sci Technol 2010 Dec 01;61(12):2531-2543 [FREE Full text] [CrossRef] [Medline]
    13. Trinquart L, Galea S. Mapping epidemiology's past to inform its future: metaknowledge analysis of epidemiologic topics in leading journals, 1974-2013. Am J Epidemiol 2015 Jul 15;182(2):93-104. [CrossRef] [Medline]
    14. Neill D. New directions in artificial intelligence for public health surveillance. IEEE Intel Sys 2012;27(1):56-59.
    15. Trochim WM, Cabrera DA, Milstein B, Gallagher RS, Leischow SJ. Practical challenges of systems thinking and modeling in public health. Am J Public Health 2006 Mar;96(3):538-546. [CrossRef]
    16. Schlager KJ. Systems engineering—key to modern development. IRE Tran. Eng Manage 1956 Jul;EM-3(3):64-66. [CrossRef]
    17. Venkatasubramanian V. Systemic failures: challenges and opportunities in risk management in complex systems. AIChE J 2010 Nov 30;57(1):2-9. [CrossRef]
    18. Leveson NG, Stephanopoulos G. A system-theoretic, control-inspired view and approach to process safety. AIChE J 2013 Nov 28;60(1):2-14. [CrossRef]
    19. Bookstaber R, Glasserman P, Iyengar G, Luo Y, Venkatasubramanian V, Zhang Z. Process systems engineering as a modeling paradigm for analyzing systemic risk in financial networks. J Invest 2015 May;24(2):147-162. [CrossRef]
    20. Garvey P. Analytical Methods for Risk Management: A Systems Engineering Perspective. Boca Raton: RC Press; 2008.
    21. Sage A, Rouse W. Handbook of Systems Engineering and Management. Hoboken: John Wiley & Sons; 2009.
    22. Kopach-Konrad R, Lawley M, Criswell M, Hasan I, Chakraborty S, Pekny J, et al. Applying systems engineering principles in improving health care delivery. J Gen Intern Med 2007 Dec;22 Suppl 3:431-437 [FREE Full text] [CrossRef] [Medline]
    23. Yaylali E, Ivy JS, Taheri J. Systems engineering methods for enhancing the value stream in public health preparedness: the role of Markov models, simulation, and optimization. Public Health Rep 2014;129 Suppl 4:145-153 [FREE Full text] [CrossRef] [Medline]
    24. Staab S, Studer R. Handbook on Ontologies, Vol. 1. 2nd Edition. Berlin: Springer Science & Business Media; 2013.
    25. Grüninger M, Fox M. Methodology for the design and evaluation of ontologies. 1995 Presented at: IJCAI'95, Workshop on Basic Ontological Issues in Knowledge Sharing; 1995; Montreal.
    26. Venkatasubramanian V, Zhang Z. TeCSMART: a hierarchical framework for modeling and analyzing systemic risk in sociotechnical systems. AIChE J 2016 May 24;62(9):3065-3084. [CrossRef]
    27. Heussen K, Lind M. Decomposing objectives and functions in power system operation and control. 2009 Presented at: IEEE PES/IAS Conference on Sustainable Alternative Energ; 2009; Valencia.
    28. Lind M. Modeling goals and functions of complex industrial plants. Appl Artif Intell 1994 Apr;8(2):259-283. [CrossRef]
    29. Chittaro L. Functional diagnosis and prescription of measurements using effort and flow variables. IEE Proc-Control Theory Appl 1995;142(5):420-432. [CrossRef]
    30. Chittaro L, Guida G, Tasso C, Toppano E. Functional and teleological knowledge in the multimodeling approach for reasoning about physical systems: a case study in diagnosis. IEEE Trans Syst Man Cybern 1993;23(6):1718-1751. [CrossRef]
    31. Kopena J, Regli W. Functional modeling of engineering designs for the semantic Web. IEEE Data Engineer Bull 2003;26(4):55-61.
    32. Council of the European Union. Council conclusions on lessons learned from the A/H1N1 pandemic—health security in the European Union. Brussels: Council of the European Union; 2010.   URL: https:/​/ec.​​health/​/sites/health/files/preparedness_response/docs/council_lessonsh1n1_en.​pdf [accessed 2017-09-25] [WebCite Cache]
    33. Learning the lessons from the H1N1 vaccination campaign for health care workers. London: UK Department of Health; 2010.   URL: [accessed 2017-09-25] [WebCite Cache]
    34. World Health Organization. Pandemic Influenza Preparedness and Response Guide. 2009.   URL: [accessed 2017-09-25] [WebCite Cache]
    35. Cimiano P. Ontology Learning and Population From Text: Algorithms, Evaluation and Applications. New York: Springer; 2006.
    36. Carley K, Columbus D, Landwehr P. Automap User's Guide 2013. 2013.   URL: [accessed 2017-09-25] [WebCite Cache]
    37. Riloff E, Wiebe J. Learning extraction patterns for subjective expressions. 2003 Presented at: Proceedings of the conference on Empirical methods in natural language processing; 2003; Stroudsburg. [CrossRef]
    38. Hoekstra R, Breuker J, Di Bello M, Boer A. The LKIF Core Ontology of Basic Legal Concepts. 2007 Presented at: Proceedings of the Workshop on Legal Ontologies and Artificial Intelligence Techniques (LOAIT ); 2007; Stanford.
    39. d'Aquin M, Schlicht A, Stuckenschmidt H, Sabou M. Ontology modularization for knowledge selection: experiments and evaluations. 2007 Presented at: International Conference on Database and Expert Systems Applications; 2007; Regensburg. [CrossRef]
    40. Grau BC, Horrocks A, Kazakov Y, Sattler U. A logical framework for modularity of ontologies. 2007 Presented at: IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence; 2007; Hyderabad.
    41. Musen M. The Protege project: A look back and a look forward. AI Matters 2015;1(4). [CrossRef]
    42. Wang XH, Zhang DQ, Gu T, Pung HK. Ontology based context modeling and reasoning using OWL. 2004 Presented at: Proceedings of the Second IEEE Annual Conference; 2004; Washington.
    43. Kuba M. OWL 2 and SWRL tutorial.   URL: [accessed 2015-12-07] [WebCite Cache]
    44. World Health Organization. WHO technical advice for case management of Influenza A(H1N1) in air transport. Geneva: World Health Organization; 2009.   URL: [accessed 2017-09-25] [WebCite Cache]
    45. US Code: Title 42, Chapter 6A.   URL: [WebCite Cache]
    46. Title 42, Chapter 1. Code of Federal Regulations   URL: [accessed 2015-12-07] [WebCite Cache]
    47. Department of Health and Human Services. Foreign quarantines: etiological agents, hosts, and vectors. 2010.   URL: [accessed 2017-09-25] [WebCite Cache]
    48. World Health Organization. International Health Regulations.: World Health Organization; 2005.   URL: [accessed 2017-09-25] [WebCite Cache]
    49. World Health Organization. Protocol for assessing national surveillance and response capacities for the international health regulations. 2010.   URL: [accessed 2017-09-25] [WebCite Cache]
    50. Fineberg HV. Pandemic preparedness and response—lessons from the H1N1 influenza of 2009. N Engl J Med 2014 Apr 03;370(14):1335-1342. [CrossRef] [Medline]
    51. Asnis DS, Conetta R, Teixeira AA, Waldman G, Sampson BA. The West Nile Virus outbreak of 1999 in New York: the Flushing Hospital experience. Clin Infect Dis 2000 Mar;30(3):413-418. [CrossRef] [Medline]
    52. Higuchi K. KH Coder.   URL: [accessed 2017-09-12] [WebCite Cache]
    53. Venkatasubramanian V, Rengaswamy R, Yin K, Kavuri SN. A review of process fault detection and diagnosis. Comp Chem Engineer 2003 Mar;27(3):293-311. [CrossRef]
    54. Burton-Jones A, Storey VC, Sugumaran V, Ahluwalia P. A semiotic metrics suite for assessing the quality of ontologies. Data Knowl Engineer 2005 Oct;55(1):84-102. [CrossRef]
    55. Duque-Ramos A, Boeker M, Jansen L, Schulz S, Iniesta M, Fernández-Breis JT. Evaluating the Good Ontology Design Guideline (GoodOD) with the ontology quality requirements and evaluation method and metrics (OQuaRE). PLoS One 2014 Aug;9(8) [FREE Full text] [CrossRef] [Medline]
    56. Brank J, Grobelnik M, Miladenic D. A survey of ontology evaluation techniques. 2005 Presented at: Proceedings of 8th International Multi-Conf Information Society; 2005; Lisbon.
    57. Maedche A, Staab S. Measuring similarity between ontologies. 2002 Presented at: International Conference on Knowledge Engineering and Knowledge Management; 2002; London.
    58. Bures V, Otcenásková T, Cech P, Antos K. A proposal for a computer-based framework of support for public health in the management of biological incidents: the Czech Republic experience. Perspect Public Health 2012 Nov;132(6):292-298. [CrossRef] [Medline]
    59. Venkatasubramanian V, Zhao C, Joglekar G, Jain A, Hailemariam L, Suresh P, et al. Ontological informatics infrastructure for pharmaceutical product development and manufacturing. Comput Chem Engineer 2006 Sep;30(10-12):1482-1496. [CrossRef]
    60. Rao R, Makkithaya K, Gupta N. Ontology based semantic representation for Public Health data integration. 2014 Presented at: International Conference on Contemporary Computing and Informatics; 2014; Mysore.
    61. Schriml LM, Arze C, Nadendla S, Chang YW, Mazaitis M, Felix V, et al. Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res 2012 Jan;40:D940-D946 [FREE Full text] [CrossRef] [Medline]
    62. Hobbs JR, Pan F. An ontology of time for the semantic web. 2004 Mar 01 Presented at: ACM Transactions on Asian Language Information Processing (TALIP) - Special Issue on Temporal Information Processing; 2004; New York p. 66-85. [CrossRef]
    63. Lieberman J, Singh R, Goad C. W3C geospatial ontologies.: WC3 Organization   URL: [WebCite Cache]
    64. Nolen L, Braveman P, Dachs J. Strengthening health information systems to address health equity challenges. B World Health Organ 2005;83(8):597-603. [CrossRef]
    65. Tao, Wei W, Solbrig R, Savova G, Chute C. CNTRO: a semantic web ontology for temporal relation inferencing in clinical narratives. 2010 Presented at: AMIA Annual Symposium Proceedings; 2010; Washington.
    66. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. Nat Genet 2000 May;25(1):25-29 [FREE Full text] [CrossRef] [Medline]


    EID: emerging infectious disease
    LKIF: Legal Knowledge Interchange Format
    OntoPH: Public Health Document Ontology
    OQuaRE: Ontology Quality Evaluation Framework
    OWL: web ontology language
    NLP: natural language processing
    SWRL: semantic web rule language
    WHO: World Health Organization
    LMIC: low- and middle-income country

    Edited by T Sanchez; submitted 05.05.17; peer-reviewed by C Fu, J Du, Z He, A Lau; comments to author 30.05.17; revised version received 19.07.17; accepted 11.09.17; published 11.10.17

    ©Zhizun Zhang, Mila C Gonzalez, Stephen S Morse, Venkat Venkatasubramanian. Originally published in JMIR Research Protocols (, 11.10.2017.

    This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.