Eugene Santos Jr.

Sydney E. Junkins 1887 Professor of Engineering

Research Interests

Nonlinear decision-making; innovative reasoning; emergent behavior; probabilistic reasoning; adversarial modeling; intent inferencing; user modeling; information retrieval; evolutionary computation; socio-cultural modeling; intelligent systems; artificial intelligence

Education

  • BS, Mathematics and Computer Science, Youngstown State University 1985
  • MS, Mathematics, Youngstown State University 1986
  • ScM, Computer Science, Brown University 1988
  • PhD, Computer Science, Brown University 1992

Awards

  • Fellow, AAAS (2016)
  • Fellow, IEEE (2012)

Professional Activities

  • Member, Department of the Air Force Scientific Advisory Board
  • Northeast Region Director, Sigma Xi Scientific Research Society
  • Editorial Board, Computational Intelligence
  • Editorial Board, Journal of Intelligent Information Systems
  • Editorial Board, Journal of Experimental and Theoretical Artificial Intelligence
  • Associate Editor, International Journal of Image and Graphics
  • Editor-in-Chief (formerly), IEEE Transactions on Systems, Man, and Cybernetics: Part B and IEEE Transactions on Cybernetics

Research Projects

  • Connections Hypothesis Provider in NCATS

    Connections Hypothesis Provider in NCATS

    Connections Hypothesis Provider (CHP) service built by Dartmouth College (PI – Dr. Eugene Santos) and Tufts University (Co-PI – Joseph Gormley) in collaboration with the National Center for Advancing Translational Sciences (NCATS). CHP aims to leverage clinical data along with structured biochemical knowledge to derive a computational representation of pathway structures and molecular components to support human and machine-driven interpretation, enable pathway-based biomarker discovery, and aid in the drug development process. In its current version, CHP supports queries relating to genetic, therapeutic, and patient clinical features (e.g. tumor staging) contribution toward patient survival, as computed within the context of our test pilot: a robust breast cancer dataset from The Cancer Genome Atlas (TCGA). We are using this as a proving ground for our system’s basic operations as we work to incorporate structured pathway knowledge and pathway analysis methods into the tool.

  • Nonlinear decision-making

    Nonlinear decision-making

    To advance the science of decision-making as it pertains to how people learn to make decisions and how this process can be captured computationally, we are specifically addressing the challenge of how nonlinear decisions can be learned from data, experience, and even interactions with other decision-makers. Nonlinear thinking is a prized ability we, humans, have that is ubiquitously applied across any and all domains when the problems are challenging, and known solutions or ways of addressing the problems all fail to provide an adequate solution – e.g., All available choices are bad choices, must we settle for the least bad one? The ability to discover a new choice has been called being nonlinear, innovative, intuitive, emergent, or “outside-the-box.” It is well-documented that humans can often excel at such thinking in situations when there is a scarcity/overflow of data, significant uncertainty, and numerous contradictions in what is known or provided. However, how this can be replicated computationally for a machine has yet to be fully addressed or understood in extant research.

  • Multi-source knowledge fusion and learning

    Multi-source knowledge fusion and learning

    Real-world complex systems can be observed from many different angles or perspectives, and datasets collected from various perspectives often emphasize different types of features. This results in inconsistent beliefs about what is relevant to the system, how relevant features are related to one another, and what statistical properties these features possess. Many methods have been proposed to combine such diverse information sources. However, current algorithms only learn from each dataset separately and then combine individual outputs since this is easier to do with heterogeneous datasets with unknown feature correlations. This approach, although convenient and intuitive, cannot capture the logical linkages between various datasets. To understand variables’ interactions learned from datasets with noise and incompleteness, we are exploring algorithms that naturally fuse these datasets based on their shared variables and induce new variable relationships.

  • Innovative reasoning and emergent learning

    Innovative reasoning and emergent learning

    Can we teach computers to think outside the box? In other words, is it possible to replicate the innovative decision process computationally for machine learning? Extant research in machine learning has typically either focused on (a) building predictive models of a single internally-consistent target or on (b) a single task or decision in isolation, and more rarely, both, given the difficulties already posed within these more restrictive problems. The successes and utility of modern machine learning is clearly evident in numerous applications across many domains, and ever more so now with Big Data. Yet, the former focus (a) has made machine learning of complex targets (e.g. systems of systems, complex systems, a human) very elusive because of their inherent assumptions of expectations. This precludes the ability to learn emergent, unexpected, or innovative behaviors.

  • Complex and emergent behavior modeling

    Complex and emergent behavior modeling

    In existing attempts to model complex systems, one critical aspect that has not been clearly ad-dressed involves the underlying mechanism for integrating the numerous “pieces” and “parts” that make up the target. Combining pieces is the process of aggregation and must handle inconsistencies among the pieces. Combining parts is the process of composition in which the parts are encapsulations of information with a set of meaningful operations defined on them. Parts are functional in nature and thus are driven by function composition. Extant research has not directly addressed this resulting in mathematically ad-hoc models opaque to analysis. We propose to develop a singular, rigorous, comprehensive computation framework that is axiomatic and provides the capabilities needed to model complex systems based on a new model of complex adaptive Bayesian Knowledge Bases and a novel, powerful analytical framework capable of wholistic end-to-end quantitative analysis of performance, robustness, vulnerability, and impacts of change on our targets being modeled. Furthermore, our results will be applicable to numerous domains of public purpose from crisis and catastrophe management for natural disasters and disease outbreaks to assessing the well-being of our financial system and national infrastructures.

  • High performance search and optimization

    High performance search and optimization

    High performance search and optimization aims to develop new models and algorithms for solving challenging engineering problems in domains such as mission planning and logistics, manufacturing process optimization, composite materials production, distributed plant scheduling and management, and policy evaluation, to name a few.

  • Distributed information retrieval

    Distributed information retrieval

    Distributed information retrieval aims to develop a large-scale information retrieval architecture that can be effectively and efficiently deployed in distributed environments. Heterogeneous information (such as content, formats and sources) is the typical issue that needs to be identified and handled in the distributed environment. Our objective is to develop a unified architecture called I-FGM (intelligent foraging, gathering and matching) for dealing with the massive amount of information in a dynamic search space within large-scale distributed platforms. The system will proceed to explore the information space, and continuously identify and update promising candidate information. Specific metrics are also being developed for performance evaluation.

  • Bayesian knowledge bases, engineering, verification, and validation

    Bayesian knowledge bases, engineering, verification, and validation

    Bayesian knowledge bases, engineering, verification, and validation focuses on the fundamental problem of probabilistic modeling of knowledge in order to represent and reason about information in a theoretically sound manner. The world is replete with issues such as incompleteness, impreciseness, and inconsistency which makes the task of capturing even everyday tasks, processes, and activities very difficult, let alone trying to capture that of decision-making by experts or other complex phenomena. Improperly modeling uncertainty leads to numerous anomalies in reasoning as well as increased computational difficulties.

  • Information processing and summarization

    Information processing and summarization

    Information processing and summarization are critical areas of research that study how we can develop stand-alone algorithms as well as algorithms fused with humans to handle and process information in a variety of forms. The goal is to be able to extract the meaning (or semantics) of the information in order to better manipulate/reason and present it to the human user. This is fundamental to solving problems such as avoiding information overload and providing effective summarization.

  • Influence of culture and society on attitudes and behaviors

    Influence of culture and society on attitudes and behaviors

    Influence of culture and society on attitudes and behaviors aims to build and employ social, cultural, and political data-driven models to explore and explain attitudes and behaviors. The efforts involve classifying the factors that play significant roles in attitudes and behaviors, abstracting general rules from traditional research such as sociological case studies, studying the inferencing structures that allow different factors to influence decision-making, reasoning from different points of view, and applying them in predicting behavior.

  • Adversary intent inferencing and adversarial modeling

    Adversary intent inferencing and adversarial modeling

    Adversary intent inferencing and adversarial modeling investigates the feasibility of developing and utilizing an adversary intent inferencing model as a core element for predictive analyses and simulations to establish emergent adversarial behavior. It is our desire to use this intelligent adversary to predict adversary intentions, explain adversary goals, and predict enemy actions in an effort to generate alternative futures critical to performing course of action (COA) analysis. Such a system will allow planners to gauge and evaluate the effectiveness of alternative plans under varying actions and reactions to friendly COAs. This can also be applied in a broad range of areas.

  • User modeling and user intent inferencing

    User modeling and user intent inferencing

    User modeling and user intent inferencing involves building dynamic cognitive user models that can predict the goals and intentions of a user in order to understand and ultimately provide proactive assistance with user tasks, such as information gathering. The key is to capture the user's intent by answering questions such as: what is the user's current focus, why is the user pursuing certain goals, and how will the user achieve them? The efforts involve machine learning, knowledge representation, intent inferencing, and establishment of proper evaluation metrics. This work has been applied to assisting with intelligent information retrieval and enhancing the effectiveness of intelligence analysts.

  • Artificial intelligence

    Artificial intelligence

    What is the nature of intelligence? Can we make machines that are intelligent? Machines that think like human beings or think differently? Can machines think even better than humans? What are the implications? These and other questions are being investigated.

  • Deception detection

    Deception detection

    Deception detection aims to automatically detect and infer the intentions behind deceptive actions. Our objectives are to 1) develop a framework for categorizing and classifying errors that may be committed by an expert, since not all errors are deception; and 2) design algorithms for automatic deception detection capable of providing detailed evidential information and explanation of deception intent, plus analysis of the deception's impact. Like insider threat, deception detection can occur in any number of scenarios and domains, and insider threat and deception detection are often interrelated.

  • Insider threat

    Insider threat

    Insider threat and deception detection are two areas that focus on user actions and their impacts upon the systems with which they interact. Insider threat aims to understand and prevent malicious activities that are instigated by "trusted" users on complex computer/information systems. Such activities cover a broad spectrum ranging from simple theft of confidential data to the more subtle alteration of system performance and/or information. For the latter, examples can include minor perturbation of a component specification in a manufacturing process resulting in a rippling effect of final component failure to influencing the decision-makers by modifying their information flow and content. The goal is to model insider threat in order to predict behavior and ultimately infer their goals and intentions.

Courses

  • ENGS 52: Introduction to Operations Research
  • ENGS 65: Engineering Software Design
  • ENGG 418: Applied Natural Language Processing

Videos

Artificial Intelligence and Responsible Design

Around theCUBE, Unpacking AI Panel

AI in Medicine and the Role of State Governments