“Detecting Model Performance Deterioration Using Distribution Divergence Measures”
Friday, December 4 | 1 – 1:30 p.m.
Amin Tajgardoon, Graduate Student, ISP
Abstract: We study the correlation between the distribution divergence and model performance deterioration. The hypothesis is that as datasets deviate from the original distribution, the performance of the original model further deteriorates. We experiment with several divergence measures and multiple synthetic and real-world datasets to test this hypothesis. I will present the preliminary results of our study and discuss the future work including the development of a divergence criterion based on which to determine the significance of the divergence and to predict the model's degradation. Such criterion is of special importance in unsupervised problems where the test labels are unknown and the model's performance cannot be measured directly. Moreover, I will discuss a similar ongoing work in which we aim to detect performance deterioration in unsupervised domain adaptation methods.
“Reasoning About Complex Media from Weak Multi-Modal Supervision”
Friday, December 4 | 12:30 - 1 p.m.
Adriana Kovashka, Professor, ISP
Abstract: In a world of abundant information targeting multiple senses, and increasingly powerful media, we need new mechanisms to model content. Techniques for representing individual channels, such as visual data or textual data, have greatly improved, and some techniques exist to model the relationship between channels that are “mirror images” of each other and contain the same semantics. However, multimodal data in the real world contains little redundancy; the visual and textual channels complement each other. We examine the relationship between multiple channels in complex media, in two domains, advertisements and political articles.
First, we collect a large dataset of advertisements and public service announcements, covering almost forty topics (ranging from automobiles and clothing, to health and domestic violence). We pose decoding the ads as automatically answering the questions “What should do viewer do, according to the ad” (the suggested action), and “Why should the viewer do the suggested action, according to the ad” (the suggested reason). We train a variety of algorithms to choose the appropriate action-reason statement, given the ad image and potentially a slogan embedded in it. The task is challenging because of the great diversity in how different users annotate an ad, even if they draw similar conclusions. One approach mines information from external knowledge bases, but there is a plethora of information that can be retrieved yet is not relevant. We show how to automatically transform the training data in order to focus our approach’s attention to relevant facts, without relevance annotations for training. We also present an approach for learning to recognize new concepts given supervision only in the form of noisy captions.
Second, we collect a dataset of multimodal political articles containing lengthy text and a small number of images. We learn to predict the political bias of the article, as well as perform cross-modal retrieval despite large visual variability for the same topic. To infer political bias, we use generative modeling to show how the face of the same politician appears differently at each end of the political spectrum. To understand how image and text contribute to persuasion and bias, we learn to retrieve sentences for a given image, and vice versa. The task is challenging because unlike image-text in captioning, the images and text in political articles overlap in only a very abstract sense. We impose a loss requiring images that correspond to similar text to live closeby in a projection space, even if they appear very diverse purely visually. We show that our loss significantly improves performance in conjunction with a variety of existing recent losses. We also propose new weighting mechanisms to prioritize abstract image-text relationships during training.
“Hacking Your Way to RL in the Real World (Someday)”
Friday, November 20 | 12:30 – 1:30 p.m.
Edward Grefenstette, Research Scientist, Facebook AI Research; Honorary Associate Professor, UCL
Abstract: Deep Reinforcement Learning has produced some impressive results—mostly in a particular kind of game or game-like setting—which are worthy of praise. However, we (eventually) want agents which do practical stuff in the real world. Is the real world like these games? What's the realism gap? Are there games that close it a little? All these questions will (possibly) be partially and vaguely answered as I present a new environment for RL research which will help push the boundaries of our field (without setting your cluster on fire), and discuss some recent methods developed to scratch the surface of the challenges associated with it.
Bio: Edward Grefenstette is a Research Scientist at Facebook AI Research, and Honorary Associate Professor at UCL. He previously was a Staff Research Scientist at DeepMind, and as a Junior Research Fellow within Oxford’s Department of Computer Science and Somerville College. He completed his DPhil (PhD) at the University of Oxford in 2013 under the supervision of Profs Coecke and Pulman, and Dr Sadrzadeh, working on applying category-theoretic tools–initially developed to model quantum information flow–to model compositionality of distributed representations in natural language semantics. His recent research has covered topics at the intersection of deep learning and machine reasoning, addressing questions such as how neural networks can model or understand logic and mathematics, infer implicit or human-readable programs, or learn to understand instructions from simulation.
“Evaluating Effect of Microsoft Hololens on Extraneous Cognitive Load During Simulated Cervical Lateral Mass Screw Placement”
Friday, November 6 | 12:30 – 1:30 p.m.
Dmitriy Babichenko, Clinical Associate Professor, SCI
Abstract: The use of augmented reality (AR) is widely accepted as a feasible training, planning, and prototyping tool. Unlike virtual reality (VR), which implies a complete immersion in a virtual world, AR adds digital elements to a live view by using a headset or camera on a smartphone. The ability to project digital elements into the physical world, combined with the Federal Food and Drug Administration (FDA) approval to use the Microsoft HoloLens in surgical procedures, presents a unique opportunity to develop novel neurosurgical and orthopaedic surgery applications of AR, specifically in spine surgery. Placement of any spinal instrumentation is not free from complication, even with modern image-guidance platforms. The potential of AR in spine surgery lies in its ability to enhance the surgeons operative awareness by allowing them to project CT-generated 3D models of the patient’s own bony anatomy with overlaid pre-planned screw trajectories, thus in theory increasing operative efficiency while reducing operative error. Prior to applying AR in the operative suite, the potential negatives associated with using AR technologies in the operative theater, namely their effect on extraneous cognitive load and on task performance, needs to be addressed. A matched crossover trial design was used in which a combined group of 22 neurosurgery and orthopaedic surgery residents, ranging in their training from the second Post-Graduate Year (PGY-2) to chief resident (PGY-7 for neurosurgery and PGY-5 for orthopedic surgery, respectively. Participants were asked to place cervical lateral mass screws in a standardized, 3D-printed cervical spine with and without the Microsoft Hololens 1 headset worn. Lateral mass screws were placed bilaterally at C4 to C6, with six cervical lateral mass screws placed by each participant in each trial, totaling 12 total screws placed. Overall time to drill six pilot holes, time for placement of each individual screw, pilot hole proximity to a predetermined entry point as defined by the Magerl method, and presence of medial/lateral breaches were assessed and used as surrogate measures of mental taxation. The SURG-TLX questionnaire, a validated measure of extraneous cognitive load, was also used to compare cognitive strain of the task with and without the Hololens 1.
“Multimodal Communication: Commonsense, Grounding, and Computation”
Friday, October 23 | 12:50 – 1:30 p.m.
Malihe Alikhani, Assistant Professor, SCI; ISP Faculty Candidate
Abstract: From the gestures that accompany speech to images in social media posts, humans effortlessly combine words with visual presentations. Communication succeeds even though visual and spatial representations are not necessarily wired to syntax and conventions, and do not always replicate appearance. Machines, however, are not equipped to understand and generate such presentations due to people’s pervasive reliance on commonsense and world knowledge in relating words and external presentations. I show the potential of discourse modeling for solving the problem of multimodal communication. I start by presenting a novel framework for modeling and learning a deeper combined understanding of text and images by classifying inferential relations to predict temporal, causal, and logical entailments in context. This enables systems to make inferences with high accuracy while revealing author expectations and social-context preferences. I proceed to design methods for generating text based on visual input that use these inferences to provide users with key requested information. The results show a dramatic improvement in the consistency and quality of the generated text by decreasing spurious information by half. Finally, I describe the design of two multimodal communicative systems that can reason on the context of interactions in the areas of human-robot collaboration and conversational artificial intelligence and describe my research vision: to build human-level communicative systems and grounded artificial intelligence by leveraging the cognitive science of language use.
“Survey of the Effects of Image Domain Shifts for Different Methods of Visual Question Answering”
Friday, October 23 | 12:30 – 12:50 p.m.
Tristan Maidment, Graduate Student, ISP
Abstract: Visual Question Answering (VQA) takes visual and linguistic recognition and combines them in a multi-modal reasoning task. Modern VQA systems are typically limited to a single VQA dataset, due to dataset-specific domain differences or due to dataset overfitting. To better understand how VQA algorithms handle shifts in domain, we introduce a synthetic domain shift in the image space, using style transfer techniques to modify the look of the images. Five different VQA algorithms are compared, each of which use a different methodology for representing the different modalities and reasoning mechanism within the combined spaces. Each of the algorithms’ response to domain shifts are monitored, with various fine-tuning techniques, and domain adaptation methods explored to improve upon the domain generalization. We demonstrate how different method types exhibit different levels of domain shift robustness and provide insight to means for improving domain generalization in VQA.
“Physics Guided Machine Learning: A New Paradigm for Scientific Knowledge Discovery”
Friday, October 9 | 12:30 – 1:30 p.m.
Xiaowei Jia, Assistant Professor, SCI; ISP Faculty Candidate
Abstract: Data science and machine learning models, which have found tremendous success in several commercial applications where large-scale data is available, e.g., computer vision and natural language processing, has met with limited success in scientific domains. Traditionally, physics-based models of dynamical systems are often used to study engineering and environmental systems. Despite their extensive use, these models have several well-known limitations due to incomplete or inaccurate representations of the physical processes being modeled. Given rapid data growth due to advances in sensor technologies, there is a tremendous opportunity to systematically advance modeling in these domains by using machine learning methods. However, capturing this opportunity is contingent on a paradigm shift in data-intensive scientific discovery since the “black box” use of ML often leads to serious false discoveries in scientific applications. Because the hypothesis space of scientific applications is often complex and exponentially large, an uninformed data-driven search can easily select a highly complex model that is neither generalizable nor physically interpretable, resulting in the discovery of spurious relationships, predictors, and patterns. This problem becomes worse when there is a scarcity of labeled samples, which is quite common in science and engineering domains.
My work aims to build the foundations of physics-guided machine learning by exploring several ways of bringing scientific knowledge and machine learning models together. My work has the potential to greatly advance the pace of discovery in a number of scientific and engineering disciplines where physics-based models are used, e.g., hydrology, agriculture, climate science, materials science, power engineering and biomedicine.
“Skill Complexity and Labor Resilience in the Future of Work”
Friday, September 25 | 12:30 – 1:30 p.m.
Morgan Frank, Assistant Professor, SCI; ISP Faculty Candidate
Abstract: Rapidly advancing cognitive technologies, such as artificial intelligence (AI), have the potential to drastically impact modern society and to shape the future of work. Although a given technology impacts demand for only a narrow set of workplace skills, modern empirical work explains employment trends with coarse labor distinctions between cognitive and physical or routine and non-routine work. In this talk, I explore the complex ways that skills and employment undergird labor dynamics in the US. I perform an unsupervised analysis of specific workplace skills as a network whose aggregate and refined topology grant new insights into job polarization and workers' career mobility. Since these inter-skill connections predict career mobility, I construct a map of US occupations that captures worker transition rates between employment opportunities and, in combination with urban employment data, predicts workers' spatial mobility. These refined models that connect workplace skills to both inter-city and intra-city dynamics enable new insights and new input data sources for real-time labor trends at the level of specific technologies and specific workplace skills. I demonstrate how simple measures for skills within a labor market contribute to the differential impact of automation across US cities of different sizes, and how more complicated measures for job connectivity indicate economic resilience to labor shocks, including labor shifts from COVID-19. These results suggest that preparing for AI and the future of work may best be achieved by fostering resilient workforces and adaptable workers.
“Instance-Specific Causal Bayesian Network Structure Learning”
Thursday, September 24 | 10 a.m. – Noon
Fattaneh Jabbari, Graduate Student, ISP
Abstract: Much of science consists of discovering and modeling causal relationships in nature. Causal knowledge provides insight into the mechanisms acting currently (e.g., the side-effects caused by a new medication) and the prediction of outcomes that will follow when actions are taken (e.g., the chance that a disease will be cured if a particular medication is taken). In the past 30 years, there has been tremendous progress in developing computational methods for discovering causal knowledge from observational data. Some of the most significant progress in causal discovery research has occurred using causal Bayesian networks (CBNs). A CBN is a probabilistic graphical model that includes nodes and edges. Each node corresponds to a domain variable and each edge (or arc) is interpreted as a causal relationship between a parent node (a cause) and a child node (an effect), relative to the other nodes in the network.
In this dissertation, I focus on two problems: (1) developing efficient CBN structure learning methods that learn CBNs in the presence of latent variables (i.e., unmeasured or hidden variables). Handling latent variables is important in causal discovery since it can induce dependencies that need to be distinguished from direct causation. (2) developing instance-specific CBN structure learning algorithms to learn a CBN that is specific to an instance (e.g., patient), both with and without latent variables. Learning instance-specific CBNs is important in many areas of science, especially the biomedical domain; however, it is an under-studied research problem. In this dissertation, I develop various novel instance-specific CBN structure learning methods and evaluate them using simulated and real-world data.
“Program Assessment Meeting
Friday, September 11 | 12:30 – 1:30 p.m.
“ISP Faculty Presentations”
Friday, August 28 | 12:30 – 1:30 p.m.
This week we will have presentations from some of the ISP faculty who will talk about their research. The faculty are: Dr. Diane Litman (CS/ISP/LRDC), Dr. Sera Linardi (GSPIA/ISP), and Dr. Shandong Wu (Radiology, DBMI, BIOENG).
Dissertation Defense: Jaromir Savelka
Monday, April 20 | 2 – 3:30 p.m.
“Discovering Sentences for Argumentation About Meaning of Statutory Terms”
Abstract: In this work I studied, designed, and evaluated computational methods to support interpretation of statutory terms. Understanding statutes is difficult because the abstract rules they express must account for diverse situations, even those not yet encountered. The interpretation involves an investigation of how a particular term has been referred to, explained, interpreted, or applied in the past. Going through the list of results manually is labor intensive. A response to a search query may consist of hundreds or thousands of documents. I investigated the feasibility of developing a system that would respond to a query with a list of sentences that mention the term in a way that is useful for understanding and elaborating its meaning. I treat the discovery of sentences for argumentation about the meaning of statutory terms as a special case of ad hoc document retrieval. The specifics include retrieval of short texts (sentences), specialized document types (legal case texts), and, above all, the unique definition of document relevance.
This work makes a number of contributions to the areas of legal information retrieval and legal text analytics. First, a novel task of discovering sentences for argumentation about the meaning of statutory terms is proposed. This is a task lawyers routinely perform using a combination of manual and computational approaches. Second, a data set comprising 42 queries (26,959 sentences) was assembled to support the experiments presented here. Third, by systematically assessing the performance of a number of traditional information retrieval techniques, I position this novel task in the context of a large body of work on ad hoc document retrieval. Fourth, I assembled a unique list of 129 descriptive features that model the retrieved sentences, their relationships to the terms of interest, as well as the statutory provisions they come from. I demonstrate how the proposed feature set could be utilized in learning-to-rank settings by showing how a number of machine learning algorithms learn to rank the sentences with very reasonable effectiveness. Fifth, I analyze the effectiveness of fine-tuning pre-trained language models in the context of this special task and demonstrate a very promising direction for future work.
“Tensor Mixed Graphical Model”
Friday, April 17 | 1 – 1:30 p.m.
Haiyi Mao, Graduate Student, ISP
Abstract: We consider the graphical model with continuous variables, discrete variables and tensor variables. In Lee and Hastie (2013) it is proved that pseudo-likelihood with different regressions and L1 regularizers can learn a sparse and meaningful graph structure. Here we extend this work with tensor variables which can be seen as a high dimensional multi-way array. Different with previous works, flattening the tensor into a long vector and treating each entry as one variable, we take a tensor as a variable to keep its structural information. With low ranks and sparsity constrains on the connectivity matrix(tensors), we can define new type of edges for tensor-to- tensor variables, tensor-to-continuous variables and tensor-to-discrete variables. I the end we will show some preliminary results proving that the model can learn meaningful linear relations between variables
“Parameter Estimation and Time Series Modeling of Ordinary Differential Equations (ODE) for Wound Healing”
Friday, April 17
Jun Luo, Graduate Student, ISP
Abstract: Ordinary Differential Equations (ODEs) are often used to model dynamic systems and have been successfully applied to many fields including population prediction, decay of radioactive material, electric circuits. Its inverse problem, the parameter fitting problem, is often challenging due to the dimensionality of the parameters and their correlations. Meanwhile, thanks to the rapid development of Machine Learning techniques, it became possible to more accurately predict time series data from systems that can be modeled by ODEs. In this work, we focus on the parameter fitting problem from and the time series data prediction for an ODE application in wound healing process. We use a Markov Chain Monte Carlo method to estimate parameters from data, which can potentially represent customized settings for individuals across population. Long Short-Term Memory (LSTM) based time series models are used to predict tendencies for different concentrations that we measure in wound tissues. Experiments performed on synthesized data show that our models are capable to estimate parameters for an ODE wound healing modeling with 42 parameters and predict time series data given the history.
Thesis Defense: Sanya Taneja
Friday, April 10 | 3:30 – 5 p.m.
“Bayesian Networks for Diagnosing Childhood Malaria in Malawi”
Abstract: Infectious diseases such as malaria are responsible for the majority of under-five deaths in low- and middle-income countries. Accurate diagnosis and management of illnesses can help in reducing the global burden of childhood morbidity and mortality. While trained healthcare workers deliver treatment for common childhood illnesses in healthcare facilities in Malawi, there is a significant lack of diagnostic support in rural health centers. With recent trends in artificial intelligence in global health, we hypothesize that a data-driven approach to diagnosis of childhood illnesses may address the challenges faced in health centers in low-resource countries such as Malawi. In this study, we aim to utilize Bayesian networks to diagnose cases of childhood malaria in Malawi. We develop two Bayesian diagnostic models for classification of malaria using clinical signs and symptoms. The first model is created manually, while the other combines an Augmented Naïve Bayes approach with expert knowledge. The models are learnt using a national survey dataset which contains sick child observations including patient information, diagnosis, and symptoms. The target malaria diagnosis is taken as the result of the malaria rapid diagnostic test (mRDT). The performance of the Bayesian models is further compared to traditional machine learning classifiers on the basis of accuracy, AUC, precision, F1 score, sensitivity and specificity. We also present an experimental framework that can be used to model the malaria diagnostic support in the rural health centers. The manually created Bayesian model achieves an accuracy of 63.6% with an AUC of 0.58. The augmented naïve Bayes model considers associations between the variables and achieves an accuracy of 62.7%. The Bayesian models outperform the logistic regression and random forest models in the classification of the disease. Bayesian models provide a powerful, efficient and data-driven tool for diagnosis of childhood illness that can lead to a more evidence-based clinical practice in Malawi. The simplicity and interpretability of Bayesian models offers a unique approach to diagnostic support in low-resource countries. As Bayesian models are representative of the population from which the data has been derived, this approach can be generalized to other childhood illnesses in different regions of the world.
“Towards Understanding Depression in Parent-Child Interactions”
Friday, April 3 | 1 – 1:30 p.m.
Maneesh Bilalpur, Graduate Student, ISP
Abstract: Development of automated multimodal behaviour prediction methods find their use in both clinical and in-the-wild applications. In this presentation, we explain the characteristics of our dataset obtained through a longitudinal study of depression in family environments, specifically in Mother-Adolescent interactions. We give a brief introduction to the nature of our annotation process and how it captures various aspects of behaviour through constructs. We also explain some of the widely used multimodal(vision and speech) features in affect analysis. Finally, we present our results on a multimodal approach to predict the constructs.
“Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner Loop Maximum a Posteriori Examination”
Friday, April 3 | 12:30 – 1 p.m.
Anthony Sicilia, Graduate Student, ISP
Abstract: We consider a model-agnostic solution to the problem of Multi-Domain Learning (MDL). While a number of solutions to the problem of MDL exist, they primarily consist of model-dependent methods which designate separate sets of shared and domain-specific parameters, additionally making architectural specifications to accommodate these parameters. While some of these methods are effective, they are challenging to apply in problem spaces where certain standard model architectures are well accepted; e.g. the UNet architecture in the problem of Semantic Segmentation. To this end, we consider a weighted loss function (perhaps the simplest solution to MDL) and extend it to an effective procedure by employing techniques from the recently active area of learning-to-learn (meta-learning). Specifically, we take inner-loop gradient steps to dynamically estimate posterior distributions over the hyper-parameters of our loss function. The immediate result is a method which requires no additional model parameters and no architectural specification; instead, only a relatively efficient algorithmic modification is needed to improve performance in MDL. We demonstrate our solution on a fitting problem in medical imaging, specifically, in automatic segmentation of white matter hyper-intensity (WMH) where we take our domains to be two distinct imaging modalities (T1-MR and FLAIR) with a significant difference in underlying distribution and a large information imbalance.
“Adaptation and Synchronization in Teams”
Friday, March 27 | 1 – 1:30 p.m.
Huao Li, Graduate Student, ISP
Abstract: Mutual adaptation has been considered crucial in human teams and human-agent teams. However, there has been little research on the individual adaptation process and its influence on (a) team state and/or (b) team performance in either human-human or human-agent teams (HATs). This effect is difficult to untangle since adaptation leads to non-stationary team dynamics. Good individual adaptation should contribute to team synchronization which is especially important in heterogeneous tightly coupled teams where team strategies must change to deal with individual differences, changing contexts and team reorganization. Most existing research on assessment of team processes use retrospective team member reports and thus cannot capture the dynamic and emergent nature of team processes needed to develop computational models. In the presented work, we investigated the hypothesis that (a) team performance is influenced by individual capability, individual adaptation, and team synchronization, and (b) characterize these factors. The application of our research findings and proposed quantitative methods for developing adaptive agents for human-autonomy teaming is discussed.
“Sub-Volume GAN for High Resolution 3D Medical Images Synthesis”
Friday, March 27 | 12:30 – 1 p.m.
Li Sun, Graduate Student, ISP
Abstract: Generative adversarial networks (GAN) have been applied successfully in medical image analysis, including data augmentation and image-to-image translation. Limited by memory, most current GAN models, especially 3D GANs, are trained on low resolution medical images. In this work, we propose a novel end-to-end GAN architecture that can be trained on 3D high resolution images. The key idea is to introduce subsample layer to reduce the dimensionality of feature maps in the generator in the training process. During test time, the subsample layer can be removed to directly generate full volume of image at 256^3. An encoder is incorporated into the model to learn feature representation of images for disease severity prediction. Experiments performed on 3D thorax CT and brain MRI demonstrates that our approach can generate images of better quality than baseline models.
“Generating High-Quality Images for Weakly Supervised Learning, Semi-Supervised Learning, and Transfer Learning Via Conditional Generative Adversarial Network”
Friday, March 6 | 1 – 1:30 p.m.
Yanwu Xu, Graduate Student, ISP
Abstract: Conditional generative models enjoy remarkable progress over the past few years. One of the popular conditional models is Auxiliary Classifier GAN (AC-GAN), which generates highly discriminative images by extending the loss function of GAN with an auxiliary classifier. However, the diversity of the generated samples by AC-GAN tends to decrease as the number of classes increases, hence limiting its power on large-scale data. To address this issue, we identify the source of the low diversity theoretically. We propose Twin Auxiliary Classifiers Generative Adversarial Net (TAC-GAN) that further benefits from a new player that interacts with other players (the generator and the discriminator) in GAN. Additionally, we also study the application of our model on the weak learning (learning from complementary labeled data), semi-supervised learning (learning from unlabeled data) and transfer learning (domain adaptation).
“Inaccurate Labels in Weakly Supervised Deep Learning: Automatic Identification and Correction and Their Impact on Classification Performance”
Friday, March 6 | 12:30 – 1 p.m.
Degan Hao, Graduate Student, ISP
Abstract: In data-driven deep learning-based modeling, data quality may substantially influence classification performance. Correct data labeling for deep learning modeling is critical. In weakly-supervised learning, a challenge lies in dealing with potentially inaccurate or mislabeled training data. In this paper, we proposed an automated methodological framework to identify mislabeled data using two metric functions, namely, Cross-entropy Loss that indicates divergence between a prediction and ground truth, and Influence function that reflects the dependence of a model on data. After correcting the identified mislabels, we measured their impact on the classification performance. We also compared the mislabeling effects in three experiments on two different real-world clinical questions. A total of 10,500 images were studied in the contexts of clinical breast density category classification and breast cancer malignancy diagnosis. We used intentionally flipped labels as mislabels to evaluate the proposed method at a varying proportion of mislabeled data included in model training. We also compared the effects of our method to two published schemes for breast density category classification. Experiment results show that when the dataset contains 10% of mislabeled data, our method can automatically identify up to 98% of these mislabeled data by examining/checking the top 30% of the full dataset. Furthermore, we show that correcting the identified mislabels leads to an improvement in the classification performance. Our method provides a feasible solution for weakly-supervised deep learning modeling in dealing with inaccurate labels.
“An Instance-Specific Algorithm for Learning the Structure of Causal Bayesian Networks Containing Latent Variables”
Friday, February 21 | 1 – 1:30 p.m.
Fattaneh Jabbari, Graduate Student, ISP
Abstract: Almost all of the algorithms for learning causal Bayesian networks (CBNs) from observational data assume that the instances in the population share the same causal structure. While accurately learning such population-wide CBN models is useful, learning CBNs that are specific to each instance is often important as well. For example, a breast cancer tumor in a patient (instance) is often a composite of causal mechanisms, where each of these individual causal mechanisms may appear relatively frequently in breast-cancer tumors of other patients, but the particular combination of mechanisms is unique to the current tumor. Therefore, it is critical to discover the specific set of causal mechanisms that are operating in each patient to understand and treat that particular patient effectively. We introduce a novel instance-specific causal structure learning algorithm that uses partial ancestral graphs (PAGs) to model latent confounders. Simulations support that the proposed instance-specific method can improve structure-discovery performance compared to an existing PAG-learning method called GFCI, which is not instance-specific. We also report results that provide support for instance-specific causal relationships existing in real-world datasets.
“Studying Cardiovascular Risk More Precisely Via Markers of Subclinical Atherosclerosis in the Heart SCORE Cohort”
Friday, February 21 | 12:30 – 1 p.m.
Mahbaneh Eshaghzadeh Torbati, Graduate Student, ISP
Abstract: Current practice guidelines in primary prevention of cardiovascular disease (CVD) encourage risk-reducing strategies which founded based upon the individuals’ global cardiovascular risk, such as Framingham Risk Score (FRS). However, the currently used tools are imprecise as they misestimate the risk of the future CVD events. To deal with this problem, the markers of subclinical atherosclerosis were recommended as complements of the risk estimators. As these markers may behave differently in subgroups of race and risk, we hypothesized that studying these markers in these subgroups may give more precise insight on their improvement upon FRS in prediction of cardiovascular events. Our results on studying coronary artery calcification, carotid intima media thickness, and ankle brachial index in the Heart Strategies Concentrating on Risk Evaluation (Heart SCORE) cohort, have showed that subgrouping revealed hidden prediction improvement of these markers in subgroups of race and CVD risk.
Dissertation Defense: Johnathan Young
Thursday, February 20 | 10 – 11 a.m.
“Deep Learning for Causal Structure Learning Applied to Cancer Pathway Discovery”
“Overcoming Challenges in Neuroimaging Applications with Computer Vision”
Friday, February 7 | 1 – 1:30 p.m.
Seong Jae Hwang, Assistant Professor, SCI
Abstract: Many problems in neuroimaging aim to better understand the underlying mechanism of various neurodegenerative diseases such as Alzheimer’s disease. For instance, to better understand the structural integrity of the brain suffering from a neurodegenerative disease, we analyze the brain network to derive its association with disease-specific risk factors. Such neuroimaging tasks, if accomplished using appropriate methods, may directly or indirectly aid the following clinical applications. I will show how some statistical and machine learning models can help us to overcome various data- and domain-specific challenges. Several examples will be presented, including some of our recent preliminary work on brain image segmentation.
“Teaching Machines to Read Human Rights Reports and Measure Violations in Higher Resolution: Introducing PULSAR 2.0”
Friday, February 7 | 12:30 – 1 p.m.
Mike Colaresi, William S. Dietrich II Professor of Political Science, Department of Political Science
Abstract: The accelerating availability of information from human rights monitors such as Amnesty International, Human Rights Watch, and the US State Department has led to new opportunities to measure repression and human rights protections in higher resolution. However, to date, most approaches that attempt to automatically structure textual reports use simple, lower-dimensional observations such as the counts of words that ignore syntax and word order. While these representations are useful for some applications, they limit the inferences scholars and policy-makers can extract from human rights reports. We present PULSAR 2.0 a new system that takes syntax and word order into account. PULSAR uniquely allows researchers to extract both the judgements and the aspects/rights being judged from texts at scale. We illustrate that this more detailed information is useful both for improving predictions of physical integrity rights and women's political rights, but also for generating machine learning models that are more interpretable than conventional specifications. This latter benefit holds the promise of coherently connecting qualitative and quantitative analyses of human rights texts.
“Intelligent Systems: Past, Present and Future!”
Friday, January 24 | 12:30 – 1:30 p.m.
Ganesh Mani, Adjunct Faculty, Carnegie Mellon University
Abstract: The informal talk will discuss the evolution - over the last few decades - of systems that purport to exhibit behavior that many humans would consider useful, clever or intelligent. Starting with a framework that considers the world’s building blocks to be atoms, bits and cells, I will discuss the implications of innovation around these building blocks, fueled by AI. I will articulate metaphors for the use of AI systems and enumerate a few future Grand Challenges, which may serve as a compass for students and faculty with respect to research directions. A perspective on different industry use cases will also be provided potentially helping students calibrate and hone in on career opportunities, congruent with their strengths, interests and where the field is headed. I will also touch upon the new AI being Augmented Intelligence via the HuMachine framework and its broad implications for technology and society.
Bio: Dr. Ganesh Mani is on the adjunct Faculty of Carnegie Mellon University and is considered a thought leader in imbuing AI into many areas, spanning Health, Wealth and Wisdom. He has worn multiple hats including entrepreneur, investment manager and researcher. His articles have appeared in many forums, ranging from Neural Computation to Bloomberg Opinion. He has a PhD in AI / Computer Science and an MBA in Finance from the University of Wisconsin-Madison; in addition to an undergraduate degree in Computer Science from the Indian Institute of Technology, Bombay. He has consulted for and serves on the advisory board of many leading institutions.
Annual ISP Celebration
Friday, January 17 | 12:30 – 1:30 p.m.
Various ISP Faculty Pitch Research Projects to Students
Friday, 10 | 12:30 – 1:30 p.m.