Past Events - 2018

“Automated Machine-Learning Based Differential Diagnosis of Heart Diseases from Emergency Department Visits”

Friday, December 7 | 1 – 1:30 p.m.

Diyang Xue, Graduate Student, Intelligent Systems Program

Abstract: Generating differential diagnosis is a subjective process primarily relying on a physician’s experience that could benefit from a machine-learning based decision support tool. However, currently no differential diagnosis models for heart diseases, which can be automatically built from electronic health records, are available for emergency department (ED) settings. To develop and evaluate a decision-tree based differential diagnosis model automatically built from structured and unstructured ED data, We merged 205 heart disease related ICD-9-CM codes into 6 categories of cardiac diseases based on clinical similarity, subjects were emergency department visits to 15 hospitals in the University of Pittsburgh Medical Center (UPMC) Health system with primary diagnosis including one of 205 ICD-9-CM codes for heart disease. We used decision tree algorithm to learn rules for differentiating these 6 categories using unstructured electronic health record data parsed with natural language processing. The result show a decision tree model was able to be automatically built from analyzing structured and unstructured ED data and to perform differential diagnosis of 6 categories of cardiac diseases, we also tried some other algorithms to improve the performance of classic decision tree algorithm.

“Semantic Pleonasm Detection”

Friday, December 7 | 12:30 – 1 p.m.

Omid Kashefi, Graduate Student, Intelligent Systems Program

Abstract: Pleonasms are words that are redundant. To aid the development of systems that detect pleonasms in text, we introduce an annotated corpus of semantic pleonasms. We validate the integrity of the corpus with inter-annotator agreement analyses. We also compare it against alternative resources in terms of their effects on several automatic redundancy detection methods.

“Transfer Learning for Bayesian Case Detection Systems”

Friday, November 30 | 1 – 1:30 p.m.

Ye Ye, Graduate Student, Intelligent Systems Program

Abstract: In this age of big biomedical data, massive amounts of data have been produced worldwide. If we could effectively share all the information accumulated from all existing resources, we may develop a deeper understanding of biomedical problems and find better solutions.

Compared to traditional machine learning techniques, transfer learning techniques fully consider differences between shared parties in order to provide a smooth transfer of knowledge from source party to target party. Most well-established techniques focus on sharing data, while recent techniques have begun to explore the possibility of sharing models. Model-sharing techniques are especially appealing for biomedical area because of much less privacy risks. Unfortunately, most model-transferring techniques are unable to handle heterogeneous scenarios where feature spaces, marginal and conditional distributions differ among shared parties, which commonly exist in biomedical data.

My dissertation developed an innovative transfer learning framework to share data or model under heterogeneous scenarios. Heuristic scores have been designed to integrate source information with target data, while allowing injections of target-specific features for a better localization. Both synthetic and real-world datasets were used to test two hypotheses: 1) Transfer learning is better than using the model constructed with target data only; 2) Transfer learning is better than direct adoption of the source model. A comprehensive analysis was conducted to investigate conditions where these two hypotheses hold, and more generally the factors that affect the effectiveness of transfer learning, providing empirical opinions about when and what to share.

My research contributes to the fields of machine learning, medical informatics and disease surveillance. It enables knowledge sharing under heterogeneous scenarios and provides methodologies for diagnosing transfer learning performance under tasks varying degrees of feature space overlapping, similarities of distributions, and sample sizes. The model-transferring algorithm can be viewed as a new Bayesian network learning algorithm with a flexible representation of prior knowledge allowing partial feature coverage. To the best of my knowledge, this is the first exploration on model-transferring for biomedical data in heterogeneous scenarios. My work shows the potential of quick development of a case detection system for an emergent unknown disease and demonstrates its transferability and adaptability.

“Identifying Incidental Findings from Radiology Reports of Trauma Patients: An Evaluation of Automated Feature Representation Methods”

Friday, November 30 | 12:30 – 1 p.m.

Gaurav Trivedi, Graduate Student, Intelligent Systems Program

Abstract: Radiologic imaging of trauma patients often uncovers findings that are unrelated to the trauma. Identifying these incidental findings in clinical notes is necessary for proper follow-up. We developed and evaluated an automated pipeline to identify incidental findings in radiology reports of trauma patients at the sentence and section levels using a variety of feature representations. We annotated a corpus of over 4,000 reports and investigated several feature representations including traditional word and concept (such as SNOMED-CT) representations as well as word and concept embeddings. We evaluated these representations using traditional machine learning as well as CNN-based deep learning methods. Our results show that the best performance was achieved by using CNNs with Pre-trained embedding at both sentence and section levels. This provides evidence that such a pipeline is likely to be clinically useful to identify incidental findings in radiology reports in trauma patients.

“Combining Attention-Based Approach and Textual Context for Evidence Sequence Prediction”

Friday, November 16 | 1 – 1:30 p.m.

Mengdi Wang, Graduate Student, Intelligent Systems Program

Abstract: With the rapid expansion of the Web, large volumes of event-related data are available in numerous fields such as e-commerce, social activities and electronic health records. These events are correlated so that the patterns of past events may help to predict future events. Understanding those correlations is, therefore, crucial for an important task called ``event sequence prediction'': given the observed event sequence in the past, the goal is to predict what kind of events will occur at what time in the future. One classical mathematical tool for modeling sequences is point process, but it has the drawback of strong assumption on the generative process that may not reflect the reality. Recent studies explored Recurrent Neural Networks (RNNs) to learn a more general representation of the underlying dynamics, but they suffered from long sequence modeling and scalability issues. In addition, the existing methods ignored the textual context (such as a tweet, a blog, or a Facebook update, etc.), which has potential to be helpful in categorizing the events. In this paper, we propose to jointly model the event sequences (i.e., event type and time stamp) and the textual context using an attention-based neural network named Transformer. The key idea of our combined approach is to automatically learn the underlying dependencies among events from the event sequence history. The experiments on the large-scale synthetic and real-world datasets demonstrate our model can achieve better performance than both classical and RNN-based models.

“Using Neural Networks to Inform Causal Structure When Inputs Cause Outputs”

Friday, November 16 | 12:30 – 1 p.m.

Jonathan Young, Graduate Student, Intelligent Systems Program

Abstract: In this work, we explore the extent to which deep neural networks can be used to find causal structure when the variables in a dataset can be split into two sets and the causal direction between these two sets is known. More concretely, we used data where the inputs were known to cause the outputs, (i.e, the causal direction between inputs and outputs is known), but the latent structure is unknown. We present results on multiple simulated datasets and plan to use these algorithms to explore biological data, specifically cancer data, in the future. Cancer data will be used as a motivating example throughout the talk.

Dissertation Defense: Ye Ye

Monday, November 12 | 8 – 10 a.m.

“Transfer Learning for Bayesian Case Detection Systems”

Committee Members:

  • Dr. Fuchiang (Rich) Tsui, Associate Professor, Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia (Chair and Advisor) 
  • Dr. Michael M. Wagner, Professor, Biomedical Informatics and Intelligent Systems, University of Pittsburgh (Cochair) 
  • Dr. Gregory F. Cooper, Professor, Biomedical Informatics and Intelligent Systems, University of Pittsburgh 
  • Dr. Jeremy Weiss, Assistant Professor, Health Informatics at Heinz College, CMU

Abstract: In this age of big biomedical data, massive amounts of data have been produced worldwide. If we could effectively share all the information accumulated from all existing resources, we may develop a deeper understanding of biomedical problems and find better solutions.

Compared to traditional machine learning techniques, transfer learning techniques fully consider differences between shared parties in order to provide a smooth transfer of knowledge from source party to target party. Most well-established techniques focus on sharing data, while recent techniques have begun to explore the possibility of sharing models. Model-sharing techniques are especially appealing for biomedical area because of much less privacy risks. Unfortunately, most model-transferring techniques are unable to handle heterogeneous scenarios where feature spaces, marginal and conditional distributions differ among shared parties, which commonly exist in biomedical data.

My dissertation developed an innovative transfer learning framework to share data or model under heterogeneous scenarios. Heuristic scores have been designed to integrate source information with target data, while allowing injections of target-specific features for a better localization. Both synthetic and real-world datasets were used to test two hypotheses: 1) Transfer learning is better than using the model constructed with target data only; 2) Transfer learning is better than direct adoption of the source model. A comprehensive analysis was conducted to investigate conditions where these two hypotheses hold, and more generally the factors that affect the effectiveness of transfer learning, providing empirical opinions about when and what to share.

My research contributes to the fields of machine learning, medical informatics and disease surveillance. It enables knowledge sharing under heterogeneous scenarios and provides methodologies for diagnosing transfer learning performance under tasks varying degrees of feature space overlapping, similarities of distributions, and sample sizes. The model-transferring algorithm can be viewed as a new Bayesian network learning algorithm with a flexible representation of prior knowledge allowing partial feature coverage. To the best of my knowledge, this is the first exploration on model-transferring for biomedical data in heterogeneous scenarios. My work shows the potential of quick development of a case detection system for an emergent unknown disease and demonstrates its transferability and adaptability.

“AI Meets Cytology”

Friday, November 2 | 12:30 – 1:30 p.m.

Kieth Callenberg and Adit Sanghvi, Director of Machine Learning, UPMC Enterprises

Abstract: Bladder cancer is the sixth most common cancer in the United States with more than 80,000 new cases in 2018. Early detection by urine cytology greatly improves intervention success, but review of cytological slides is challenging: pathologists have to search for 10 malignant cells in glass slide specimens that can range from 100,000 to more than 300,000 cells. Many efforts are underway to optimize this procedure with digital whole slide imaging workflows, but limited attention has so far been given to automating cytology with artificial intelligence. We built a multi-level model based on clinically-relevant cellular features, strategically combining traditional machine learning and deep learning. The model demonstrates high accuracy at identifying malignant cells, as well as high accuracy at predicting the whole slide cytology diagnosis. In this talk we will describe our approach to model development, discuss the main challenges we faced, and review results on a held-out validation set.

“Deep Learning for Computational Drug Discovery”

Friday, October 19 | 12:30 – 1:30 p.m.

David Koes, Assistant Professor, School of Medicine

“Instance-Specific Bayesian Network Structure Learning”

Friday, October 5 | 1 – 1:30 p.m.

Fattaneh Jabbari, Graduate Student, Intelligent Systems Program

Abstract: Bayesian network (BN) structure learning algorithms are almost always designed to recover the structure that models the relationships that are shared by the instances in a population. While accurately learning such population-wide Bayesian networks is useful, learning Bayesian networks that are specific to each instance is often important as well. For example, to understand and treat a patient (instance), it is critical to understand the specific causal mechanisms that are operating in that particular patient. We introduce an instance-specific BN structure learning method that searches the space of Bayesian networks to build a model that is specific to an instance by guiding the search based on attributes of the given instance (e.g., patient symptoms, signs, lab results, and genotype). The structure discovery performance of the proposed method is compared to an existing state-of-the-art BN structure learning method, namely an implementation of the Greedy Equivalence Search algorithm called FGES, using both simulated and real data.

“Dialogue State Tracking for Conversational Image Editing”

Friday, October 5 | 12:20 – 1 p.m.

Zahra Rahimi, Graduate Student, Intelligent Systems Program

Abstract: In this talk, I’ll discuss the “conversational image editing”, a novel real-world application domain combining dialogue, visual information, and the use of computer vision. I will specifically focus on designing one of the main components of the system including Natural Language Understanding and State Tracking. State tracking refers to exploiting distributions over the user's goal at every step in the dialog as they are making edits to an image. I will further discuss the findings and challenges specific to this application domain.  

“Chat and Chunk Phases in Conversation and What They Tell us About How to Talk”

Friday, September 21 | 12:30 – 1:30

Emer Gilmartin, Trinity College Dublin

Abstract: Much human talk is casual and multiparty, forming a soundtrack to social bonding and mutual co-presence rather than strictly exchanging information in order to complete a well-defined practical task. While there has been much work on the technological or task-based applications of spoken interaction, between humans and more recently in spoken dialog technology, there has been less focus on casual or social talk - which is almost universally present whenever people congregate. HCI applications capable of participating as a speaker or listener in such talk would be useful for companionship, educational, and social applications. However, such applications require dialogue structure beyond the adjacency pair  sequences which may be sufficient to model a well-defined task. While there is body of theory on multiparty casual talk, there is a lack of work quantifying such talk.  Longer casual conversations have been observed to follow a sequence of chat and chunk phases, where short interactive stretches of talk (chats) are interleaved with longer turns (chunks), where a single participant dominates the conversation, often telling a story or giving an extended opinion. My PhD work concentrated on these phases and how they manifest in casual talk.

In this talk I will provide an overview of the structure of casual conversation, and chat and chunk phases in particular, reporting corpus based studies of how these elements occur in multiparty casual talk. It appears that dialog features such as timing and the distribution of laughter and overlap differ between chat and chunk phases, which could have major implications for interaction design. I will discuss how such knowledge might be used in the design of casual interfaces, and describe my current Fulbright project investigating how chunks in particular may be used in spoken language modules for an online tutor for migrants living, working, or studying in a new country.

Bio: Emer Gilmartin is a researcher at the ADAPT Centre, Trinity College Dublin, Ireland. She works on spoken dialogue and spoken dialogue technology and is interested in use of dialogue technology for language learning. She holds degrees in Mechanical Engineering (B.E),  Speech and Language Technology (Dip. PostGrad), Theoretical Linguistics (M.Phil), and her PhD work was on the structure of multiparty casual conversation. Prior to her work in academia, she worked in English Language Teaching, as a teacher, teacher trainer, and as Executive Manager of Ireland's programme for Language and Integration for adult and child migrants. She is currently a Fulbright TechImpact Scholar at the Language Technologies Institute at CMU, where she is working on the creation and testing of web-based language learning resources for migrants and refugees.

ISP Picnic

Friday, September 14 | 3 – 6 p.m.

“Accounting for Noise in Microfoundations of Information Aggregation”

Friday, September 7 | 12:30 – 1:30 p.m.

Sera Linardi, Associate Professor, Graduate School of Public and International Affairs

Abstract: This paper shows that the basic unit of information aggregation described by the Geanakoplos and Polemarchakis (1982) posterior revision process does not always produce public statistics that are closer to the full information posterior than the common prior. I study this process of back and forth communication between two individuals with private signals by introducing white noise into payoff computations, defining the evolution of common knowledge, and providing conjectures on the resulting public statistics. I then develop a computational method to ex-ante rank information structures on their tolerance to noise. Subjects’ behavior in a laboratory experiment is consistent with the model’s prediction: though the posterior revision process do move reports towards each other and towards the full information posterior, noise persists and aggregation is incomplete. As predicted, aggregation attempts in the two least noise-tolerant information structures result in public statistics that perform worse than the common prior

Dissertation Defense: Roya Hosseini

Tuesday, July 24 | 10 a.m. – Noon

“Program Construction Examples in Computer Science Education: From Static Text to Adaptive and Engaging Learning Technology”

Committee

  • Peter Brusilovsky (Chair) (ISP & SCI, PITT)
  • Christian D. Schunn (ISP & Psychology, PITT)
  • Diane Litman (ISP & SCI, PITT)
  • Vincent Aleven (HCII, CMU)

Abstract: My dissertation is situated in the field of computer science education research, specifically, the learning and teaching of programming. This is a critical area to be studied, since, primarily, learning to program is difficult, but also, the need for programming knowledge and skills is growing, now more than ever. This research is particularly focused on how to support a student's acquisition of program construction skills through worked examples, one of the best practices for acquiring cognitive skills in STEM areas.

While learning from examples is superior to problem-solving for novices, it is not recommended for intermediate learners with sufficient knowledge, who require more attention to problem-solving. Thus, it is critical for example-based learning environments to adapt the amount and type of assistance given to the student's needs. This important matter has only recently received attention in a few select STEM areas and is still unexplored in the programming domain. The learning technologies used in programming courses mostly focus on supporting student problem-solving activities and, with few exceptions, examples are mostly absent or presented in a static, non-engaging form.

To fill existing gaps in the area of learning from programming examples, my dissertation explores a new genre of worked examples that are both adaptive and engaging, to support students in the acquisition of program construction skills. My research examines how to personalize the generation of examples and how to determine the best sequence of examples and problems, based on the student's evolving level of knowledge. It also includes a series of studies created to assess the effectiveness of the proposed technologies and, more broadly, to investigate the role of worked examples in the process of acquiring programming skills.

Results of our studies show the positive impact that examples have on student engagement, problem-solving, and learning. Adaptive technologies were also found to be beneficial: The adaptive generation of examples had a positive impact on learning and problem-solving performance. The adaptive sequencing of examples and problems engaged students more persistently in activities, resulting in some positive effects on learning.

“Multi-Variate Phenotype Association Test on Chronic Obstructive Pulmonary Disease Using Image Features”

Friday, April 20 | 1 – 1:30 p.m.

Javad Rahimikollu, Graduate Student, Intelligent Systems Program

Abstract: Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of death in US and worldwide. Although cigarette smoking is one of the major environmental risk factors, not all smokers develop debilitating COPD. Furthermore, the family-based studies show the importance of the genetic factors in increasing the risk of the disease. Genome Wide Association Studies (GWAS) of COPD have identified several genetic loci associated to the disease. Classically, univariate regression has been done to identify a set of statistically significant genetic variants with respect to a phenotype of interest. Traditionally, GWAS is performed on a one-dimensional phenotype such as respiratory measurements. However, COPD is highly heterogeneous and single measurement cannot characterize the heterogeneity. Computerized Tomography (CT) techniques are increasingly used for diagnosis of the disease. CT captures the anatomical variabilities induced by the disease; hence it can be viewed as a rich high-dimensional phenotype. In this talk, I will review the classical methods to identify risk factors using univariate respiratory measurements. I will discuss why using a high dimension phenotype can be a challenging statistical task and will discuss potential solutions to tackle the challenges.

“Evaluating the Impacts of Gamification in Using Recommender Systems”

Friday, April 20 | 12:30 – 1 p.m.

Saba Dadsetan, Graduate Student, Intelligent Systems Program

Abstract: Mastery grid is one of the successful implementations of open learner model which allows students to see their progress compare to their past attempts to solve the problems included questions and examples in each topic. Task recommender is one of the newest features of Mastery Grid which suggests the best next action based on the past records of the student. The evaluation of the algorithm behind this recommender system cannot be done perfectly due to these recommended activities are ignored by some of the students. In this project, to stimulate the student to click more on suggested activities, we provide some gamification techniques such as achievement badges. These badges will be given to students according to how many points they can collect from our point system. Finally, by collecting the data from each student's profile at the end of one semester as well as data from same class in previous semesters without this tool we will assess how much this gamification tool is productive and how much it makes the Mastery Grid more interactive.

“Detecting Agent Mentions in U.S. Court Decisions”

Friday, April 6 | 1 – 1:30 p.m.

Jaromir Savelka, Graduate Student, Intelligent Systems Program

Abstract: Case law analysis is a significant component of research on almost any legal issue and understanding which agents are involved and mentioned in a decision is integral part of the analysis. In this paper we present a first experiment in detecting mentions of different agents in court decisions automatically. We defined a light-weight and easily extensible hierarchy of agents that play important roles in the decisions. We used the types from the hierarchy to annotate a corpus of US court decisions. The resulting data set enabled us to test the hypothesis that the mentions of agents in the decisions could be detected automatically. Conditional random fields models trained on the data set were shown to be very promising in this respect.

“Patient-Specific Explanations for Predictions of Clinical Outcomes”

Friday, April 6 | 12:30 – 1 p.m.

Mohammadamin Tajgardoon, Graduate Student, Intelligent Systems Program

Abstract: Machine learning models are being increasingly developed to predict clinical outcomes such as mortality, morbidity, and adverse events. Sophisticated predictive models with excellent performance are reported in the literature at an increasing pace. These models in most cases are regarded as black boxes that produce a prediction for an outcome. However, for such models to be practically useful in clinical care, it is critical to provide clinicians with simple and reliable patient-specific explanations for each prediction. We developed machine learning models to predict severe complications in patients with community-acquired pneumonia (CAP), and evaluated patient-specific explanations using physicians. Our method uses LIME that generates a patient-specific linear model that provides a feature relevance ranking. There was good agreement by physician evaluators on patient-specific explanations that were generated to augment predictions of clinical outcomes. Such explanations can be useful in interpreting predictions of clinical outcomes.

“Transparency and Explanation in Deep Reinforcement Learning Neural Networks”

Friday, March 23 | 1 – 1:30 p.m.

Michael Lewis, Professor, Department of Informatics and Networked Systems

Abstract: For Autonomous AI systems to be accepted and trusted, the users should be able to understand the reasoning process of the system, i.e. the system should be transparent. System transparency enables humans to form coherent explanations of the system’s decisions and actions.   Transparency is important not only for user trust, but also for software debugging and certification. In recent years, Deep Neural Networks have made great advances in multiple application areas. However, deep neural networks are opaque and while mechanisms for making their behavior more transparent have been proposed and demonstrated, the effectiveness of these methods for tasks outside of classification have not been verified.  Deep Reinforcement Learning Networks (DRLN) have been extremely successful in accurately learning action control in image input domains, such as Atari games where successful play can be learned from pixels alone.

Our study extends these methods by:

(a) incorporating explicit object recognition processing into deep reinforcement learning models and

(b) introducing “object saliency maps” to provide visualization of internal states of DRLNs, thus enabling the formation of “explanations.

We present computational results and human experiments in matching saliency maps to game play and predicting actions in the immediate future.

“Enabling Translational Medicine Using Bayesian Rule Learning with Informative Structure Priors”

Friday, March 23 | 12:30 – 1 p.m.

Jeya Balasubramanian, Graduate Student, Intelligent Systems Program

Abstract: Translational medicine involves the improvement in clinical practice by harnessing knowledge from the basic sciences. An important task in clinical practice is developing predictive models of clinical outcomes from biomedical datasets. Typical biomedical datasets suffer from the problem of high-dimensionality, where a large number of candidate variables can explain a few observations. Data mining algorithms in such datasets can easily get stuck in a local optima or infer models with spurious variables that are predictive of the outcome by chance. As a result, these algorithms may learn suboptimal or meaningless models. In biomedicine, in addition to the dataset, we often have related domain knowledge from the basic sciences that can help assist in data mining. This knowledge can come from domain literature, an expert, curated knowledge-bases (like ontologies) or datasets from other related studies. Developing intelligent systems that enable integration of these domain knowledge into the model learning process is vital towards a more informed model learning. In addition to meaningful models with good predictive performance, biomedicine benefits from learning comprehensible (human readable) models that can subsequently be verified by a domain expert. Bayesian Rule Learning (BRL) is a data mining method that has been shown to be successful in learning predictive rule models from high-dimensional biomedical datasets, which are comprehensible and have good predictive performance. BRL learns predictive rules from constrained Bayesian networks (BNs) inferred from data by searching the BN model space. In this project, we develop an intelligent system called iBRL that makes use of BRL’s Bayesian framework to implement an approach to incorporate prior domain knowledge using informative structure priors. We demonstrate iBRL on a real-world dataset to identify differentially expressed genes in lung cancer cells, and then evaluate the impact of incorporating prior domain knowledge about the dataset.

ISP 30 Year Celebration

Thursday, March 15 – Friday, March 16

“Automated Face Analysis and Synthesis for Affective Computing”

Friday, March 2 | 12:30 – 1:30 p.m.

Jeffrey Cohn, Professor, Department of Psychology

Abstract: Facial expression communicates emotion, intention, and physical state, and regulates interpersonal behavior.  Manual measurement is labor intensive and difficult to scale. Convergence of computer vision, machine learning, and behavioral science has made possible automated face analysis (AFA), a central pillar of affective computing.  I will present 1) human-observer based approaches to measurement that inform AFA; 2) current state of the art in detecting occurrence and intensity of facial actions; 3) applications in face-to-face interaction, emotion, and psychopathology; 4) expression transfer, and 5) new directions in body pose detection and analysis.

Bio:  Jeffrey Cohn is Professor of Psychology and Psychiatry at the University of Pittsburgh and Adjunct Professor at the Robotics Institute, Carnegie Mellon University. He leads interdisciplinary and inter-institutional efforts to develop advanced methods of automatic analysis and synthesis of face and body movement and applies them to research in human emotion, communication, psychopathology, and biomedicine. His research program is supported by grants from the U.S. National Institutes of Health and U.S. National Science Foundation among other sponsors. He Chairs the Steering Committee of the IEEE International Conference on Automatic Face and Gesture Recognition (FG) and has served as General Chair of international conferences on automatic face and gesture recognition, affective computing, and multimodal interfaces.

“Leveraging Human Knowledge for Better Statistical Generalizations”

Friday, February 16 | 12:30 – 1:30 p.m.

Yulia Tsvetkov, Assistant Professor, Carnegie Mellon University

Abstract: Although we are living in the era of big data when from the web it is easy to obtain billions or trillions of words, there are many scenarios in which we cannot be too data hungry. For example, even in a billion-word corpus, there is a long tail of rare and out-of vocabulary words. Next, language is not always paired with correlated events: corpora contain what people said, but not what they meant, or how they understood things, or what they did in response to the language. Finally, the vast majority of the world’s languages barely exist on the web at all. I'll present model-based approaches that incorporate prior knowledge in novel ways, to alleviate the problem of missing or skewed data. I'll show (1) how neural language models can benefit from cross-linguistic knowledge; (2) how insight into linguistic coherence, prototypicality, simplicity, and diversity of data helps improve learning in non-convex NLP models; (3) how knowledge about a speaker can be used for domain and style transfer.  I’ll conclude with an overview of ongoing research projects.

Bio: Yulia Tsvetkov is an assistant professor in the Language Technologies Institute at Carnegie Mellon University. Her research interests lie at or near the intersection of machine learning, natural language processing, social science, and linguistics. Her current research projects focus on language technologies for social good, including advancing NLP technologies for resource-poor languages spoken by millions of people, developing approaches to promote civility in communication (e.g., modeling gender bias in texts and debiasing), identifying strategies that undermine the democratic process (e.g., political framing and agenda-setting in digital media). Prior to joining CMU, Yulia was a postdoc in the Stanford NLP Group; she received her PhD from Carnegie Mellon University.

“Pathway-Level Information Extractor (PLIER): A Generative Model for Gene Expression Data”

Friday, February 2 | 12:30 – 1:30 p.m.

Maria Chikina, Assistant Professor, Department of Computational and Systems Biology

Abstract: The increasing ease of collecting genome-scale data has rapidly accelerated its use in all areas of biomedical science. Translating genome scale data in to testable hypothesis, on the other hand, is challenging and remains an active area method development. In this talk we present a machine learning aproach to produce data representations guided by a mechanistic understanding of the data generating process. Our method is a new constrained matrix decomposition approach that directly aligns a lower dimension representation with known biological pathways. The approach provides state-of-the-art accuracy in reconstructing known upstream variables and yields new insights into the archetecture of genetic regulation.

Annual ISP Assessment

Friday, January 19 | 12:30 – 1:30 p.m.

ISP Celebration

Friday, January 12 | 12:30 – 1:30 p.m.