Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Janyce Wiebe, Intelligent Systems Program, University of Pittsburgh
Exploting Subjectivity Classification to Improve Information Extraction; and Multi-perspective Question Answering Using the OpQA Corpus
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Daqing He, Intelligent Systems Program, University of Pittsburgh
Designing a Digital Library for Learning Digital Libraries
In this very informal talk, I will discuss our on-going effort on designing and developing an integrated, interactive, and intuitive learning environment for students to learn digital library topics. The learning environment is built on top of a digital library system to take the advantages of the digital library's integrated storage, organization, retrieval, and presentation functionalities. The interesting topics include the motivations of adopting a digital library system as a learning environment, the identification of documents, the construction of a course oriented topic ontology, and some future activities involving video recording, social navigation and visualization.
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Olivier Bodenreider, National Library of Medicine
Acquiring ontological relations from biomedical resources
Ontologies are essential resources for applications such as text mining and knowledge discovery. Biomedicine has a long tradition of collecting and organizing the names of entities of biomedical interest. Terminologies such as the Medical Subject Headings (MeSH), the International Classification of Diseases (ICD) and the Gene Ontology illustrate this tradition. For the most part, however, terminological knowledge consists of hierarchical relations rather than associative (or trans-ontological) relations. This talk will present methods for identifying relations among terms (e.g., synonymy, hypernymy) and among concepts, with special emphasis on trans-ontological relations. Both lexical and statistical approaches will be presented.
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Frederick Crabbe, Computer Science Department, United States Naval Academy
Compromise Strategies for Action Selection
Compromise behavior is a property of action selection in a behavior-based robotics architecture where, rather than selecting an action that is best for any single one of its active subgoals, the robot selects an action that is a compromise between the subgoals. For historical reasons, it is widely thought that this property is a necessary component of every behavior-based architecture. This talk will discuss compromise behavior and its possible definitions, examining in detail its utility to an agent. It will conclude that the property is not as useful as commonly believed, at least in the form in which it is usually presented. It will also suggest why this confusion occurred, and it what ways compromise behavior might be truly useful.
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Wendy Chapman, Intelligent Systems, University of Pittsburgh
Manual Annotations and Medical Language Processing
There are many uses for manual annotations from textual documents in medical language processing. I will describe two studies we have performed involving manual annotation from clinical reports. In the first study, we created an annotation schema to help physician experts annotate clinical conditions from reports. We measured agreement with and without the schema to determine the usefulness of the schema. We also compared lay people to physicians to determine whether physician expertise is really necessary for the task.
In the second study, we used manual annotations of 60 clinical conditions in Emergency Department reports to evaluate an algorithm we developed for integrating individual annotations in order to determine whether a condition was acute, chronic, or absent in the report. Our methodology measured the accuracy of the algorithm and pointed out areas where the algorithm was not accomplishing its goal of imitating a human expert's performance.
Shi-Kuo Chang, Intelligent Systems, University of Pittsburgh
A Chronobot for Time and Knowledge Exchange and Management
The Chronobot is a device for time and knowledge exchange and management. The concept of the Chronobot is derived from a science fiction short story written by me some twenty years ago.
Recently the Industrial Technology Research Institute (ITRI) and Institute for Information Industry (III), two leading research institutes in Taiwan, invited me to lead a two-year pioneering project to put my ideas into practice and build a realistic device. The Chronobot was thus conceived. To put it in the simplest terms the Chronbot allows a group of people to exchange time and knowledge. It is a platform for time management and knowledge management. In this talk I will describe the concept of the Chronobot, its basic mechanism for time exchange, the negotiation protocols and its application to e-learning. I will also give a demo of the Chronobot prototype and present the application scenarios. Research issues related to e-learning and visual communication are then discussed.
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Machine Learning for Incident Detection
Modern highways in urban areas are continuosly monitored with an array of sensors. This allows for early detection of traffic accidents and improved emergency response times. The characteristics of traffic flow vary widely from site to site. Because of this, incident detection algorithms require site-to-site manual analysis and calibration each time they are deployed. Machine learning techniques are rarely met in the theory and practice of incident detection. In order to understand why this is so and how it could change, we examine ML models for the task and identify the performance constraints and bottlenecks that hinder their more widespread use. I will overview challenges particular to the traffic and incident data: data sparsity, heavily skewed class distribution, imperfect labeling and violations of iid assumption; and propose a simple model accounting for these.
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Jialan Que, Intelligent Systems Program, University of Pittsburgh
Timeliness Study of Radiology & Microbiology Report in A Healthcare System
We developed a framework to measure the timeliness of two data types-radiology and microbiology reports-for detection of diseases such as inhalational anthrax (IA) in a healthcare system. We measured the timeliness of a data type as the delay between patient registration in an emergency department (ED) and receipt of data type by a biosurveillance system. We also determined the lower and upper bounds of median delay time (LMDT and UMDT) for the two data types to be available for detection of a single IA case. The study provides a range of delay time for detection of a single IA case within a healthcare system, and it may benefit outbreak planning and outbreak model simulation.
Sung-Young Jung, Intelligent Systems Program, University of Pittsburgh
Modeling Preference: Integrating Content-based and Collaborative Preferences on Documents
Modeling preference has become one of the main issues in a personalization system which analyzes user behaviors, extracts information about what he likes, and predicts user behaviors about what the user will select in the future. Various studies have been performed to utilize user preferences, but there had been an issue that they are not corresponding to common sense of preference. A statistical framework was proposed to establish the concept of preference to overcome this problem. However, integrating content-based and collaborative preferences is another issue because they are different information sources, and it is not easy to combine them in a statistical framework.
Time: 12:00 noon
Place: 5317 Sennott Square
Speakers:
Guang Xiang, Intelligent Systems Program, University of Pittsburgh
Dimensionality reduction of high-throughput proteomic data using partial discriminative projections
High-throughput proteomic profiling is a promising technique that has been shown to achieve good result in detecting diseases and discovering disease biomarkers. However, the high dimensionality of typical proteomic profile data hinders the application of machine learning techniques to profile analysis and often leads to poor classification models. In this talk, we propose a new dimensionality reduction approach using multiple discriminative projections. By projecting the original features along the most discriminative directions and focusing on hard instances via a boosting-like mechanism, our method works on a reduced low-dimensional aggregate feature space and significantly improves the performance of standard classification algorithms like SVM.
Richard Pelikan, Intelligent Systems, University of Pittsburgh
Beating a dead source: Utilizing prior information with latent variable modeling of TOF-MS Spectra
Mass spectrometry is a highly vaunted tool for the analysis of the human proteome. However, the intrinsically high dimensionality of the data it produces makes it difficult to learn from. In particular, since these spectra are studied for differences inbetween healthy and diseased patients, we are pressed further to find discriminatory patterns which are present in one group, and not in the other. A group of statistical learning techniques called "latent variable models" offer to learn these regulatory patterns for us, but under high dimensionality, it becomes difficult to extract anything meaningful.
In this talk, I will demonstrate the effect of my latest research, which uses prior information from publicly available biology databases to generate expectations of naturally occuring biological systems. I then attempt to verify their existence through usage of automatic relevance determination (ARD), an after-effect of Bayesian modeling which accepts suggestions for sources only if they appear relevant to the reconstruction of data. Unacceptable sources are killed, making the model less complex, and reducing the dimensionality of the data to a smaller set of regulatory (latent) variables, which can be useful in patient diagnosis.