Department of Biomedical Informatics - University of Pittsburgh

Archived Talks

University of Pittsburgh Department of Biomedical Informatics Lecture Announcement

Speaker: Rohit J. Kate, M.S., Ph.D. Candidate
Department of Computer Sciences,
University of Texas at Austin

Tuesday, August 14, 2007
2:00 pm - 3:00 pm
Room M-184 VALE [?]
200 Meyran Avenue

Title: “Learning for Semantic Parsing with Kernels under Various Forms of Supervision”

Abstract: Semantic parsing involves deep semantic analysis that maps natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing systems that understand natural language input. In this talk, I will present a new machine learning approach for semantic parsing based on string-kernel-based classification. It takes natural language sentences paired with their formal meaning representations as training data. For every production in the formal language grammar, a Support-Vector Machine (SVM) classifier is trained using string similarity as the kernel. Meaning representations for novel natural language sentences are obtained by finding the most probable semantic parse using these classifiers. This method does not use any hard-matching rules and unlike previous and other recent methods, does not use grammar rules for natural language, probabilistic or otherwise, which makes it more robust to noisy input.

Besides being robust, this approach is also flexible and able to learn under a wide range of supervision. A simple extension using transductive SVMs enables the system to do semi-supervised learning and improve its performance utilizing unannotated sentences which are usually easily available. Another extension involving EM-like retraining makes the system capable of learning under ambiguous supervision in which the correct meaning representation for each sentence is not explicitly given, but instead a set of possible meaning representations is given. This weaker and more general form of supervision is better representative of a natural training environment for a language-learning system requiring minimal human supervision. Experimental results show that the resulting system is able to cope with ambiguities and learn accurate semantic parsers.

For more information: www.dbmi.pitt.edu or 412.647.7113