Theresa: In this message and the next one are your comprehensive exam questions. Feel free to ask questions of any of us. Please answer each question in 7-10 double-spaced (not-too-small font) pages. Typically, ISP students have nine days to complete the written portion of their comprehensive exams. You are free to hand it in earlier than that, if you wish. =================================================================== Professor Ashley's question is in Word and will be sent in the next message. =================================================================== Professor Hwa's question: Consider applying the learning algorithms discussed in the following readings to the problem of summarizing customer product reviews (as the task is conceived of in Hu and Liu (2004): * Blum & Mitchell, * Mitchell, chapter 3 * Mitchell, chapter 6 * Mitchell, chapter 8 * Scholkopf Determine whether the algorithms might be well-suited to the problem. Compare the relative advantages and disadvantages of the different approaches. Discuss in terms of * the availability of required resources, * the choice of problem representations, * computational complexity considerations What are the biggest challenges for each approach? Are they insurmountable? Because summarization is not on your reading list, don't focus on the summarization step per se, but rather on determining ingredients to be included in such summaries. =================================================================== Professor Wiebe's question: Compare and contrast: * Different approaches to creating affective and subjectivity lexicons. Consider the following dimensions: * criteria for including entries in the lexicon (i.e., what types of lexical items does the approach aim to include?) Which seem more appropriate for supporting NLP applications? * What types of information should be included about the dictionary entries, to support NLP applications? Sketch how you envision the information being used in NLP applications (consider representative NLP applications -- you don't need to be exhaustive). * How could the dictionary be populated (entries plus information about them)? How feasible are the approaches? What resources would be needed? What performance might we expect? ===================================================================