程序代写案例-COMP90042
时间:2022-06-18
COMP90042 Natural Language ProcessingThe University of MelbourneSchool of Computing and Information SystemsCOMP90042Natural Language ProcessingFinal ExamSemester 1 2021Exam duration: 165 minutes (15 minutes reading time + 120 minutes writing time + 30 minutestechnical buffer time)Length: This paper has 5 pages (including this cover page) and 8 questions. You should attempt allquestions.Instructions to students:• This exam is worth a total of 120 marks and counts for 40% of your final grade.• You can read the question paper on a monitor, or print it.• You are recommended to write your answers on blank A4 paper. Note that some answers requiredrawing diagrams or tables.• You will need to scan or take a photo of your answers and upload them via Gradescope. Be sure tolabel the scans/photos with the question numbers (-10% penalty for each unlabelled question).• Please answer all questions. Please write your student ID and question number on every page.Format: Open Book• While you are undertaking this assessment you are permitted to:– make use of the textbooks, lecture slides and workshop materials.• While you are undertaking this assessment you must not:– make use of any messaging or communications technology;– make use of any world-wide web or internet-based resources such as wikipedia, stackoverflow,or google and other search services;– act in any manner that could be regarded as providing assistance to another student who isundertaking this assessment, or will in the future be undertaking this assessment.• The work you submit must be based on your own knowledge and skills, without assistance fromany other person.page 1 of 5 Continued overleaf . . .COMP90042 Natural Language ProcessingCOMP90042 Natural Language ProcessingSemester 1, 2021Total marks: 120 (40% of subject)Students must attempt all questionsSection A: Short Answer Questions [45 marks]Answer each of the questions in this section as briefly as possible. Expect to answer each sub-question inno more than several lines.Question 1: General Concepts [24 marks]a) What is a “sequence labelling” task and how does it differ from independent prediction? Explain using“part-of-speech tagging” as an example. [6 marks]b) Compare and contrast “antecedent restrictions” and “preferences” in “anaphora resolution”. Youshould also provide examples of these restrictions and preferences. [6 marks]c) What is the “exposure bias” problem in “machine translation”? [6 marks]d) Why do we use the “IOB tagging scheme” in “named entity recognition”? [6 marks]Question 2: Distributional Semantics [9 marks]a) How can we learn “word vectors” using “count-based methods”? [6 marks]b) Qualitatively, how will the word vectors differ when we use “document” vs. “word context”? [3 marks]Question 3: Context-Free Grammar [12 marks]a) Explain two limitations of the “context-free” assumption as part of a “context-free grammar”, withthe aid of an example for each limitation. [6 marks]b) What negative effect does “head lexicalisation” have on the grammar? Does “parent conditioning”have a similar issue? You should provide examples as part of your explanation. [6 marks]page 2 of 5 Continued overleaf . . .COMP90042 Natural Language ProcessingSection B: Method Questions [45 marks]In this section you are asked to demonstrate your conceptual understanding of the methods that we havestudied in this subject.Question 4: Dependency Grammar [18 marks]a) What is “projectivity” in a dependency tree, and why is this property important in dependencyparsing? [3 marks]b) Which arc or arcs are “non-projective” in the following tree? Explain why they are non-projective.[6 marks]c) Show a sequence of parsing steps using a “transition-based parser” that will produce the dependencytree below. Be sure to include the state of the stack and buffer at every step. [9 marks]Question 5: Loglikelihood Ratio [15 marks]The “loglikelihood ratio” is used in summarisation to measure the “saliency” of a word compared to abackground corpus. In the second task of the project, to understand the nature of rumour vs. non-rumoursource tweets, one analysis we can do is to extract salient hashtags in rumour source tweets and non-rumour source tweets to understand the topical differences between them. Illustrate with an examplewith equations how you can apply loglikelihood ratio to extract salient hashtags in these two types ofsource tweets.Question 6: Ethics [12 marks]You’re tasked to develop an NLP application to predict the “intelligence quotient (IQ)” scores of highschool students based on their essays written for a range of topics. Discuss at least three ethical impli-cations of this application.page 3 of 5 Continued overleaf . . .COMP90042 Natural Language ProcessingSection C: Algorithmic Questions [30 marks]In this section you are asked to demonstrate your understanding of the methods that we have studied inthis subject, in being able to perform algorithmic calculations.Question 7: N-gram Language Models [15 marks]This question asks you to calculate the probability for “N -gram language models”. You should leave youranswers as fractions. Consider the following table, which collects the counts of words that occur aftersalted in a corpus.Word Count Unsmoothed ProbabilitySmoothed ProbabilityAbsolute Discounting Katz Backoffegg 6 ? ? ?caramel 4 ? ? ?fish 3 ? ? ?peanuts 2 ? ? ?butter 0 ? ? ?salted 0 ? ? ?E.g. the bigram salted egg occurs 6 times, while salted caramel occurs 4 times.a) Assuming the 6 distinct words in the table are all the words in vocabulary, compute the bigramprobabilities for all the bigrams listed in the table without any smoothing. Hint: you should fill inthe missing values for the “Unsmoothed Probability” column in the table, and demonstrate how youarrive at these values. [3 marks]b) Compute the bigram probabilities for all bigrams listed in the table using “absolute discounting”, witha discount factor of 0.2. Hint: you should fill in the missing values for the “Absolute Discounting”column in the table, and demonstrate how you arrive at these values. [6 marks]c) Compute the bigram probabilities for all bigrams listed in the table using “Katz Backoff”, with thesame discount factor of 0.2. Use the corpus below (2 sentences) for computing unigram probabilities.For simplicity, you do not need to consider special tokens (ending or starting tokens), and may assumeall the unique words in the 2 sentences as your vocabulary when computing the unigram probabilities.Hint: you should fill in the missing values for the “Katz Backoff” column in the table, and demonstratehow you arrive at these values. [6 marks]butter in batter will make batter saltedbut better butter will make batter betterpage 4 of 5 Continued overleaf . . .COMP90042 Natural Language ProcessingQuestion 8: Topic Models [15 marks]Consider training a “latent Dirichlet allocation” (LDA) topic model using the following corpus with 3documents (d1, d2, d3). To initialise the training process, each word token is randomly allocated to atopic (e.g. peck/t3 means peck is assigned topic t3). Hyper-parameters of the topic model are set asfollows: (1) number of topics T = 3; (2) document-topic prior α = 0.5; and (3) topic-word prior β = 0.1.d1: peck/t3 pickled/t1 peppers/t1d2: peter/t1 piper/t2 picked/t3 peppers/t2d3: peppers/t2 piper/t3 peck/t3 peppers/t1a) Compute the probability over the topics (t1, t2, t3) if you were to sample a new topic for the firstword (peck) in d1 for a training step. You should show co-occurrence tables that are relevant toproducing your solution. [9 marks]b) Assume now that the topic model is trained. You are now given a new document: pickled pepperspopped. Describe how LDA infers the topics for this new document. Note: you do not need to showequations or tables here. [6 marks]— End of Exam —page 5 of 5 End of Exam