Python代写|Assignment代写 - COMP202 ASSIGNMENT
时间:2020-11-29
It is very important that you follow the directions as closely as possible. The directions, while perhaps tedious, are designed to make it as easy as possible for the TAs to mark the assignments by letting them run your assignment, in some cases through automated tests. While these tests will never be used to determine your entire grade, they speed up the process significantly, which allows the TAs to provide better feedback and not waste time on administrative details. Plus, if the TA is in a good mood while he or she is grading, then that increases the chance of them giving out partial marks. :) Up to 30% can be removed for bad indentation of your code as well as omitting comments, or poor coding structure. To get full marks, you must: • Follow all directions below. – In particular, make sure that all file names and function names are spelled exactly as described in this document. Otherwise, a 50% penalty will be applied. • Make sure that your code runs. – Code with errors will receive a very low mark. • Write your name and student ID as a comment at the top of all .py files you hand in. • Name your variables appropriately. – The purpose of each variable should be obvious from the name. • Comment your work. – A comment every line is not needed, but there should be enough comments to fully understand your program. • Avoid writing repetitive code, but rather call helper functions! You are welcome to add additional functions if you think this can increase the readability of your code. • Lines of code should NOT require the TA to scroll horizontally to read the whole thing. Vertical spacing is also important when writing code. Separate each block of code (also within a function) with an empty line. 1 Part 1 (0 points): Warm-up Do NOT submit this part, as it will not be graded. However, doing these exercises might help you to do the second part of the assignment, which will be graded. If you have difficulties with the questions of Part 1, then we suggest that you consult the TAs during their office hours; they can help you and work with you through the warm-up questions. You are responsible for knowing all of the material in these questions. Warm-up Question 1 (0 points) Write a function same_elements which takes as input a two dimensional list and returns true if all the elements in each sublist are the same, false otherwise. For example, >>> same_elements([[1, 1, 1], ['a', 'a'], [6]]) True >>> same_elements([[1, 6, 1], [6, 6]]) False Warm-up Question 2 (0 points) Write a function flatten_list which takes as input a two dimensional list and returns a one dimensional list containing all the elements of the sublists. For example, >>> flatten_list([[1, 2], [3], ['a', 'b', 'c']]) [1, 2, 3, 'a', 'b', 'c'] >>> flatten_list([[]]) [] Warm-up Question 3 (0 points) Complete the case study on multidimensional lists presented in class on Friday, October 30. You can find the instructions on myCourses (Content > Live sessions > Extra practice > Case study (Oct 30)). Warm-up Question 4 (0 points) Write a function get_most_valuable_key which takes as input a dictionary mapping strings to integers. The function returns the key which is mapped to the largest value. For example, >>> get_most_valuable_key({'a' : 3, 'b': 6, 'g': 0, 'q': 9}) 'q' Warm-up Question 5 (0 points) Write a function add_dicts which takes as input two dictionaries mapping strings to integers. The function returns a dictionary which is a result of merging the two input dictionary, that is if a key is in both dictionaries then add the two values. >>> d1 = {'a':5, 'b':2, 'd':-1} >>> d2 = {'a':7, 'b':1, 'c':5} >>> add_dicts(d1, d2) == {'a': 12, 'b': 3, 'c': 5, 'd': -1} True Warm-up Question 6 (0 points) Create a function reverse_dict which takes as input a dictionary d and returns a dictionary where the values in d are now keys mapping to a list containing all the keys in d which mapped to them. For example, >>> a = reverse_dict({'a': 3, 'b': 2, 'c': 3, 'd': 5, 'e': 2, 'f': 3}) >>> a == {3 : ['a', 'c', 'f'], 2 : ['b', 'e'], 5 : ['d']} True Note that the order of the elements in the list might not be the same, and that’s ok! Page 2 Part 2 The questions in this part of the assignment will be graded. This assignment is adapted from an assignment created by Michael Guerzhoy (University of Toronto), Jackie Chi Kit Cheung (McGill University), and François Pitt (University of Toronto). The main learning objectives for this assignment are: • Apply what you have learned about list, one dimensional or multidimensional. • Apply what you have learned about dictionaries. • Understand how to test functions that return dictionaries. • Solidify your understanding of working with loops and strings. • Create a more complex program which consists of several modules. • Understand how to write a docstring and use doctest when working with dictionaries. • Learn to identify when using the function enumerate can help you write a cleaner code. • Apply what you have learned about file IO and string manipulation. Note that the assignment is designed for you to be practicing what you have learned in the videos up to and including Week 11.4. For this reason, you are NOT allowed to use anything seen after Week 11.4 or not seen in class at all. You will be heavily penalized if you do so. For full marks, in addition to the points listed on page 1, make sure to add the appropriate documentation string (docstring) to all the functions you write. The docstring must contain the following: • The type contract of the function. • A description of what the function is expected to do. • At least 3 examples of calls to the function (except when the function has only one possible output, in which case you can provide only one example). You are allowed to use at most one example per function from this pdf. Examples For each question, we provide several examples of how your code should behave. All examples are given as if you were to call the functions from the shell. When you upload your code to codePost, some of these examples will be run automatically to check that your code outputs the same as given in the example. However, it is your responsibility to make sure your code/functions work for any inputs, not just the ones shown in the examples. When the time comes to grade your assignment, we will run additional, private tests that may use inputs not seen in the examples. Furthermore, please note that your code files for this question and all others should not contain any function calls in the main body of the program (i.e., outside of any functions). Code that does not conform in this manner will automatically fail the tests on codePost and be heavily penalized. It is OK to place function calls in the main body of your code for testing purposes, but if you do so, make certain that you remove them before submitting. Please review what you have learned in video 5.2 if you’d like to add code to your modules which executes only when you run your files. Page 3 Question 1: Identify Synonyms (100 points) One type of question encountered in the Test of English as a Foreign Language (TOEFL) is the “Synonym Question”, where students are asked to pick a synonym of a word out of a list of alternatives. For example: 1. vexed (Answer: (a) annoyed) (a) annoyed (b) amused (c) frightened (d) excited For this assignment, you will build an intelligent system that can learn to answer questions like this one. In order to do that, the system will approximate the semantic similarity of any pair of words. The semantic similarity between two words is the measure of the closeness of their meanings. For example, the semantic similarity between “car” and “vehicle” is high, while that between “car” and “flower” is low. In order to answer the TOEFL question, you will compute the semantic similarity between the word you are given and all the possible answers, and pick the answer with the highest semantic similarity to the given word. More precisely, given a word w and a list of potential synonyms s1, s2, s3, s4, we compute the similarities of (w, s1), (w, s2), (w, s3), (w, s4) and choose the word whose similarity to w is the highest. We will measure the semantic similarity of pairs of words by first computing a semantic descriptor vector of each of the words, and then implement different similarity measures between the two vectors (for example, you will implement a function that computes the cosine similarity). Given a text with n words denoted by (w1, w2, ..., wn) and a word w, let descw be the semantic descriptor vector of w computed using the text. descw is an n-dimensional vector. The i-th coordinate of descw (i.e. the entry of the vector in position i) is the number of sentences in which both w and wi occur. For efficiency’s sake, for this assignment we will represent semantic descriptor vectors as a dictionaries, not storing the zeros that correspond to words which don’t co-occur with w. For example, suppose we are given the following text (an extract from Animal Farm by George Orwell): All the habits of Man are evil. And, above all, no animal must ever tyrannise over his own kind. Weak or strong, clever or simple, we are all brothers. No animal must ever kill any other animal. All animals are equal. The word “evil” only occurs in the first sentence. Since each word in that sentence occurs in exactly one sentence with the word “evil”, its semantic descriptor vector is: {'all': 1, 'the': 1, 'habits': 1, 'of': 1, 'man': 1, 'are': 1} The word “animal” only appears in the second and fourth sentence, but in the fourth sentence it appears twice. Its semantic descriptor vector would be: {'and': 1, 'above': 1, 'all': 1, 'no': 3, 'must': 3, 'ever': 3, 'tyrannise': 1, 'over': 1, 'his': 1, 'own': 1, 'kind': 1, 'kill': 2, 'any': 2, 'other': 2} We store all words in all-lowercase, since we don’t consider, for example, “Man” and “man” to be different words. We do, however, consider, “animal” and “animals”, or “am” and “is” to be different words. We discard all punctuation. Vectors Given two vectors u = {u1, u2, . . . , uN } and v = {v1, v2, . . . , vN }, we can compute the dot product between two vectors using the following formula: XN ui · vi i=0 Page 4 We cannot apply the formula directly to our semantic descriptors since we do not store the entries which are equal to zero. However, we can still compute the dot product between vectors by only considering the positive entries. For example, the dot product between the semantic descriptor vectors of “evil” and “animal” is the following: 1 · 1 This is because the word “all” is the only key the two semantic descriptor vectors have in common, and in both of dictionaries, “all” maps to the value 1. Similarly, given a vector v = {v1, v2, . . . , vN } we can define its norm using the following formula: kvk = vuutXNi=0 v2i Once again we apply the formula to our semantic descriptors considering only the positive entries. For example, the norm of the semantic descriptor vector of “evil” is 2.4494 . . . and the norm of the semantic descriptor vector of “animal” is 6.8556 . . . . With this in mind we can compute the semantic similarity between two word using a similarity mea￾sure. For instance, the cosine similarity measure between two vectors u = {u1, u2, . . . , uN } and v = {v1, v2, . . . , vN } is defined as: sim(u, v) = u · v kuk kvk = PNi=1 uivi rPNi=1 u2i  PNi=1 v2i  So the cosine similarity of “evil” and “animal”, given the semantic descriptors above, is 1 · 1 2.4494... · 6.8556... = 0.0595 . . . Vectors Utility Functions Let’s start by creating a module called vectors_utils.py which contains several helper functions to work with vectors. Note that as indicated above, we will be using dictionaries mapping keys to integer values to represent vectors. In this module you are allowed to import and use the math module. For full marks, all the following functions must be part of this module: • add_vectors: given two dictionaries representing vectors, it adds the second vector to the first one. This function is void, and it modifies only the first input dictionary. For example: >>> v1 = {'a' : 1, 'b' : 3} >>> v2 = {'a' : 1, 'c' : 1} >>> add_vectors(v1, v2) >>> len(v1) 3 >>> v1['a'] 2 >>> v1 == {'a' : 2, 'b' : 3, 'c' : 1} True Page 5 >>> v2 == {'a' : 1, 'c' : 1} True • sub_vectors: given two dictionaries representing vectors, it returns a dictionary which is the result of subtracting the second vector from the first one. This function must not modify any of the input dictionaries. For example: >>> d1 = {'a' : 3, 'b': 2} >>> d2 = {'a': 2, 'c': 1, 'b': 2} >>> d = sub_vectors(d1, d2) >>> d == {'a': 1, 'c' : -1} True >>> d1 == {'a' : 3, 'b': 2} True >>> d2 == {'a': 2, 'c': 1, 'b': 2} True • merge_dicts_of_vectors: given two dictionaries containing values which are dictionaries represent￾ing vectors, the function modifies the first input by merging it with the second one. This means that if both dictionaries contain the same key, then in the merged dictionary that same key will map to the sum of the two vectors. Note that this is a void function and it modifies only the first input dictionary. For example: >>> d1 = {'a' : {'apple': 2}, 'p' : {'pear': 1, 'plum': 3}} >>> d2 = {'p' : {'papaya' : 6}} >>> merge_dicts_of_vectors(d1, d2) >>> len(d1) 2 >>> len(d1['p']) 3 >>> d1['a'] == {'apple': 2} True >>> d1['p'] == {'pear': 1, 'plum': 3, 'papaya' : 6} True >>> d2 == {'p' : {'papaya' : 6}} True >>> merge_dicts_of_vectors(d2, d1) >>> d2['a']['apple'] 2 >>> d2['p']['papaya'] 12 • get_dot_product: given two dictionaries representing vectors, returns the dot product of the two vectors. As explained in the previous section, given two vectors u = {u1, u2, . . . , uN } and v = {v1, v2, . . . , vN }, we can compute the dot product between two vectors using the following formula: XNi=0 ui · vi Page 6 For example, >>> v1 = {'a' : 3, 'b': 2} >>> v2 = {'a': 2, 'c': 1, 'b': 2} >>> get_dot_product(v1, v2) 10 >>> v3 = {'a' : 3, 'b': 2} >>> v4 = {'c': 1} >>> get_dot_product(v3, v4) 0 • get_vector_norm: given a dictionary representing a vector, returns the norm of such vector. As explained in the previous section, given a vector v = {v1, v2, . . . , vN } we can compute its norm using the following formula: kvk = vuutXNi=0 v2i For example, >>> v1 = {'a' : 3, 'b': 4} >>> get_vector_norm(v1) 5.0 >>> v2 = {'a': 2, 'c': 3, 'b': 2} >>> round(get_vector_norm(v2), 3) 4.123 • normalize_vector: given a dictionary representing a vector, the function modifies the dictionary by dividing each value by the norm of the vector. Given a vector v = {v1, v2, . . . , vN } we can normalize it by multiplying it by the inverse of its norm (i.e. 1/ kvk). Note that this function does not return any values. If the input vector has a norm of zero, then do not modify the vector. For example: >>> v1 = {'a' : 3, 'b': 4} >>> normalize_vector(v1) >>> v1['a'] 0.6 >>> v1['b'] 0.8 >>> v2 = {'a': 2, 'c': 3, 'b': 2} >>> normalize_vector(v2) >>> round(v2['c'], 3) 0.728 Similarity Measures We can now create a module called similarity_measures.py which contains several functions that allow us to compute the similarity between two vectors. Note that as indicated above, we will be using dictionaries mapping keys to integer values to represent vectors. As always you can add helper functions Page 7 if you want to. Please make sure to reduce code repetition as much as possible. For this module you can assume that all strings only contain lowercase letters of the English alphabet. For full marks, all the following functions must be part of this module: • get_semantic_descriptor: given a string w representing a single word and a list s representing all the words in a sentence, returns a dictionary representing the semantic descriptor vector of the word w computed from the sentence s. For example: >>> s1 = ['all', 'the', 'habits', 'of', 'man', 'are', 'evil'] >>> s2 = ['no', 'animal', 'must', 'ever', 'kill', 'any', 'other', 'animal'] >>> desc1 = get_semantic_descriptor('evil', s1) >>> desc1['all'] 1 >>> len(desc1) 6 >>> 'animal' in desc1 False >>> desc2 = get_semantic_descriptor('animal', s2) >>> desc2 == {'no': 1, 'must': 1, 'ever': 1, 'kill': 1, 'any': 1, 'other': 1} True >>> get_semantic_descriptor('animal', s1) {} • get_all_semantic_descriptors : takes as input a list of lists representing the words in a text, where each sentence in a text is represented by a sublist of the input list. The function returns a dictionary d such that for every word w that appears in at least one of the sentences, d[w] is itself a dictionary which represents the semantic descriptor vector of w (note: the variable names here are arbitrary). For example: >>> s = [['all', 'the', 'habits', 'of', 'man', 'are', 'evil'], \ ['and', 'above', 'all', 'no', 'animal', 'must', 'ever', 'tyrannise', 'over', 'his', 'own', 'kind'], \ ['weak', 'or', 'strong', 'clever', 'or', 'simple', 'we', 'are', 'all', 'brothers'], \ ['no', 'animal', 'must', 'ever', 'kill', 'any', 'other', 'animal'], \ ['all', 'animals', 'are', 'equal']] >>> d = get_all_semantic_descriptors(s) >>> d['animal']['must'] 3 >>> d['evil'] == {'all': 1, 'the': 1, 'habits': 1, 'of': 1, 'man': 1, 'are': 1} True • get_cos_sim: given two dictionaries representing similarity descriptor vectors, returns the co￾sine similarity between the two. As seen before, the cosine similarity between two vectors u = {u1, u2, . . . , uN } and v = {v1, v2, . . . , vN } is defined as: sim(u, v) = u · v kuk kvk = PNi=1 uivi rPNi=1 u2i  PNi=1 v2i  If the norm of one of of the two vectors is 0, then a ZeroDivisionError will be raised. That’s ok! Just be mindful about this when you use this function. Page 8 For example, >>> round(get_cos_sim({"a": 1, "b": 2, "c": 3}, {"b": 4, "c": 5, "d": 6}), 2) 0.7 >>> s = [['all', 'the', 'habits', 'of', 'man', 'are', 'evil'], \ ['and', 'above', 'all', 'no', 'animal', 'must', 'ever', 'tyrannise', 'over', 'his', 'own', 'kind'], \ ['weak', 'or', 'strong', 'clever', 'or', 'simple', 'we', 'are', 'all', 'brothers'], \ ['no', 'animal', 'must', 'ever', 'kill', 'any', 'other', 'animal'], \ ['all', 'animals', 'are', 'equal']] >>> d = get_all_semantic_descriptors(s) >>> v1 = d['evil'] >>> v2 = d['animal'] >>> round(get_cos_sim(v1, v2), 4) 0.0595 • get_euc_sim: given two dictionaries representing similarity descriptor vectors, returns the similarity between the two using the negative euclidean distance. This similarity measure is computed using the following formula: simeuc(v1, v2) = − kv1 v2k where v1 and v2 are two vectors. Remember that k·k is the notation indicating the norm of a vector. See the intro section for the formula that defines it. For example: >>> round(get_euc_sim({"a": 1, "b": 2, "c": 3}, {"b": 4, "c": 5, "d": 6}), 2) -6.71 >>> s = [['all', 'the', 'habits', 'of', 'man', 'are', 'evil'], \ ['and', 'above', 'all', 'no', 'animal', 'must', 'ever', 'tyrannise', 'over', 'his', 'own', 'kind'], \ ['weak', 'or', 'strong', 'clever', 'or', 'simple', 'we', 'are', 'all', 'brothers'], \ ['no', 'animal', 'must', 'ever', 'kill', 'any', 'other', 'animal'], \ ['all', 'animals', 'are', 'equal']] >>> d = get_all_semantic_descriptors(s) >>> v1 = d['evil'] >>> v2 = d['animal'] >>> round(get_euc_sim(v1, v2), 4) -7.1414 • get_norm_euc_sim: given two dictionaries representing similarity descriptor vectors, returns the similarity between the two using the negative euclidean distance between the normalized vectors. This similarity measure is computed using the following formula: simeucnorm(v1, v2) = v1 kv1k v2 kv2k where v1 and v2 are two vectors. Remember that k·k is the notation indicating the norm of a vector. See the intro section for the formula that defines it. This function should not modify the input dictionaries. For example: >>> round(get_norm_euc_sim({"a": 1, "b": 2, "c": 3}, {"b": 4, "c": 5, "d": 6}), 2) -0.77 >>> s = [['all', 'the', 'habits', 'of', 'man', 'are', 'evil'], \ ['and', 'above', 'all', 'no', 'animal', 'must', 'ever', 'tyrannise', 'over', 'his', 'own', 'kind'], \ ['weak', 'or', 'strong', 'clever', 'or', 'simple', 'we', 'are', 'all', 'brothers'], \ ['no', 'animal', 'must', 'ever', 'kill', 'any', 'other', 'animal'], \ Page 9 ['all', 'animals', 'are', 'equal']] >>> d = get_all_semantic_descriptors(s) >>> v1 = d['evil'] >>> v2 = d['animal'] >>> round(get_norm_euc_sim(v1, v2), 4) -1.3715 Processing files We are now ready to create a module called file_processing.py which contains several functions that allow us to read a file and extract a dictionary of semantic descriptors from it. As always you can add helper functions if you want to. For full marks, all the following functions must be part of this module: • get_sentences: given a string returns a list of strings each representing one of the sentences from the input string. You should assume that the following punctuation always separates sentences: ".", "!", "?", and that is the only punctuation that separates sentences. Sentences should not begin nor end with a space character. There must be no empty strings in the output list. Note that if the input string does not contain any of the characters ".", "!", "?", then the function will consider the string as a single sentence. For example, >>> text = "No animal must ever kill any other animal. All animals are equal." >>> get_sentences(text) ['No animal must ever kill any other animal', 'All animals are equal'] >>> t = "Are you insane? Of course I want to leave the Dursleys! Have you got a house? When can I move in?" >>> get_sentences(t) ['Are you insane', 'Of course I want to leave the Dursleys', 'Have you got a house', 'When can I move in'] • get_word_breakdown: given a string returns a 2D lists of strings. Each sublist contains a strings representing words from each sentence. You should assume that the following punctuation always separates sentences: ".", "!", "?", and that is the only punctuation that separates sentences. Beside that, you should assume that the only punctuation present in the texts is the following: [',', '-', '--', ':', ';', '"', "'"] The strings representing words should also not contain any white space (i.e. space characters, tabs, or new lines) and all their characters must be lower case. There must be no empty strings in the output list. For example, >>> text = "All the habits of Man are evil. And, above all, no animal must ever tyrannise over his \ own kind. Weak or strong, clever or simple, we are all brothers. No animal must ever kill \ any other animal. All animals are equal." >>> s = [['all', 'the', 'habits', 'of', 'man', 'are', 'evil'], \ ['and', 'above', 'all', 'no', 'animal', 'must', 'ever', 'tyrannise', 'over', 'his', 'own', 'kind'], \ ['weak', 'or', 'strong', 'clever', 'or', 'simple', 'we', 'are', 'all', 'brothers'], \ ['no', 'animal', 'must', 'ever', 'kill', 'any', 'other', 'animal'], \ ['all', 'animals', 'are', 'equal']] >>> w = get_word_breakdown(text) >>> s == w True Page 10 • build_semantic_descriptors_from_files: given a list of file names (strings) as input returns a dictionary of the semantic descriptors of all the words in the files received as input, with the files treated as a single text. To open a file use open(filename, "r", encoding="utf-8"), where filename is a string. For example, assume that the following text is written inside a file named animal_farm.txt: All the habits of Man are evil. And, above all, no animal must ever tyrannise over his own kind. Weak or strong, clever or simple, we are all brothers. No animal must ever kill any other animal. All animals are equal. And the following text is written inside a file named alice.txt: "If you didn’t sign it," said the King, "that only makes the matter worse. You must have meant some mischief, or else you'd have signed your name like an honest man." There was a general clapping of hands at this: it was the first really clever thing the King had said that day. Then, >>> d = build_semantic_descriptors_from_files(['animal_farm.txt']) >>> d['animal']['must'] 3 >>> d['evil'] == {'all': 1, 'the': 1, 'habits': 1, 'of': 1, 'man': 1, 'are': 1} True >>> d = build_semantic_descriptors_from_files(['animal_farm.txt', 'alice.txt']) >>> 'king' in d['clever'] True >>> 'brothers' in d['clever'] True >>> len(d['man']) 21 Guessing synonyms Finally we can create a module called synonyms_solver.py which contains functions that allows our program to answer synonym questions. As always you can add helper functions if you want to. For full marks, all the following functions must be part of this module: • most_sim_word: This function takes four inputs: a string word, a list of strings choices, and a dictio￾nary semantic_descriptors which is built according to the requirements for get_all_semantic_descriptors, and a similarity function similarity_fn. The function returns the element of choices which has the largest semantic similarity to word, with the semantic similarity computed using the data in semantic_descriptors and the similarity function similarity_fn. The similarity function is a func￾tion which takes in two sparse vectors stored as dictionaries and returns a float. An example of such a function is get_cos_sim. If the semantic similarity between two words cannot be computed, it is considered to be float('-inf'). In case of a tie between several elements in choices, the one with the smallest index in choices should be returned (e.g., if there is a tie between choices[5] and choices[7], choices[5] is returned). In the case in which the function cannot compute the semantic similarity between word and all of the strings in choices, then the function should return an empty string. For example, >>> choices = ['dog', 'cat', 'horse'] Page 11 >>> c = {'furry' : 3, 'grumpy' : 5, 'nimble' : 4} >>> f = {'furry' : 2, 'nimble' : 5} >>> d = {'furry' : 3, 'bark' : 5, 'loyal' : 8} >>> h = {'race' : 4, 'queen' : 2} >>> sem_descs = {'cat' : c, 'feline' : f, 'dog' : d, 'horse' : h} >>> most_sim_word('feline', choices, sem_descs, get_cos_sim) 'cat' • run_sim_test: This function takes three inputs: a string filename, a dictionary semantic_descriptors, and a function similarity_fn. The string is the name of a file in the same format as test.txt. The function returns the percentage (i.e., float between 0.0 and 100.0) of questions on which most_sim_word guesses the answer correctly using the semantic descriptors stored in semantic_descriptors, and the similarity function similarity_fn. The format of test.txt is as follows. On each line, we are given a word (all-lowercase), the correct answer, and the choices. For example, the line: feline cat dog cat horse represents the question: feline: (a) dog (b) cat (c) horse and indicates that the correct answer is “cat”. For example, >>> descriptors = build_semantic_descriptors_from_files(['test.txt']) >>> run_sim_test('test.txt', descriptors, get_cos_sim) 15.0 • generate_bar_graph: given a list of similarity functions and a string filename (which is the name of a file in the same format as test.txt) generates a bar graph (using matplotlib) where the performance of each function on the given file test is plotted. The graph should be saved in a file named synonyms_test_results.png. Download the novels Swann’s Way by Marcel Proust, and War and Peace by Leo Tolstoy from Project Gutenberg, and use them (at the same time) to build a semantic descriptors dictionary. Please save the novels inside files with the following names: swanns_way.txt, and war_and_peace.txt. Note: the program may take several minutes to run (or more, if your implementation is inefficient). The novels are available at the following URLs: http://www.gutenberg.org/cache/epub/7178/pg7178. txt http://www.gutenberg.org/cache/epub/2600/pg2600.txt Page 12 What To Submit You must submit all your files on codePost (https://codepost.io/). The file you should submit are listed below. Any deviation from these requirements may lead to lost marks. vectors_utils.py similarity_measures.py file_processing.py synonyms_solver.py README.txt In this file, you can tell the TA about any issues you ran into doing this assignment. If you point out an error that you know occurs in your program, it may lead the TA to give you more partial credit. Remember that this assignment like all others is an individual assignment and must represent the entirety of your own work. You are permitted to verbally discuss it with your peers, as long as no written notes are taken. If you do discuss it with anyone, please make note of those people in this README.txt file. If you didn’t talk to anybody nor have anything you want to tell the TA, just say “nothing to report” in the file.
essay、essay代写