python代写-INFO1112 2021 -Assignment 1
时间:2021-08-31

INFO1112 2021 Assignment 1 (version 2)
Due Week 4, Thursday 2nd September at 11:59PM (Midnight)
This assignment is worth 10% of your overall grade for the course.
Assessment
The assignment will be marked with an automatic testing system on Ed. A mark
will be given based on a percentage of tests passed (5%) and a manual mark will
be given for overall style, quality, readability, etc (3%). You are expected to write
your own tests and submit them with your code, and a mark will be given based on
coverage and manual inspection of your tests (2%).
There will be public test cases made available for you to test against, but there will
also be extra non-public tests used for marking. Success with the public tests
doesn't guarantee your program will pass the private tests.
This assignment involves writing a program (called texta) to transform text files
according to a set of commands in a file.
Unix and Unix-like systems (Linux etc) have a large number of programs for
manipulating text. In fact, the original Unix was developed to help people write
technical documentation. In those days everything was stored as plain text and
specialised document processing systems did the typesetting to produce the final
document. We still use a lot of plain text files today. There are still widely used
document processing systems such as Latex that take a plain text file and interpret
commands in the file to display typeset text. Latex is almost universal in the technical
academic world (CS, Eng, Science…). Text formats such as CSV (comma
separated values) are used to store data files. Log files that store information about
activity in a system are all stored in text.
There are many things we might like to do with a text file. For example, pick out all
lines that contain a certain string, or replace a certain string with another string, or
pick out particular substrings of the line. A good example of this occurs when you
analyse log files. A web server log might accumulate many thousands of entries
(lines) each day. If I want to find out how many people accessed a certain URL I
want to extract all relevant lines and count them. I might want the total number of
accesses from a particular client machine, or how many total bytes were transferred.
INFO1112 2021 Assignment 1 1
All of these examples require analysis of the text log file.
Unix has many commands for doing this analysis. Each of these commands does a
small task (for example, selecting lines that match a pattern) and we can string them
together in a shell pipeline.
For this assignment we're going to write a program that implements a few of these
operations and control what to do with a file of instructions. It will be a sort of Swiss
Army Knife that does everything.
The texta Command
Your command (called "texta") will read a file containing instructions (commands)
to do something with the input text. It will read lines from the input files and apply
each command, in sequence, to each line, sending the result to output.
The commands to implement are:
filter regexp # selects lines that match regular expression regexp
fields "delimiter-string" a b c d # Divide a source line into fields using the
delimiter-string and keep only the fields numbered a b c d, in the
given order
replace "string1" "string2" # replaces string1 by string2
count # at the end of a run prints a count of the number of output lines on stderr
A comment may be placed at the end of any command using a hash character (#).
Each command must consist of a single line.
filter
The regexp argument specifies a regular expression which is matched against each line. If
the expression matches, the line is passed to the next stage. If the expression does not
match, that line is skipped. You can implement regular expression matching easily in
Python using the "re" module. V2: Note that the double quote character cannot appear in a
regexp in this exercise. This is to make the exercise a bit easier - regular expressions can
normally include quotes.
fields
The delimiter-string is any string of characters enclosed in double quotes ("), it is used to
break the line into a set of fields, numbered from 0. The numbers a b c etc are used to
select the order that these fields are to be written (separated by delimiter-string) to the
output. This allows the line to be re-ordered, fields deleted etc. Note that the double quote
character cannot appear in a delimiter-string, and an empty string ("") means any white
INFO1112 2021 Assignment 1 2
space. V2: If the delimiter string is empty, output fields are separated with a single
space.
replace
This allows one string to be replaced by another. The strings are enclosed in double quotes.
This is done in one pass, left to right. If string1 is empty ("") it means any amount of
whitespace is to be replaced. If string2 is empty ("") it means remove string1. V2: The double
quote character cannot appear in the strings.
V2: a newline character (\n) is illegal in any of the strings in filter, fields or
replace.
count
This command sets a flag to tell texta to print the count of output lines to the standard error
file when the texta program finishes.
The general form of the texta command is:
texta cmdfile [file1 [file2 …]]
cmdfile is the file containing commands (eg filter etc)
file1 is a file to apply the commands to
file2… are more files to apply the commands to
If there are no files specified, texta reads from the standard input.
The processed lines are written to the standard output.
Error handling and assumptions
You should make sure the arguments to your texta command are correct: must have a
cmdfile and it must be readable and contain commands. If there are filenames, the files must
exist and be readable. You should check that the commands are legal, ie one of filter,
fields, replace, count, and they have the right number of arguments. Also check
that the arguments are legal, eg no negative field numbers or non-integers.
Error messages must have the form:
Error: file name not readable
Error: command line N: bad field number
Error: command line N: incorrect number of strings in replace
Error: command line N: message
Error: message
INFO1112 2021 Assignment 1 3
where name is one of the file names on the command line, N is the line number in the
command file, message is an informative error message.
V2: If an input file is unreadable give an error message and move on to the next file.
V2: The philosophy with errors is to give an informative message and try and keep going.
For example, if a line doesn't have the correct number of fields you should give an error
message on stderr but then go to the next command for the same line.
V2: It is impossible to cover all possible input states in a specification like this. If you find
something you think is ambiguous in the specification make your own judgement and justify
it with comments in the code. The marker will be reading your code.
Usage Examples
GIven an input file (called testdata) in the comma separated value spreadsheet format
(CSV) containing the following data:
Jim,Smith,jsmi4321@uni.sydney.edu.au,INFO1110
Jane,Smith,jsmi1234@uni.sydney.edu.au,INFO1113
Bill,Smith,jsmi9876@uni.sydney.edu.au,INFO1110
And the following set of commands in the file testcmds:
filter "INFO1110" # select students in INFO1110
fields "," 2 # replace matching lines with the email address
count # print the number of lines
Then the command:
texta testcmds testdata
will output a set of lines containing email addresses only for students that are in INFO1110. It
will also print the count of lines on output to the standard error file.
Standard output:
jsmi4321@uni.sydney.edu.au
jsmi9876@uni.sydney.edu.au
Standard error:
INFO1112 2021 Assignment 1 4
2Another example with the same input but testcmds containing:
fields "," 2
replace "@uni.sydney.edu.au" "" # extract unikey
This will take the email address field and remove the "@uni.sydney.edu.au" part.
Standard output:
jsmi4321
jsmi1234
jsmi9876
Standard error will not have any output.
An example showing error messages
With testcmds containing:
fields "," x
Standard output:
Standard error:
Error: command line 1: bad field number
Implementation
The assignment is to be implemented in Python as a script which handles command
line arguments. A set of scaffold files will be provided. You are expected to write
legible code with good style.
The only Python modules which you are allowed to import are os, sys , and re. If you
want to use an additional module which will not trivialize the assignment, ask your
tutor, and the allowed library list may be extended.
INFO1112 2021 Assignment 1 5
Testing
You are expected to write a number of test cases for your program. These should be
simple input/output tests. Example tests will be included in the scaffold. You are
expected to test every execution path of your code.
Submitting your code
An Ed Lesson workspace will be available for you to test and submit your code.
Public test cases will be released up to Sunday the 29th of August. Additionally,
there will be a set of unreleased test cases which will be run against your code after
the due date.
Any attempt to deceive the automatic marking system will be subject to academic
dishonesty proceedings.
PLEASE NOTE
Sometimes we find typos or other errors in
specifications. Sometimes the specification could be
clearer. Students and tutors often make great
suggestions for improving the specification.
Therefore, this assignment specification may be
clarified up to the start of week 4. No major changes
will be made. Revised versions will be clearly marked
and the most recent version announced to the class
via Canvas and Ed Discussions.
INFO1112 2021 Assignment 1 6
Academic Declaration
By submitting this assignment you declare the following:
I declare that I have read and understood the University of Sydney Student
Plagiarism: Coursework Policy and Procedure, and except where specifically
acknowledged, the work contained in this assignment/project is my own work, and
has not been copied from other sources or been previously submitted for award or
assessment.
I understand that failure to comply with the Student Plagiarism: Coursework Policy
and Procedure can lead to severe penalties as outlined under Chapter 8 of the
University of Sydney By-Law 1999 (as amended). These penalties may be imposed
in cases where any significant portion of my submitted work has been copied without
proper acknowledgement from other sources, including published works, the
Internet, existing programs, the work of other students, or work previously submitted
for other awards or assessments.
I realise that I may be asked to identify those portions of the work contributed by me
and required to demonstrate my knowledge of the relevant material by answering
oral questions or by undertaking supplementary work, either written or in the
laboratory, in order to arrive at the final assessment mark.
I acknowledge that the School of Computer Science, in assessing this assignment,
may reproduce it entirely, may provide a copy to another member of faculty, and/or
communicate a copy of this assignment to a plagiarism checking service or in-house
computer program, and that a copy of the assignment may be maintained by the
service or the School of Computer Science for the purpose of future plagiarism
checking.
INFO1112 2021 Assignment 1 7


essay、essay代写