Perl代写-COMP2041-Assignment 2
时间:2021-04-23
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 1/11
Assignment 2: speed, Speed
version: 0.4 last updated: 2021-03-15 2000
NOTE:
the material in the lecture notes will not be sufficient by itself to allow you to complete this assignment.
You may need to search the command-line and on-line documentation for Perl, Sed, Regex, etc.
Being able to search documentation efficiently for the information you need is a very useful skill for any kind of computing
work.
Aims
This assignment aims to give you:
practice in Perl programming generally.
a clear and concrete understanding of sed's core semantics.
Introduction
Your task in this assignment is to implement Speed.
A subset of the important Unix/Linux tool Sed.
You will do this in Perl hence the name Speed
Sed is a very complex program which has many commands.
You will implement only a few of the most important commands.
You will also be given a number of simplifying assumptions, which make your task easier.
Speed is a POSIX-compatible subset of sed with extended regular expressions (EREs).
On a CSE systems you would run sed -r
You must implement Speed in Perl only.
See the Permitted Languages section below for more information.
Reference implementation
Many aspects of this assignment are not fully specified in this document;
instead, you must match the behaviour of the reference implementation: 2041 speed
Provision of a reference implementation is a common method to provide or define an operational specification,
and it's something you will likely need to do after you leave UNSW.
Discovering and matching the reference implementation's behaviour is deliberately part of the assignment,
and will take some thought.
If you discover what you believe to be a bug in the reference implementation, report it in the class forum.
Andrew and Dylan may fix the bug, or indicate that you do not need to match the reference implementation's behaviour in this case.
Speed Commands
Subset 0
In subset 0 speed.pl will always be given a single Speed command as a command-line argument.
The Speed command will be one of 'q', 'p', 'd', or 's' (see below).
The only other command-line argument possible in subset 0 is the -n option.
Input files will not be specified in subset 0.
For subset 0 speed.pl need only read from standard input.
It is possible in response to student queries that there will be minor changes to subset 2.
This may entail minor changes to work on subset 2 you have already completed.
Autotests are incomplete. More will be added.
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 2/11
Subset 0: addresses
All Speed commands in subset0 can optionally be preceded by an address specifying the line(s) they apply to.
In subset 0, this address can either be line number or a regex.
The line number must be a positive integer.
The regex must be delimited with slash / characters.
Subset 0: Regexes
In subset 0, regexes can use any ERE expressions, Except for the following limitations.
In subset 0, you can assume backslashes \ do not appear in address or substitution regexes.
In subset 0, you can assume semicolons ; do not appear in address or substitution regexes.
In subset 0, you can assume commas , do not appear in address or substitution regexes.
In subset 0, regexes are delimited with slash / characters, so you can assume slashes do not appear in regexes.
In subset 0, you can assume the regex is correct. You do not have to check for errors in the regex.
In subset 0, you can assume the regex is compatible with Perl.
In other words, the regex can be used as Perl regular expressions and will have the same meaning.
Subset 0: q - quit command
The Speed q command causes speed.pl to exit, for example:
Subset 0: p - print command
The Speed p commands prints the input line, for example:
Subset 0: d - delete command
$ seq 1 5 | 2041 speed '3q'
1
2
3
$ seq 9 20 | 2041 speed '3q'
9
10
11
$ seq 10 15 | 2041 speed '/.1/q'
10
11
$ seq 500 600 | 2041 speed '/^.+5$/q'
500
501
502
503
504
505
$ seq 100 1000 | 2041 speed '/1{3}/q'
100
101
$ seq 1 5 | 2041 speed '2p'
1
2
2
3
4
5
$ seq 7 11 | 2041 speed '4p'
7
8
9
10
10
11
$ seq 65 85 | 2041 speed '/^7/p'
65
66
67
68
69
70
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 3/11
The Speed d commands deletes the input line, for example:
Subset 0: s - substitute command
The Speed s command replaces the specified regex on the input line.
The substitute command can followed optionally by the modifier character g, for example:
$ echo Hello Andrew | 2041 speed 's/e//'
Hllo Andrew
$ echo Hello Andrew | 2041 speed 's/e//g'
Hllo Andrw
g is the only permitted modifier character.
Just like the other commands The substitute command can be given addresses to be applied to:
$ seq 1 5 | 2041 speed '4d'
1
2
3
5
$ seq 1 100 | 2041 speed '/.{2}/d'
1
2
3
4
5
6
7
8
9
$ seq 11 20 | 2041 speed '/[2468]/d'
11
13
15
17
19
$ seq 1 5 | 2041 speed 's/[15]/zzz/'
zzz
2
3
4
zzz
$ seq 10 20 | 2041 speed 's/[15]/zzz/'
zzz0
zzz1
zzz2
zzz3
zzz4
zzz5
zzz6
zzz7
zzz8
zzz9
20
$ seq 100 111 | 2041 speed 's/11/zzz/'
100
101
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 4/11
Subset 0: -n command line option
The Speed -n command line option stops input lines being printed by default.
$ seq 1 5 | 2041 speed -n '3p'
3
$ seq 2 3 20 | 2041 speed -n '/^1/p'
11
14
17
-n command line option is the only useful in conjunction with the p command,
but can still be used with the other commands.
Subset 1
Subset 1 is more difficult. You will need to spend some time understanding the semantics (meaning) of these operations, by running
the reference implementation and researching the equivalent sed operations.
Note the assessment scheme recognises this difficulty.
Subset 1: addresses
In subset 1, $ can be used as an address.
It matches the last line, for example:
$ seq 1 5 | 2041 speed '$d'
1
2
3
4
$ seq 1 10000 | 2041 speed -n '$p'
10000
In subset 1, Speed commands can optionally be preceded by a comma separated pair of address specifying the start and finish of
the range of lines the command applies to, for example:
$ seq 11 19 | 2041 speed '5s/1/2/'
11
12
13
14
25
16
17
18
19
$ seq 51 60 | 2041 speed '5s/5/9/g'
51
52
53
54
99
56
57
58
59
60
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 5/11
comma separated pairs of address can be used with the p, d, and s commands.
Subset 1: Regexes
All the rules from Subset 0 about regex still apply, except:
In subset 1, substitute regexes are not always delimited with slash / characters,
So you can not assume slashes do not appear in regexes.
You can assume that whatever the delimitor is, it will not appear in regexes.
Subset 1: s - substitute command
In subset 1, any non-whitespace character may be used to delimit a substitute command, for example:
Subset 1: Multiple Commands
In subset 1, multiple Speed commands can be supplied separated by semicolons ; or newlines, for example:
$ seq 10 21 | 2041 speed '3,5d'
10
11
15
16
17
18
19
20
21
$ seq 10 21 | 2041 speed '3,/2/d'
10
11
21
$ seq 10 21 | 2041 speed '/2/,4d'
10
11
14
15
16
17
$ seq 1 5 | 2041 speed 'sX[15]XzzzX'
zzz
2
3
4
zzz
$ seq 1 5 | 2041 speed 's?[15]?zzz?'
zzz
2
3
4
zzz
$ seq 1 5 | 2041 speed 's_[15]_zzz_'
zzz
2
3
4
zzz
$ seq 1 5 | 2041 speed 'sX[15]Xz/z/zX'
z/z/z
2
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 6/11
$ seq 1 5 | 2041 speed '4q;/2/d'
seq 1 5 | 2041 speed '/2/d;4q'
seq 1 20 | 2041 speed '/2$/,/8$/d;4,6p'
1
3
4
1
3
4
1
9
10
11
19
20
Subset 1: -f command line option
The Speed -f reads Speed commands from the specified file, for example:
$ echo 4q > commands.speed
$ echo /2/d >> commands.speed
$ seq 1 5 | 2041 speed -f commands.speed
1
3
4
$ echo /2/d > commands.speed
$ echo 4q >> commands.speed
$ seq 1 5 | 2041 speed -f commands.speed
1
3
4
commands can be supplied separated by semicolons ; or newlines.
Subset 1: Input Files
In subset 1, input files can be specified on the command line:
$ seq 1 2 > two.txt
$ seq 1 5 > five.txt
$ 2041 speed '4q;/2/d' two.txt five.txt
1
1
2
$ seq 1 2 > two.txt
$ seq 1 5 > five.txt
$ 2041 speed '4q;/2/d' five.txt two.txt
1
3
4
$ echo 4q > commands.speed
$ echo /2/d >> commands.speed
$ seq 1 2 > two.txt
$ seq 1 5 > five.txt
$ 2041 speed -f commands.speed two.txt five.txt
1
1
2
Subset 1: Comments & White Space
In subset 1, whitespace can appear before and/or after commands and addresses.
In subset 1, '#' can be used as a comment character, for example:
$ seq 24 42 | 2041 speed ' 3, 17 d # comment'
24
25
41
42
On both the command line and in a command file, a semicolon ; or newline ends a comment
2021/4/15 COMP2041 21T1 — Assignment 2: speed, Speed
https://cgi.cse.unsw.edu.au/~cs2041/21T1/assignments/ass2/index.html 7/11
$
seq 24 42 | 2041 speed '/2/d # delete ; 4 q # quit'
30
31
33
34
35
36
37
38
39
40
41
Subset 2
Subset 2 is even more difficult. You will need to spend considerable time understanding the semantics of these operations, by
running the reference implementation, and/or researching the equivalent sed operations.
Note the assessment scheme recognises this difficulty.
Subset 2: Regexes
In subset 2, backslash \ may appear in regexes.
In subset 2, the character used to delimit the regex may appear in the regex itself.
Subset 2: s - substitute command
In subset 2 you can not assume, that the replacement string is compatible with Perl and you can not assume it can be used in a Perl
substitute command with the same effect.
In subset 2, the character used to delimit the substitute command may appear in the regex or replacement string.
In subset 2, backslash may appear in the regex or replacement string.
Subset 2: -i command line option
The Speed -i command line options replaces file contents with the output of the Speed commands. You should use a temporary file.
$ seq 1 5 >five.txt
$ cat five.txt
1
2
3
4
5
$ 2041 speed -i /[24]/d five.txt
$ cat five.txt
1
3
5
Subset 2: Multiple Commands
In subset 2, semicolons can appear inside Speed commands.
$ echo 'Punctuation characters include . , ; :'|2041 speed 's/;/semicolon/g;/;/q'
Punctuation characters include . , semicolon :
In subset 2, newline can be used to separate Speed commands passed a command-line argument.
Subset 2: : - label command
The Speed : command indicates where b and t commands should continue execution.
There can not be an address before a label command.
Subset 2: b - branch command
The Speed b command branches to the specified label, if the label is omitted, it branches to end of the script.
Subset 2: t - conditional branch command
The Speed t command behaves the same as the b command except it branches only if there has been a successful substitute
command since the last input line was read and since the last t command.
Subset 2: a - append command
The Speed a command appends the specified text.






























































































































































































































































































































































































学霸联盟


essay、essay代写