INFO1112-info1112代写-Assignment 1|学霸联盟

INFO1112-info1112代写-Assignment 1

时间：2024-08-27

INFO1112 - Assignment 1 – mysh
Your task is to implement your own, Unix shell!
Ultimately, your shell should support the following features:
Without further ado, let's break down the main components on how the shell works!
Starting and Running the Shell
Starting the Shell
Starting the shell is simply done by running:
The shell expects no additional arguments, and additional arguments are silently ignored.
Running the Shell
When the shell is running, you will see the prompt that has been set from the PROMPT environment variable,
and the user will be able to type in commands to be executed:
Effectively, your shell will run infinitely (in an infinite loop), until it is terminated, either by EOF (e.g. Ctrl + D), or
if the exit built-in command is run (see here for more details on this). You can think of the shell as an "event-
driven" program, where it is waiting for a command input (which will be the event), which it then reacts to.
While the prompt is being displayed, there are a couple of extra things to keep in mind:
a custom initialisation file, .myshrc , which is formatted as JSON.
run and execute any program on the system's PATH
define your own custom commands using JSON files
support additional quality-of-life features to mimic other Unix shells
python3 mysh.py
>>
If the user enters Ctrl + C, but no command is running (i.e. the prompt is being displayed), Ctrl + C should
be silently ignored, and a new prompt should be displayed.
If the user inputs a blank command (an empty line, with or without whitespace), the shell should simply
display a new prompt.
1 / 21
Syntax and Execution
Splitting a Line Into Arguments
Like other Unix shells, mysh is a line-based command parser. A user inputs one line at a time, and the line is
executed as a command.
When we input one line of input, the line is split by whitespace, spaces, tabs, etc. (with an exception, more on
this below), with each "word" forming an "argument". For example, the line:
can be split into a list of arguments (using Python syntax) as:
There is one exception to this: quoted strings (both single and double quotes)! The outer level of quotes are
stripped from the quoted string. If a string is inside some quotes, and it includes whitespace, the whitespace is
preserved.
For example, the below line:
will be split as:
while the below line:
will be split as:
Quotes can also be escaped by typing a backslash (\) before the quote character. This means that if you have
quoted argument of single or double quotes, and type in the same type of quote character prepended with a
backslash (e.g. \" or \' ), it won't be interpreted as a closing quote, but instead, as a quote character – similar
to the behaviour of Python.
Note
If your terminal is terminated by EOF, make sure to print an extra newline character! (Thank you to the
students who helped raise this to be compatible with Ed test cases, as this was initially overlooked!)
hello there how are you
["hello", "there", "how", "are", "you"]
hello there "how are you" 'going today'
["hello", "there", "how are you", "going today"]
hello there 'my name is "Sarah"'
["hello", "there", 'my name is "Sarah"']
2 / 21
For example, the below line:
will be split as:
Edge cases
Error cases
my "\"string\"" which 'is \'quoted\''
["my", '"string"', "which", "is 'quoted'"]
Tip
The shlex module will be incredibly useful to help parse these strings, and will meet the edge cases
below! Please use it!
It is valid to have double quotes inside single quotes (as in the above example), and single quotes inside
double quotes.
It is valid for quotes to be spliced inside of arguments, and can even be next to each other, as long as the
outer layer of quotes is stripped out! For example, the line below:
should be split as:
hello there "quotes are""next to 'each'"-other
["hello", "there", "quotes arenext to 'each'-other"]
An unterminated quoted argument should be considered a syntax error by your program and should
output an error message to stderr , formatted below. For example, for the line:
the program should output:
to stderr .
line with "unterminated string
mysh: syntax error: unterminated quote
Note that unterminated quote characters are valid if they are inside quotes of a different type. For
example, the line:
this is "Tom's PC"
3 / 21
Executing Commands
When a line is split, the first "word" is always defined as the name of the command to execute. For example, for
the line:
will be parsed as:
with "sort" being the name of the command (or program) we wish to execute, and ["input.txt", "-o",
"output.txt"] being the arguments passed to the "sort" program.
The command name will be attempted to be matched to a built-in command (see Built-in Commands for all
built-in commands), otherwise, it will execute an executable on the system with this name, either by checking in
the system's PATH for a matching executable name, or by executing the executable given by an absolute or
relative path. For instance:
would execute an executable my_compiled_prog in the current directory of the user.
If a valid command is successfully entered, the shell will then execute the command, and wait for it to complete,
before displaying the prompt again, and asking the user to input another command.
can be validly split as:
["this", "is", "Tom's PC"]
sort input.txt -o output.txt
["sort", "input.txt", "-o", "output.txt"]
./my_compiled_prog
Tip
The info you learn about fork and exec in Week 3 will be very useful for helping to implement the ability
to run programs in your shell! You can also find implementations of these system calls as functions in the
Python os module.
Important
If a command is run in a separate process, it is important to set the process group of the child process to
a brand new one, and set the new foreground terminal process group to the new group the child
process is in. This is so functionality, such as pressing Ctrl + C, only affects the child, rather than the
parent process.
In short, in the child process, the only other thing you'll need to do after forking is to create a new process
group by calling os.setpgid(0, 0) (before doing whatever you need to do to exec ).
4 / 21
Error cases
In the parent process, there are a couple of steps that need to be followed after forking:
Useful functions:
1. First, also try to create a new process group for the child process, by adding:
Why do we need both? It's a little tricky at first, and there's no need to fully understand it, but
essentially: since each process effectively runs immediately after forking, if the parent would set the
process group (call os.setpgid ) after the child has called an exec function, it would fail and cause
a PermissionError to be thrown. Setting the process group in both the parent and child processes
prevents this race condition from impacting our program's functionality (Thank you to the student who
identified this in Ed thread #159!).
try:
os.setpgid(child_pid, child_pid)
except PermissionError:
# Child has already set new process group!
pass
2. Get the child's new process group ID with os.getpgid(child_pid) .
3. Open the current terminal device, located in /dev/tty to get its file descriptor.
4. Set the terminal foreground process group with os.tcsetpgrp(descriptor>, child_pgid) .
5. Wait for the child process to complete.
6. Restore the terminal foreground process group back to the parent's process group with
os.tcsetpgrp(, parent_pgid) .
7. Make sure the /dev/tty file descriptor is closed as well!
os.setpgid
os.getpgid
os.tcsetpgrp
os.getpgrp
If the first argument does not contain a slash and the argument does not refer to a built-in command or
executable on the PATH , the shell should output:
to stderr .
mysh: command not found:
If the first argument contains a slash anywhere within it (indicating it is a relative or absolute path), and it
does not refer to a valid file or directory, the shell should output:
to stderr .
mysh: no such file or directory:
5 / 21
Pipes
Like other Unix shells, one of the features that mysh will support is piping the stdout result of one command
as stdin input for another command!
In POSIX shells, such as Bash, commands are piped from one to another using the pipe operator ( | ). For
instance, in the line below:
both echo and cat are immediately started by the shell. echo prints Hello! to what normally would be
stdout , however, instead, the shell captures this output. Meanwhile, cat is also waiting for input to come
from what normally would be stdin , however, the shell will instead feed it the output of echo . The result is
that after echo prints Hello! , the shell captures this output, and passes it along to cat as if it was from
stdin , which it then reads, and thus prints Hello! to the terminal screen.
In mysh, pipes operate in the same way. The syntax for pipes is below:
Each pipe operator ( | ) is placed between 2 commands with arguments, with the stdout output of the
previous command being captured and being fed as stdin to the next command. All commands as part of a
sequence of pipes (also known as a pipeline) start simultaneously, and each command waits for input to read
(as if it was from stdin ) from the previous command.
For example, in the line:
the command a is run with arguments b c , where its stdout output is passed to stdin to command d (run
with arguments e ), and in turn, d 's stdout output is passed as stdin to command f .
If the first argument contains a slash and is a valid path, but cannot be executed due to it being a
directory, the shell should output:
to stderr .
mysh: is a directory:
If the first argument contains a slash and is a valid path, but the user cannot execute the file due to
lacking appropriate executable permissions, the shell should output:
to stderr .
mysh: permission denied:
echo "Hello!" | cat
| | ... | arguments>
a b c | d e|f
6 / 21
However, if the pipe operator is in single or double quotes, it is not interpreted as a shell pipe, but rather as a
literal '|' character instead.
For instance, in the line:
we interpret this as passing the stdout of command a to the stdin of command b , which was invoked with
the single argument c | d (just a string).
Error cases
a | b 'c | d'
Tip
The scaffold for the assignment contains a module, parsing , which contains a function,
split_by_pipe_op which you can use to split strings by an unquoted pipe operator!
Tip
The os.pipe , as well as the os.dup2 / os.dup functions will be incredibly useful here!
Important
Each process which is executed as part of a pipeline must be part of the same process group while
executing, with this process group being set as the foreground terminal process group, so that if the
user wants to terminate execution of the pipeline early (e.g. by pressing Ctrl + C), all processes in the
pipeline will terminate. See the Executing commands section for more details about setting this.
If a line contains a trailing pipe operator without a command after it, for example:
the shell should output:
to stderr .
a | b |
mysh: syntax error: expected command after pipe
If a line contains multiple pipes, but there is no command between 2 pipes, for example:
this should output the same syntax error to stderr as above.
a | b | | c
7 / 21
Using Shell Variables
Just like other Unix shells, mysh supports defining your own variables, which are saved into the process's
environment variables, and can be used as part of the shell prompt.
Defining your own variables can be done via the var command, and usage of this command can be found its
corresponding section, so let's talk about how we can use defined shell variables first.
Shell variables may be used as part of some command line input with the ${} syntax. If an
argument contains this syntax, it should be replaced with the value of the environment variable matching
${} .
For example, if we have an environment variable greeting with the value Hello, world! , and the user
enters the line:
${greeting} should be substituted with its value, with the line being interpreted as:
However, the user may choose to "escape" this syntax with a backslash ('\'), as they may not want to have this
interpreted as a string. To "escape" the variable usage syntax, the user writes a backslash before the $
symbol, i.e. \${greeting} . This means that for the line:
the line should instead be interpreted as:
i.e. calling the echo command, with the literal string ${greeting} as a single argument.
If the variable is used inside a quoted argument (either single or double quotes), substitution should occur as
normal. For example, if the name environment variable had the value Rosanna , the line:
would be interpreted as:
The same goes for if an argument is inside single quotes single quotes - for instance, the line:
echo ${greeting}
echo Hello, world!
echo \${greeting}
echo ${greeting}
echo "My name is ${name}"
echo "My name is Rosanna"
echo 'My name is ${name}'
8 / 21
would be interpreted as:
Error cases
Built-in Commands
As part of mysh, some commands are defined as "built-in" commands, that is to say, that the commands do not
necessarily run a separate executable, but instead, is executed as part of some built-in functionality for the
shell.
var
The var built-in command is used to set the value of environment variables within mysh. The syntax for
defining this command is:
where is the name of the shell variable, which can be:
and represent the value of the shell variable.
echo 'My name is Rosanna'
If contains characters which are invalid for a variable name (see var ), a syntax error
should be printed to stderr in the form:
The command which contained the invalid variable name should also not be executed.
If there are multiple variables with invalid characters, mysh will only raise an error on the first invalid
variable.
mysh: syntax error: invalid characters for variable
If there is no environment variable named , the string ${} will be
substituted (if applicable) by an empty string. For instance, if no variable name exists, the following line:
will be parsed as:
echo "My name is ${name}"
echo "My name is "
var [-s]
any alphabetical letters A-Z, capital or lowercase
any digits 0-9
underscores (_)
9 / 21
The -s flag is optional – see the Setting the result of a command subsection for more details. In this case,
more than 1 argument may be provided.
In basic usage, if the user were to input:
the shell would create a new environment variable count , with its value set to 0 .
If the user were to input:
the shell would create a new environment variable greeting , with its value set to Hello, world! .
If var is successfully called, there should be no additional output created.
Setting the result of a command
The var command also accepts one possible optional flag – -s . This flag allows you to set a variable to a
value that is the stdout result of a command, rather than a string value.
If the -s flag is specified, the syntax for var will look like the below:
is a single argument which represents the execution of a command as if it was to be
executed by the shell. This may be just a command name (e.g. ls ), a command with arguments (e.g. "ls -l
~" ), or a pipeline (e.g. 'ls -1 ~ | wc -l' ) – the latter 2 options being entered as a quoted argument to be
passed as a single argument.
For example, if the user was to input:
the shell should store the result of executing the command expr 2 - 5 into the environment variable
add_res .
Similarly, if the user was to input:
Important
All shell variables are interpreted as strings, as these must be stored in the process's environment, which
only supports string environment variables. See os.environ for more info.
var count 0
var greeting "Hello, world!"
var -s
var -s add_res "expr 2 - 5"
var -s usyd_info_units 'sort usyd-scs-units.csv | grep "INFO"'
10 / 21
the shell should store the result of executing sort usyd-scs-units | grep "INFO" into the environment
variable usyd_info_units .
Edge cases
Error cases
pwd
Like the pwd command found in most Unix shells, the pwd built-in command should print the current working
directory as an absolute path that the shell is located in to stdout .
The user may optionally pass one flag: -P , which, if passed, will resolve additionally resolve all symbolic links
in the shell's current working directory to their real path before printing this as an absolute path to stdout .
Example usage
If the user was in their home directory (e.g. /home/bob ), the following:
Flag arguments ( -s , as well as any invalid flags) are only interpreted as flags if they are the first
argument passed to var . If they are in any other position, they are interpreted as a regular argument to
var (following the syntax requirements above).
If the user enters any flag arguments which are not -s , the shell should output the following to stderr :
For instance, if the user enters var -t test_var test_value , the shell should output:
If multiple options are specified, such as var -tu , the first invalid option found should be outputted.
var: invalid option: -