Python代写-A2
时间:2022-09-27
HexBot Robot Environment (A2 Update 26/09/22)
COMP3702 Artificial Intelligence 2022
You have been tasked with developing a planning algorithm for automatically controlling HexBot,
a multi-purpose robot which operates in a hexagonal environment, and has the capability to
push and pull ‘Widgets’ in order to reposition and rotate them to target locations and
orientations. To aid you in this task, we have provided a simulator and visualisation for the
HexBot robot environment which you will interface with to develop your solution.
For A2, the HexGrid environment has been extended to model non-deterministic outcomes of
actions. Cost and action validity are now replaced by a reward function where action costs are
represented by negative received rewards, with additional penalties (i.e. negative rewards)
being incurred when a collision occurs (between the robot or a widget and an obstacle, or
between widgets). Updates to this document are shown in blue text.
Hexagonal Grid
The environment is represented by a hexagonal grid. Each cell of the hex grid is indexed by
(row, column) coordinates. The hex grid is indexed top to bottom, left to right (i.e. the top left
corner has coordinates (0, 0) and the bottom right corner has coordinates (n_rows-1, n_cols-1)).
Even numbered columns (starting from zero) are in the top half of the row, odd numbered
columns are in the bottom half of the row. e.g.
row 0, col 0 row 0, col 2 ...
row 0, col 1 row 0, col 3
row 1, col 0 row 1, col 2 ...
row 1, col 1 row 1, col 3
... ... ... ...
Two cells in the hex grid are considered adjacent if they share an edge. For each non-border
cell, there are 6 adjacent cells.
Robot
The HexBot robot occupies a single cell in the hex grid. In the visualisation, the robot is
represented by the cell marked with the character ‘R’. The side of the cell marked with ‘*’
represents the front of the robot. The state of the robot is defined by its (row, column)
coordinates and its orientation (i.e. the direction its front side is pointing towards).
The robot has 4 available nominal actions:
● Forward → move to the adjacent cell in the direction of the front of the robot (keeping the
same orientation)
● Reverse → move to the adjacent cell in the opposite direction to the front of the robot
(keeping the same orientation)
● Spin Left → rotate left (relative to the robot’s front, i.e. counterclockwise) by 60 degrees
(staying in the same cell)
● Spin Right → rotate right (i.e. clockwise) by 60 degrees (staying in the same cell)
Each time the robot selects an action, there is a fixed probability (given as a parameter of each
testcase) for the robot to ‘drift’ by 60 degrees in a clockwise or counterclockwise direction
(separate probabilities for each drift direction) before the selected nominal action is performed.
The probability of drift occurring depends on which nominal action is selected, with some
actions more likely to result in drift. Drifting CW and CCW are mutually exclusive events. Drift
occurring does not cause any additional cost/reward penalty to be incurred except in the case
where the movement direction resulting from drift causes a collision to occur.
Additionally, there is a fixed probability (also given as a parameter of each testcase) for the
robot to ‘double move’, i.e. perform the nominal selected action twice. The probability of a
double move occurring depends on which action is selected. Double movement may occur
simultaneously with drift (CW or CCW). Double movement does not cause additional
cost/reward penalty to be incurred (i.e. the movement is ‘two for the price of one’) except where
the double movement results in a collision occurring.
The reward received after each action is the minimum/most negative out of the rewards
received for the nominal action and any additional (drift/double move) actions.
The robot is equipped with a gripper on its front side which allows it to manipulate Widgets.
When the robot is positioned with its front side adjacent to a widget, performing the ‘Forward’
action will result in the Widget being pushed, while performing the ‘Reverse’ action will result in
the Widget being pulled.
Obstacles
Some cells in the hex grid are obstacles. In the visualisation, these cells are filled with the
character ‘X’. Any action which causes the robot or any part of a Widget to enter an obstacle cell
results in collision, causing the agent to receive a negative obstacle collision penalty as reward.
This reward replaces the movement cost which the agent would have otherwise incurred.
causing a penalty value (given as a parameter of each testcase) to be subtracted from the
received reward. The outside boundary of the hex grid behaves in the same way as an obstacle.
Additionally, the environment now contains an additional obstacle type, called ‘hazards’.
Hazards behave in the same way as obstacles, but when collision occurs, a different (larger)
penalty is received as the reward. subtracted from the reward. As a result, avoiding collisions
with hazards has greater importance than avoiding collisions with obstacles. Hazards are
represented by ‘!!!’ in the visualisation.
Widgets
Widgets are objects which occupy multiple cells of the hexagonal grid, and can be rotated and
translated by the HexBot robot. The state of each widget is defined by its centre position (row,
column) coordinates and its orientation. Widgets have rotational symmetries - orientations which
are rotationally symmetric are considered to be the same.
In the visualisation, each Widget in the environment is assigned a unique letter ‘a’, ‘b’, ‘c’, etc.
Cells which are occupied by a widget are marked with the letter assigned to that widget
(surrounded by round brackets). The centre position of the widget is marked by the uppercase
version of the letter, while all other cells occupied by the widget are marked with the lowercase.
Three widget types are possible, called Widget3, Widget4 and Widget5, where the trailing
number denotes the number of cells occupied by the widget. The shapes of these three Widget
types and each of their possible orientations are shown below.
Widget3
VERTICAL
_____
/ \ SLANT_RIGHT SLANT_LEFT
/ (a) \ _____ _____
\ / / \ / \
\_____/ _____/ (a) \ / (a) \_____
/ \ / \ / \ / \
/ (A) \ _____/ (A) \_____/ \_____/ (A) \_____
\ / / \ / \ / \
\_____/ / (a) \_____/ \_____/ (a) \
/ \ \ / \ /
/ (a) \ \_____/ \_____/
\ /
\_____/
Widget4
UP DOWN
_____ _____ _____
/ \ / \ / \
/ (a) \ / (a) \_____/ (a) \
\ / \ / \ /
\_____/ \_____/ (A) \_____/
/ \ \ /
_____/ (A) \_____ \_____/
/ \ / \ / \
/ (a) \_____/ (a) \ / (a) \
\ / \ / \ /
\_____/ \_____/ \_____/
Widget5
SLANT_RIGHT SLANT_LEFT
_____ _____
HORIZONTAL / \ / \
_____ _____ / (a) \_____ _____/ (a) \
/ \ / \ \ / \ / \ /
/ (a) \_____/ (a) \ \_____/ (a) \ / (a) \_____/
\ / \ / / \ / \ / \
\_____/ (A) \_____/ _____/ (A) \_____/ \_____/ (A) \_____
/ \ / \ / \ / \ / \
/ (a) \_____/ (a) \ / (a) \_____/ \_____/ (a) \
\ / \ / \ / \ / \ /
\_____/ \_____/ \_____/ (a) \ / (a) \_____/
\ / \ /
\_____/ \_____/
Two types of widget movement are possible - translation (change in centre position) and rotation
(change in orientation).
Translation occurs when the robot is positioned with its front side adjacent to one of the widgets
cells such that the robot’s orientation is in line with the widget’s centre position. Translation
results in the centre position of the widget moving in the same direction as the robot. The
orientation of the widget does not change when translation occurs. Translation can occur when
both ‘Forward’ or ‘Reverse’ actions are performed. For an action which results in translation to
be valid, the new position of all cells of the moved widget must not intersect with the
environment boundary, obstacles, the cells of any other widgets or the robot’s new position.
Rotation occurs when the robot’s new position intersects one of the cells of a widget but the
robot’s orientation does not point towards the centre of that widget. Rotation results in the
widget spinning around its centre point, causing the widget to change orientation. The position
of the centre point does not change when rotation occurs. Rotation can only occur for the
‘Forward’ action - performing ‘Reverse’ in a situation where ‘Forward’ would result in a widget
rotation is considered invalid.
The following diagrams show which moves result in translation or rotation for each widget type:
The arrows indicate directions from which the robot can push or pull a widget in order to cause a
translation or rotation of the widget. Pushing in a direction which is not marked with an arrow is
considered invalid.
Targets
The hex grid contains a number of ‘target’ cells. In the visualisation, these cells are marked with
‘tgt’. For a HexBot environment to be considered solved, each target cell must be occupied by
part of a Widget. The number of targets in an environment is always less than or equal to the
total number of cells occupied by all Widgets.


essay、essay代写