xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

扫码添加客服微信

扫描添加客服微信

Python代写 - COMP3160 ARTIFICIAL INTELLIGENCE

时间：2020-11-02

Assignment 2 FINAL (Weight: 20%)

Evolutionary Algorithms for Adversarial Game Playing

Due: 11:55pm, Nov 06, 2020 (Friday, Week 13)

The goal of this assignment is to appreciate the efficacy of Evolutionary Algorithms, specif-

ically Genetic Algorithm (GA), in the context of game theory. In this assignment you will

be using the DEAP package for Genetic Algorithm in order to evolve strategies for repeat-

edly playing 3-Person Prisoners Dilemma described below with a different storyline:1

Three old friends, P1, P2 and P3, always wanted to skydive but never got the oppor-

tunity. An opportunity arose when they met each other recently in Sydney for a class

reunion, and decided to go for it. There are two skydiving packages available: the

BasicTandem package (pay a certain fee and dive with an instructor), and the Tandem-

Plus package (pay an additional $600, and also get specially videoed from multiple

angles while diving, and keep the video). All three friends think TandemPlus is better

than BasicTandem. They also think it is not worth the extra $600, but would be willing

to pay up to $300 extra for it instead. It is understood that they will evenly split the

total cost of their skydiving. Nonetheless they are thrilled at the prospect of this joint

adventure, and the ensuing pleasure is valued at $400 by each. Each friend orders the

package she wants without consultation or communication with others. If you were

P1, would you opt B (the BasicTandem package), or P (the TandemPlus package)?

Pj & Pk

0 B 1 B 2 Bs

Pi

B 0 2 4

P 1 3 7

Table 1: Payoffs to Pi, under how many of {Pj, Pk} play B; 1 unit = $100.

The payoff matrix for 3PD is given in Table 1. To see how this table is meant to

be used, suppose both P1 and P2 choose the basic package, but P3 chooses the plus

package. We want to calculate P2’s payoff. Setting i = 2 and {j, k} = {1, 3}, we see

Pi plays B (so payoffs in top row), and only one of Pj and Pk plays B (so we restrict to

the column under 1B); and determine that P2’s payoff is 2 (valued at $200). It is easily

verified that that is the case. The extra cost $600 due to P3’s choice of P is evenly split

between the three, P2 contributing $200 towards it. From the perspective of P2, this

extra cost is more than compensated by the pleasure she gets (valued $400), and so her

net gain is $200, represented as 2 in Table 1. The payoffs to each of the three friends

under alternative arrangements (e.g., if all of them opt P) can be similarly verified.

You will be using the DEAP package for Genetic Algorithm in order to evolve strategies

for playing Iterated 3-Player Prisoners Dilemma (3IPD). Two papers on the application of

GA to PD – one to IPD and the other to nIPD – are provided in the Assignment folder.

1Also called the Unscrupulous Diners Dilemma

1

Task Specification

Note: You are advised to go through the two supplied papers:

i “Using GA to Develop Strategies for IPD,” by A Haider, and

ii “An Experimental Study of N-Person IPD Games,” by X Yao and PJ. Darwen

in the given order before proceeding with the assignment tasks. Give particular attention

to Sections 4.1 and 2.1 of the respective works.

1. BACKGROUND KNOWLEDGE ASSESSMENT [3 marks]

(a) Analysing the Payoff matrix provided in Table 1, determine if a Nash Equilib-

rium exists for the game 3PD. If so, identify at least one of its Nash Equilibria,

and explain why it is so.

(b) Suppose we want to represent strategies for playing 3IPD of memory depth 2

in the context of GA. How many bits shall we need to represent the individuals,

and why? Answer in no more than 50 words.

(c) Consider a strategy (individual/chromosome) of memory-depth 2 for playing

3IPD. Explain how you would represent the memory bits and the default moves

in this individual.

2. IMPLEMENTATION IN PYTHON [15 marks]

(a) Implement the function:

payoff_to_ind1(individual1, individual2,

individual3, game):

returns payoff to individual1

Note: payoff is determined by latest moves obtained from respective appro-

priate memory locations of the individuals and the provided payoff matrix for

the game game. (Assume that the game is 3PD and memory-depth is 2.)

(b) Implement the function:

move_by_ind1(individual1, individual2,

individual3, round):

returns individual1’s move

Note: individual1’s next move is based on all the three individuals’ earlier

moves and individual1’s strategy (encoded in individual1’s chromo-

some). The move to be returned can be B/P, or C/D, or 0/1 depending on

your representation. Note that in early rounds some default moves are made.

Assume memory-depth of 2.

(c) Implement the function:

process_move(individual, move, memory_depth):

returns success / failure

Note: individual’s relevant memory bits are appropriately updated based

on its latest move move and memory depth.

2

(d) Implement the function:

def eval_function(individual1, individual2,

individual3, m_depth, n_rounds):

returns score to individual1

Note: individual1 iteratively plays 3PD against the other two for a num-

ber of rounds given by n rounds, and its score is returned.

(e) Implement, using the DEAP package, genetic evolution of strategies for play-

ing 3IPD. Assume a memory depth of 2. Based on your implementation, briefly

describe the best 3IPD-individual you generated via GA, and what parameters

(fitness function, type of crossover, mutation rate, etc.) you used for that pur-

pose. Explain why you chose those specific parameters.

3. ANALYSIS [2 marks]

(a) Describe in English the behaviour of the 3IPD-strategy you obtained via task 2e

above. Exploit any pattern you notice in it for this purpose.

(b) What would you consider to be a good counterpart of the strategy Tit for Tat

in 3IPD? Compare it with the best strategy you obtained via task 2e in terms

of traits such as being nice, being forgiving and being provocable. Make an

attempt to explain any major difference in these two strategies in terms of their

payoff structures.

What to Submit, and When

You will submit two files:

1.

2.

Your code file should include all the Python codes you wrote for this assignment.

Your report file should include all the answers (including the Python codes copied-and-

pasted). It must be submitted in the pdf format. It must have as cover page the one that has

been supplied (as part of the document template), duly filled and signed. Also, if relevant,

note in the last section anything relevant that is worth noting.

You will submit the files in two stages. In the first stage you must submit two draft files

(that you will be able to update) by 11:55pm, Friday Week 12:

(a) of the program file

least the implementation of functions specified in Tasks 2a and 2b.

(b) of your report file

swers to Tasks 1a-1c. You can modify these files while preparing your final ver-

sion. However, failure to submit the two draft files by the required date will attract

a penalty of 4 marks. The final version of these three files must be submitted by

11:55pm, Friday Week 13.

3