xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

Matlab代写-MATH 368

时间：2021-04-26

MATH 368

Dr. Andreas Alpers

Project 1

Submit by: 30.04.2021

Breaking permutation ciphers using

Markov Chain Monte Carlo

1. Introduction

Supposedly already used by the ancient Spartans, permutation ciphers have been around since

many centuries. Breaking of such ciphers is commonly considered to be rather difficult. In this

project we want to implement an MCMC method to break permutation ciphers. In particular, we

want to decipher a specific text that you should download from the course webpage.

We will assume that the plain and ciphertext contain only the 27 characters A,B,C, . . . ,Z, ,

with ‘ ’ denoting the space character. The set alph := {A,B,C, . . . ,Z, } is called alphabet. There

is no linebreak, punctuation, or anything else.

In permutation ciphers all the 27 characters in the plain text remain the same but their positions

are rearranged in a different order. The plain text is first split into blocks of a prescribed size ` (the

so-called key length). Characters in each block are permuted according to a same permutation σ

(the so-called key).

For example, for ` = 5 and key

σ = (σ(1), σ(2), σ(3), σ(4), σ(5)) = (2, 4, 5, 1, 3)

one obtains1

plaintext = VOLLE︸ ︷︷ ︸ YBALL︸ ︷︷ ︸

ciphertext =

︷ ︸︸ ︷

LVEOL

︷ ︸︸ ︷

LYLBA

Decryption therefore amounts to finding the inverse σ−1 of the (unknown) permutation σ.

2. Swap and slide moves

Using Metropolis-Hastings the aim is to generate a Markov Chain whose states are the permu-

tations σ of ` elements. Two different types of proposals/moves should be considered in the

proposal step: swap moves and slide moves.

A swap move swaps two randomly selected entries of the current key σ. For instance, if for the

above σ the selected entries are 2 and 5 then the new key will be

σ′ = (5, 4,2, 1, 3) .

For a slide move one first needs to draw (uniformly at random) three numbers b ∈ [1, ` − 2],

k1 ∈ [1, `− b+ 1], and k2 ∈ [0, `− b]. Then, from σ the entries at positions k1, k1 + 1, . . . , k1 + b− 1

are deleted and again inserted (in the same order) after position k2. For instance, for the above σ

with b = 2, k1 = 3, k2 = 0, one obtains

σ = (2, 4,5,1, 3) → (2, 4, 3) → σ′ = (5,1, 2, 4, 3) .

1In this notation for σ, the character in each block is moved from position i to σ(i).

Page 1 of 3

3. Steps of the project

We can break this project down into several steps.

(S0) Download the file ciphertext.txt from the canvas course page https://liverpool.

instructure.com/courses/19760 (see link in the top announcement). The file contains

a single line of 756 characters, all in alph. This is the ciphertext to be decrypted. Assume

that the plain text was written in English and encrypted with a permutation cipher with key

length ` = 9.

(S1) Download a large English text, such as Tolstoy’s War and Peace from http://www.

gutenberg.org/files/2600/2600-0.txt, and compute the following:

(a) freq(c), the relative frequencies of the characters c ∈ alph in English language (i.e., set

freq(c) to be the number of occurrences of c in your English text and normalize this

such that

∑

c∈alph freq(c) = 1).

(b) B(x, y), a matrix that records the number of character transitions from x ∈ alph to

y ∈ alph in your English text (in other words, B(x, y) holds the value how often x

is followed by y in the English text). By dividing each row B(x, ·) by ∑y∈alphB(x, y)

obtain the matrix TransB.

(S2) For numerical stability we want to work with the logarithm version of the Metropolis-

Hastings algorithm discussed in class. Therefore, compute logfreq(c) := log(freq(c)) and

logTransB(x, y) := log(TransB(x, y)) for all c, x, y ∈ alph (with log denoting the natural

logarithm).

(S3) To any string f of n characters from alph we can associate a plausibility

P (f) := logfreq(f(1)) +

n−1∑

i=1

logTransB(f(i), f(i+ 1)),

with f(i) denoting the ith character in f. Further, let f−j denote the string that results

from deleting the first j blocks (i.e., j` characters) from the ciphertext.

Give a matlab implementation of the Metropolis-Hastings algorithm that takes f−j as input

(j drawn uniformly at random from the interval [0, 30]) and, starting with the identity

permutation, performs a prescribed number MaxIt of iterations. Only slide moves should

be proposed in the first MaxIt/2 iterations, whereas only swap moves are to be used in the

final MaxIt/2 iterations.

(Note that the acceptance probability for accepting the move that changes the current string f

to the new string f ∗ is given by exp(min{0, P (f ∗)− P (f)}) as we work with logarithms.)

(S4) Run the Metropolis-Hastings algorithm 20 times (each run consisting of MaxIt = 6, 000

Metropolis-Hastings iterations, i.e., new proposals of f ∗). Return for the most plausible f

(the f with largest P (f) value obtained over all iterations and runs of the algorithm) the

corresponding permutation, and decrypt with it the whole ciphertext. If your code contains

no bug you will obtain the plain text (or, at least, a good approximation to it).

Page 2 of 3

4. Hints for debugging

(i) Before analyzing War and Peace you should convert all the letters to upper case. Delete

the first 842 and the last 364 lines of the text file (they just contain the table of contents

and technical comments inserted by Project Gutenberg; the remaining text should start with

the words ‘CHAPTER I’ and end with the last sentence of the book). Moreover delete all

characters that are not in alph.

(ii) To improve convergence, it may be helpful to replace −∞ values in logTransB by a small

value, say, −12.

(iii) You might want to test your implementation by decrypting a text that you encrypted yourself.

(iv) It might be helpful to write (and test separately) a function that takes a permutation and a

string as input and returns the permuted string.

(v) If your code (including the text analysis) takes more than a few minutes on a standard laptop,

then there is probably somewhere a bug.

5. What and how to submit

Each group is expected to submit a single PDF containing answers/code for Q1-Q4 stated

below (submit via Canvas as group submission). Each member of the group is expected to take

part in the peer moderation on Canvas using Buddycheck upon submission (failure to do so,

will result in a subtraction of 5 points from your individual grade after moderation).

Q1: [15 points] What is the best decryption of the ciphertext obtained by your MCMC

method? State also the corresponding P (f) value and the σ−1 that you use for the

decryption.

(Can you guess the original author of the lines of the ciphertext?)

Q2: [20 points] What are the five least frequent characters in War and Peace?

(State them in increasing order.)

Q3: [20 points] What are the five most frequent character transitions in War and Peace?

(State them in decreasing order.)

Q4: [45 points] Submit your complete, commented matlab code for decrypting the text (in-

cluding your MCMC implementation and code for analyzing English language texts).

Make sure that the part containing the implementation of the swap and slide moves is

well documented.

Academic Integrity

The work submitted must be academically sound, without elements of plagiarism or

other features of poor practice. More resources on Academic Integrity, and the conse-

quences of Academic Misconduct, can be found here, and the examiner expects that all

students complete the KnowHow Academic Integrity module (to be found on Canvas,

under KnowHow: Study for Success) before submitting any assignment.

Advice and responsibilities for students working in groups are given in Paragraph 5 of the

Code of Practice on Assessments Appendix G. The examiner expects that all students

familiarise themselves with this section of the document before commencing group work.

Page 3 of 3

学霸联盟

Dr. Andreas Alpers

Project 1

Submit by: 30.04.2021

Breaking permutation ciphers using

Markov Chain Monte Carlo

1. Introduction

Supposedly already used by the ancient Spartans, permutation ciphers have been around since

many centuries. Breaking of such ciphers is commonly considered to be rather difficult. In this

project we want to implement an MCMC method to break permutation ciphers. In particular, we

want to decipher a specific text that you should download from the course webpage.

We will assume that the plain and ciphertext contain only the 27 characters A,B,C, . . . ,Z, ,

with ‘ ’ denoting the space character. The set alph := {A,B,C, . . . ,Z, } is called alphabet. There

is no linebreak, punctuation, or anything else.

In permutation ciphers all the 27 characters in the plain text remain the same but their positions

are rearranged in a different order. The plain text is first split into blocks of a prescribed size ` (the

so-called key length). Characters in each block are permuted according to a same permutation σ

(the so-called key).

For example, for ` = 5 and key

σ = (σ(1), σ(2), σ(3), σ(4), σ(5)) = (2, 4, 5, 1, 3)

one obtains1

plaintext = VOLLE︸ ︷︷ ︸ YBALL︸ ︷︷ ︸

ciphertext =

︷ ︸︸ ︷

LVEOL

︷ ︸︸ ︷

LYLBA

Decryption therefore amounts to finding the inverse σ−1 of the (unknown) permutation σ.

2. Swap and slide moves

Using Metropolis-Hastings the aim is to generate a Markov Chain whose states are the permu-

tations σ of ` elements. Two different types of proposals/moves should be considered in the

proposal step: swap moves and slide moves.

A swap move swaps two randomly selected entries of the current key σ. For instance, if for the

above σ the selected entries are 2 and 5 then the new key will be

σ′ = (5, 4,2, 1, 3) .

For a slide move one first needs to draw (uniformly at random) three numbers b ∈ [1, ` − 2],

k1 ∈ [1, `− b+ 1], and k2 ∈ [0, `− b]. Then, from σ the entries at positions k1, k1 + 1, . . . , k1 + b− 1

are deleted and again inserted (in the same order) after position k2. For instance, for the above σ

with b = 2, k1 = 3, k2 = 0, one obtains

σ = (2, 4,5,1, 3) → (2, 4, 3) → σ′ = (5,1, 2, 4, 3) .

1In this notation for σ, the character in each block is moved from position i to σ(i).

Page 1 of 3

3. Steps of the project

We can break this project down into several steps.

(S0) Download the file ciphertext.txt from the canvas course page https://liverpool.

instructure.com/courses/19760 (see link in the top announcement). The file contains

a single line of 756 characters, all in alph. This is the ciphertext to be decrypted. Assume

that the plain text was written in English and encrypted with a permutation cipher with key

length ` = 9.

(S1) Download a large English text, such as Tolstoy’s War and Peace from http://www.

gutenberg.org/files/2600/2600-0.txt, and compute the following:

(a) freq(c), the relative frequencies of the characters c ∈ alph in English language (i.e., set

freq(c) to be the number of occurrences of c in your English text and normalize this

such that

∑

c∈alph freq(c) = 1).

(b) B(x, y), a matrix that records the number of character transitions from x ∈ alph to

y ∈ alph in your English text (in other words, B(x, y) holds the value how often x

is followed by y in the English text). By dividing each row B(x, ·) by ∑y∈alphB(x, y)

obtain the matrix TransB.

(S2) For numerical stability we want to work with the logarithm version of the Metropolis-

Hastings algorithm discussed in class. Therefore, compute logfreq(c) := log(freq(c)) and

logTransB(x, y) := log(TransB(x, y)) for all c, x, y ∈ alph (with log denoting the natural

logarithm).

(S3) To any string f of n characters from alph we can associate a plausibility

P (f) := logfreq(f(1)) +

n−1∑

i=1

logTransB(f(i), f(i+ 1)),

with f(i) denoting the ith character in f. Further, let f−j denote the string that results

from deleting the first j blocks (i.e., j` characters) from the ciphertext.

Give a matlab implementation of the Metropolis-Hastings algorithm that takes f−j as input

(j drawn uniformly at random from the interval [0, 30]) and, starting with the identity

permutation, performs a prescribed number MaxIt of iterations. Only slide moves should

be proposed in the first MaxIt/2 iterations, whereas only swap moves are to be used in the

final MaxIt/2 iterations.

(Note that the acceptance probability for accepting the move that changes the current string f

to the new string f ∗ is given by exp(min{0, P (f ∗)− P (f)}) as we work with logarithms.)

(S4) Run the Metropolis-Hastings algorithm 20 times (each run consisting of MaxIt = 6, 000

Metropolis-Hastings iterations, i.e., new proposals of f ∗). Return for the most plausible f

(the f with largest P (f) value obtained over all iterations and runs of the algorithm) the

corresponding permutation, and decrypt with it the whole ciphertext. If your code contains

no bug you will obtain the plain text (or, at least, a good approximation to it).

Page 2 of 3

4. Hints for debugging

(i) Before analyzing War and Peace you should convert all the letters to upper case. Delete

the first 842 and the last 364 lines of the text file (they just contain the table of contents

and technical comments inserted by Project Gutenberg; the remaining text should start with

the words ‘CHAPTER I’ and end with the last sentence of the book). Moreover delete all

characters that are not in alph.

(ii) To improve convergence, it may be helpful to replace −∞ values in logTransB by a small

value, say, −12.

(iii) You might want to test your implementation by decrypting a text that you encrypted yourself.

(iv) It might be helpful to write (and test separately) a function that takes a permutation and a

string as input and returns the permuted string.

(v) If your code (including the text analysis) takes more than a few minutes on a standard laptop,

then there is probably somewhere a bug.

5. What and how to submit

Each group is expected to submit a single PDF containing answers/code for Q1-Q4 stated

below (submit via Canvas as group submission). Each member of the group is expected to take

part in the peer moderation on Canvas using Buddycheck upon submission (failure to do so,

will result in a subtraction of 5 points from your individual grade after moderation).

Q1: [15 points] What is the best decryption of the ciphertext obtained by your MCMC

method? State also the corresponding P (f) value and the σ−1 that you use for the

decryption.

(Can you guess the original author of the lines of the ciphertext?)

Q2: [20 points] What are the five least frequent characters in War and Peace?

(State them in increasing order.)

Q3: [20 points] What are the five most frequent character transitions in War and Peace?

(State them in decreasing order.)

Q4: [45 points] Submit your complete, commented matlab code for decrypting the text (in-

cluding your MCMC implementation and code for analyzing English language texts).

Make sure that the part containing the implementation of the swap and slide moves is

well documented.

Academic Integrity

The work submitted must be academically sound, without elements of plagiarism or

other features of poor practice. More resources on Academic Integrity, and the conse-

quences of Academic Misconduct, can be found here, and the examiner expects that all

students complete the KnowHow Academic Integrity module (to be found on Canvas,

under KnowHow: Study for Success) before submitting any assignment.

Advice and responsibilities for students working in groups are given in Paragraph 5 of the

Code of Practice on Assessments Appendix G. The examiner expects that all students

familiarise themselves with this section of the document before commencing group work.

Page 3 of 3

学霸联盟