Translated: October 2013
Revised: September 2015 and January 2020
This work is licensed under a
Attribution-NoDerivs 3.0 Creative Commons license.
An encrypted message arrives
—Menelaus, you’ve got an email from somebody called Paris.
—Paris? That’s an old friend of mine.
—Look at the email’s content.
-----BEGIN BLAISE MESSAGE----- B8 B6 CB C8 C0 9E BA C5 C3 C0 C6 B1 C6 D2 A8 91 C7 8D D7 C7 C4 CE B5 91 D3 CB 91 D1 AE D9 95 C2 C9 BF B5 C1 D5 B6 AC 8D D9 C4 D0 A1 C2 B6 9F D0 B9 C3 8D C2 BA CE CE 90 C1 C4 CE C4 CD BB A0 BE A2 D0 B5 91 CC C1 C5 AC 8D C9 95 BE BB BE 98 D3 9C C1 D3 C1 A0 CA CB 9A C7 BA D3 C4 91 C7 C1 A0 B6 C9 D3 BD C0 D1 C1 9F -----END BLAISE MESSAGE-----
—It looks like an encrypted message, but why does he send it to me?
I ran into him days ago after years without seeing him and we were
talking about… Ah, of course, I remember I told him that I’m quite
unmotivated lately; that I don’t get projects that suppose an
intellectual challenge. I believe he must have sent this so I have
something interesting to think about. Very kind of him!
—Will you try to decrypt it?
—Our current projects have no imminent deadlines, so I see no reason I shouldn’t do it. Let’s start. What can we say about this message at first sight?
—The most obvious is that there is a begin tag and an end tag.
—Yes, and it looks like it was generated with a cryptosystem called
—I don’t know it.
—Me neither. Let’s look for the program on the Internet.
The “BLAISE” cryptosystem
—I see, you write some text and a password down and it allows you to
encrypt or decrypt it. Let’s see what happens if we enter Paris' message
and a random password… Of course, it turns into rubbish. Let’s continue
studying the message. What can we say about the content between the
begin and the end tag of the message?
—They look like bytes in hexadecimal notation.
—Exactly. Let’s run Racket’s REPL in order to make some tests.
menelaus@sparta:~$ racket Welcome to Racket v7.5. >
—Let’s save the message’s content in a variable.
(define msg " B8 B6 CB C8 C0 9E BA C5 C3 C0 C6 B1 C6 D2 A8 91 C7 8D D7 C7 C4 CE B5 91 D3 CB 91 D1 AE D9 95 C2 C9 BF B5 C1 D5 B6 AC 8D D9 C4 D0 A1 C2 B6 9F D0 B9 C3 8D C2 BA CE CE 90 C1 C4 CE C4 CD BB A0 BE A2 D0 B5 91 CC C1 C5 AC 8D C9 95 BE BB BE 98 D3 9C C1 D3 C1 A0 CA CB 9A C7 BA D3 C4 91 C7 C1 A0 B6 C9 D3 BD C0 D1 C1 9F ")
—Let’s define a procedure to get the raw encrypted message: it’ll have to split the string on whitespace and convert each resulting hexadecimal number into an integer.
(define (unarmor str) (sequence-map (curryr string->number 16) (string-split str)))
—Let’s also write down a procedure to get the sorted list of each value’s frequency.
(require math/statistics) (define (sorted-frequencies nums) (sort (hash->list (samples->hash nums)) < #:key cdr))
—And let’s get the result using both procedures.
> (sorted-frequencies (unarmor msg)) '((191 . 1) (174 . 1) (200 . 1) (184 . 1) (154 . 1) (161 . 1) (185 . 1) (168 . 1) (215 . 1) (177 . 1) (205 . 1) (152 . 1) (144 . 1) (204 . 1) (213 . 1) (162 . 1) (158 . 1) (210 . 1) (189 . 1) (202 . 1) (156 . 1) (217 . 2) (149 . 2) (209 . 2) (198 . 2) (197 . 2) (187 . 2) (195 . 2) (172 . 2) (159 . 2) (192 . 3) (160 . 3) (186 . 3) (203 . 3) (181 . 3) (194 . 3) (190 . 3) (201 . 3) (208 . 3) (199 . 4) (206 . 4) (141 . 4) (182 . 4) (196 . 5) (145 . 5) (211 . 5) (193 . 7))
—We can see the values' frequency isn’t regularly distributed.
—And what does that mean?
—That it isn’t a strong cryptosystem; it’s most likely designed by an amateur.
—Encrypting the text
AAAA with password
AB gets us the ciphertext
AE AF AE AF. If we encrypt
BBBB with the same password we have
AF B0 AF B0, i.e., the same than before but with each value
incremented by one.
—What follows from this?
—That each ciphertext byte is obtained by applying a function to a character of the plaintext and a character of the password. The first character of the text with the first of the password, the second with the second and so on until the password ends. Then start over with the first password character. There’s only one thing left to discover: the magic number.
—The magic number?
—A fixed number that’s also used in the function, which we’ll find out by subtracting to the ciphertext byte the value of its corresponding plaintext character and password character.
> (- #xAE (char->integer #\A) (char->integer #\A)) 44 > (- #xAF (char->integer #\A) (char->integer #\B)) 44
—See? In both cases the magic number is
44. Now we can write the
(define (decrypt nums keys) (sequence-map (lambda (n k) (integer->char (modulo (- n k 44) 128))) (in-parallel nums (in-cycle (sequence-map char->integer keys)))))
—Let’s try a dictionary attack. Since only uppercase letters without diacritical marks are accepted by the program, we’ll need a procedure to convert the dictionary words as they are read.
(define normalize-word (compose (curryr string-replace #px"\\W+" "") string-upcase string-normalize-nfkd))
—Finally, a procedure that tries to decrypt the ciphertext with each dictionary word. It’ll look for a prefix on each result to know if it is the correct word, and if it matches it’ll return the word.
(define (attack nums word-list prefix) (with-input-from-file word-list (lambda () (for/first ((w (sequence-map normalize-word (in-lines))) #:when (sequence-andmap eqv? (in-parallel (decrypt nums w) prefix))) w))))
—Let’s find an English word list file on the
Internet… OK, this one will do. Now we must decide which prefix to
—It might work. Let’s try.
> (attack (unarmor msg) "english.txt" "HELLO MENELAUS")
—Gee, I wonder why Paris chose this word as password. Well, now we simply have to use it to decrypt the message.
Menelaus read the decrypted message and stared at it for some seconds. He suddenly paled, stood up abruptly and grabbed the phone.