July 11, 2019

Cryptopals Challenges: #1-3

Cryptopals Challenges: #1-3

Problem #1: Convert hex to base64

There is an easy way and a hard way to do this problem.  The hard way would be to program your own conversion function.  However, if you understand the ideas behind different bases this is not very necessary.  Instead, you can save the time and effort and use built-in libraries in python!

import binascii

def HexTo64(text):
	return base64.b64encode(bytes.fromhex(text)).decode()

def HexFrom64(text):
	return base64.b64decode(text).hex()

Problem #2: Fixed XOR

This problem asks us to take a string, hex decode it, and xor it against another string.  Again, python gives us the tools to make this challenge very easy using the int() typecast.

def XORHexStrings(one, two):
    one = int(text, 16)
    two = int(text, 16)
    return (hex(one^two))

Problem #3: Single-byte XOR cipher

This is the first problem in the cryptopals set which requires more than a couple lines of code. Truthfully, my solution to this may be more verbose and over-complicated than others due to my inexperience with python.

Step 1: Writing the XOR Function

Our first task is to write a function which takes a hex string and xors each byte with a single character.  What this means is if we ha "abc" ⊕ "b" we would get the string: (a ⊕ b) + (b ⊕ b) + (c ⊕ c).  Since we are given the string in hex, we can get the bytes using:

bytes.fromhex("String")

We then write a function which uses pythons for-each loop to iterate through the bytes and create a word from the xored pairs.

def SingleCharXOR(ciphertext_bytes, character):
	word = ""
    
    for byte in ciphertext_bytes:
    	word += chr(byte^character)
    
    return word

Step 2: Frequency Analysis

Since we now have a function which can xor a single character, the goal is to determine which character was used to encode the string. To do this we weight each character based on their frequency in English. This information can be found on wikipedia under Letter Frequency.  First, we build a dictionary of frequencies for each character.

frequency = {
    'a': 8.167, 'b': 1.492, 'c': 2.782, 'd': 4.253, 'e': 12.702,
    'f': 2.228, 'g': 2.015, 'h': 6.094, 'i': 6.966, 'j': 0.153,
    'k': 0.772, 'l': 4.025, 'm': 2.406, 'n': 6.749, 'o': 7.507,
    'p': 1.929, 'q': 0.095, 'r': 5.987, 's': 6.327, 't': 9.056,
    'u': 2.758, 'v': 0.978, 'w': 2.360, 'x': 0.150, 'y': 1.974,
    'z': 0.074, ' ': 12.8
}

Note, I did not include capital letters in the frequency table.  Since capitals may appear in the plaintext, we modify each character to be lowercase by adding 32.  We then make this into a function and return the value of the character, or zero if it does not exist in the dictionary.

def GetScore(char):
    if(char > 64 and char < 90):
        char += 32

    frequency = {
        'a': 8.167, 'b': 1.492, 'c': 2.782, 'd': 4.253, 'e': 12.702,
        'f': 2.228, 'g': 2.015, 'h': 6.094, 'i': 6.966, 'j': 0.153,
        'k': 0.772, 'l': 4.025, 'm': 2.406, 'n': 6.749, 'o': 7.507,
        'p': 1.929, 'q': 0.095, 'r': 5.987, 's': 6.327, 't': 9.056,
        'u': 2.758, 'v': 0.978, 'w': 2.360, 'x': 0.150, 'y': 1.974,
        'z': 0.074, ' ': 12.8
    }

    return frequency.get(chr(char), 0)

Next, we add a weighting system to the single character xor function to keep track of the character weights and return the value with the phrase.

frequency = 0

for byte in ciphertext_bytes:
        word += chr(byte ^ character)
        frequency += GetScore(byte ^ character)

    return word, frequency

Step 3: Determining the Key

To determine the key we loop through each character while keeping track of the highest frequency value of each phrase.  If the frequency value exceeds that of a previous character, we note the phrase and the character.

value_highest = 0
best_phrase = ""
character = ""

for i in range(128):
    phrase, value = SingleCharXOR(bytes.fromhex("<String Removed>"), i)
    if(value_highest < value):
    	character = chr(i)
        best_phrase = phrase
        value_highest = value

From the frequency analysis we find that the xor character used is "X."  The decrypted ciphertext sets the tone for the rest of cryptopals: Vanilla Ice quotes.

plaintext: Cooking MC's like a pound of bacon