We “dodododo” participated in TSG CTF 2021 as members of zer0pts. This article is a writeup of “Advanced Fisher”.

In this challenge, the flag is encoded by morse code and 440Hz On-Off-Keying. However, generated signals are shuffled, so we cannot decode in a normal way.

This is the distributed script:

import soundfile as sf
import numpy as np
from utils import signals_to_string, string_to_signals

flag = open('flag.txt').read()
signals = string_to_signals(flag)

assert(flag.startswith('TSGCTF{') and flag.endswith('}'))
assert(len(signals) == 473)

wave = np.empty(len(signals) * 2000)
for i in range(len(wave)):
  if signals[i // 2000] == 0:
    wave[i] = 0.0
  else:
    # Generate 440Hz morse code
    wave[i] = np.sin(i * 440 / 44100 * (2 * np.pi))

# Fisher here :)
np.random.shuffle(wave)

sf.write('result.wav', wave, 44100, 'PCM_24')

$\sin(\lfloor \frac{440}{44100} 2\pi i \rfloor)$ produces 2205 different values, and each symbol of morse code uses 2000 of them. We cannot recover randomized signals so I totaled the number of appearance by sin value. Then I came up with that we can represent these numbers as the below simple linear equations using correlation matrix $\mathbf{A}$ and appearance count vector $\mathbf{c}$. $$ \mathbf{Ax}=\mathbf{c} $$

This is my script of solving the linear system.

import numpy as np
import soundfile as sf
from utils import signals_to_string
import sys

data, rate = sf.read(open('./result.wav', 'rb'))

def rev_value(v):
    v = np.arcsin(v)
    v /= 2 * np.pi
    v *= 44100
    v = int(np.round(v))
    v = int(np.round(v/ 10) * 10)
    return v


table = []
for i in range(5000):
    k = np.sin(i * 440 / 44100 * (2 * np.pi))
    table.append(rev_value(k))

dic = {}
for val in data:
    val = rev_value(val)
    if val not in dic:
        dic[val] = 0

    dic[val] += 1


indexes = set()
bucket = [ [] for i in range(473) ]
for i in range(473 * 2000):
    val = np.sin(i * 440 / 44100 * (2 * np.pi))
    val = rev_value(val)

    if val == 0:
        continue

    bucket[i // 2000].append(val)
    indexes.add(int(val))

indexes = list(indexes)
rev_index = dict()
for i in range(len(indexes)):
    rev_index[indexes[i]] = i

M = []
for i in range(473):
    vec = [0] * len(indexes)
    for j in range(len(bucket[i])):
        vec[rev_index[bucket[i][j]]] = 1
    M.append(vec)

vec = []
for idx in indexes:
    if idx not in dic:
        vec.append(0)
        sys.stderr.write('aaaa' + str(idx) + '\n')
    else:
        vec.append(dic[idx])

A = Matrix(M)
v = vector(vec)

ans = A.transpose().solve_right(v)

print(ans)
ans = list(ans)[46:-32]
print(signals_to_string(ans))

Result:

(2, 2, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 2, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0,
 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0,
 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0,
 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0,
 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1,
 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0,
 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0,
 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1,
 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1,
 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0,
 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
F{THE-TRUE-F1SHERM4N-U53S-M0RSE-COE

This script decodes only binary sequense in the solution because the solution includes 2 that is not used in morse code. During the contest, we guessed the flag by this fragment of the flag.

In fact, the rank of $\mathbf{A}$ is 441, and it is less than the length of the symbols. It can be recovered by using the flag format TSGCTF{, so I wrote an additional script.

from ans import ans
from utils import signals_to_string, string_to_signals

sig = string_to_signals('TSGCTF{')

ans = list(ans)

for i in range(len(sig)):
    if sig[i] != ans[i]:
        ans[i + 441] = ans[i] - sig[i]
        ans[i] = sig[i]

print(signals_to_string(ans))

Result: TSGCTF{THE-TRUE-F1SHERM4N-U53S-M0RSE-CODE}