Categories
System Administration

Parsing the EU Digital COVID Certificate

I recently go my first Covid-19 shot, and shortly after that I received my QR code that is supposed to prove to anybody that I’m vaccinated.

Of course, I was interested in the actual technical implementation of the thing so I started tinkering around. Initially, I wasn’t sure whether the QR code just links to a webpage or is a standalone document, but reading the documentation quickly made clear that it simply contains a signed document.

There is a python snippet floating around that does all the decoding (decode base45, decompress, decode cbor, ignore signature, cbor decode), but I was wondering whether I can just use standard Linux command-line tools to do the same.

To my big surprise, I could not find a simple base45 decoder to be used on the command line. Neither the GNU coreutils (which supports a bunch of encodings) or basez have implemented base45 yet. For a quick coding exercise, getting code into the coreutils was a bit too ambitious, so I rolled my own in a few minutes. I haven’t done coding in C for quite some time, so this was fun. (while probably not necessary, this should be doable as a one-liner in Perl)

Next step: The libz decompression. Here, too, I was surprised that my Debian box did not have a simple program ready to do it. After some googling, I found that
zlib-flate -uncompress
does the trick. zlib-flate is included in the qpdf package.

Update: The debian package zlib1g-dev contains a few example programs, zpipe.c does exactly what I need. Just copy the file from /usr/share/doc/zlib1g-dev/examples/ , compile with “cc -o zpipe zpipe.c -lz”.

Then cbor: at first this looked easy, there is a libcbor0 in Debian and appropriate modules for Perl and Python. But a command-line tool? That’s not included by default.

Initially, I tried using
python3 -m cbor2.tool
and then continue with jq to select the right element (jq ‘.”CBORTag:18″[2]’ seems to do the trick) but I ran into encoding issues: I did not manage to extract the element in binary form, I always had escape sequences like ‘\x04’ or ‘\u0012’ in there which are not suitable for next step of cbor decoding.

Update: I’ve now read a bit more about the cbor standard and on the things it can do that JSON can’t, is handling raw binary data. JSON knows strings, yes, but they are character strings encoded in utf-8, not sequences of octets. Back in the old days of latin1 that might not have mattered that much, but these days you really shouldn’t mix those two.

So this is still work in progress, watch this space for updates.