LKV373A: state of the reverse engineering

This article is part of series about reverse-engineering LKV373A HDMI extender. Other parts are available at:

Part 1: Firmware image format
Part 2: Identifying processor architecture
Part 3: Reverse engineering instruction set architecture
Part 4: Crafting ELF
Part 5: Porting objdump
Part 6: State of the reverse engineering
Part 7: radare2 plugin for easier reverse engineering of OpenRISC 1000 (or1k)

Last time, I showed how to do objdump, able to disassemble instructions for not yet supported processor – LKV373A encoder. This time, as promised in part 5, I am just publishing, what I was able to do.

Reverse engineering repo

Repository is located, as usually, on Github, here. The most important file there is printout of encoder firmware, generated with my fork of objdump. Also, there are few scripts, I used to make process more automatic. Especially useful for someone, who might want to reproduce the process or continue my work might be the ELF generator script. It is written in Python and uses my ELF-creation library from part 4. As there is no way to install the library where it should be – /usr/lib/pythonX.X/site-packages, I used MAKEELFPATH environment variable to find makeelf.

Less important scripts, but still sometimes useful, are the ones to generate graph of function cross-references. They are able of generating .dot file, which can be converted to PNG (which is bad way of making it useful – it is 32768 pixels wide) or opened somewhere. And I used Gephi to read the final graph.

binutils development

Also binutils fork was improved a little bit. It is now possible to see symbols on every jump and call instruction. Moreover I did a kind of hack to be able to see references to data, as they mostly consisted of two instructions – first setting upper half of register, then adding lower half. So in that case, objdump is now displaying content of register, which most of the time don’t work, as objdump parses code linearly, without caring about jumps. But, if this opcode pair is used to really reference some address, it is quite reliable, so it is possible to see the address, as well as symbol name, in same way as with calls. Also, there is much more information about opcode types, so completely unknown instructions had been eliminated. Some of them were not used a single time and were renamed to resXX – for reserved.

Does it make sense?

Now, as I know really a lot about the firmware, it is time to try to answer the question in the heading. Well, the goal of reversing the encoder was to find compression and checksum algorithm for SMEDIA/ITE firmware blob. And, as far as I know now, it is really likely that at least compression algorithm is in fact somewhere here. I can say that, because I’ve seen routine for processing the data that seem to be compressed in SMEDIA/SMAZ container. Moreover, SMEDIA itself is also processed here, in original form, as one of the firmware’s responsibilities is firmware upgrade, so it is being copied to some internal ROM. So, if both – compressed and decompressed versions of firmware is here, then decompression also should be somewhere.

Now, this reverse engineering work might take really, really long time to get some results, but on the other hand, I might find, what I am looking for today. So, now the topic will probably seem to be dead for some undefined time.