Creating one-file Linux distribution with docker

Posted on May 9, 2021 by Kamil (aka. v3l0c1r4pt0r)

Few months ago I wrote a tutorial about creating Linux distribution consisting of just busybox as its userspace. In the meantime I worked a bit with docker and it sounded like nice next step in learning docker to automate the process of creating Linux distribution using it. As a result, today I present Linux distribution built with docker and based on my previous tutorial. I called it busy-linux due to it consisting of only busybox at the moment. My plan is to develop it further, most likely for private purposes only, so there might not be much happening in the project, but for sure I want to create dynamically linked variant in the near future, as this is what my use case requires. In the meantime feel free to try it yourself. Continue reading “Creating one-file Linux distribution with docker” →

Reading and programming 93Cx6 EEPROM with Digispark

Posted on December 14, 2020 - December 14, 2020 by Kamil (aka. v3l0c1r4pt0r)

For some time already, I am working on a big reverse engineering topic. I hope, I will be able to present something on that in future. Of course this would be something almost unique, if finished. For now I want to present a tool that I made while working on this big thing (as a side note, it’s not the first one, cc-factory was also created for that purpose).

What I had to do, was to read contents of EEPROM, that I found on board, I am analyzing. It is quite obscure, as Google did not return anything useful (beside Taobao auctions). Fortunately I learned that chip it is connected to expect EEPROM from the 93Cx6 series. So, to not break anything, I bought few similar memory chips from usual source. In the meantime, I found that this thing talks Microwire protocol, which is quite similar, but not identical to SPI. This means that flashrom is not an option here. It is however similar enough to SPI that some people were successful in talking to these EEPROMs on SPI bus. Unfortunately, I did not have any device that was confirmed successful and I did not want to experiment with low chance of success. Luckily for me, there is simple Arduino library, that bit-bangs the protocol. I am not a big fan of Arduino, but I have few Digispark boards, so I decided to give it a try. Obviously, the fact that this post appeared means, I was successful. Nevertheless, it was not that easy. At least for me, so I share my experience, just in case someone have similar problem. Enough of this, let’s read (and write, if you want) Microwire 93C56 EEPROM with Arduino sketch and Digispark board, via USB virtual serial port. Because, why not? 🙂 Continue reading “Reading and programming 93Cx6 EEPROM with Digispark” →

Busybox-based Linux distro from scratch

Posted on November 14, 2020 - December 1, 2020 by Kamil (aka. v3l0c1r4pt0r)

Canon SD MMC card 16 MiB — 16 MiB SD/MMC card. Made in Japan in 2005

Today, I would like to show something different, than usual reverse-engineering, that appears on my blog usually. I needed to prepare a Linux distro for myself to be able to run it on my PC. But not the ordinary operating system that we download from webpage, then use fancy graphical installer to select, what we want and where. My goals were very specific. First was to have it custom-compiled. With that in mind there aren’t many choices left (maybe Gentoo?). Second was to not cross 16 MiB boundary. Why exactly that? That’s simple. I have old (15 years old to be precise) SD/MMC card made for Canon of exactly that size. Quick check showed me that this is possible. I tried buildroot and it failed to fulfill second requirement and I decided not to continue, despite the obvious optimizations on kernel modules, I could do. It’s simply too complex for such a simple task. If not buildroot, then let’s go and see how to do such thing from scratch!

The plan

Basically the plan is to have custom Linux distro compiled from scratch. It may sound like something incredibly complex and hard to do. But it’s not. There are just few problems one must learn on how to overcome. The most problematic constraint in my case is, obviously, 16 MiB limit. To not exceed it, I have to use busybox as my userspace. This by the way simplifies distro development significantly. Busybox works the way, that, if linked statically, requires only one, single binary to be able to work correctly. So, to sum up, on software side, we need Linux and busybox. You may wonder, how do I want to boot that system, then? Well. I said I need Linux 🙂 Maybe some people know, some does not, that Linux is itself a boot loader of some kind. At least, when using UEFI and this is what I want to use, it can be loaded directly by UEFI firmware. But that’s another thing to note – I will describe a way to prepare a distro for UEFI – it won’t be as simple as that, for legacy BIOS.

The whole plan will look as follows:

Get compiler
Compile Linux kernel
Compile busybox (statically and stripped!)
Prepare initramfs with whole userspace
Format drive as EFI System Partition
Combine kernel and initramfs into single binary
Optionally sign the binary, in case we want Secure Boot to be enabled
Add entry to embedded UEFI boot manager

In the meantime, I am going to show few ways to debug the system, in case of any problems. Continue reading “Busybox-based Linux distro from scratch” →

Meet CC Factory – a factory for cross compilers

Posted on October 21, 2020 - November 10, 2020 by Kamil (aka. v3l0c1r4pt0r)

Having a tailored cross compiler is a problem I encountered couple of times in the past. Of course there are solutions to that problem like great crosstool-ng or more complex buildroot. In most cases crosstool-ng (ct-ng) can solve them. But whatever the tool we use, it has always its own drawbacks. For ct-ng these are small number of supported versions of toolchain components and huge dependence of environment, where it is started. The latter is even more problematic, because of the way continuing interrupted build work in ct-ng. Obviously if you want to build in example one compiler for ARM and one for MIPS, both consisting of latest tools, then it is not a problem.

But I have another use case for compiling toolchains. I do some reverse engineering from time to time. Nowadays many products have Linux under the hood and often there is no chance to get any SDK for them. But having ability to build something for the device can help a lot, either to run it there, or link with found tools and run in emulator. But I could also imagine that outside the reverse engineering field there might be a need to get toolchain in exact configuration, which is sadly not available via ct-ng or buildroot. Anyway, in any case where ct-ng or buildroot are not applicable, there is third way – docker. And this is the way I chose. This is how CC Factory appeared. It is docker container that builds gcc cross compiler on first startup and lands you in an container that have working compiler for the platform of your choice. And it does not require big effort to port it for the next architecture, or different tool version, unless the changes between the versions were really significant. Continue reading “Meet CC Factory – a factory for cross compilers” →

Peugeot 407 rain sensor pinout

Posted on August 3, 2020 - August 3, 2020 by Kamil (aka. v3l0c1r4pt0r)

I had a project in mind featuring rain sensor, present in Peugeot 407 car, among others, probably. However, reality is that it is on my todo list for years now and I don’t think I will do it ever. But, I already bought the sensor in the past and it was lying in my drawer. So why not to disassemble it and figure out the pinout.

Identification

To be precise in what I am talking about, here are some magic values, I know about this thing:

Manufacturer: Bosch
Part number: 1 397 212
Another number: 96 524 903 80
Peugeot part number: 6405 CW

Continue reading “Peugeot 407 rain sensor pinout” →

Playing with GF-07 GPS device

Posted on December 11, 2019 by Kamil (aka. v3l0c1r4pt0r)

GF-07 is dirt-cheap GPS locator. You put SIM card in it, send SMS and you know where it is. That’s it. But not for me. I like to know what I am using, especially if it is that cheap and such obscure device as this one. It comes together with manual that is written in so bad English that I barely understand anything. Immediately after opening SIM slot, one can see few test pads. Fortunately all of them are described in silkscreen. Let’s see what can be done with it as a one-evening hack.

GF-07 opened — SIM cover opened (pin headers visible)

Continue reading “Playing with GF-07 GPS device” →

Hacking Android’s Bluetooth application to receive any file (outside whitelist)

Posted on December 9, 2019 - December 12, 2019 by Kamil (aka. v3l0c1r4pt0r)

Bluetooth application in AOSP hardcodes strange constraint in form of whitelist of MIME types that are allowed to be received. Also LineageOS seems to have that code compiled for some reason. There is no other way to allow any file transfer than to make your own Bluetooth apk and install it. In this tutorial I will show how to recompile APK and install it in system.

Before I start I have to warn anybody trying to follow the tutorial: THERE IS NO WARRANTY THAT THIS METHOD WILL WORK, I am not responsible for bootloops, broken Bluetooth or any other harm made to your device. You, and only you are responsible for your device, so don’t do it, if you don’t know what you’re doing! Continue reading “Hacking Android’s Bluetooth application to receive any file (outside whitelist)” →

How Android smartphone is spying on you?

Posted on November 20, 2019 - October 4, 2021 by Kamil (aka. v3l0c1r4pt0r)

The fact that Android knows a lot about its user and don’t keep that information for itself is quite well known. But how much data is sent to the outside world on first connection to WiFi? Which apps are responsible for pushing the data into the web? I will try to show that on example of Xiaomi Redmi Go.

For the purpose of the test, I created network that is not forwarding any packet outside. Before performing the test I installed few APKs. Only one is known to contact strange servers – File Manager (com.rhmsoft.fm). I marked them with (*) on a list. Rest should not have any influence on the results. They were: Termux, AFWall+, F-Droid, Magisk and Aurora Store. Continue reading “How Android smartphone is spying on you?” →

New VCI+A-BT (DS150E) ST-Link pinout

Posted on August 7, 2019 - August 8, 2019 by Kamil (aka. v3l0c1r4pt0r)

This device could easily be found at the Chinese sellers. They advertise them as DS150E. Under the hood there is nice STM32 and STM8 pair. One is present on main PCB (in my case TCS MAIN V6.0), the other on relay board (TCS+RELAY V3.0). Both chips have their ST-Link headers broken out. If one want to read/write the firmware installed in internal flash, it should be as easy as connecting few of these pins. Continue reading “New VCI+A-BT (DS150E) ST-Link pinout” →

LKV373A: porting objdump

Posted on December 30, 2017 - October 17, 2019 by Kamil (aka. v3l0c1r4pt0r)

This article is part of series about reverse-engineering LKV373A HDMI extender. Other parts are available at:

Part 1: Firmware image format
Part 2: Identifying processor architecture
Part 3: Reverse engineering instruction set architecture
Part 4: Crafting ELF
Part 5: Porting objdump
Part 6: State of the reverse engineering
Part 7: radare2 plugin for easier reverse engineering of OpenRISC 1000 (or1k)

After part number four, we already have ELF file, storing all the data we found in firmware image, described in a way that should make our analysis easier. Moreover, we have ability to define new symbols inside our ELF file. The next step is to add support for our custom architecture into objdump and this is what I want to show in this tutorial.

Finding best architecture to copy

If we want to set up new architecture in objdump code, we need to learn interfaces that need to be implemented. It would be easier if we can use some existing code to do so. After some looking into the binutils’ code I learned that what is of special interest are bfd and opcodes libraries. They contain code dedicated to particular architectures. The first one seem to be related to object file handling (which in our case is ELF), so we should not tinker with it too much. Second one is related to disassembling binary programs, so is what we are looking for.

I did some quick examination of source code related to popular architectures and it seems not to be easy to adjust to our needs. Architecture I found to be best suitable for modification is Microblaze. Its source seem to be quite well-written, clean and short. Also from my research of architecture name for LKV373A (part 2, failed by the way) I also remember it is quite similar to the one present in LKV373A, so it is even better decision to use it.

Compiling objdump for target architecture

At first it is useful to learn how to compile objdump, so it will be able to disassemble program written for our target. Microblaze is not really a mainstream architecture, so there aren’t many programs compiled for it available online after typing 'microblaze program elf' into usual search engine. However, I was able to find 2 of them, so I was able to verify that compilation worked. If you can’t find any, I uploaded these to MEGA, so they can serve as test cases. First one is minimal valid file, the other one is quite huge.

Compilation is very easy. The only thing that needs to be done beside usual ./configure && make && make install is adding target option to configure script. So, the script looks as follows:

./configure --target=microblaze-elf

Of course, install step can safely be skipped as well as compilation of other tools, beside objdump. objdump itself seem to be built using make binutils/objdump. However it can’t be build successfully using that shortcut, so whole binutils package must be configured the way, everything not buildable is excluded from the build.

Setting up own architecture

Next step is to add support for our brand new, custom architecture to binutils’ configuration files and copy microblaze sources, so they will simulate our architecture, until we will write our own implementation. Then it should be possible to test objdump again, against our sample microblaze programs and disassembly should still work.

Even without any modification to binutils’ source or configs, it should be possible to configure it for any random architecture. The only constraint is format of the target string: ARCH-OS-FORMAT, where FORMAT is most likely to be elf. So, if we pass lkv373a-unknown-elf as target, it will work. -unknown part is usually skipped and this will not work. If we need it to work, config.sub must be modified. config.sub is used to convert any string, passed to configure into canonical form, so in our case lkv373a-unknown-elf. If it detects, that it is already in canonical form, it does nothing.

Final configure command will be slightly more complex, as we have to disable some parts, that are not of our interest and requires additional effort to work:

./configure --target=lkv373a-unknown-elf --disable-gas --disable-ld --disable-gdb

Although passing something random as target option works on configure stage, it will obviously fail on make stage. What make is doing at first is configuring all the sublibraries. What is of our interest is bfd and opcodes. And the first one fails. So this is the first problem, we need to get rid of.

bfd/config.bfd

The purpose of this file is to set some environment variables depending on target architecture. If it does not know the architecture, it returns error to caller, which is probably bfd’s configure script, called by make. According to documentation in file header, it sets following variables:

targ_defvec – default vector. This links target to list of objects that will provide support for ELF file built for specific architecture (stored in bfd/configure.ac)
targ_selvecs – list of other selected vectors. Useful e.g. when we need support for both 32- and 64-bit ELFs. Not needed here.
targ64_selvecs – 64-bit related stuff. Used when target can be both 32- and 64-bit, meaningless in our case.
targ_archs – name of the symbol storing bfd_arch_info_type structure. It provides description of architecture to support.
targ_cflags – looks like some hack to add extra CFLAGS to compiler. We don’t care.
targ_underscore – not sure what it is, should have no impact on our goals (possible values are yes or no)

To sum up, what we need to do on this step is to define default vector, we will later add to configure.ac and set name of architecture description structure. The structure itself will be defined later. Finally, I ended up with the following patch:

@@ -173,6 +173,7 @@ hppa*)     targ_archs=bfd_hppa_arch ;;
 i[3-7]86)   targ_archs=bfd_i386_arch ;;
 i370)     targ_archs=bfd_i370_arch ;;
 ia16)     targ_archs=bfd_i386_arch ;;
+lkv373a)  targ_archs=bfd_lkv373a_arch ;;
 lm32)           targ_archs=bfd_lm32_arch ;;
 m6811*|m68hc11*) targ_archs="bfd_m68hc11_arch bfd_m68hc12_arch bfd_m9s12x_arch bfd_m9s12xg_arch" ;;
 m6812*|m68hc12*) targ_archs="bfd_m68hc12_arch bfd_m68hc11_arch bfd_m9s12x_arch bfd_m9s12xg_arch" ;;
@@ -924,6 +925,10 @@ case "${targ}" in
     targ_defvec=iq2000_elf32_vec
     ;;

+  lkv373a*-*)
+    targ_defvec=lkv373a_elf32_vec
+    ;;
+
   lm32-*-elf | lm32-*-rtems*)
     targ_defvec=lm32_elf32_vec
     targ_selvecs=lm32_elf32_fdpic_vec

bfd/configure.ac

Now we need to define vector, we just declared to use for lkv373a architecture.

505     k1om_elf64_fbsd_vec)         tb="$tb elf64-x86-64.lo elfxx-x86.lo elf-ifunc.lo elf-nacl.lo elf64.lo $elf"; target_size=64 ;;
506     lkv373a_elf32_vec)           tb="$tb elf32-lkv373a.lo elf32.lo $elf" ;;
507     l1om_elf64_vec)              tb="$tb elf64-x86-64.lo elfxx-x86.lo elf-ifunc.lo elf-nacl.lo elf64.lo $elf"; target_size=64 ;;

Unfortunately, as we did modifications to .ac script, we now need to rebuild our configure. From my experience, any tinkering with autohell, after solving one problem, creates 5 more. We need to get into bfd directory and reconfigure project:

cd bfd
autoreconf

Now, if it worked for you, you should definitely go, play some lottery 🙂 . For me it said that I need exactly same version of autoconf as used by binutils’ developers. Because autoconf is so great, probably what I will show now is completely useless for anyone, but hacks I needed to do are at first to add:

20 m4_define([_GCC_AUTOCONF_VERSION], [2.69])

to the beginning of configure.ac file. Then bfd/doc/Makefile.am contains removed cygnus option at the beginning, in AUTOMAKE_OPTIONS, so we need to remove it. After that doing automake --add-missing, as autoreconf suggests, and then again autoreconf should solve the problem. But, as I said, this will probably not work for you. I can only wish you good luck.

(if were following the steps, you might have noticed that autoconf complained about not being in version 2.64 and we overridden version from 2.69 to 2.69 and it worked 🙂 , don’t ask me, why, please!)

After this step, compilation should start (and obviously will fail miserably on bfd as it misses few symbols). Now its time to make bfd compilable.

bfd/elf32-lkv373a.c

This file is meant to provide support for custom features of ELF file. As we don’t have any, we can safely do nothing here. Good template of such file is elf32-m88k.c as it does exactly this.

One thing that seem to be important here is EM value of architecture described. EM is an enum used in ELF file to define target architecture, so it might be required to adjust in our new elf32-lkv373a.c file. By the way definition of this value have to be added to include/elf/common.h:

433 /* LKV373A architecture */
434 #define EM_LKV373A              0x373a

It might also be a good idea to add it to elfcpp/elfcpp.h. To make the file compile, it is necessary to add following to bfd/bfd-in2.h as value of bfd_architecture enum:

2398   bfd_arch_lkv373a,    /* LKV373A */

bfd/archures.c

As we declared bfd_lkv373a_arch as symbol with CPU description structure, we now need to add this declaration to archures.c, as this is the file, where it will be used. We have to add:

611 extern const bfd_arch_info_type bfd_l1om_arch;
612 extern const bfd_arch_info_type bfd_lkv373a_arch;
613 extern const bfd_arch_info_type bfd_lm32_arch;

bfd/targets.c

Similar situation is in targets.c file. Here we have to provide declaration of our vector as bfd_target. This will be another structure, which seem to be generated automatically, so we should not care about it.

704 extern const bfd_target l1om_elf64_fbsd_vec;
705 extern const bfd_target lkv373a_elf32_vec;
706 extern const bfd_target lm32_elf32_vec;

bfd/cpu-lkv373a.c

This last file, we need in bfd, provides bfd_arch_info_type structure and… that’s it! Can be easily borrowed from cpu-microblaze.c with only slight modifications. One thing that needs explanation here is section_align_power. As far as I understand it, it is power of two to which the beginning of the section in memory must be aligned. It should be safe to put 0 here, as we are not going to load our ELF into memory.

This should close the bfd part of initialization. As you can see, there was no development at all to be done here. Let’s now go to opcodes library.

opcodes/configure.ac

At first we need to define objects to build for LKV373A architecture in opcodes library. This is quite similar to what we had to do in configure.ac of bfd library.

282         bfd_iq2000_arch)        ta="$ta iq2000-asm.lo iq2000-desc.lo iq2000-dis.lo iq2000-ibld.lo iq2000-opc.lo" using_cgen=yes ;;
283         bfd_lkv373a_arch)       ta="$ta lkv373a-dis.lo" ;;
284         bfd_lm32_arch)          ta="$ta lm32-asm.lo lm32-desc.lo lm32-dis.lo lm32-ibld.lo lm32-opc.lo lm32-opinst.lo" using_cgen=yes ;;

Hopefully, -dis file will be enough to be implemented. I’ve made a copy from microblaze configuration. The same way we will copy whole source file and any related headers in the next step.

Now, similarly to bfd’s configure.ac, we have to reconfigure it. And again, nobody knows what errors we will encounter.

opcodes/disassemble.c

The only thing that have to be done here is to set pointer of disassemble function. For this following snippets should be added:

53 #define ARCH_lkv373a

255 #ifdef ARCH_lkv373a
256     case bfd_arch_lkv373a:
257       disassemble = print_insn_lkv373a;
258       break;
259 #endif

And to disassemble.h:

62 extern int print_insn_lkv373a           (bfd_vma, disassemble_info *);

opcodes/lkv373a-dis.c

This is, where real stuff will happen. As our goal, for now, is not to make implementation of LKV373A architecture, but rather set everything up, so objdump will build, we can copy source file from microblaze-dis.c. It is also required to copy headers, related to MicroBlaze, used by this file, so:

opcodes/microblaze-dis.h
opcodes/microblaze-opc.h
opcodes/microblaze-opcm.h

And change include directives in them to link to lkv373a file, rather than microblaze ones.

Now, optionally we could change names of any symbols referring to name microblaze, but this should not be required, as original microblaze files should not be included in the build. The only change than need to be done is print_insn_microblaze into print_insn_lkv373a, as this is what we added to disassemble.c.

You should now be able to compile working objdump with LKV373A support (of course with wrong implementation, for now). We can now verify that everything works on slightly modified ELF file for MicroBlaze architecture (EM field must point to LKV373A – value must be 0x373a). Well done!

NOTE: all the steps, done till now are available on tutorial-setup tag in repository on Github.

Functions to implement

Now, finally the real fun starts. Bindings between opcodes library and objdump itself, require at least print_insn_lkv373a to be implemented.

What should happen inside this function is quite simple and can be described in following steps:

Gets bfd_vma and struct disassemble_info (called info below) as parameters
Read raw data containing instructions using info->read_memory_func
Call info->memory_error_func in case of any errors
Use info->fprintf_func to print disassembled instruction into info->stream
Optionally use info->symbol_at_address_func to determine if there is any symbol declared at address decoded from instructions
If symbol exists, call info->print_address_func
Return number of bytes consumed

Following is some documentation, I wrote for easier implementation (mostly translated inline comments), of functions to be called:

  /**
   * \brief Function used to get bytes to disassemble
   *
   * \param memaddr Address of the current instruction
   * \param myaddr Buffer, where the bytes will be stored
   * \param length Number of bytes to read
   * \param dinfo Pointer to info structure
   *
   * \return errno value or 0 for success
   */
  int (*read_memory_func)
    (bfd_vma memaddr, bfd_byte *myaddr, unsigned int length,
     struct disassemble_info *dinfo);

  /**
   * \brief Call if unrecoverable error occurred
   *
   * \param status errno from read_memory_func
   * \param memaddr Address of current instruction
   * \param dinfo Pointer to info structure
   */
  void (*memory_error_func)
    (int status, bfd_vma memaddr, struct disassemble_info *dinfo);

  /**
   * \brief Pointer to fprintf
   *
   * \param stream Pass info->stream here
   * \param char Format string
   * \param ... vargs
   *
   * \return Number of characters printed
   */
  typedef int (*fprintf_ftype) (void *, const char*, ...) ATTRIBUTE_FPTR_PRINTF_2;

  /**
   * \brief Determines if there is a symbol at the given ADDR
   *
   * \param addr Address to check
   * \param dinfo Pointer to info structure
   *
   * \return If there is returns 1, otherwise returns 0
   * \retval 1 If there is any symbol at ADDR
   * \retval 0 If there is no symbol at ADDR
   */
  int (* symbol_at_address_func)
    (bfd_vma addr, struct disassemble_info *dinfo);

  /**
   * \brief Print symbol name at ADDR
   *
   * \param addr Address at which symbol exists
   * \param dinfo Pointer to info structure
   */
  /* Function called to print ADDR.  */
  void (*print_address_func)
    (bfd_vma addr, struct disassemble_info *dinfo);

For easier start of development, this commit can be used as template. You can find effects of implementation according to this description on lkv373a branch of my binutils fork on Github. After this step, you should have working objdump, able to disassemble architecture of your choice.

Alternative way

According to binutils’ documentation, porting to new architectures should be done using different approach. Instead of copying sources from other architectures, developers should write CPU description files (cpu/ directory) and then use CGEN to generate all necessary files. However, I found these files way too complicated comparing to goal, I wanted to achieve, therefore I used the shortcut. In reality, however, this might be a better way, as the final result should be the support for new architecture not only in objdump, but also in e.g. GAS (GNU assembler). If you want to go that way, another useful resource might be description of CPU description language.

Plans for the future

As I am now able to speed up reverse engineering of both instruction set and LKV373A firmware, I am planning to create public repository of my progress and guess operations done by some more opcodes as I already know only few of them. So, I will probably push some more commits to binutils repo as well. I hope this will enable me to gain some more knowledge about LKV373A and allow, me or someone else, to reverse engineer second part of the firmware, which seem to be way more interesting that the one, I was reverse engineering till now.