SADVE – tiny program for computing #define values

While tinkering with spy camera, I found one detail that is significantly slowing the process of reverse engineering and debugging the applications, installed on its embedded Linux platform – finding final values of preprocessor directives and sometimes also results of sizeof() operator.

As I am not aware of any existing solution for that problem (I guess there might be some included in one of the more sophisticated IDEs, however I use Vim for development) it is good reason to create one. By the way I used cmake template I published some days ago to bootstrap the project.

Usage

Ease of use was the main goal here, as it is obviously possible to create improvised solution by creating hello-world type of program, including required headers and printing the symbol we want to compute value of.

So, to be able to use SADVE, you just have to clone the repo and use standard cmake installation commands and you’re done:

mkdir -p build
cd build
cmake ..
make
sudo make install

Then you can call it like below:

sadve -d AF_INET sys/socket.h

And you should get 2 as an answer. That’s it. If instead you want to get size of some structure, you can type:

sadve -s sockaddr sys/socket.h

And you should get size of sockaddr structure. Obviously, you can see full usage with sadve --help.

Internals

Internally the program simply automates the process I described in the first paragraph of Usage – it applies what is desired by user to hello-world-like template and compiles. Therefore it might not be the best idea to make it a backend for web service available for general public, at least without a lot of isolation and input sanitization. However for private usage this should be enough. If you are interested in doing such task with cmake, I encourage you to dive into source code on Github.

To speed the process up, I had to store all cmake build files in ~/.cache with no interface for cleaning it up.

Using CMocka for unit testing C code

Writing unit tests along with the source code (or even before the code itself – see TDD) is currently very popular among programmers writing in languages like Java or C#. For C code, however it is a bit different. There are only a few frameworks enabling the possibility to write unit tests. One of them is quite special – it allows to mock functions. And its name is CMocka. Unfortunately there are not many resources that describes the process of setting up cmocka, especially together with cmake to allow programmers add new executables, tests and mocks without unnecessary overhead. But before showing how to do it, let’s go back to basics (if you already know them, you can skip next heading).

What is mocking?

Mocking is a mechanism that allows to substitute object, we do not want to test, with empty implementation, which we can further configure to do whatever we like, like simulate errors. Usually objects are mocked, because we import them from some external library and this is not purpose of unit testing to test these external dependencies. This is how they work in object-oriented languages like the ones mentioned at the beginning. In C, it is only a bit different in that, instead of mocking the object (which does not exist in C), we mock functions. Similarly to mocking object, this allows us to control behavior of external function and i.e. test reaction of our code to errors.

To give very short impression about the possibilities, it opens let’s say we want to test function that accepts connection to a socket. How could we test such a function without touching the accept() function? Probably we would need another program that performs connect(). In such a simple case it requires us to write second program to cover just one case. Then, what if we want to check reaction for failed connection establishment? Manpage tells us, ECONNABORTED is returned in that case. So, how to force our test program to break the connection before second side gets return from accept() function? Do you see how complex it gets?

But, what if we could link our test program (the one that calls function using accept()) to our own accept()? Then at first we could write our fake function so it pretends that real socket had been created and return some positive integer. In second case, we could just set errno to ECONNABORTED and return -1. That’s it!

This, and by the way, much more is possible with CMocka. Let’s see how to configure simplest possible project to use cmocka with cmake.

Hey cmake, do my unit tests!

For purpose of this tutorial, let’s say we have quite simple project. We just started it, so it’s perfect time to enable unit testing for TDD approach. It consists of one console (CLI) application split into two modules. First one – program is just main() function and another function it calls. Second one – module provide just one function that is also called by main(). For configuration, we have two extremely short CMakeLists.txt files:

CMakeLists.txt
cmake_minimum_required (VERSION 3.0) project (cmocka_template) add_subdirectory(src)
src/CMakeLists.txt
add_executable(program program.c module.c)

You can view the code itself on Github. With this configuration it is now possible to make the project with cmake. To enable testing, at first we have to append following script to main CMakeLists.txt:

27 list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake/Modules")
28 
29 # cmocka
30 option(ENABLE_TESTS "Perform unit tests after build" OFF)
31 if (ENABLE_TESTS)
32   find_package(CMocka CONFIG REQUIRED)
33   include(AddCMockaTest)
34   include(AddMockedTest)
35   add_subdirectory(test)
36   enable_testing()
37 endif(ENABLE_TESTS)

In line 27, we tell cmake to look for additional cmake scripts to include in cmake/Modules directory. This would be required by includes in lines 33 and 34.

Then, we define option to enable/disable tests. This will allow users not willing to run unit tests, build the project without having to satisfy dependency to CMocka in line 32.

In that line FindCmocka will be called and will setup few variables. Of our interest would be ${CMOCKA_LIBRARIES}. This points to CMocka library, we have to link with all our test programs.

Includes in lines 33 and 34, provides functions with the analogous names, first one is part of CMocka sources, the other define simple wrapper on the first one, so user do not have to type all the parameters that usually stays the same from test to test.

At last we include another build script from subdirectory and enable testing.

With help of the mentioned add_mocked_test wrapper in simplest case, where we do not use any functions external to the module, all we have to do is call add_mocked_test(module). However, if we do call some external functions, we have to call it like below and provide sources with these functions:

add_mocked_test(program SOURCES ${CMAKE_SOURCE_DIR}/src/module.c)

Alternatively, we can also join many sources into library and pass it like that:

add_mocked_test(module LINK_LIBRARIES mylib)

This finally becomes -lmylib in ld.

Simplest test

As we should have complete build script, we can now write simple test:

test/test_main.c
24 #include <stdarg.h> 25 #include <stddef.h> 26 #include <setjmp.h> 27 #include <cmocka.h> 28 29 #define main __real_main 30 #include "program.c" 31 #undef main 32 33 typedef struct {int a; int b; int expected;} vector_t; 34 35 const vector_t vectors[] = { 36 {0,1,0}, 37 {1,0,0}, 38 {1,1,1}, 39 {2,3,6}, 40 }; 41 42 static void test_internal(void **state) 43 { 44 int actual; 45 int i; 46 47 for (i = 0; i < sizeof(vectors)/sizeof(vector_t); i++) 48 { 49 /* get i-th inputs and expected values as vector */ 50 const vector_t *vector = &vectors[i]; 51 52 /* call function under test */ 53 actual = internal(vector->a, vector->b); 54 55 /* assert result */ 56 assert_int_equal(vector->expected, actual); 57 } 58 } 59 60 int main() 61 { 62 const struct CMUnitTest tests[] = { 63 cmocka_unit_test(test_internal), 64 }; 65 66 return cmocka_run_group_tests(tests, NULL, NULL); 67 }

At first we have to include some headers. Then in lines 29 and 31, preprocessor hack is done. This is because we already have a main() function used to define test suite. So to not have it redeclared, we temporarily change its name to __real_main, so this is how our program.c/main() will be called, when included in line 30. For tests with modules not containing main() function this is superfluous.

In lines 33-40, we define sets of data to feed into our function with expected results. This is useful if we have many such vectors. For single inputs/outputs pair this is unnecessary. In line 50 one vector is extracted from that array.

Then in line 53 function under test is called and in line 56 its value is asserted with assert_int_equal.

Finally in main() function we have to define list of tests in this suite and call cmocka_run_group_tests to do rest of the job for us. Done.

add_mocked_test internals

For better understanding of the process, let’s have a short look into add_mocked_test() function. It’s source is as simple as:

cmake/Modules/AddMockedTest.cmake
32 function(add_mocked_test name) 33 # parse arguments passed to the function 34 set(options ) 35 set(oneValueArgs ) 36 set(multiValueArgs SOURCES COMPILE_OPTIONS LINK_LIBRARIES LINK_OPTIONS) 37 cmake_parse_arguments(ADD_MOCKED_TEST "${options}" "${oneValueArgs}" 38 "${multiValueArgs}" ${ARGN} ) 39 40 # define test 41 add_cmocka_test(test_${name} 42 SOURCES test_${name}.c ${ADD_MOCKED_TEST_SOURCES} 43 COMPILE_OPTIONS ${DEFAULT_C_COMPILE_FLAGS} 44 ${ADD_MOCKED_TEST_COMPILE_OPTIONS} 45 LINK_LIBRARIES ${CMOCKA_LIBRARIES} 46 ${ADD_MOCKED_TEST_LINK_LIBRARIES} 47 LINK_OPTIONS ${ADD_MOCKED_TEST_LINK_OPTIONS}) 48 49 # allow using includes from src/ directory 50 target_include_directories(test_${name} PRIVATE ${CMAKE_SOURCE_DIR}/src) 51 endfunction(add_mocked_test)

At the beginning, arguments are parsed, which is outside of the scope of this article (those interested could read the official documentation). What is important in this part is that SOURCES from the function call becomes ADD_MOCKED_TEST_SOURCES, COMPILE_OPTIONS are ADD_MOCKED_TEST_COMPILE_OPTIONS and so on.

Then in line 41 add_cmocka_test function is called. This functions does some of the job for us. For example, we do not have to worry about defining executable, linking libraries to it and making it a test called by CTest.

Then in line 42 source file list is passed to it, so it can link them into final executable. We can also pass our own compiler and linker flags, so they are routed to it in lines 43-44 and 47.

Last thing it does is to link this test executable with all required libraries and the only requirements for CMocka to work is to pass its shared library using CMOCKA_LIBRARIES variable, which is available thanks to finding CMocka package in CMakeLists.txt.

Beside that, we make C headers in src/ directory visible to our test program in line 50. That’s it.

Enabling mocks

Killer feature of CMocka however is its API for creating mocked functions. To use it, on CMake side all we have to do, beside what we already did is add flags for linker (to be precise -Wl,--wrap=function_name for every function to be mocked). As can be seen in add_mocked_test source, we could use LINK_OPTIONS argument for that purpose. However it is timesaving to have this integrated into the function’s interface. All we have to do is add a loop for creating argument list:

cmake/Modules/AddMockedTest.cmake
40 # create link flags for mocks 41 set(link_flags "") 42 foreach (mock ${ADD_MOCKED_TEST_MOCKS}) 43 set(link_flags "${link_flags} -Wl,--wrap=${mock}") 44 endforeach(mock)

Then add new argument and modify LINK_OPTIONS of add_cmocka_test to pass that list:

53                   LINK_OPTIONS ${link_flags} ${ADD_MOCKED_TEST_LINK_OPTIONS})

Now, it should work. Then on the source side we can create new test for existing test_program test suite:

79 static void test_main(void **state)
80 {
81   int expected = 0;
82   int actual;
83 
84   /* expect parameters to printf call */
85   expect_string(__wrap_printf, format, "%d\n");
86   expect_value(__wrap_printf, param1, 60);
87 
88   /* printf should return 3 */
89   will_return(__wrap_printf, 3);
90 
91   /* call __real_main as this is main() from program.c */
92   actual = __real_main(0, NULL);
93 
94   /* assert that main return success */
95   assert_int_equal(expected, actual);
96 }

And write mocked object:

43 int __wrap_printf (const char *format, ...)
44 {
45   int param1;
46 
47   /* extract result from vargs ('printf("%d\n", result)') */
48   va_list args;
49   va_start(args, format);
50   param1 = va_arg(args, int);
51   va_end(args);
52 
53   /* ensure that parameters match expecteds in expect_*() calls  */
54   check_expected_ptr(format);
55   check_expected(param1);
56 
57   /* get mocked return value from will_return() call */
58   return mock();
59 }

Now let’s take a look at what is happening here. At first in function test_main, in lines 85-86 we are telling CMocka that printf function is expected to be called with parameter format of a certain content and param1 should have value 60. param1 here is first variadic argument for printf function and its name is completely arbitrary. The key to working mock is to use this same name inside of mocked function.

In line 89, we say mocked printf to return 3 (this is number of bytes written to console and will then be ignored by code in __real_main, but for completeness it is set to proper value here).

Finally, we call the function under test and assert its result, the same way as in the example without mocks above.

On the other side is the mocked printf implementation. In languages like Java it is common to have this mock automatically generated based on the rules provided by user. In C with CMocka it is not so easy, we have to write the mock ourselves. However it is not very hard.

As we chosen variadic function for our mock and we want to check if these variadic arguments are as expected, we have to extract them at first, which is done in lines 48-51. As a result we have param1 variable (notice that it is local variable). Remember that this must match exactly what we declared in test body using one of expect_* functions. This name is how function-parameter pair is extracted from internal dictionaries.

Then in lines 54-55 arguments are checked for having expected contents. Note that there is no difference in how we treat positional and variadic parameters in this step.

Finally in line 58, we extract value that we want to be returned from printf (declared in line 89). This may seem superfluous in such a simple case, however if we want use this mocked printf in more than one test, it is really useful feature. Have also in mind that linker does not give us any interface to have two or more mocks for one function. So this way we have chance to write one universal mock.

Final word

This tutorial was made during my work on template for CMocka+CMake projects. So now, as you understand how it is done, you can just go to my Github and clone the template to make it part of your project. The template is licensed under MIT license, so I don’t care about what you do with it, as long as you leave original copyright.

LKV373A: porting objdump

This article is part of series about reverse-engineering LKV373A HDMI extender. Other parts are available at:

After part number four, we already have ELF file, storing all the data we found in firmware image, described in a way that should make our analysis easier. Moreover, we have ability to define new symbols inside our ELF file. The next step is to add support for our custom architecture into objdump and this is what I want to show in this tutorial.

Finding best architecture to copy

If we want to set up new architecture in objdump code, we need to learn interfaces that need to be implemented. It would be easier if we can use some existing code to do so. After some looking into the binutils’ code I learned that what is of special interest are bfd and opcodes libraries. They contain code dedicated to particular architectures. The first one seem to be related to object file handling (which in our case is ELF), so we should not tinker with it too much. Second one is related to disassembling binary programs, so is what we are looking for.

I did some quick examination of source code related to popular architectures and it seems not to be easy to adjust to our needs. Architecture I found to be best suitable for modification is Microblaze. Its source seem to be quite well-written, clean and short. Also from my research of architecture name for LKV373A (part 2, failed by the way) I also remember it is quite similar to the one present in LKV373A, so it is even better decision to use it.

Compiling objdump for target architecture

At first it is useful to learn how to compile objdump, so it will be able to disassemble program written for our target. Microblaze is not really a mainstream architecture, so there aren’t many programs compiled for it available online after typing 'microblaze program elf' into usual search engine. However, I was able to find 2 of them, so I was able to verify that compilation worked. If you can’t find any, I uploaded these to MEGA, so they can serve as test cases. First one is minimal valid file, the other one is quite huge.

Compilation is very easy. The only thing that needs to be done beside usual ./configure && make && make install is adding target option to configure script. So, the script looks as follows:

./configure --target=microblaze-elf

Of course, install step can safely be skipped as well as compilation of other tools, beside objdump. objdump itself seem to be built using make binutils/objdump. However it can’t be build successfully using that shortcut, so whole binutils package must be configured the way, everything not buildable is excluded from the build.

Setting up own architecture

Next step is to add support for our brand new, custom architecture to binutils’ configuration files and copy microblaze sources, so they will simulate our architecture, until we will write our own implementation. Then it should be possible to test objdump again, against our sample microblaze programs and disassembly should still work.

Even without any modification to binutils’ source or configs, it should be possible to configure it for any random architecture. The only constraint is format of the target string: ARCH-OS-FORMAT, where FORMAT is most likely to be elf. So, if we pass lkv373a-unknown-elf as target, it will work. -unknown part is usually skipped and this will not work. If we need it to work, config.sub must be modified. config.sub is used to convert any string, passed to configure into canonical form, so in our case lkv373a-unknown-elf. If it detects, that it is already in canonical form, it does nothing.

Final configure command will be slightly more complex, as we have to disable some parts, that are not of our interest and requires additional effort to work:

./configure --target=lkv373a-unknown-elf --disable-gas --disable-ld --disable-gdb

Although passing something random as target option works on configure stage, it will obviously fail on make stage. What make is doing at first is configuring all the sublibraries. What is of our interest is bfd and opcodes. And the first one fails. So this is the first problem, we need to get rid of.

bfd/config.bfd

The purpose of this file is to set some environment variables depending on target architecture. If it does not know the architecture, it returns error to caller, which is probably bfd’s configure script, called by make. According to documentation in file header, it sets following variables:

  1. targ_defvec – default vector. This links target to list of objects that will provide support for ELF file built for specific architecture (stored in bfd/configure.ac)
  2. targ_selvecs – list of other selected vectors. Useful e.g. when we need support for both 32- and 64-bit ELFs. Not needed here.
  3. targ64_selvecs – 64-bit related stuff. Used when target can be both 32- and 64-bit, meaningless in our case.
  4. targ_archs – name of the symbol storing bfd_arch_info_type structure. It provides description of architecture to support.
  5. targ_cflags – looks like some hack to add extra CFLAGS to compiler. We don’t care.
  6. targ_underscore – not sure what it is, should have no impact on our goals (possible values are yes or no)

To sum up, what we need to do on this step is to define default vector, we will later add to configure.ac and set name of architecture description structure. The structure itself will be defined later. Finally, I ended up with the following patch:

@@ -173,6 +173,7 @@ hppa*)     targ_archs=bfd_hppa_arch ;;
 i[3-7]86)   targ_archs=bfd_i386_arch ;;
 i370)     targ_archs=bfd_i370_arch ;;
 ia16)     targ_archs=bfd_i386_arch ;;
+lkv373a)  targ_archs=bfd_lkv373a_arch ;;
 lm32)           targ_archs=bfd_lm32_arch ;;
 m6811*|m68hc11*) targ_archs="bfd_m68hc11_arch bfd_m68hc12_arch bfd_m9s12x_arch bfd_m9s12xg_arch" ;;
 m6812*|m68hc12*) targ_archs="bfd_m68hc12_arch bfd_m68hc11_arch bfd_m9s12x_arch bfd_m9s12xg_arch" ;;
@@ -924,6 +925,10 @@ case "${targ}" in
     targ_defvec=iq2000_elf32_vec
     ;;

+  lkv373a*-*)
+    targ_defvec=lkv373a_elf32_vec
+    ;;
+
   lm32-*-elf | lm32-*-rtems*)
     targ_defvec=lm32_elf32_vec
     targ_selvecs=lm32_elf32_fdpic_vec

bfd/configure.ac

Now we need to define vector, we just declared to use for lkv373a architecture.

505     k1om_elf64_fbsd_vec)         tb="$tb elf64-x86-64.lo elfxx-x86.lo elf-ifunc.lo elf-nacl.lo elf64.lo $elf"; target_size=64 ;;
506     lkv373a_elf32_vec)           tb="$tb elf32-lkv373a.lo elf32.lo $elf" ;;
507     l1om_elf64_vec)              tb="$tb elf64-x86-64.lo elfxx-x86.lo elf-ifunc.lo elf-nacl.lo elf64.lo $elf"; target_size=64 ;;

Unfortunately, as we did modifications to .ac script, we now need to rebuild our configure. From my experience, any tinkering with autohell, after solving one problem, creates 5 more. We need to get into bfd directory and reconfigure project:

cd bfd
autoreconf

Now, if it worked for you, you should definitely go, play some lottery 🙂 . For me it said that I need exactly same version of autoconf as used by binutils’ developers. Because autoconf is so great, probably what I will show now is completely useless for anyone, but hacks I needed to do are at first to add:

20 m4_define([_GCC_AUTOCONF_VERSION], [2.69])

to the beginning of configure.ac file. Then bfd/doc/Makefile.am contains removed cygnus option at the beginning, in AUTOMAKE_OPTIONS, so we need to remove it. After that doing automake --add-missing, as autoreconf suggests, and then again autoreconf should solve the problem. But, as I said, this will probably not work for you. I can only wish you good luck.

(if were following the steps, you might have noticed that autoconf complained about not being in version 2.64 and we overridden version from 2.69 to 2.69 and it worked 🙂 , don’t ask me, why, please!)

After this step, compilation should start (and obviously will fail miserably on bfd as it misses few symbols). Now its time to make bfd compilable.

bfd/elf32-lkv373a.c

This file is meant to provide support for custom features of ELF file. As we don’t have any, we can safely do nothing here. Good template of such file is elf32-m88k.c as it does exactly this.

One thing that seem to be important here is EM value of architecture described. EM is an enum used in ELF file to define target architecture, so it might be required to adjust in our new elf32-lkv373a.c file. By the way definition of this value have to be added to include/elf/common.h:

433 /* LKV373A architecture */
434 #define EM_LKV373A              0x373a

It might also be a good idea to add it to elfcpp/elfcpp.h. To make the file compile, it is necessary to add following to bfd/bfd-in2.h as value of bfd_architecture enum:

2398   bfd_arch_lkv373a,    /* LKV373A */

bfd/archures.c

As we declared bfd_lkv373a_arch as symbol with CPU description structure, we now need to add this declaration to archures.c, as this is the file, where it will be used. We have to add:

611 extern const bfd_arch_info_type bfd_l1om_arch;
612 extern const bfd_arch_info_type bfd_lkv373a_arch;
613 extern const bfd_arch_info_type bfd_lm32_arch;

bfd/targets.c

Similar situation is in targets.c file. Here we have to provide declaration of our vector as bfd_target. This will be another structure, which seem to be generated automatically, so we should not care about it.

704 extern const bfd_target l1om_elf64_fbsd_vec;
705 extern const bfd_target lkv373a_elf32_vec;
706 extern const bfd_target lm32_elf32_vec;

bfd/cpu-lkv373a.c

This last file, we need in bfd, provides bfd_arch_info_type structure and… that’s it! Can be easily borrowed from cpu-microblaze.c with only slight modifications. One thing that needs explanation here is section_align_power. As far as I understand it, it is power of two to which the beginning of the section in memory must be aligned. It should be safe to put 0 here, as we are not going to load our ELF into memory.

This should close the bfd part of initialization. As you can see, there was no development at all to be done here. Let’s now go to opcodes library.

opcodes/configure.ac

At first we need to define objects to build for LKV373A architecture in opcodes library. This is quite similar to what we had to do in configure.ac of bfd library.

282         bfd_iq2000_arch)        ta="$ta iq2000-asm.lo iq2000-desc.lo iq2000-dis.lo iq2000-ibld.lo iq2000-opc.lo" using_cgen=yes ;;
283         bfd_lkv373a_arch)       ta="$ta lkv373a-dis.lo" ;;
284         bfd_lm32_arch)          ta="$ta lm32-asm.lo lm32-desc.lo lm32-dis.lo lm32-ibld.lo lm32-opc.lo lm32-opinst.lo" using_cgen=yes ;;

Hopefully, -dis file will be enough to be implemented. I’ve made a copy from microblaze configuration. The same way we will copy whole source file and any related headers in the next step.

Now, similarly to bfd’s configure.ac, we have to reconfigure it. And again, nobody knows what errors we will encounter.

opcodes/disassemble.c

The only thing that have to be done here is to set pointer of disassemble function. For this following snippets should be added:

53 #define ARCH_lkv373a
255 #ifdef ARCH_lkv373a
256     case bfd_arch_lkv373a:
257       disassemble = print_insn_lkv373a;
258       break;
259 #endif

And to disassemble.h:

62 extern int print_insn_lkv373a           (bfd_vma, disassemble_info *);

opcodes/lkv373a-dis.c

This is, where real stuff will happen. As our goal, for now, is not to make implementation of LKV373A architecture, but rather set everything up, so objdump will build, we can copy source file from microblaze-dis.c. It is also required to copy headers, related to MicroBlaze, used by this file, so:

  • opcodes/microblaze-dis.h
  • opcodes/microblaze-opc.h
  • opcodes/microblaze-opcm.h

And change include directives in them to link to lkv373a file, rather than microblaze ones.

Now, optionally we could change names of any symbols referring to name microblaze, but this should not be required, as original microblaze files should not be included in the build. The only change than need to be done is print_insn_microblaze into print_insn_lkv373a, as this is what we added to disassemble.c.

You should now be able to compile working objdump with LKV373A support (of course with wrong implementation, for now). We can now verify that everything works on slightly modified ELF file for MicroBlaze architecture (EM field must point to LKV373A – value must be 0x373a). Well done!

NOTE: all the steps, done till now are available on tutorial-setup tag in repository on Github.

Functions to implement

Now, finally the real fun starts. Bindings between opcodes library and objdump itself, require at least print_insn_lkv373a to be implemented.

What should happen inside this function is quite simple and can be described in following steps:

  1. Gets bfd_vma and struct disassemble_info (called info below) as parameters
  2. Read raw data containing instructions using info->read_memory_func
  3. Call info->memory_error_func in case of any errors
  4. Use info->fprintf_func to print disassembled instruction into info->stream
  5. Optionally use info->symbol_at_address_func to determine if there is any symbol declared at address decoded from instructions
  6. If symbol exists, call info->print_address_func
  7. Return number of bytes consumed

Following is some documentation, I wrote for easier implementation (mostly translated inline comments), of functions to be called:

  /**
   * \brief Function used to get bytes to disassemble
   *
   * \param memaddr Address of the current instruction
   * \param myaddr Buffer, where the bytes will be stored
   * \param length Number of bytes to read
   * \param dinfo Pointer to info structure
   *
   * \return errno value or 0 for success
   */
  int (*read_memory_func)
    (bfd_vma memaddr, bfd_byte *myaddr, unsigned int length,
     struct disassemble_info *dinfo);
  /**
   * \brief Call if unrecoverable error occurred
   *
   * \param status errno from read_memory_func
   * \param memaddr Address of current instruction
   * \param dinfo Pointer to info structure
   */
  void (*memory_error_func)
    (int status, bfd_vma memaddr, struct disassemble_info *dinfo);
  /**
   * \brief Pointer to fprintf
   *
   * \param stream Pass info->stream here
   * \param char Format string
   * \param ... vargs
   *
   * \return Number of characters printed
   */
  typedef int (*fprintf_ftype) (void *, const char*, ...) ATTRIBUTE_FPTR_PRINTF_2;
  /**
   * \brief Determines if there is a symbol at the given ADDR
   *
   * \param addr Address to check
   * \param dinfo Pointer to info structure
   *
   * \return If there is returns 1, otherwise returns 0
   * \retval 1 If there is any symbol at ADDR
   * \retval 0 If there is no symbol at ADDR
   */
  int (* symbol_at_address_func)
    (bfd_vma addr, struct disassemble_info *dinfo);
  /**
   * \brief Print symbol name at ADDR
   *
   * \param addr Address at which symbol exists
   * \param dinfo Pointer to info structure
   */
  /* Function called to print ADDR.  */
  void (*print_address_func)
    (bfd_vma addr, struct disassemble_info *dinfo);

For easier start of development, this commit can be used as template. You can find effects of implementation according to this description on lkv373a branch of my binutils fork on Github. After this step, you should have working objdump, able to disassemble architecture of your choice.

Alternative way

According to binutils’ documentation, porting to new architectures should be done using different approach. Instead of copying sources from other architectures, developers should write CPU description files (cpu/ directory) and then use CGEN to generate all necessary files. However, I found these files way too complicated comparing to goal, I wanted to achieve, therefore I used the shortcut. In reality, however, this might be a better way, as the final result should be the support for new architecture not only in objdump, but also in e.g. GAS (GNU assembler). If you want to go that way, another useful resource might be description of CPU description language.

Plans for the future

As I am now able to speed up reverse engineering of both instruction set and LKV373A firmware, I am planning to create public repository of my progress and guess operations done by some more opcodes as I already know only few of them. So, I will probably push some more commits to binutils repo as well. I hope this will enable me to gain some more knowledge about LKV373A and allow, me or someone else, to reverse engineer second part of the firmware, which seem to be way more interesting that the one, I was reverse engineering till now.