NOTE: This post was imported from my previous blog – v3l0c1r4pt0r.tk. It was originally published on 22nd June 2014.
As promised in my previous post I’m publishing description of Microsoft’s SDC file format. At the beginning I’d like to explain what SDC file is. SDC is the abbreviation of Secure Download Cabinet/Secure Digital Container. It is used by Microsoft in its Dreamspark program (formerly MSDNAA). Theoretically it is secure container that can be sent using Internet without additional encryption and it should prevent its content from being read by any third party. But that’s theory, let’s look at how it works in practice.
Overview
Firstly let’s look at the packing process. Let’s say we are in Microsoft and we want to “secure” some data. We got some file (or possibly few files) ie. Windows ISO. Next we generate some random number and write it down somewhere. Now we use least significant byte of that number to do XOR on EVERY single byte of that file. Now it may be considered secure 🙂 But some day Microsoft realized it isn’t enough. So what did they do? They used deflate (it is compression method used ie. in zip, gzip). Actually there are two versions of the deflate: one with all headers necessary to realize method of compression by using a tool like binwalk and the other that haven’t any header. Now it is time to combine all the files we have in one. Of course we still need to know some information about them (ie. their size before/after compression, file name). After concatenation we need to count CRC of all the data we have as of now. And finally we need to build a file header. At first we need to write header size. Then starts actual header. It is important because here starts region that will be encrypted. Here is some info about the header itself and then about each file. It is possibly padded with random data (don’t know for sure). Now we need two random 32-byte keys consisting of printable characters. We use first to encrypt filenames and the second to encrypt whole header (beside its size). Finally we concatenate header with the rest and here we have SDC file.
Header format
So, we have basic overview on the format, now let’s look at the details. You think it isn’t secure, huh? It would be worse. On the right you can see example header after decryption. First four bytes determine size of the header counting from the next byte. After that we have area encrypted using Blowfish (sometimes referred blowfish-compat) with ECB mode (Electronic CodeBook) using the key stored in edv variable of webpage linked from SDX file. In that area we have 3 dwords describing the header itself. First is header signature. It can be one of the following values: 0xb4, 0xb5, 0xc4, 0xd1. All I know now is that the one with sig = 0xd1 can store files larger than 4 GiB. The next value is interesting one. It looks like it is used to “encrypt” file name in memory so that the static analysis would result in “not found”. As in other cases it is “very advanced encryption”, the same situation as with the whole file: get all the buffer, iterate through it and XOR with the value’s LSB. I have to admit that this one is even does the job. Now we have something called header size. Actually it is probably number of files packed in the container. While reversing I concluded that SDM iterates from 0 to that number, and while this it is reading 0x38 bytes from file. Next it is probably reading fileNameLength and fileName, so whole header must be in format:
<size><description><0x38-bytes-of-file-description><fileNameLength><fileName><0x38...>
and so on until we reach headerSize. Then we have a lot of values not necessary to unpack the file. First of them is offset of file name. While its value is usually 0 (at least in newer headers with blowfish encryption) it is still probably possible to encounter a file with this value greater than zero. If that happened the first thing to do is probably decrypt filename and then move pointer this amount of bytes right. Next value describes file attributes. In fact I didn’t bother about what bit means what attribute, but I suppose it is the same map as in FAT (see my libfatdino library). The next three values are timestamps (creation, access and modification). They all are in Windows 64-bit format called “file time” used for instance by .NET Framework’s DateTime class (DateTime.FromFileTime method; they are number of 100-nanosecond ticks that elapsed since epoch at 1st January 1601 midnight and I suppose that this value is unsigned). That format is very interesting in comparison with another approach of saving date on 64-bit value used on Linux. UNIX timestamp traditionally uses 1st January of 1970 as its epoch and there is usually signed value in use. It isn’t as precise as Windows (counts only seconds) but its end is about 300 billion (10^9) years in future and since it is signed, in past too. Comparing to that Windows’ date will wrap about year 60000 A.C. and cannot store any date before 1601. I know that is still unreachable (like 4 billion computers in 80’s 🙂 but good to know:) After that we have size of the compressed file (be beware of the difference between 64-bit variant and 32-bit one). When we have container with only one file the equation
compressedSize + headerSize + 4 == sdcSize
should always be true. The next one is uncompressed size of the file which can be used to check if the file has been downloaded entirely. After that there is boolean that indicates if file is inflated (compressed), another one-byte value that is probably reserved for future use, one-word padding, which is also interesting because it looks like it contains random numbers (really?). And after that more padding (this time empty) after which we have size of the file name. It may be a bit tricky because the size we have here is the size AFTER decryption and blowfish demands its output to have length divisible by 8. So to decrypt it we need to count next divisor of 8. File name is encrypted using the same method as the header itself and the second key from edv.
Decryption key
Now something more about the keystring (edv). Its format is:
<crc>^^<fileNameKey><headerKey><xorKey>
where:
- <crc> is a checksum of whole data area of a file (everything beside header size and header)
- <fileNameKey> is the key used to encrypt file names
- <headerKey> is the key used to encrypt whole header
- <xorKey> is the key used to “encrypt” the files
Security of the whole program
People who are familiar with security should already know how insecure is the SDM. For others I have short description.
- At first the files itself AREN’T ENCRYPTED in any way. They are only XORed using one byte long key. XOR itself is very weak protection, even with extremely long key. It is due to the fact that many file formats have some of their bytes predictable (this concerns EXEs, ISOs and ZIPs and these are the formats most frequent on Dreamspark). That predictable bytes are usually the beginnings (headers) which usually have so called magic bytes to easily identify file format. So when we know what byte we expect we could try to XOR that byte with actual byte and it is very probable that we get the “encryption” key.
- Deflate which is used to hide this patterns from the end user is just compression method. We don’t need anything special to decompress this data.
- ECB which is used as blowfish encryption mode is the most insecure mode of block ciphers. It can cause some parts of data to be revealed without actual decryption (see: Wikipedia).
- All the data SDM downloads/sends from/to Microsoft’s servers are UNENCRYPTED. Everything: request from the user, SDC itself and decryption keys are all plaintext so with knowledge how SDC looks we can decrypt the file even when it is not intended for us, but we are only in the middle of its road. Furthermore malicious node is able to modify the file on the fly and i.e. put a backdoor into the file, for instance Windows image.
Conclusion
For all the above reasons Secure Download Manager cannot be called a software for securely downloading the files from Microsoft’s servers. All the users using this are the same way INSECURE as users downloading i.e. their copy of Windows from warez sites. Both are susceptible to MITM attacks.
So we still don’t know the answer: why Microsoft is using dedicated software to share their software. The only answer I have is that it is just for making user’s not using Microsoft’s operating system life difficult. In place of decision-making people like the ones in European Commission I would think if this policy is not intended to be only to keep Microsoft’s monopoly for operating system.
Update 20.07.2014
Description updated thanks to GMMan and his great work on reverse engineering the whole program. He also reminded me about older variants of SDC files. I have currently sample(s) of files with 0xb3, 0xb5 and 0xd1 signatures. I know at the moment that there are also signatures 0xa9, 0xb2, 0xb4, 0xc4 and it is possible that they still are reachable through Dreamspark. It is also likely that Microsoft (or Kivuto on Microsoft’s order) will create new format so if you have a sample of file with different header, please let me know in comments!