Learning of malware analysis. Solving 9-1 lab from the "OllyDbg" chapter. ("Practical Malware Analysis" book)


Welcome!

Firstly, I have to tell you that the name of the chapter that I'm gonna present to you is confusing. I will not use OllyDbg to solve the exercises since Immunity Debugger is my choice. ImmDbg has a better UI than OllyDbg because it has a dark scheme which is an important configuration for me. Beside it, Immunity Debugger is the same as OllyDbg, therefore, my decision was made because of the impressions of the dark scheme UI. :) 

I'm happy that I can use a debugger since it's very useful in case of examining the complex code dynamically. Now I can simply run a debugger on the marked section of the executable code and look at the registers and memory dump - based on this information it's relatively easy to tell what an exemplary complex code really does. After the introduction, let's move on to the malware analysis of the first malicious program from the exercises.




Lab 9-1

Analyze the malware found in the file Lab09-01.exe using OllyDbg and IDA Pro to answer the following questions. This malware was initially analyzed in Chapter 3 labs using basic static and dynamic analysis techniques.

Questions:

1. How can you get this malware to install itself?

2. What are the command-line options for this program? What is the password requirement?

3. How can you use OllyDbg to permanently patch this malware, so that it doesn’t require the special command-line password?

4. What are the host-based indicators of this malware?

5. What are the different actions this malware can be instructed to take via the network?

6. Are there any useful network-based signatures for this malware?

Analysis:

Password analysis:

If we already did a basic static and dynamic analysis on this file then I'm not going to perform these techniques one more time. Let's examine the malware with IDA and Immunity Debugger immediately.


The above picture shows the beginning of the main function of the malware. (__alloca_probe which is actually the first call inside the main isn't interesting for us)  This label has the 0x402B1D address and this memory location is relative to the base image address which is 0x400000 on Windows by default. Let's launch a debugger now to set the software breakpoint on that label. Obviously, it's a good idea to take a snapshot since we are going to run the malware inside a debugger, right?

After opening the file in the OllyDbg or in the Immunity Debugger let's press Ctrl+G and type 0x402B1D to jump to the label's address. 


As you can see we are in the correct location inside the executable - that's the beginning of the main function. We can add a Bookmark to "mark" this label, doing this we will have easier access to the label since we will jump to this location using a shortcut, not an address. Right-click on the instruction from address 0x402B1D -> Bookmark -> Insert Bookmark 0. Now we can jump to the beginning of the main using Alt+0 shortcut. Everything is set, so let's analyze the malware using IDA and when we have problems we'll use a debugger to examine the exact state of the malicious program on runtime.

Actually, I'm going to use a debugger now. I'm too lazy to analyze what [ecx+eax*4-4] stands for so I will pass the arguments to the program and see what value resides inside the edx register after executing the above instruction. To pass cmd commands to the executable let's open it in a debugger again.


The marked part of the window is the place for cmd arguments. Let's type first <space> second into this input field and jump into 0x402B1D address again. We have to set a breakpoint in this location, run the file and do exactly three debugging steps (Step Into or Step Over) to be able to see what MOV EDX,DWORD PTR DS:[ECX+EAX*4-4] does.  



Unfortunately, the Registers section in this picture is almost not visible but I can guarantee that EDX stores the last argument passed from the command line. Now we can change the name of the var_4 to last_arg_ptr and continue analysis. Next, we have the sub_402510 subroutine with the pointer to the last argument. This subroutine has some math operations so let's use a debugger to understand what is going on inside this function. I think this approach is faster than doing static analysis.




After executing the first math operations, we can clearly see that ecx stores the length of the last command-line argument


The above code tells us that the length of the passed argument has to be exactly 4. In our case, the argument had 6 characters in length, so we have to restart the malware in dbg and pass 4 characters long argument to it.


This time I've passed "asdf" as the argument that is exactly 4 characters long. Let's set the breakpoint on the comparison address which is 0x402527 and run the program. In this situation, the malware should jump into the next check. And the malware does exactly what we are expecting. 




The second check is rather easy to follow. It simply checks if the first character from the passed argument is 'a'. So we are three characters to go. I'm lucky since the first character in "asdf" is 'a' - that's why I can investigate the third check without the need of restarting the whole process inside the debugger. :)




Inside the third check against the second character, there is the second_char - 'a' result in ecx register. You can examine this using debugger easily. So after calculating this simple equation -> second_char - 'a' = 1 -> second_char = 1 + 'a' -> second_char = 'b' :) The second char in "asdf" isn't 'b' so I'm gonna restart the malware inside the debugger and pass the "abcd" string as an argument this time. The fourth check against the third character inside the argument has the address 0x402563 in virtual memory.


This math trick is more complex than the other ones but advanced dynamic analysis magic is here to help us! At the end of this check, we have a comparison between the EAX and EDX register. Inside the EDX register, there is the third character from the argument passed, so this is 'c' in our case. EAX has the value 0x63 which is character 'c'. Therefore if EDX has to be equal to EAX -> third_char == 0x63 -> third_char == 'c'. Again, I'm lucky since the third char of my argument is 'c', so I'm jumping to the last check.



The last character from the passed argument is stored inside the EAX register before CMP instruction execution. ECX is equal to 0x64 and if EAX == ECX -> fourth_char == 0x64 -> fourth_char == 'd' so it's now clear that the last character inside the password has to be 'd'.

Password for the malware: abcd I've guessed it correctly. :)

We've recreated the password correctly so let's deep into further analysis and suppose that the typed password isn't correct. If this is what happens we jump directly into sub_402410, otherwise, the subroutine is omitted - let's have a look at this screen:


Let's see what the malware does inside the sub_402410 when the typed password is incorrect.


The marked part of the subroutine is responsible for retrieving the short form path of the malware. Thus, in the [ebp+malwareShortPath] buffer we have the short form of the current process path.
The next instructions are not that easy to analyze, so I'm going to examine them with the help of the debugger. mov edi, offset aCDel instruction is on the address 0x00402449, so we have to start the debugger with an incorrect password passed and set the breakpoint on 0x00402449.







After the sixth rep <something> instruction the malware concatenates "/c del " command with the malware's path. The earlier rep operations are insignificant.



The two rep operations from the above screenshot concatenate "/c del <malwarePath>" command with " >> NUL". From the advanced dynamic analysis technique, we now know that the malware probably deletes itself from the victim's machine when an incorrect password was typed. Now we are here in the code:


The rep operation wasn't executed in my case but it's probably here for building the final command and executing of this command depends on the length of the malware's path. Next, we have the call to the ShellExecuteA function and that is what it looks like inside the debugger:


So we have this exact call -> ShellExecuteA(NULL, NULL, "cmd.exe", "/c del <malwarePath> >> NUL", NULL, 0); Now it's clear that the malware attempts to delete itself from the victim's machine if the passed password is incorrect. Fortunately, Windows does not allow the file to be deleted since it's opened in ImmunityDbg. If the passed password is correct the malware executes this code:



So we are dealing with _mbscmp function. According to MSDN _mbscmp method compares two multi-byte strings and returns 0 if they are equal. Most likely in [ebp+var_1820] there is the second console argument and it's compared with "-in" string inside _mbscmp function. Let's examine it inside the debugger to be 100% sure. Set the breakpoint on call _mbscmp instruction and run the malware with these arguments -> "<malware's filename> -in abcd". (obviously inside the debugger)


As you can see I was right since the _mbscmp is called with two "-in" strings as the arguments, thus one of them has to be our argv[1]. Let's test it again, but this time with "-xx" as the second console argument.


Now we can be sure that the first argument passed to _msbcmp is always argv[1]. "-in" is a special command that can be used to force the malware to do some malicious work. Along with _mbscmp calls we have a couple of if-statement with other commands responsible for malware's configuration and these commands are: "-re", "-c", "-cc". Obviously, we have to understand what each of them does. Let's start with "-in" command.

"-in" option analysis:



The malware checks for the number of passed arguments via console. For the time being let's check what will happen if the number of arguments is 3 and the configuration command is "-in". In this situation, our malicious program executes this code:


Let's jump into sub_4025B0. This is a subroutine that accepts exactly two arguments.


sub_4025B0 is very simple. It receives the full path of the malware by calling GetModuleFileNameA with NULL as the first argument. The next function call is a call to the _splitpath method which in our case writes the malware's filename to the location addressed by the sub_4025B0 argument. This means that [ebp+ServiceName] stores malware's filename. (obviously, it is a place on the stack)


If the malware's filename was retrieved correctly and as a result, ServiceName is set to the malware's filename then the above code is executed.



In the sub_402600 there is a lot of rep operations that are a little bit complex to start the static analysis on these. If we look deeper into this subroutine we can see calls to the service routines such as OpenSCManager, CreateServiceA, ChangeServiceConfigA thus we can omit the analysis of rep operations to save some time. Instead, I will use a debugger to investigate calls to the important WinAPI functions in order to understand what is going on when someone uses "-in" option.


A call to the OpenSCManagerA is not as meaningful in the malware analysis process as to OpenServiceA. If we focus on the OpenServiceA we can get the important information -> what is the name of the malicious service? Therefore, let's set a breakpoint on the call ds:OpenServiceA and see the stack.



With the help of the debugger, I can tell that the malware attempts to open a "Lab09-01" (more precisely the service-name is the malware's filename) service with SERVICE_ALL_ACCESS privileges.
If the service with this name doesn't exist on the victim's machine the malware executes this branch:




Obviously, before the call to the CreateServiceA there are multiple rep operations but they are complex (as usually) and not that important as the mentioned call. Let's set a breakpoint on the call ds:CreateServiceA instruction and see the stack to get useful information about the malware's service.


A debugger is such an amazing tool. After accessing the breakpoint we are able to read every single argument passed to the CreateServiceA function. We can now tell that the malware's ServiceName is "Lab09-01", DisplayName is "Lab09-01 Manager Service" and BinaryPathName is "C:\Windows\system32\Lab09-01.exe". Another meaningful information is that the service will start automatically at the system's boot-time. (StartType = SERVICE_AUTO_START) If CreateServiceA returns (the result might be a success or fail - it doesn't matter in the case of this malware) the malicious program closes all handles. If the malicious service is created without any fail the malware goes through this path:







Let's check what each of these functions does with the help of the debugger. 

ExpandEnvironmentStringsA:



This WinAPI function basically expands the %SYSTEMROOT% string environment variable to the current value. After executing this function the malware creates the fully qualified binary path of the newly created service. DestString -> "C:\Windows\system32\Lab09-01.exe" (most likely %SYSTEMROOT% == "C:\Windows")

GetModuleFileNameA:



After executing GetModuleFileNameA(NULL, &f, 0x400); there will be malware's path inside the f variable. f = <malware's current path>

CopyFileA:



This piece of code copies the malware inside C:\Windows\system32\<malware's filename>. The path was passed as the BinaryPathName to the CreateServiceA earlier call. We now know that the malicious service is the malware itself.

sub_4015B0:

The subroutine is defined by the cracker who wrote the malware's code, thus we have to analyze its behavior using IDA. The only argument passed to this function is the new malware's path -> C:\Windows\system32\<malware's filename>.


At the beginning of this subroutine, the malware gets a path to the system's directory which is C:\Windows\system32\.


If the earlier operation was successful then the above code its executed. Let's see what the malicious program does here inside a debugger.




We are at the first instruction after the rep operations series. The rep operations were responsible for assembling C:\Windows\system32\kernel32.dll path. This path lies inside the [ebp+SystemDirectoryPath] buffer. Therefore the call sub_4014E0 instruction in C-Like code looks like this: sub_4014E0("C:\Windows\system32\<malware's filename>", "C:\Windows\system32\kernel32.dll");


We need to analyze sub_4014E0 to continue the analysis of sub_4015B0.

sub_4014E0:


The start of this subroutine is very simple and can be analyzed using "Symbolic constant". The malware tries to read the kernel32.dll.


Next, the malicious program gets the information about the last write time, the last access time, and the creation time of the kernel32.dll object.


After all, the subroutine sets the last write time, last access time, and the creation time of the malware the same as the kernel32.dll. And that's it. We can change the name of this subroutine to mask_malwares_file_time.

sub_4015B0 continue:

This subroutine returns after the call to the mask_malware_file_time, thus we can change its name to something like MAIN_mask_malware_file_time.

And now we have this piece of code to analyze which is the last in the branch obviously:


Well, we have some numbers and the practicalmalwareanalysis.com domain so it's clear that we are dealing with some network communication now through the HTTP protocol. Let's jump into sub_401070 to see what's going on.

sub_401070 [C-like call: sub_401070("ups", "practicalmalwareanalysis.com",  "80", "60");]:


At the beginning of the function, there is a lot of rep operations. We can try to set a breakpoint on the marked instruction and check what lies in [ebp+var_4] after all operations. 


The mov edx, [ebp+var_4] instruction isn't very interesting but the content of some buffer placed on the stack after rep operations looks suspicious.


The next block of instructions creates a registry key inside "HKLM\SOFTWARE\Microsoft" named "XPS" with KEY_ALL_ACCESS privileges.


Then, the malware sets the XPS key to "Configuration: <Data>" value. We have to check what is in the [ebp+Data] buffer. A debugger will give us such information. So let's set a breakpoint on push edx instruction that lies on 0x4011C4 memory address. The value of the EDX register is:


... "ups". That's weird but it is what it is. After setting the key, sub_401070 closes all handles and returns. We can check the HKLM\SOFTWARE\Microsoft\XPS registry value to be sure that the malware really set "Configuration: " key to "ups". Obviously, to do this you have to step over the RegSetValueExA call inside the debugger.


And that's much better! Now, this makes sense. As you can see the HKLM\SOFTWARE\Microsoft\XPS key is set to "ups\x00http://practicalmalwareanalysis.com\x0080\x0060". You can be sure that this malware will use the data inside the key. Now we can change the name of the sub_401070 to set_networking_config.

The analysis of the "default" branch used in "-in" malware's option is done. Let's examine the branch taken if the service already exists on the victim's system. Therefore, suppose that the OpenServiceA returned not 0 value. In such a situation this code is executed:


As you remember, [ebp+NewFileName] has "C:\Windows\system32\<malware's filename>" inside. Thus, if the malicious service already exists then the malware changes the start type of the service to SERVICE_AUTO_START to ensure that it will start always at the boot-time of the system. After executing this piece of code the "default" branch is taken again.

"-in" option can be executed with additional option due to this code:


If argc > 4 then the malware deletes itself from the machine since the number of command-line options is incorrect while using "-in". But you can see that if argc == 4 then everything is correct -> <malwares_executable> -in <additional_option> abcd. This additional_option is the name of the malicious service that will be installed immediately. 


To sum up "-in" option: 
The malware installs itself as a service using this option. This malicious service is called the same as the malware's filename but the displayed name is "<malware's filename> Manager Service" and the BinaryPath is "C:\Windows\system32\<malware's filename>" instead of the CWD path. The malware has an additional option after "-in" - the name of the malicious service to install. Therefore the service can be called as we want. The service is set to start automatically at the system's boot-time. In addition, the malware sets the last access, the last write, and the creation time to the same as from kernel32.dll. Also, it creates a networking configuration inside the registry to be able to talk to the attacker's server through the HTTP protocol. This configuration is set to http://practicalmalwareanalysis.com and the ports available to use are 80 and 60. 

"-re" option analysis:


The next configuration option which the malware looks for is "-re". This option has to be placed as the second command-line argument.


If the second command-line option is actually "-re" then this code is executed:

 

This code means primarily that the "-re" feature also has the additional option which we will investigate later on. Let's see what happens if we execute the malware with "-re" option alone (so if argc == 3).



This is the only code that is executed when the "-re" option alone is passed to the malware. The first part of this block of code obviously returns the name of the malware's file into the [ebp+malware_filename] buffer. This filename is passed to the sub_402900 as the only argument. Let's analyze this subroutine to understand what is going on inside the malware when "-re" option is in use.

sub_402900:


Most of this code is already known for us from the earlier "-in" option analysis. It's fairly simple - if the malicious service exists on the victim's system then its deleted by the DeleteService WinAPI function. But the service alone is only one part of the whole malware. After removing the service we have the code which is responsible for deleting the rest of the malware - binary of the service, the configuration inside the registry, and the malware's executable itself. Every track has to be deleted to ensure that the victim will not get any information about the malicious actions taken on the system.



Firstly, the malware closes all handles to the deleted service and gets the malware's filename.


Then we have a couple of "rep operations" that constructs the correct path of the malicious service binary - %SYSTEMROOT%\\system32\<malware's filename>. ExpandEnvironmentStringsA is used to expand %SYSTEMROOT% environment string to C:\Windows and as a result of this the malware gets the C:\Windows\system32\<malware's filename> path inside the [ebp+FileName] buffer. We can confirm it by using a debugger. Remember that if you want the program to stop at the call to the ExpandEnvironmentStringsA inside the "-re" option code, the service has to be installed on the system.


The execution of the malware is suspended right after the call ExpandEnvironmentStringsA instruction. 0x12DF44 is the address of the [ebp+FileName] buffer and as I mentioned earlier it has "C:\Windows\system32\<malware's filename>" path inside. After setting the correct path this code is executed:



The C:\Windows\system32\<malware's filename> file is deleted using DeleteFileA. Next, we have to check what value is inside the offset unk_40EB60.


So, unk_40EB60 has only null bytes inside. set_networking_config will put these null bytes into the HKLM\SOFTWARE\Microsoft\XPS "Configuration: " value and as a result of this, the malware's networking configuration is cleared. Now, it's time to examine the new function sub_401210.

sub_401210:


Only from investigating call instructions and the called WinAPI functions along with arguments we can be sure that this subroutine deletes a "Configuration: " key inside the "HKLM\SOFTWARE\Microsoft\XPS". We can rename this mystery subroutine to delete_networking_key.

sub_402900 (continue):

And that's the end of the analysis process of the sub_402900 function. The malware removes everything associated with the malicious service inside this method. The service, the binary of the service, and networking configuration placed inside the key along with the key - everything is removed. Therefore, we can rename this subroutine to something like remove_mal_srv_with_config

This is what happens if the number of command-line arguments is exactly 4 (if argc > 4 the malware removes itself):


The situation is the same as in "-in" option. The malware can remove the service with the name passed inside argv[2]. Thus, the syntax of the appropriate "-re" option is like that: <malware's executable> -re <[OPTIONAL] service name> abcd.

To sum up "-re" option:
This option is used for removing the whole service from the victim's system. Not only service is removed but also its binary file and the networking configuration written inside the registry after the service's installation. "-re" option also has the optional parameter which allows the attacker to pass the name of the service to remove.

"-c" option analysis:

At the beginning we have the standard check if the second command-line option is actually "-c":


Right after the check, there is an if statement construct -> if (argc == 7). 


When someone passed exactly 7 command-line arguments while using "-c" everything is correct and the malware executes the malicious code. Otherwise, it deletes itself from the victim's machine. Let's suppose that the attacker passed 7 command-line arguments with "-c" option and let's analyze the "red" branch.
You can see at first glance that the call to the set_networking_config function comes to set_networking_config(argv[2], argv[3], argv[4], argv[5]); Let's change the names of the variables to make the analysis process easier to follow.


This looks like the attacker can set some (probably networking) configuration of the malware. From the "-in" option analysis we know that there are exactly four positions to set within the "Configuration: " registry key. <some string>\x00<a domain>\x00<first port>\x00<second port>\x00 Without having the code that deals with this configuration we can't be 100% sure what this configuration is about, but I still think that this config is set for networking communication.

To sum up "-c" option:
With the help of this option the attacker can configure the malware. We don't analyze the code that uses this configuration, but I think that this is a setup for networking communication with some server. If this is true then we can configure a domain and ports simply by passing the arguments in the correct order using the console. 

"-cc" option analysis:

The standard check if the second option is actually "-cc":


Then we have the check against the number of arguments passed through the console. 


The malicious code runs only if argc == 3 using "-cc" option. Otherwise, the malware deletes itself from the machine. Let's examine the first interesting block of code and the sub_401280 function called inside it. From the passed arguments I would guess that we are dealing with four buffers with 1024 bytes in size each.

sub_401280:

This is the beginning of the subroutine:


The smart approach is to look at the value moved to the [ebp+cbData] variable. It's exactly 4097 which is equal to 4 * 1024 + 1. As you already know this function accepts four buffers 1024 bytes each as parameters. RegQueryValueExA function gets the data from the HKLM\SOFTWARE\Microsoft\XPS "Configuration: " key and places it into [ebp+Data] which is the buffer with the size of [ebp+cbData] so the size of [ebp+Data] is exactly 4097 bytes. Next, we have quite a long code with operations on the strings and this is the beginning of it:


As you can see the data from the "Configuration: " key ([ebp+Data]) is used in these operations.
And this is the end of the strings operations and the end of the subroutine.


The passed buffers are also in use. I suppose that this subroutine gets the malware's networking configuration from the registry and writes it into four buffers. But to ensure about this, let's examine it using a debugger. 

First of all, let's set breakpoints on the lea instructions before the call to the subroutine to check what are the addresses of the passed buffers.


I will list the addresses of the buffers passed to the subroutine. 
- Fourth buffer address: 0x12F364
- Third buffer address: 0x12E764
- Second buffer address: 0x12EF64
- First buffer address: 0x12EB64

If we already have needed memory locations let's set the breakpoint after the call to the subroutine and check the content of these buffers. We are here:


The malware's configuration at the moment:


These screenshots are evidence that the sub_401280 is responsible for placing the content of the malware's configuration into concrete buffers passed as the arguments. Let's rename this subroutine to read_config_into_args.


This is the last block of code to analyze when dealing with "-cc" option. To understand these instructions we have to look into sub_402E7E function.

sub_402E7E:

It's clear that this subroutine accepts exactly four buffers with the malware's configuration inside and the mystery string - "k:%s h:%s p:%s per:%s\n".


The whole subroutine is not that long as you can see. The __stbuf and __ftbuf methods are used for preparing a buffer to print it. sub_403A88 is long and complex, therefore I think that sub_402E7E is responsible for preparing and printing the malware's configuration to the console. I'm not sure if I'm right, thus I'm going to launch the malware in a safe environment with this option provided.


Using "-c" option I had set the malware's configuration and with the help of "-cc" switch the malware has written the new configuration to the console. Therefore, we can rename the sub_402E7E to print_config.

Because we now know what each command of the malware does I'm going to sum up the result of the investigation.

The list of malware's options:

Name                                                      Description        
-in <OPTIONAL: service name>        Installs the malware as a service masks its file-time and writes                                                                    the malware's default configuration into the hardcoded registry                                                                 key.

-re <OPTIONAL: service name>        Removes the malware's service along with the configuration                                                                     from the machine.

-c <some string> <domain> <port1>   Configures the malware.
<port2>

-cc                                                            Prints the malware's configuration to the console.

If someone tries to use any of the options with the incorrect number of arguments the malware removes itself from the machine.

The rest of the malware analysis

The malware obviously has some interesting option to play with, but what happens if there is only one command-line argument which is the malware itself?


This is the only code that the malware executes when no configuration option is given. That being said, let's jump into sub_401000. We have to remember that when this subroutine returns 0 the malware removes itself.

sub_401000:


First of all, this subroutine attempts to open the malware's configuration XPS key. If it's done it queries the key for the configuration value but doesn't place this data anywhere. The function returns 1 if the configuration exists and 0 otherwise. With this in mind, we can safely rename the subroutine to check_if_config_exists. So the malware deletes itself when the configuration doesn't exist at the time of launching the malware without any additional arguments.

The next very important function to analyze is sub_402360.

sub_402360:

The subroutine begins with the loop:



The whole function is in the loop's body. Each time the loop starts a new iteration it updates eax register to 1 and with test eax, eax it then checks if eax is equal to 0 and if it's true then we have the loop's end. But obviously, this expression can not be true, thus we are dealing with an infinite loop.

The first thing that this is executed inside the loop is reading the configuration.


If the malware's couldn't read the configuration it returns from the function with 1 as the exit code. Otherwise, we have the configuration written into the four local buffers, and this code takes control:


_atoi means "ASCII to integer", so it converts the string to int type. -> atoi reference. Next, we have the user-defined function called sub_402020, therefore we must break the analysis of sub_402360 and start analyzing the new one.

sub_402020:

Right at the beginning, we are dealing with the new user-defined function:


The marked function is quite large, thus I'm going to analyze its behavior using a debugger.

sub_401E60:

The beginning of this subroutine is like this:


The first two functions are responsible for reading the exact parts of the whole malware's config. The next function named sub_401D80 is also complicated so I'm going to step over it and check the content of the passed buffer whose size is 16 bytes. (most likely) The address of the buffer is 0x12C2F0.


sub_401D80 function initializes the buffer to the mystery ASCII string "IQDQ/GHY3.sTv". This value might depend on the malware's configuration, therefore I'm going to check it once again, but with default configuration data.


When I've changed the config the data generated by the subroutine is different. But it might not depend on configuration, since the generated data might depend on the time. Let's see if the content of the buffer changes after each running of the program.


The data changes each time the program runs and it might depend on time. It's visible that the pattern exists between all of the generated strings. <4 characters>/<4 characters>.<some 3 characters extension>. Let's check what the malware does with this string since it can be simply encrypted. Another interesting fact is that the passed buffer is exactly 16 bytes in size. This means that the 2 or 3 bytes are free at the end of this buffer depending if the string is already terminated with the null-byte or not.


As you can see the mystery string is passed to the sub_401AF0 function along with domain, port, and some buffers. 

sub_401AF0 (inside sub_401E60):


These instructions are most likely for creating a socket. Let's see what sub_401640 does, but without making another analysis point in the article. 

The subroutine is responsible for making a connection between the infected machine and the attacker's server. The server and the port are taken from the malware's config. You can see it on the screenshots from the debugger:


(After the above call the function returned NULL, since I didn't launch the Ubuntu machine with the fake network configuration. If you want to analyze the malware's networking behavior dynamically you should set the network.) 


As we know, every function on x86 architecture returns the value into EAX register after leaving its frame. Since I've already set the fake network the malware's found the IP address of my Linux machine and placed it into the structure at the address 0x149BE0 as shown above. Next, the malware creates the socket to make TCP connection possible:


The call is like this -> socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);

After converting host short to network short (little-endian to big-endian) via htons WinAPI call the malware jump strictly into the connect function:


MSDN for this function points that the second argument is a pointer to the sockaddr structure. This structure contains information about the connection's target. Let's take a look at this interesting object.


Here is how the sockaddr_in structure looks:
Family - 0x2 which is 2 (AF_INET)
Port - the port is converted to the network's big-endian, thus the port is 0x50 which is 80 - HTTP
IP Address - each byte corresponds to one IP octet. So the target's IP is 10.0.0.1. (this is the IP address of my Ubuntu machine. In real scenario this is the IP address of the C&C server.)

When the connection is established the malware returns from the sub_401640, therefore, we can rename it to the connect_to_cc_server.

Now, we can continue the sub_401AF0 analysis. Here is the code that is executed when the malware establishes the connection to the C&C server:



A couple of rep operations later...


A quick look at the above pictures is enough to be almost sure that the rep operations are responsible for placing the "GET <something> HTTP /1.0\r\n\r\n" string right into the buffer at [ebp+buf] address. Let's check with the debugger if I'm right.


The address of the buffer lies inside the EAX register before the call to the send function is done. So it's now clear that the rep operations are responsible for creating an appropriate string to send to the C&C server. The mystery string is in use right in the first GET request done by the malware. To be able to investigate what this string is intended to do we would have to set the real network and contact the real server. ("practicalmalwareanalysis.com" in our case) It might be some kind of handshake between the malware and the C&C server.

It's logical that after send function there is a call to the recv method. 


This is the first block in the body of the loop. You have to admit that it's pretty straightforward. It simply receives the 512 bytes of data from the C&C server. Next, there is an if statement that checks if the received bytes is less or equal to 0. If it's true then the malware executes this:


If the conditional jump is taken then the malware receives less than 0 bytes and this is the error while executing recv function and the malicious program shutdowns the socket. 


If the malware receives zero bytes it jumps to the above code that checks if the malware receives more than 0 bytes. If it's true then the CPU jumps into the beginning of the loop and otherwise the malware shutdowns the socket. Now let's check what is going on when the malware receives more than 0 bytes from the server.


This code is perfect to analyze it with a debugger. So let's set the internet connection between the VM host and the server. I will check if the practicalmalwareanalysis.com returns appropriate data to jump into the above code. The network is set, so let's put a breakpoint on mov edx, [ebp+var_408], and see if we can hit it.


Fortunately, the breakpoint is hit, but now we have to be careful since the malware can access the Internet. Now we are at the second breakpoint just before the call to _strstr:


The data received from the "practicalmalwareanalysis.com" is just "400 Bad Request", so nothing special and useful. :) But we know that with the help of _strstr function the malware looks for the "\r\n\r\n" which is the end of an HTTP request/response. 

If _strstr finds the end of the correct HTTP response from the server then the malware shutdowns the socket and places the number of bytes received into the buffer and returns from the subroutine.

After this analysis process, we can rename the sub_401AF0 to get_init_data_from_server.
sub_401AF0 end

sub_401E60 (continue):

After getting the first "init" data from the server the malware tries to find this "'`'`'" string and deals with it using rep operations.



The last instructions after rep:


These ones suggest that the malware parses the "init" string and concatenates the data within the [ebp+arg_0] buffer. This code is quite complex, thus I will not analyze it statically. I can't do this dynamically, since I haven't IP address of the real C&C server. Due to this fact, I would guess that the malware gets some "handshake" from the C&C server. I renamed sub_401E60 to connect_and_handshake_with_the_server.
sub_401E60 end

sub_402020 (continue):

After making a connection with the server the malware executes this code:


[ebp+msg] here is the [ebp+arg_0] buffer from the last analyzed subroutine. Now it's clear that the "init" command from the server is not a greeting, but the exact command to execute on the victim's system. 
Here we have the list of the available commands:
SLEEP, UPLOAD, DOWNLOAD, CMD, NOTHING

From now on we can treat this malware as the backdoor.

The backdoor analysis

We are still inside sub_402020 where the commands are executed. 

SLEEP:


From calls to _strtok and atoi functions, we know that the syntax of the SLEEP command is: SLEEP <number>. The number returned from atoi method is multiplied by 1000 and passed to Sleep WinAPI function it means that the [ebp+var_404] is the number_of_seconds_to_sleep. As a result, the full syntax is: SLEEP <number of seconds to sleep>

UPLOAD:


The IDA recognized string tokens from the UPLOAD command because they are used further in calls to the WinAPI methods. This feature saves time and we can calmly jump right into sub_4019E0 which is the main function of the UPLOAD command.
First of all, the malware tries to connect to the C&C server. If the connection is established then the malicious program executes this code:


The malware creates a writable file on the infected machine. This means that the commands are server-side. UPLOAD command is sent by the attacker from the server, thus the attacker uploads a file to the infected machine


Calls to recv and WriteFile are strong indicators that we are dealing with uploading a file from the server to the infected machine and that is exactly what malware does while executing the UPLOAD command.
Syntax: UPLOAD <port> <file to upload to the infected machine>

DOWNLOAD:


From these instructions, we can gain information about the syntax of the command. It's fairly straightforward that the syntax is: DOWNLOAD <port> <file to download from the infected machine> Now let's see what sub_401870 does.


The DOWNLOAD command opens an existing file on the infected machine for reading, obviously, the malware does it after a successful connection to the server. 


It then reads the opened file and places the file content into [ebp+buf].


After reading, the malware sends the whole content of the file inside the loop to ensure that each byte of the file is delivered straight to the attacker. And this is the end of the DOWNLOAD command's code since after the successful sending the malware shutdowns the socket.

CMD:


The above code gives us information about the syntax of the command. 
Syntax: CMD <port> <command to execute on the infected machine>

popen is the function that executes the system's command passed as the first argument and creates a pipe between the calling process and the executed command. The malware calls this function with the "rb" mode which tells us that the attacker would like to read some data from our machine. Then this code is executed:


The malware calls a subroutine in this way -> sub401790(domain, port, pipe_with_result_of_command); And here is the code of the subroutine:


The very first chunk of code is responsible for connecting with the server on the port indicated by the sent command. The next action in the CMD command chain is reading binary data from the pipe created earlier with the help of the fread function. This data represents a result of the executed command and is placed into [ebp+buf], thus let's rename it to command_result_data.


The above code is almost the same as in the DOWNLOAD command. The malware simply sends the result of the executed command on the infected machine to the attacker's server. This is how the communication looks like:


If we now know what sub_401790 does we can change its name to send_command_result.

NOTHING:


The last command available with this backdoor is NOTHING which does... exactly nothing. :) The code from the screenshot executes and the malware returns from the backdoor's main function.


When the malware executes the command it returns here:


backdoor_main function is obviously responsible for parsing and executing the commands from the server and without failure, it returns 0. Therefore, if this function returns 0 the malware sleeps for [ebp+sleep_time_in_seconds] seconds, then it jumps to the beginning of the infinite loop and waits for another command. The sleeping time is the last member of the malware's configuration in the registry.

The end of the malware analysis

So now we can answer the questions:

1. How can you get this malware to install itself?

The malware installs itself as a service with "-in" configuration option and the correct password passed as the last cmd-line argument. That being said I would install the malware in this way (suppose that the malware's filename is Lab09-01.exe) -> Lab09-01.exe -in abcd or Lab09-01.exe -in <service-name> abcd.

2. What are the command-line options for this program? What is the password requirement?

The command-line options for this program are: "-in", "-re", "-c", "-cc". The password requirement is "abcd".

3. How can you use OllyDbg to permanently patch this malware, so that it doesn’t require the special command-line password?

Let's do this for educational purposes. I'm going to use Immunity Debugger instead of OllyDbg.


If the password_validation function returns non-zero value then the password is correct. What I'm going to do is to change the jnz conditional jump to simply jmp. After this little patch, the malware will be jumping to the configuration branch for any password we want.

 Let's search for the address of this jnz instruction and find it inside the debugger. Then right-click Binary -> Edit.


After executing the conditional jump the malware jumps exactly 7 bytes ahead. So we have to change this instruction to jmp $+7 where $ in NASM is the pointer to the first byte of the current instruction.
To find the opcode for this instruction I'm going to use Metasploit's tool called nasm-shell.


So the correct opcode of the jmp instruction is EB 05. Let's change the opcode of jnz inside the debugger and right-click on the CPU window then Copy to executable -> All modifications. On the newly created window in the debugger right-click and Save file to save the patched file to disk. I've saved it as "Lab09-01-patched.exe". Let's test if we can read the configuration of the malware using some random password:


As you can see it works. :)

4. What are the host-based indicators of this malware?

I think the strongest host-based indicator of this malware is the service running on the system. The name of the service might be set by the attacker, but it's always called <something> Manager Service. Another host-based indicator is the malware's file-time which is equal to the file-time of kernel32.dll. The binary path of the service is always the strong host-based indicator of this malware since the binary resides inside %SYSTEMROOT//system32 directory and is named <service name>.exe. Finally, the registry entry of the malware's configuration is obviously a meaningful host-based indicator. When someone has HKLM//SOFTWARE//Microsoft//XPS entry with a Configuration key the machine is infected by the backdoor.

5. What are the different actions this malware can be instructed to take via the network?

The malware can be instructed to execute these commands: SLEEP, DOWNLOAD, UPLOAD, CMD, NOTHING. I've described them in the "The backdoor analysis" section of this article.

6. Are there any useful network-based signatures for this malware?

In the default configuration, the malware connects to the practicalmalwareanalysis.com domain and this host is for sure the network-based signature for this malware. Obviously, in a real scenario, we would have to deal with some C&C server/servers. The malware talks with the server through the HTTP protocol (by default). The malware/s endpoint can vary across the infected machines since the attacker can easily change the networking configuration using the "-c" option.


That's the end of the pretty long analysis of the interesting backdoor. I hope that you've learned something same as me. Thanks for reading. Cheers!

Comments

  1. Learning Of Malware Analysis. Solving 9-1 Lab From The "Ollydbg" Chapter. ("Practical Malware Analysis" Book) >>>>> Download Now

    >>>>> Download Full

    Learning Of Malware Analysis. Solving 9-1 Lab From The "Ollydbg" Chapter. ("Practical Malware Analysis" Book) >>>>> Download LINK

    >>>>> Download Now

    Learning Of Malware Analysis. Solving 9-1 Lab From The "Ollydbg" Chapter. ("Practical Malware Analysis" Book) >>>>> Download Full

    >>>>> Download LINK uL

    ReplyDelete

Post a Comment

Popular posts from this blog

PicoCTF 2018 - Reverse Engineering writeups

Learning of malware analysis. Solving labs from the "Analyzing malicious Windows programs" chapter from the "Practical Malware Anlysis" book