-TML logo-

T-110.6220 Special Course in Communications Security

Spring 2008: Malware Analysis and Antivirus Technologies (5 ECTS) P V

Homework instructions


Recent changes to the page

There will be 3 rounds of homeworks for this course.

For each of the homework rounds, you must get at least 25 % of the maximum points. In addition, you must get at least 50 % of the total points of all the homework rounds.

The total points you get, together with the points from the course project, will define your course grade.

You should return your answers by using the Optima system. Suitable formats for the documents are .txt, .doc. and .pdf. When naming your answer files, use the prefix YourLastName_homeworkX (where X is the number of the homework round) for all files you submit. Remember to include your name inside the files as well.

Please do not discuss the answers to the homeworks in public before the deadline date, even if you are not attending the course! All general questions related to the homeworks can be send to the course newsgroup: opinnot.tik.t1106220.


Homework 1

1. Windows OS basics (3 points)

Are the following claims true or false?

  1. Two threads inside the same process share the same memory space.
  2. The Service Control Manager stores configuration data about services in the registry.
  3. The information about which users can read a specific file is contained in the access token associated with that file.
  4. The sysenter instruction is a privileged instruction that can only be executed by code running in kernel mode.
  5. The Native API is the recommended API to use when developing most Windows applications.
  6. The Windows API function GetProcAddress locates the virtual address of a function exported by a dynamic-link library.

2. Windows registry (3 points)

Take a look at an example export of the HKEY_CURRENT_USER\Software\Microsoft\Windows registry branch with a Unicode-aware text editor.

Based on this, answer the following questions:

  1. What is the purpose of the HKEY_CURRENT_USER branch?
  2. This part of the registry contains a very common launchpoint. What is it?
  3. Which of the applications that are automatically launched is most likely to be malware? Why?

3. Calling conventions and Windows memory management (3 points)

The following disassembly is part of a user-mode application running on a default installation of Windows XP.

...
0042D9B0  xor         eax,eax 
0042D9B2  push        eax  
0042D9B3  call        dword ptr [myfunc] 
0042D9B6  mov         ecx,80494678h 
0042D9BB  mov         dword ptr [ecx],eax 
0042D9BD  push        eax  
0042D9BE  call        dword ptr [myfunc2] 
...

Answer the following questions:

  1. Please explain shortly what the code does.
  2. Which calling convention does the code use: cdecl, stdcall or fastcall? Justify your answer.
  3. All of code will most likely not be able to run. What's wrong with the code? What will happen when it's executed?

4. C compilers and structures, representing data (3 points)

Strings can be represented in many ways. Some API functions on Windows expect Unicode strings to be stored in UNICODE_STRING format. An explanation of UNICODE_STRING can be found in MSDN.

The structure definition in the operating system headers looks like this:

typedef struct _UNICODE_STRING {
    USHORT Length;
    USHORT MaximumLength;
    PWSTR  Buffer;
} UNICODE_STRING;

In the partial memory dumps below, address 0x0012FF48 contains a UNICODE_STRING structure.

0x0012FF30  cc cc cc cc cc cc cc cc  面面面面
0x0012FF38  cc cc cc cc cc cc cc cc  面面面面
0x0012FF40  cc cc cc cc cc cc cc cc  面面面面
0x0012FF48  04 00 06 00 94 3c 48 00  ....<H.
0x0012FF50  cc cc cc cc b8 ff 12 00  面面..
0x0012FF58  32 e0 42 00 02 00 00 00  2B.....
0x0012FF60  a0 46 33 00 38 47 33 00  F3.8G3.
0x0012FF68  3d a8 ec 52 00 00 00 00  =R....
0x0012FF70  f4 f9 9d 08 00 40 fd 7f  ...@.
0x0012FF78  ce 00 00 c0 01 05 00 00  ......
0x0012FF80  00 00 00 00 28 0a 00 00  ....(...
0x0012FF88  01 00 00 00 58 33 15 00  ....X3..
0x0012FF90  02 00 00 00 00 00 00 00  ........
...
0x00483C00  00 00 00 00 00 00 00 00  ........
0x00483C08  00 00 00 00 00 00 00 00  ........
0x00483C10  00 00 00 00 00 00 00 00  ........
0x00483C18  00 00 00 00 00 00 00 00  ........
0x00483C20  00 00 00 00 00 00 00 00  ........
0x00483C28  00 00 00 00 00 00 00 00  ........
0x00483C30  00 00 00 00 00 00 00 00  ........
0x00483C38  00 00 00 00 00 00 00 00  ........
0x00483C40  00 00 00 00 00 00 00 00  ........
0x00483C48  00 00 00 00 00 00 00 00  ........
0x00483C50  00 00 00 00 6b 1c b3 47  ....k..G
0x00483C58  00 00 00 00 02 00 00 00  ........
0x00483C60  3c 00 00 00 78 10 09 00  <...x...
0x00483C68  78 70 06 00 5f 43 49 61  xp.._CIa
0x00483C70  63 6f 73 00 00 00 00 00  cos.....
0x00483C78  6d 00 73 00 76 00 63 00  m.s.v.c.
0x00483C80  72 00 74 00 2e 00 64 00  r.t...d.
0x00483C88  6c 00 6c 00 00 00 00 00  l.l.....
0x00483C90  00 00 00 00 38 00 ac 20  ....8. 
0x00483C98  00 00 00 00 00 00 00 00  ........
0x00483CA0  00 00 00 00 00 00 00 00  ........
0x00483CA8  00 00 00 00 00 00 00 00  ........
0x00483CB0  6e 00 74 00 64 00 6c 00  n.t.d.l.
0x00483CB8  6c 00 2e 00 64 00 6c 00  l...d.l.
0x00483CC0  6c 00 00 00 00 00 00 00  l.......
0x00483CC8  52 74 6c 49 6e 69 74 55  RtlInitU
0x00483CD0  6e 69 63 6f 64 65 53 74  nicodeSt
0x00483CD8  72 69 6e 67 00 00 00 00  ring....

Based on the documentation and the memory dump, answer the following questions:

  1. Do you think the data in memory is stored using little-endian or big-endian convention? Justify your answer.
  2. What are the values of the UNICODE_STRING member variables? What are the meanings of these values? What Unicode string does this structure represent?
  3. At compile time, did the compiler align or pad the structure? Why or why not?

5. Reverse engineering I (6 points)

Using IDA Pro 4.9, analyze this sample (SHA-1: 47647b240b570a7cc2840b6777d5f1d19ae82820) and write a description on what the sample does. The sample is not malicious!

Please keep your answer relatively short (less than 2000 characters with spaces).

6. Reverse Engineering II (6 points)

Using IDA Pro 4.9, analyze this sample (SHA-1: 2bbd464046eddb552711c7468493fd38220b7012) and write a description on what the sample does. The sample is not malicious!

Please keep your answer relatively short (less than 2000 characters with spaces).

7. Effort (1 point)

How many hours did you spend answering these questions?


Homework 2

1. Debuggers, emulators and virtualization (6 points)

The following questions refer to the Windows operating system. Are they true or false?

  1. A process can have only one debugger attached to it at a time.
  2. The structured exception handlers (SEH) are stored in a linked list in the Process Environment Block (PEB) of a process.
  3. When a PE executable is launched, the first instructions executed from the image are always from the entrypoint as specified in the PE header.
  4. A debugger receives a first-chance notification of an exception immediately after the application's own exception handlers fail to handle it.

Read the paper A Comparison of Software and Hardware Techniques for x86 Virtualization and answer the following questions:

  1. Why does the basic fetch-decode-execute emulator not meet Popek and Goldberg's requirements for a VMM?
  2. Why is the popf instruction problematic for virtualization? How does VMWare solve this problem without hardware support?

2. Disassemblers (6 points)

The following hexadecimal byte sequences represent Intel x86 instructions. Disassemble them and explain the meaning of each part of the instruction, including:

You will need to refer to Intel Architecture Software Developer's Manual, Volume 2: Instruction Set Reference Manual.

For example, for a byte sequence 81 4D 10 00 00 20 00, we would have:

This gives us "or [ebp+10h], 200000h"

Here are the byte sequences:

  1. 89 C2
  2. F3 AB
  3. DC 74 24 30
  4. 64 81 3D 30 00 00 00 00 F0 FD 7F

3. Debugging I (6 points)

Using OllyDbg and any other tools you find suitable, analyze the following executable:

Write a short analysis of what the executable does.

Note that you don't have to comment each line of assembly. The sample was written with a high-level language (C), and the executable contains a lot of uninteresting code. Focus only on the interesting parts!

4. Debugging II (6 points)

Using OllyDbg and any other tools you find suitable, analyze the following executable:

Write a short analysis of what the executable does.

The sample contains code that is meant to make debugging difficult. Make sure to document the antidebugging features as well.

5. Effort (1 point)

How many hours did you spend answering these questions?


Homework 3

1. PE/COFF file format (9 points)

Take a look at this sample file. It's a console mode PE executable. Answer the following questions based on the file.

1.1 Basic information

  1. What sections does the file have? Which sections seem to contain code?
  2. The file contains resources. What information can you find about them from the PE header?
  3. The file has a digital signature. Who is the signer? Which field in the PE header includes information about where the signature data? Hint: if you want to view the signature details through the GUI, rename the file to .exe, right-click and select "Properties".

1.2 RVA's

An important concept related to the PE header is the Relative Virtual Address (RVA). Many times, you will need to convert RVA's into offsets in the file on disk. Most tools will do this for you, but it's important to understand how it's done.

Convert the following RVA's into offsets in the file on disk and explain the steps to do this:

  1. 0x0000f016
  2. 0x00001229

1.3 Fixing the header

The file can't be executed. Why? Using a hex editor, fix the file to run on your Windows machine and see what it prints out. Note that as you fixed the the file, you also made the digital signature invalid. Did you see any warnings when you executed the file?

2. Runtime packer basics (6 points)

The sample file from question 1 has been packed with a runtime packer. Analyse the file with suitable tools (hex editor, debugger) and answer the following questions.

  1. Explain shortly the priciple how runtime executable packers work. What does the concept of original entry-point (OEP) mean?
  2. Which packer was used to pack the sample file? How did you identify it? What other, more reliable methods could be used to identify the packer?
  3. Using the OllyDbg debugger, let the executable unpack itself into memory so you can step through the code at the OEP. Explain how you did this.

3. Security and Malware in 2008: Banking Trojans (9 points)

A specific malware threat targeted at online banking has been becoming more and more common. Banking trojans target bank account transactions and information to steal money from online bank users.

The following papers are a good introduction into the topic:

  1. The paper by Stahlberg describes different technologies that banking trojans use to monitor the user and spy on the transaction data. Find online descriptions of banking trojans that use the following techniques: Your answer should include a link to a description (or a weblog entry) of a banking trojan for each of the techniques. Example: Infostealer.Banker.C uses hooking to intercept transaction data.
  2. Describe three different things a normal home user could implement to mitigate the risk of being affected by a banking trojan. Analyze the solutions from different perspectives: usability, security, cost, etc. Limit your answer to approximately 500 words.
  3. Describe three solutions a financial institution offering online banking services could implement to better protect its customers from the threat of banking trojans. Think broadly -- there are many possibile ways to tackle the problem (cryptography, virtualization, security software, hardware solutions...). Again, analyze the solutions from different perspectives. Limit your answer to approximately 500 words.

4. Effort (1 point)

How many hours did you spend answering these questions?


Extra homework (10 points)

You must agree with course staff that you can do the extra homework assignment beforehand. Please send an e-mail to T-110.6220(at)tml.hut.fi. This assignment is ONLY for students that have not completed the minimum requirements for the homework rounds. This includes students that
  1. have not received 25 % of the points for one or more homework rounds (6/25 p), or
  2. have not received 50 % of the total points (37/75 p)
The assignment will be graded as pass/fail and it will be worth 10 points. If you pass, you will complete the minimum requirements for the homework rounds, even if your total is less than 50 %. Note that completing this assignment will not guarantee that you pass the course! You still must get enough total points from the homeworks and the course project. The point limits will be published after the course project has been graded.

Assignment instructions

Your task is to analyze this sample and to solve the password for your own student ID number.

Shortly describe how you came up with the solution (no more than 200 words).