2of1
half a nibble of another

Home
Projects
About
Jul

09

Decoding obfuscated strings in What’s App Dalvik Executable bytecode

Filed Under (Hacking, Open source, Reverse engineering) by 2of1 on 09-07-2012

I was poking around in the What’s App Android DEX this evening. One notices immediately that they are using various obfuscations, once of which is a string obfuscation (this is NOT DexGuard, maybe Zelix KlassMaster?).
It took me a little while to learn some Dalvik bytecode, but it seems the obfuscation is fairly straight forward.

Let’s take a look at one such function:

CODE:0010EFB8   Method 6552 (0x1998):_
CODE:0010EFB8    static void
CODE:0010EFB8   com.whatsapp.n0.<clinit>()
CODE:0010EFB8   const/16                        v2, 0x5B
CODE:0010EFBC   const/16                        v3, 0x39
CODE:0010EFC0   const/16                        v1, 0x19
CODE:0010EFC4   const/16                        v4, 9
CODE:0010EFC8   const/4                         v6, 0
CODE:0010EFCA   const/4                         v0, 2
CODE:0010EFCC   new-array                       v9, v0, <t: String[]>
CODE:0010EFD0   const-string                    v0, aK_20 # "k>^"
CODE:0010EFD4   invoke-virtual                  {v0}, <ref String.toCharArray() imp. @ _def_String_toCharArray@L>
CODE:0010EFDA   move-result-object              v0
CODE:0010EFDC   array-length                    v5, v0
CODE:0010EFDE   move                            v7, v5
CODE:0010EFE0   move                            v8, v6
CODE:0010EFE2   move-object                     v5, v0
CODE:0010EFE4
CODE:0010EFE4 loc_10EFE4:                             # CODE XREF: n0__clinit_@V+9Aj
CODE:0010EFE4   if-gt                           v7, v8, loc_10F034
CODE:0010EFE8   new-instance                    v0, <t: String>
CODE:0010EFEC   invoke-direct                   {v0, v5}, <void String.<init>(ref) imp. @ unk_2EB58>
CODE:0010EFF2   invoke-virtual                  {v0}, <ref String.intern() imp. @ _def_String_intern@L>
CODE:0010EFF8   move-result-object              v0
CODE:0010EFFA   aput-object                     v0, v9, v6
CODE:0010EFFE   const/4                         v8, 1
CODE:0010F000   const-string                    v0, aE_10 # "E"
CODE:0010F004   invoke-virtual                  {v0}, <ref String.toCharArray() imp. @ _def_String_toCharArray@L>
CODE:0010F00A   move-result-object              v0
CODE:0010F00C   array-length                    v5, v0
CODE:0010F00E   move                            v7, v6
CODE:0010F010   move                            v6, v5
CODE:0010F012   move-object                     v5, v0
CODE:0010F014
CODE:0010F014 loc_10F014:                             # CODE XREF: n0__clinit_@V+CCj
CODE:0010F014   if-gt                           v6, v7, loc_10F066
CODE:0010F018   new-instance                    v0, <t: String>
CODE:0010F01C   invoke-direct                   {v0, v5}, <void String.<init>(ref) imp. @ unk_2EB58>
CODE:0010F022   invoke-virtual                  {v0}, <ref String.intern() imp. @ _def_String_intern@L>
CODE:0010F028   move-result-object              v0
CODE:0010F02A   aput-object                     v0, v9, v8
CODE:0010F02E   sput-object                     v9, n0_z
CODE:0010F032
CODE:0010F032 locret:
CODE:0010F032   return-void
CODE:0010F034 # ---------------------------------------------------------------------------
CODE:0010F034
CODE:0010F034 loc_10F034:                             # CODE XREF: n0__clinit_@V:loc_10EFE4j
CODE:0010F034   aget-char                       v10, v5, v8
CODE:0010F038   rem-int/lit8                    v0, v8, 5
CODE:0010F03C   packed-switch                   v0, switchdata_10F098
CODE:0010F042 # ---------------------------------------------------------------------------
CODE:0010F042
CODE:0010F042 loc_10F042:                             # CODE XREF: n0__clinit_@V+84j
CODE:0010F042   move                            v0, v4 # default:
CODE:0010F044
CODE:0010F044 loc_10F044:                             # CODE XREF: n0__clinit_@V+9Ej
CODE:0010F044                                         # n0__clinit_@V+A2j ...
CODE:0010F044   xor-int/2addr                   v0, v10
CODE:0010F046   int-to-char                     v0, v0
CODE:0010F048   aput-char                       v0, v5, v8
CODE:0010F04C   add-int/lit8                    v0, v8, 1
CODE:0010F050   move                            v8, v0
CODE:0010F052   goto                            loc_10EFE4
CODE:0010F054 # ---------------------------------------------------------------------------
CODE:0010F054
CODE:0010F054 loc_10F054:                             # CODE XREF: n0__clinit_@V+84j
CODE:0010F054   move                            v0, v1 # case 0: // (0x0)
CODE:0010F056   goto                            loc_10F044
CODE:0010F058 # ---------------------------------------------------------------------------
CODE:0010F058
CODE:0010F058 loc_10F058:                             # CODE XREF: n0__clinit_@V+84j
CODE:0010F058   move                            v0, v2 # case 1: // (0x1)
CODE:0010F05A   goto                            loc_10F044
CODE:0010F05C # ---------------------------------------------------------------------------
CODE:0010F05C
CODE:0010F05C loc_10F05C:                             # CODE XREF: n0__clinit_@V+84j
CODE:0010F05C   move                            v0, v3 # case 2: // (0x2)
CODE:0010F05E   goto                            loc_10F044
CODE:0010F060 # ---------------------------------------------------------------------------
CODE:0010F060
CODE:0010F060 loc_10F060:                             # CODE XREF: n0__clinit_@V+84j
CODE:0010F060   const/16                        v0, 0x75 # case 3: // (0x3)
CODE:0010F064   goto                            loc_10F044
CODE:0010F066 # ---------------------------------------------------------------------------
CODE:0010F066
CODE:0010F066 loc_10F066:                             # CODE XREF: n0__clinit_@V:loc_10F014j
CODE:0010F066   aget-char                       v10, v5, v7
CODE:0010F06A   rem-int/lit8                    v0, v7, 5
CODE:0010F06E   packed-switch                   v0, switchdata_10F0B0
CODE:0010F074 # ---------------------------------------------------------------------------
CODE:0010F074
CODE:0010F074 loc_10F074:                             # CODE XREF: n0__clinit_@V+B6j
CODE:0010F074   move                            v0, v4 # default:
CODE:0010F076
CODE:0010F076 loc_10F076:                             # CODE XREF: n0__clinit_@V+D0j
CODE:0010F076                                         # n0__clinit_@V+D4j ...
CODE:0010F076   xor-int/2addr                   v0, v10
CODE:0010F078   int-to-char                     v0, v0
CODE:0010F07A   aput-char                       v0, v5, v7
CODE:0010F07E   add-int/lit8                    v0, v7, 1
CODE:0010F082   move                            v7, v0
CODE:0010F084   goto                            loc_10F014
CODE:0010F086 # ---------------------------------------------------------------------------
CODE:0010F086
CODE:0010F086 loc_10F086:                             # CODE XREF: n0__clinit_@V+B6j
CODE:0010F086   move                            v0, v1 # case 0: // (0x0)
CODE:0010F088   goto                            loc_10F076
CODE:0010F08A # ---------------------------------------------------------------------------
CODE:0010F08A
CODE:0010F08A loc_10F08A:                             # CODE XREF: n0__clinit_@V+B6j
CODE:0010F08A   move                            v0, v2 # case 1: // (0x1)
CODE:0010F08C   goto                            loc_10F076
CODE:0010F08E # ---------------------------------------------------------------------------
CODE:0010F08E
CODE:0010F08E loc_10F08E:                             # CODE XREF: n0__clinit_@V+B6j
CODE:0010F08E   move                            v0, v3 # case 2: // (0x2)
CODE:0010F090   goto                            loc_10F076
CODE:0010F092 # ---------------------------------------------------------------------------
CODE:0010F092
CODE:0010F092 loc_10F092:                             # CODE XREF: n0__clinit_@V+B6j
CODE:0010F092   const/16                        v0, 0x75 # case 3: // (0x3)
CODE:0010F096   goto                            loc_10F076
CODE:0010F096 # ---------------------------------------------------------------------------
CODE:0010F098 switchdata_10F098:                      # DATA XREF: n0__clinit_@V+84r
CODE:0010F098   .short 0x100
CODE:0010F09A   .short 4
CODE:0010F09C   .int 0
CODE:0010F0A0   .int 0xC, 0xE, 0x10, 0x12
CODE:0010F0B0 switchdata_10F0B0:                      # DATA XREF: n0__clinit_@V+B6r
CODE:0010F0B0   .short 0x100
CODE:0010F0B2   .short 4
CODE:0010F0B4   .int 0
CODE:0010F0B8   .int 0xC, 0xE, 0x10, 0x12
CODE:0010F0B8   Method End

The first thing we notice is that the function is assigning some literal values to registers v1-v4. These values appear to remain unchanged through the decoding code.

Moving along in the code, we can see that there’s some sort of mangled string being assigned @ 0×0010EFD0.
Pulling out the string until its NULL termination yields the following:

6b3e5e1c7a6d3e4b5a7971345710267a344c1b7d6b224e147d7a335c0726783d4d107b6d3e41016a713a57126c7d7b551a66722e4936666c354d07705a345d10297f295618295a344c1b7d6b22691d66773e701b6f767b5f1460753e5d

The function will now loop through each character of the string and perform a transform on it in order to decrypt it. The transform is a simple XOR.

Remember those 4 registers we set up with literals in the beginning? These are the values used for the XOR. In fact there are 5 values in this function; the fifth one is simply used as a direct literal instead of having been placed in a register.

The ‘packed-switch’ instruction decides which XOR value to use based on the current character index in the string MOD 5. This allows us to repeatedly cycle through the XOR ‘keys’.

Here’s a python function that performs the decryption:

def decode_string(encoded_str, key):
    encoded_str = bytearray(encoded_str)
    decoded_str = ''
 
    for i in range(len(encoded_str)):
        decoded_str += chr(encoded_str[i] ^ key[i % 5])
 
    return decoded_str

And in use:

import binascii
 
key = [0x19, 0x5b, 0x39, 0x75, 0x09]
encoded_str = binascii.unhexlify("6b3e5e1c7a6d3e4b5a7971345710267a344c1b7d6b224e147d7a335c0726783d4d107b6d3e41016a713a57126c7d7b551a66722e4936666c354d07705a345d10297f295618295a344c1b7d6b22691d66773e701b6f767b5f1460753e5d")
 
print decode_string(encoded_str, key)

Which in this case yields:

register/phone/countrywatcher/aftertextchanged lookupCountryCode from CountryPhoneInfo failed

Happy Hacking!

(5) Comments
Read More

Comments:

5 Responses to “Decoding obfuscated strings in What’s App Dalvik Executable bytecode”

  1. Shane on 19 Jul 2012 at 12:36 pm

    Is there some simple way to copy from dalvik explorer into an app to decode it? I just wanna find something I shouldn’t have erased. Not go to school for ten years

  2. 2of1 on 19 Jul 2012 at 12:46 pm

    Sorry I don’t understand your question?

  3. Moritz on 30 Nov 2012 at 12:35 am

    Hey,

    where do you get this String at 0010EFD0?

    All I find is k>^

  4. william on 12 Dec 2012 at 8:03 pm

    Nice. Could you post the compiled tool? or the decompiled whatsapp?

  5. 2of1 on 7 Feb 2013 at 8:33 am

    @Moritz
    The string is stored at aK_20. IDA Just shows some of the printable characters (it’s not the entire string(.

    @william
    That’s left as an exercise for you :) .

Leave a Reply

  • Recent Posts

    • GlitchMeister 1.0 – Simple, Stupid FPGA-based Glitcher
    • Friends; this memcmp() implementation?? Really!!!??
    • Decoding obfuscated strings in What’s App Dalvik Executable bytecode
    • Seculert’s Malware Reverse Engineering Challenge
    • Removal of Themida reverse engineering posts
  • Archives

    • February 2013
    • October 2012
    • July 2012
    • June 2012
    • May 2012
    • April 2012
    • January 2012
    • September 2011
    • August 2011
    • July 2011
    • May 2011
    • March 2011
    • February 2011
    • January 2011
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
  • GitHub

    2of1 does not have any public repositories.
  • qrcode

  • Site Search

  • Categories

    • Apps (7)
      • Preloader (1)
      • Symbian S60 (1)
      • VLC (4)
      • Wordpress (1)
    • Coding (32)
      • Assembly (1)
        • ARM (1)
      • C (15)
      • Compilers (5)
      • Direct2D API (4)
      • Linux kernel (3)
        • Drivers (1)
      • Networking (4)
      • Python (1)
      • Sqlite (1)
      • Version control (5)
        • Git (5)
      • VHDL (1)
    • Hacking (12)
      • Code injection (1)
      • Reverse engineering (9)
    • Hardware (6)
      • Arduino (1)
      • PCBs (3)
    • Linux (22)
      • Bash (1)
      • Tips & tricks (13)
      • Virtualization (1)
      • VNC (2)
      • Xmonad (1)
    • Open source (39)
    • Other stuffs (4)
    • Social (4)
      • DC9723 (2)
  • OctoFinderProgramming Blogs - Blog Catalog Blog Directory

    Locations of visitors to this page

© 2of1.