09
Decoding obfuscated strings in What’s App Dalvik Executable bytecode
Filed Under (Hacking, Open source, Reverse engineering) by 2of1 on 09-07-2012
I was poking around in the What’s App Android DEX this evening. One notices immediately that they are using various obfuscations, once of which is a string obfuscation (this is NOT DexGuard, maybe Zelix KlassMaster?).
It took me a little while to learn some Dalvik bytecode, but it seems the obfuscation is fairly straight forward.
Let’s take a look at one such function:
CODE:0010EFB8 Method 6552 (0x1998):_ CODE:0010EFB8 static void CODE:0010EFB8 com.whatsapp.n0.<clinit>() CODE:0010EFB8 const/16 v2, 0x5B CODE:0010EFBC const/16 v3, 0x39 CODE:0010EFC0 const/16 v1, 0x19 CODE:0010EFC4 const/16 v4, 9 CODE:0010EFC8 const/4 v6, 0 CODE:0010EFCA const/4 v0, 2 CODE:0010EFCC new-array v9, v0, <t: String[]> CODE:0010EFD0 const-string v0, aK_20 # "k>^" CODE:0010EFD4 invoke-virtual {v0}, <ref String.toCharArray() imp. @ _def_String_toCharArray@L> CODE:0010EFDA move-result-object v0 CODE:0010EFDC array-length v5, v0 CODE:0010EFDE move v7, v5 CODE:0010EFE0 move v8, v6 CODE:0010EFE2 move-object v5, v0 CODE:0010EFE4 CODE:0010EFE4 loc_10EFE4: # CODE XREF: n0__clinit_@V+9Aj CODE:0010EFE4 if-gt v7, v8, loc_10F034 CODE:0010EFE8 new-instance v0, <t: String> CODE:0010EFEC invoke-direct {v0, v5}, <void String.<init>(ref) imp. @ unk_2EB58> CODE:0010EFF2 invoke-virtual {v0}, <ref String.intern() imp. @ _def_String_intern@L> CODE:0010EFF8 move-result-object v0 CODE:0010EFFA aput-object v0, v9, v6 CODE:0010EFFE const/4 v8, 1 CODE:0010F000 const-string v0, aE_10 # "E" CODE:0010F004 invoke-virtual {v0}, <ref String.toCharArray() imp. @ _def_String_toCharArray@L> CODE:0010F00A move-result-object v0 CODE:0010F00C array-length v5, v0 CODE:0010F00E move v7, v6 CODE:0010F010 move v6, v5 CODE:0010F012 move-object v5, v0 CODE:0010F014 CODE:0010F014 loc_10F014: # CODE XREF: n0__clinit_@V+CCj CODE:0010F014 if-gt v6, v7, loc_10F066 CODE:0010F018 new-instance v0, <t: String> CODE:0010F01C invoke-direct {v0, v5}, <void String.<init>(ref) imp. @ unk_2EB58> CODE:0010F022 invoke-virtual {v0}, <ref String.intern() imp. @ _def_String_intern@L> CODE:0010F028 move-result-object v0 CODE:0010F02A aput-object v0, v9, v8 CODE:0010F02E sput-object v9, n0_z CODE:0010F032 CODE:0010F032 locret: CODE:0010F032 return-void CODE:0010F034 # --------------------------------------------------------------------------- CODE:0010F034 CODE:0010F034 loc_10F034: # CODE XREF: n0__clinit_@V:loc_10EFE4j CODE:0010F034 aget-char v10, v5, v8 CODE:0010F038 rem-int/lit8 v0, v8, 5 CODE:0010F03C packed-switch v0, switchdata_10F098 CODE:0010F042 # --------------------------------------------------------------------------- CODE:0010F042 CODE:0010F042 loc_10F042: # CODE XREF: n0__clinit_@V+84j CODE:0010F042 move v0, v4 # default: CODE:0010F044 CODE:0010F044 loc_10F044: # CODE XREF: n0__clinit_@V+9Ej CODE:0010F044 # n0__clinit_@V+A2j ... CODE:0010F044 xor-int/2addr v0, v10 CODE:0010F046 int-to-char v0, v0 CODE:0010F048 aput-char v0, v5, v8 CODE:0010F04C add-int/lit8 v0, v8, 1 CODE:0010F050 move v8, v0 CODE:0010F052 goto loc_10EFE4 CODE:0010F054 # --------------------------------------------------------------------------- CODE:0010F054 CODE:0010F054 loc_10F054: # CODE XREF: n0__clinit_@V+84j CODE:0010F054 move v0, v1 # case 0: // (0x0) CODE:0010F056 goto loc_10F044 CODE:0010F058 # --------------------------------------------------------------------------- CODE:0010F058 CODE:0010F058 loc_10F058: # CODE XREF: n0__clinit_@V+84j CODE:0010F058 move v0, v2 # case 1: // (0x1) CODE:0010F05A goto loc_10F044 CODE:0010F05C # --------------------------------------------------------------------------- CODE:0010F05C CODE:0010F05C loc_10F05C: # CODE XREF: n0__clinit_@V+84j CODE:0010F05C move v0, v3 # case 2: // (0x2) CODE:0010F05E goto loc_10F044 CODE:0010F060 # --------------------------------------------------------------------------- CODE:0010F060 CODE:0010F060 loc_10F060: # CODE XREF: n0__clinit_@V+84j CODE:0010F060 const/16 v0, 0x75 # case 3: // (0x3) CODE:0010F064 goto loc_10F044 CODE:0010F066 # --------------------------------------------------------------------------- CODE:0010F066 CODE:0010F066 loc_10F066: # CODE XREF: n0__clinit_@V:loc_10F014j CODE:0010F066 aget-char v10, v5, v7 CODE:0010F06A rem-int/lit8 v0, v7, 5 CODE:0010F06E packed-switch v0, switchdata_10F0B0 CODE:0010F074 # --------------------------------------------------------------------------- CODE:0010F074 CODE:0010F074 loc_10F074: # CODE XREF: n0__clinit_@V+B6j CODE:0010F074 move v0, v4 # default: CODE:0010F076 CODE:0010F076 loc_10F076: # CODE XREF: n0__clinit_@V+D0j CODE:0010F076 # n0__clinit_@V+D4j ... CODE:0010F076 xor-int/2addr v0, v10 CODE:0010F078 int-to-char v0, v0 CODE:0010F07A aput-char v0, v5, v7 CODE:0010F07E add-int/lit8 v0, v7, 1 CODE:0010F082 move v7, v0 CODE:0010F084 goto loc_10F014 CODE:0010F086 # --------------------------------------------------------------------------- CODE:0010F086 CODE:0010F086 loc_10F086: # CODE XREF: n0__clinit_@V+B6j CODE:0010F086 move v0, v1 # case 0: // (0x0) CODE:0010F088 goto loc_10F076 CODE:0010F08A # --------------------------------------------------------------------------- CODE:0010F08A CODE:0010F08A loc_10F08A: # CODE XREF: n0__clinit_@V+B6j CODE:0010F08A move v0, v2 # case 1: // (0x1) CODE:0010F08C goto loc_10F076 CODE:0010F08E # --------------------------------------------------------------------------- CODE:0010F08E CODE:0010F08E loc_10F08E: # CODE XREF: n0__clinit_@V+B6j CODE:0010F08E move v0, v3 # case 2: // (0x2) CODE:0010F090 goto loc_10F076 CODE:0010F092 # --------------------------------------------------------------------------- CODE:0010F092 CODE:0010F092 loc_10F092: # CODE XREF: n0__clinit_@V+B6j CODE:0010F092 const/16 v0, 0x75 # case 3: // (0x3) CODE:0010F096 goto loc_10F076 CODE:0010F096 # --------------------------------------------------------------------------- CODE:0010F098 switchdata_10F098: # DATA XREF: n0__clinit_@V+84r CODE:0010F098 .short 0x100 CODE:0010F09A .short 4 CODE:0010F09C .int 0 CODE:0010F0A0 .int 0xC, 0xE, 0x10, 0x12 CODE:0010F0B0 switchdata_10F0B0: # DATA XREF: n0__clinit_@V+B6r CODE:0010F0B0 .short 0x100 CODE:0010F0B2 .short 4 CODE:0010F0B4 .int 0 CODE:0010F0B8 .int 0xC, 0xE, 0x10, 0x12 CODE:0010F0B8 Method End
The first thing we notice is that the function is assigning some literal values to registers v1-v4. These values appear to remain unchanged through the decoding code.
Moving along in the code, we can see that there’s some sort of mangled string being assigned @ 0×0010EFD0.
Pulling out the string until its NULL termination yields the following:
6b3e5e1c7a6d3e4b5a7971345710267a344c1b7d6b224e147d7a335c0726783d4d107b6d3e41016a713a57126c7d7b551a66722e4936666c354d07705a345d10297f295618295a344c1b7d6b22691d66773e701b6f767b5f1460753e5d
The function will now loop through each character of the string and perform a transform on it in order to decrypt it. The transform is a simple XOR.
Remember those 4 registers we set up with literals in the beginning? These are the values used for the XOR. In fact there are 5 values in this function; the fifth one is simply used as a direct literal instead of having been placed in a register.
The ‘packed-switch’ instruction decides which XOR value to use based on the current character index in the string MOD 5. This allows us to repeatedly cycle through the XOR ‘keys’.
Here’s a python function that performs the decryption:
def decode_string(encoded_str, key): encoded_str = bytearray(encoded_str) decoded_str = '' for i in range(len(encoded_str)): decoded_str += chr(encoded_str[i] ^ key[i % 5]) return decoded_str
And in use:
import binascii key = [0x19, 0x5b, 0x39, 0x75, 0x09] encoded_str = binascii.unhexlify("6b3e5e1c7a6d3e4b5a7971345710267a344c1b7d6b224e147d7a335c0726783d4d107b6d3e41016a713a57126c7d7b551a66722e4936666c354d07705a345d10297f295618295a344c1b7d6b22691d66773e701b6f767b5f1460753e5d") print decode_string(encoded_str, key)
Which in this case yields:
register/phone/countrywatcher/aftertextchanged lookupCountryCode from CountryPhoneInfo failed
Happy Hacking!




Is there some simple way to copy from dalvik explorer into an app to decode it? I just wanna find something I shouldn’t have erased. Not go to school for ten years
Sorry I don’t understand your question?
Hey,
where do you get this String at 0010EFD0?
All I find is k>^
Nice. Could you post the compiled tool? or the decompiled whatsapp?
@Moritz
The string is stored at aK_20. IDA Just shows some of the printable characters (it’s not the entire string(.
@william
.
That’s left as an exercise for you