As the world is shifting from compiled languages such
as C, C++ and Pascal to scripting languages such Python, Perl, PHP
and Javascript, so does the growth in exposure of intellectual
property (the source code). While previously “fat clients”
usually written in C and C++ were a compiled machine code
executables, more modern applications written in .NET and Java
consist of bytecode which is a “is the intermediate representation
of Java programs” (Petter Haggar, 2001). The same is applicable to
.NET applications which could be disassembled using tools shipped
with the .NET Framework SDK (such as ILDASM) and decompiled back into
source code (Gabriel Torok and Bill Leach, 2003). With web
technologies such as HTML, Javascript and Cascading Style Sheets
(CSS) where the source has to be downloaded to the client side in
order to be executed by the web browser, the end user has
unrestricted access to the entire source code.
Ability to access source code can be used both for
legitimate and malicious intent. For example, security tools are
using the ability to decompile Java applets and Flash to “performs
static analysis to understand their behaviours” (Telecomworldwire,
2009). Moreover, the ability to disassemble the source code can be
used by the software developers for debugging. On the other hand, it
can also be used to reverse engineer the source code which directly
impact the ability to protect the intellectual property.
One obvious way to try to protect the source code,
thus the intellectual property it carries, is to use obfuscation
(Gabriel Torok and Bill Leach, 2003)(Peter Haggar, 2001)(Tony Patton,
2008). Regardless of the language used to the develop the
application, obfuscation usually means:
- replacement of variable names to non-meaningful character streams
- replacement of constants with expressions
- replacement of decimal values with hexadecimal, octal and binary representation
- addition of dummy functions and loops
- removal of comments
- concatenating all lines in the source code
In a way, the process of obfuscation changes the
source code to make it difficult for the “reader” to understand
the logic behind it. It (obfuscation) could be seen as “your kid
sister encryption” - “cryptography that will stop your kid sister
from reading your files” (Bruce Shneier, 1996). Of course,
persistent “reader” can invest enough time and resources to
reproduce the source code (deobfuscate) by applying obfuscation
principals in reverse.
Bibliography
- Telecomworldwire, 2009. 'HP unveils HP SWFScan free web security tool' 2009, Telecomworldwire (M2), Regional Business News, EBSCOhost, viewed 28 October 2011.
- Bruce Schneier, 1996. “Applied Cryptography”. Wiley; 2nd Edition. Preface.
- Gabriel Torok and Bill Leach, 2003. “Thwart Reverse Engineering of Your Visual Basic .NET or C# Code” [online]. Microsoft. Available from: http://msdn.microsoft.com/en-us/magazine/cc164058.aspx (accessed: October 28, 2011).
- H.M. Deitel, P.J, Deitel and A.B. Goldber, 2004. “Internet & World Wide Web How to Program”. 3Rd Edition. Pearson Education Inc. Upper Saddle River, New Jersey.
- Peter Haggar, 2001. “Java bytecode: Understanding bytecode makes you a better programmer” [online]. IBM. Available from: http://www.ibm.com/developerworks/ibm/library/it-haggar_bytecode/ (accessed: October 28, 2011).
- Tony Patton, 2008. “Protect
your JavaScript with obfuscation”
[online]. TechRepublic. Available from:
http://www.techrepublic.com/blog/programming-and-development/protect-your-javascript-with-obfuscation/762
(accessed: October 28, 2011).