Saturday, October 29, 2011

Protecting Code

As the world is shifting from compiled languages such as C, C++ and Pascal to scripting languages such Python, Perl, PHP and Javascript, so does the growth in exposure of intellectual property (the source code). While previously “fat clients” usually written in C and C++ were a compiled machine code executables, more modern applications written in .NET and Java consist of bytecode which is a “is the intermediate representation of Java programs” (Petter Haggar, 2001). The same is applicable to .NET applications which could be disassembled using tools shipped with the .NET Framework SDK (such as ILDASM) and decompiled back into source code (Gabriel Torok and Bill Leach, 2003). With web technologies such as HTML, Javascript and Cascading Style Sheets (CSS) where the source has to be downloaded to the client side in order to be executed by the web browser, the end user has unrestricted access to the entire source code.

Ability to access source code can be used both for legitimate and malicious intent. For example, security tools are using the ability to decompile Java applets and Flash to “performs static analysis to understand their behaviours” (Telecomworldwire, 2009). Moreover, the ability to disassemble the source code can be used by the software developers for debugging. On the other hand, it can also be used to reverse engineer the source code which directly impact the ability to protect the intellectual property.

One obvious way to try to protect the source code, thus the intellectual property it carries, is to use obfuscation (Gabriel Torok and Bill Leach, 2003)(Peter Haggar, 2001)(Tony Patton, 2008). Regardless of the language used to the develop the application, obfuscation usually means:

replacement of variable names to non-meaningful character streams
replacement of constants with expressions
replacement of decimal values with hexadecimal, octal and binary representation
addition of dummy functions and loops
removal of comments
concatenating all lines in the source code

In a way, the process of obfuscation changes the source code to make it difficult for the “reader” to understand the logic behind it. It (obfuscation) could be seen as “your kid sister encryption” - “cryptography that will stop your kid sister from reading your files” (Bruce Shneier, 1996). Of course, persistent “reader” can invest enough time and resources to reproduce the source code (deobfuscate) by applying obfuscation principals in reverse.

Bibliography

Telecomworldwire, 2009. 'HP unveils HP SWFScan free web security tool' 2009, Telecomworldwire (M2), Regional Business News, EBSCOhost, viewed 28 October 2011.
Bruce Schneier, 1996. “Applied Cryptography”. Wiley; 2nd Edition. Preface.
Gabriel Torok and Bill Leach, 2003. “Thwart Reverse Engineering of Your Visual Basic .NET or C# Code” [online]. Microsoft. Available from: http://msdn.microsoft.com/en-us/magazine/cc164058.aspx (accessed: October 28, 2011).
H.M. Deitel, P.J, Deitel and A.B. Goldber, 2004. “Internet & World Wide Web How to Program”. 3^Rd Edition. Pearson Education Inc. Upper Saddle River, New Jersey.
Peter Haggar, 2001. “Java bytecode: Understanding bytecode makes you a better programmer” [online]. IBM. Available from: http://www.ibm.com/developerworks/ibm/library/it-haggar_bytecode/ (accessed: October 28, 2011).
Tony Patton, 2008. “Protect your JavaScript with obfuscation” [online]. TechRepublic. Available from: http://www.techrepublic.com/blog/programming-and-development/protect-your-javascript-with-obfuscation/762 (accessed: October 28, 2011).

Saturday, October 22, 2011

Adaptave Web Site Design

Paul De Bra (1999), identifies a number of issues related to adoptive web site design including “the separation of a conceptual representation of an application domain from the content of the actual Web-site, the separation of content from adaptation issues, the structure and granularity of user models, the role of a user and application context” Paul De Bra (1999). This essay will discuss separation of conceptual representation and the role of the user in the application context more than ten years after publication of the original article.

Modern web application development frameworks such as .NET, Spring Framework, JavaServer Faces, Apache Orchestra, Grails and Struts offer clear separation between application representation and the content. The separation is achieved by implementation of Model-View-Controller (MVC) architecture where “Model” layer is responsible for storing and managing access to relevant pieces of data, “View” layer is responsible for rendering and layout of the data, and “Controller” layer is responsible for interaction with the end user (i.e. Internet browser). No more the entire content has to be “stored” statically in the HTML page, but generated dynamically based on input received from the user. Moreover, HTML5 Web Storage API greatly increase the storage capacity (compared to HTML session cookies) which allows web application to store structured data on a client side (WHATWG, 2011). This could further facilitate user centric web site design such as storage of user preferences, data catch, etc.

On the other hand, when discussion “the role of a user and application context” Paul De Bra (1999), the methodology and the technology is not as mature. Qiuyuan Jimmy Li ties the issue to the organization of the web application structure and notes that majority of web sites do not adapt the content to the individual user. Instead, the web server “provides the same content that has been created beforehand to everyone who visits the site” (Qiuyuan Jimmy Li, 2007). Instead, he suggest a framework which accounts for users' cognitive style and adopts information content for each individual user. Justin Brickell at. al. (2006) takes a slightly different approach and instead suggest mining site access longs to identify access patterns and user behavior such as scrolling, time spent on each page, etc. The collected information could be used for shortcutting - “process of providing links to users’ eventual goals while skipping over the in-between pages” (Brickell at. al., 2006).
In addition, it is important to highlight the security and privacy issues when discussing adaptive web-site design. In order for a web application to provide customized content, it (web application) requires to acquire or collect personal data about individual user and users' behavior patterns. For example, Google Gmail uses automated scanning and filtering technology to “show relevant ads” (Google, 2011). This could be considered by some individuals as intrusion into privacy, especially if the processed message contains sensitive information such as health records or financial information.

Bibliography

Google, 2011. “FAQ about Gmail, Security & Privacy” [online]. Available from: http://mail.google.com/support/bin/answer.py?hl=en&answer=1304609 (accessed: October 22, 2011).
H.M. Deitel, P.J, Deitel and A.B. Goldber, 2004. “Internet & World Wide Web How to Program”. 3^Rd Edition. Pearson Education Inc. Upper Saddle River, New Jersey.
Justin Brickell, Inderjit S. Dhillon and Dharmendra S. Modha, 2006.“Adaptive Website Design using Caching Algorithms” [online]. Available from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.155.5537&rep=rep1&type=pdf (accessed: October 22, 2011).
Paul De Bra, 1999. “Design Issues in Adaptive Web-Site Development” [online]. Available from: http://wwwis.win.tue.nl/~debra//asum99/debra/debra.html (accessed: October 22, 2011).
Qiuyuan Jimmy Li, 2007. “Design and Implementation of a User-Adaptive Website with Information Pallets” [online]. Available from: http://dspace.mit.edu/bitstream/handle/1721.1/45636/367589980.pdf?sequence=1 (accessed: October 22, 2011).
WHATWG, 2011. “HTML – Web Storage” [online]. Available from: http://www.whatwg.org/specs/web-apps/current-work/multipage/webstorage.html#webstorage (accessed: October 22, 2011).