Code signing is a common mechanism that authors of executable code use to assert their authorship of that code and to provide integrity assurance to the users of the code that an unauthorized third party has not subsequently modified the code in any way. Code signing is widely used to protect software that is distributed over the Internet. It is also widely used for mobile code security, being a core element of the mobile code security systems of both Microsoft's ActiveX and JavaSoft's Java applet systems. Despite this widespread use, common misunderstandings have arisen concerning the actual security benefits provided by code signing. This article addresses this issue. It explains how code signing works, including its dependence upon underlying Public Key Infrastructure (PKI) technologies.
Motivation for Code Signing
Code signing, which is also known as object signing in certain programming environments, is a subset of electronic document signing. In many ways code signing is a simplification of the more generic technology in that generally only a single signature is permitted and that signature pertains to the entire file. That is, code signing usually does not support multiple signatures, encryption of (data) content, dynamic data placement, or sectional signing, which are commonly available in many document-signing systems. As a result, code signing provides only authenticity and integrity for electronic executable files —it does not provide privacy, authentication, or authorization, which are supported by several electronic document-signing approaches.
A signature provides authenticity by assuring users as to where the code came from—who really signed it. If the certificate originated from a trusted third-party Certificate Authority (CA), then the certificate embedded in the digital signature as part of the code-signing process provides the assurance that the CA has certified that the code signer is who he or she claims to be. Integrity occurs by using a signed hash function as evidence that the resulting code has not been tampered with since it was signed.
In the pre-Internet era, software was distributed in a packaged manner via branding or trusted sales outlets. It frequently came in a shrink wrapped form directly from the vendor or a trusted distributor. In the Internet era, software is often distributed via the Web, by e-mail, or by file transfer. Code signing provides users with a similar level of assurance as to software authenticity in this comparatively anonymous—and comparatively insecure—new distribution paradigm as was previously offered by packaged software in the pre-Internet era.
In all cases, what is assured is the authorship of the software, including the verification that third parties have not subsequently modified the code. In no case does the user receive any assurance that the code itself is safe to run or actually does what it claims. Thus, the actual value of code signing remains a function of the reliability and integrity of its author. Code signing, therefore, is solely a mechanism for software creators to assert their authorship of the product and validate that it has not been modified. In no case does it provide the end user with any claim as to the quality, intent, or safety of the code.
How Code Signing Works
Code signing appends a digital signature to the executable code itself. This digital signature provides enough information to authenticate the signer as well as to ensure that the code has not been subsequently modified.
Code signing is an application within a PKI system. A PKI is a distributed infrastructure that supports the distribution and management of public keys and digital certificates. A digital certificate is a signed assertion (via a digital signature) by a trusted third party, known as the Certificate Authority (CA), which correlates a public key to some other piece of information, such as the name of the legitimate holder of the private key associated with that public key. The binding of this information then is used to establish the identity of that individual. All system participants can verify the name-key binding coupling of any presented certificate by merely applying the public key of the CA to verify the CA digital signature. This verification process occurs without involving the CA.
A public key refers to the fact that the cryptographic underpinnings of PKI systems rely upon asymmetric ciphers that use two related but different keys, a public key, which is generally known, and a private key, which should be known only by the legitimate holder of the public key. This approach is known as public-key cryptography and directly contrasts to symmetric ciphers, which contrastingly require the two entities to share an identical secret key in order to encrypt or decrypt information.
The certificates used to sign code can be obtained in two ways: They are either created by the code signers themselves by using one of the code-signing toolkits or obtained from a CA. The signed code itself reveals the certificate origin, clearly indicating which alternative was used. The preference of code-signing systems (and of the users of signed code) is that the certificates come from a CA, and CAs, to earn the fee they charge for issuing certificates, are expected to perform "due diligence" to establish and verify the identity of the individual or institution identified by the certificate. As such, the CA stands behind (validates) the digital certificate, certifying that it was indeed issued only to the individual (or group) identified by the certificate and that the identity of that individual (or group) has been verified as stated. The CA then digitally signs the certificate in order to formally bind this verified identity with a given private and public key pair, which is logically contained within the certificate itself. This key pair will subsequently be used in the code-signing process. Self-created certificates, by contrast, are unconstrained as to the identities they may impersonate.
Figure 1: Code-Signing Process
Code signing itself is accomplished as follows: Developers use a hash function on their code to compute a digest, which is also known as a one-way hash . The hash function securely compresses code of arbitrary length into a fixed-length digest result. The most common hash function algorithms used in code signing are the Secure Hash Algorithm (SHA), Message Digest Algorithm 4 (MD4), or MD5. The resulting length of the digest is a function of the hash function algorithm, but a common digest length is 128 bits. The digest is then encrypted using the developer's private key, which is part of the developer's certificate. A package containing the encrypted digest and the developer's Digital Certificate is encapsulated into a special structure called the signature block . The signature block is then appended to the executable code to form the signed code.
In a Java context, the signed Java byte code is called a JAR file. First introduced in the Java Developer's Kit (JDK) version 1.1, this capability was greatly expanded with Java 2.
Figure 2: Code Verification Process
At some subsequent time, this signed code will be presented to a recipient, usually through the agency of a code-signing verification tool on the recipient's computer. This tool will inspect the signature block to verify the authenticity and integrity of the received code. This inspection is done in the following manner, as shown in Figure 2:
|1.||The certificate is inspected from the signature block to verify that it is recognizable to the code-signing verification system as a correctly formatted certificate.|
|2.||If it is, the certificate identifies the hash function algorithm that was used to create the signed digest within the received signature block. With this information, the same hash algorithm code that was used to create the original digest is then applied to the received executable code, creating a digest value, which then is temporarily stored. If it is not a correctly formatted certificate, then the code-signing verification process fails.|
|3.||The signed digest value is then taken from the signature block and decrypted with the code signer's public key, revealing the digest value, which was originally computed by the code signer. Failure to successfully decrypt this signed digest value indicates that the code signer's private key was not used to create the received signature. If this is the case, then that signature is a fraud and the code-signing verification process fails.|
|4.||The recomputed digest of Step 2 is then compared to the received digest that was decrypted in Step 3. If these two values are not identical, then the code has subsequently been modified in some way and the code-signing verification process fails. If any such anomaly occurs, then the verification system alerts the recipient concerning the nature of the failure, indicating that the resulting code is suspect and should not be trusted. However, if the digests are identical, then the identity of the code signer is established.|
|5.||If establishment occurs, then the code signer's certificate is copied from the signature block and presented to the recipient. The recipient then has the option to indicate whether or not he or she trusts the code signer. If so, then the code is executed. If not, then it is not executed.|
Types of Code Signing
Code signing is a mechanism to sign executable content. The term executable content refers to presenting executable programs in a manner so that they could be run locally—regardless of whether the executable file originated locally or remotely. Code signing is commonly used to identify authorship within several distinct usage scenarios:
|Applications can be code signed to identify their ownership within comparatively anonymous software distribution mechanisms using the Web, the File Transfer Protocol (FTP), or e-mail. This type of code signing establishes the origin for downloadable JAR, tar, zip, or CAB file software distributions, for example.|
|Device drivers can be code signed to inform an operating system of the authorship of that driver. For example, the device drivers for Windows 98, Windows ME, and Windows 2000 operating systems should preferentially be certified by Microsoft's device driver certification laboratory . The entity signs the device driver executable in order to certify that the device driver in question has indeed been successfully demonstrated by a Microsoft certification laboratory to correctly run on that operating system.|
|A recent news report  has stated that Microsoft will be using code signing as a security mechanism within its forthcoming Windows XP operating system. The article stated: "Microsoft is to incorporate a 'signed application' system in Whistler [that is, Windows XP], the intention being to furnish users with a super-secure mode of operation that just plain stops [unsigned] code executing on the machine."|
Code Signing Does Not Provide Total Security
A fundamental problem with code signing is that it cannot provide any guarantee about the good intentions of the signer or the quality, intent, operations, or safety of the code. The VeriSign and Thawte CAs, for example, combat this limitation somewhat for executables signed by certificates they issue by requiring the entities receiving their certificates to sign a "software publisher's pledge" not to sign a piece of malicious software. If they subsequently learn of violations of this agreement, they ask the owner to correct the problem.
If the owner refuses, then they cancel the owner's digital certificate and potentially bring a lawsuit against the offender. The code-signing literature has documented that the latter has occurred at least once .
Another problem is that the digital signing by even a reputable entity can be forged if the private key of the signer becomes known. This forging can occur when the criminally minded exploit any of numerous potential vulnerabilities, including hacking into the key store on the signer's machine, carelessness on the part of the signer exposing this information, or an error in a CA PKI key distribution system.
Perhaps the best summary of these issues is provided by Schneier, who wrote:
"Code signing, as it is currently done, sucks. There are all sorts of problems. First, users have no idea how to decide if a particular signer is trusted or not. Second, just because a component is signed doesn't mean that it is safe. Third, just because two components are individually signed does not mean that using them together is safe; lots of accidental harmful interactions can be exploited. Fourth, "safe" is not an all-or-nothing thing; there are degrees of safety. And fifth, the fact that the evidence of attack (the signature on the code) is stored on the computer under attack is mostly useless: The attacker could delete or modify the signature during the attack, or simply reformat the drive where the signature is stored." (Quoted from page 163 of ).Mobile Code Security
Mobile code security is a two-edged sword: it seeks to protect computer systems receiving potentially hostile mobile code and it also seeks to protect mobile code from potentially hostile users of those computer systems.
Code signing has emerged as a major adjunct to mobile code security. Because mobile code probably represents the dominant use of code signing that occurs today, this section examines how code signing assists mobile code security.
There is substantial and growing literature on mobile code security (for example, see  through ). The literature identifies four distinct approaches to mobile code security, together with a few hybrids that merge two or more methods. Each of the four approaches has an inherent trust model that identifies the assumptions upon which the approach is based. Rubin and Geer  list these four approaches as being:
|The sandbox approach, which restricts mobile code to a small set of safe operations. This is the historic approach used by Java applets. In the approach, each Java interpreter implementation attempts to adhere to a security policy, which explicitly describes the restrictions that should be placed on remote applets. "Assuming that the policy itself is not flawed or inconsistent, then any application that truly implements the policy is said to be secure. ...The biggest problem with the Java sandbox is that any error in any security component can lead to a violation of the security policy. ...Two types of applets cause most of the problems. Attack applets try to exploit software bugs in the client's virtual machine; they have been shown to successfully break the type safety of JDK 1.0 and to cause buffer overflows in HotJava. These are the most dangerous. Malicious applets are designed to monopolize resources, and cause inconvenience rather than actual loss."  The trust model assumed by the sandbox approach is that the sandbox is trustworthy in its design and implementation but that mobile code is universally untrustworthy.|
|In code signing, the client manages a list of entities that it trusts. When a mobile code executable is received, the client verifies that it was signed by an entity on this list. If so, then it is run; otherwise it does not run. This approach is most commonly associated with Microsoft's ActiveX technology. "Unfortunately, there is a class of attacks that render ActiveX useless. If an intruder can change the policy on a user's machine, usually stored in a user file, the intruder can then enable the acceptance of all ActiveX content. In fact, a legitimate ActiveX program can easily open the door for future illegitimate traffic, because once such a program is run, it has complete access to all of the user's files. Such attacks have been demonstrated in practice."  The trust model for this approach assumes that it is possible to distinguish untrustworthy authors from trustworthy ones and that the code from trustworthy authors is dependable.|
The firewalling approach involves selectively choosing whether or not to run a program at the very point where it
enters the client domain. "Research shows that it may not always be easy to block unwanted applets while allowing other
applets ... to run. The firewalling approach assumes that applets can somehow be identified. ...This approach is
fundamentally limited, however, by the halting problem, which states that there is no general-purpose algorithm that can
determine the behavior of an arbitrary program." 
A related and more viable alternative is the playground architecture that has been used to separate Java classes that prescribe graphics actions from all other actions. The former are loaded on the client, whereas the latter are loaded on a "sacrificial" playground machine for execution and then reporting of the results to the browser. Because this approach requires byte-code modification, it cannot be used in conjunction with the usual approach to code signing.
|The Proof-Carrying Code (PCC) technique is a theoretical approach that statistically checks code to ensure that it does not violate safety policies. "PCC is an active area of research so its trust model may change. At present, the design and implementation of the verifier are considered trustworthy but mobile code is universally untrustworthy." |
The most common hybrid approach occurs for Java's JDK 1.1 and Java 2. Each combines the sandbox approach, which was the security mechanism for JDK 1.0, with code signing. This hybrid originated from the realization that the inherent restrictions of the sandbox model kept applications from doing "interesting and useful things." Therefore, a mechanism for running applications outside of the sandbox, code sharing, was devised to supplement the sandbox-based original. Specifically, in JDK 1.1 a signed applet enjoys unlimited access to system resources, just like local applications do, provided that the corresponding public key is trusted in the executing environment. This system evolved within Java 2 to optionally provide a consistent and flexible policy for applets and applications, determined by the policies established within a protection domain.
The literature is unanimous that the net result of this hybrid version "introduces the same security problems [as those] inherent in the ActiveX code-signing approach."  For this reason, Bernard Cole  has stated "neither [the sandbox nor the code signing] model is appropriate to the new environment of small information appliances, connected embedded devices, numerous web-enabled wireless phones and set-top boxes."  Indeed, several articles (for example, perhaps the best collection is contained in  ) contained worrying descriptions of how to compromise specific sandbox and code-signing products.
The literature (see  through ) is also clear that despite the demonstrable weaknesses of both the sandbox and code-signing approaches as mechanisms for securing mobile code, they are the best practical alternatives available today. In the meantime, researchers are currently exploring enhanced mobile code security by making hybrids containing three—or all four—of the above mechanisms.
Researchers have also begun to investigate alternative techniques. For example, Zhao  reports that "Additional innovative authentication functions are needed for mobile code. One approach is to apply digital fingerprinting to authenticate mobile code. Analogous to 'biometric authentication' for access control, a digital fingerprint of mobile code is a unique authentication code that is an integral and intrinsic part of the thing being authenticated. It is placed into the mobile code during its development by using digital watermarking techniques."
Major Code-Signing Systems
Code-signing systems are often functions of specific applications. For example, Thawte  is a CA that provides the following certificate types:
|The Apple Developer Certificate is used by Apple MacOS-based application developers to sign software for electronic distribution.|
|The JavaSoft Developer Certificate can be used with JavaSoft's JDK 1.3 and later to sign Web applets.|
|A Marimba Channel Signing Certificate is used to sign Castanet channels on the Marimba platform.|
|A Microsoft Authenticode Certificate is used with the Microsoft InetSDK developer tools to sign Web applets (for instance, ActiveX controls) as well as .CAB, .OCX, .CLASS, .EXE, .STL, and .DLL files, and other potentially harmful active content on Microsoft OS platforms. These Authenticode certificates work only with Microsoft IE 4.0 and later browsers.|
|VBA Developer Certificates are identical to the Microsoft Authenticode certificates. They are used by developers to sign macros in Office 2000 and other VBA 6.0 environments.|
|Netscape Code-Signing Certificates are used to sign Java applets, browser plug-ins, and other active content on the Netscape Communicator platform.|
Despite this diversity, the clearly dominant code-signing systems today come from Microsoft, Netscape, and JavaSoft. Although these three systems generally adhere to the same set of standards, their approaches are highly diverse from each other. Each has its own certificate type. Each system approaches code signing with different orientations, goals, and expectations.
Although all code signing uses similar technology, interoperability problems currently impact code signing. These problems may originate from interoperability problems within the underlying PKI infrastructure, from certificate differences, or from different (vendor) approaches to code signing itself.
PKI Infrastructure Interoperability
The PKI Forum has identified ten impediments to the widespread adoption of PKI  , the most significant being the "lack of interoperability" between PKI products. Because of this, the technical working group of the PKI Forum is currently concentrating on addressing PKI interoperability problems: "The Technical Working Group continues its focus on multi-vendor interoperability projects. Over the last six months, it has sponsored monthly interoperability "bake-offs" based on the Certificate Management Protocol (CMP) standard, with participation from a growing number of vendors. In addition, two workshops have been held to date on application-level interoperability through the use of digital certificates, with remote testing ongoing. Looking forward, the Technical Working group plans to initiate two new interoperability projects in the areas of Smart Card/Token Portability and CA interoperability, and it will be defining a large-scale, multi-vendor interoperability project for public demonstration in the first quarter of 2001." 
Numerous potential interoperability issues stem from the certificates themselves because certain certificates are themselves tied to specific types of applications.
However, not every certificate is a code-signing certificate. Rather, code-signing certificates are special certificates whose associated private keys are used to create digital signatures. In addition, the id-kp-codesigning value within the extended key usage field of the certificate itself (see Section 184.108.40.206 of RFC 2459) needs to be set to indicate that the certificate can be used for code signing.
In any case, code-signing certificates must be packaged in the appropriate format [Public Key Cryptographic Standards (PKCS)]), and the various code-signing approaches (for example, Microsoft, Netscape, JavaSoft) expect both the signing certificates and the code that is to be signed to conform to different file format requirements.
These differences between code-signing systems introduce opportunities for incompatibility, even if each approach otherwise rigorously adheres to the same basic certificate standards.
Not all certificates can be used to support all potential certificate uses, even if they originate from the same CA. For example, the Java Developer Certificates are not interoperable (exchangeable) with any other certificates at this time. Fortunately, it is possible to buy certificates that can be used for many (but not all) potential uses. For example, a single certificate can support Microsoft Authenticode, Microsoft Office 2000/ VBA Macro Signing, Netscape Object Signing, Apple Code Signing, and Marimba Channel Signing.
Code Signing System Interoperability
Probably the least understood of the potential interoperability problems are due to different vendor approaches to code signing itself. Perhaps McGraw and Felten have provided the best insight to code-signing system interoperability within Appendix A of their book Securing Java . Unfortunately, those insights were in regard to an earlier version of Java, which has evolved considerably since then.
Each of the three major code-signing systems (Microsoft, Netscape, JavaSoft) has its own certificates. Each provides its own certificate stores to house certificates within its system.
Each of the three systems supports mechanisms by which certificates may be exported from a given user's certificate store and imported into a different user's certificate store on the same or on a different machine. The Microsoft and Netscape systems also have provisions for importing certificates between code-signing systems.
Certificates are usually exported between PKI systems or certificate stores in the PKCS-12 format (.p12 files if Netscape or .pfx files if Microsoft Authenticode), which contains both certificate and key pair information within the same file. Certificates can also be exported in the PKCS-7 format (for example, .cer or .spc files).
The latter approach lacks information to permit the certificate to be used for code signing by the importing system unless the missing elements can be retrieved via other mechanisms.
The Netscape certificate utility (that is, signtool–L) indicates which of the certificates located within a certificate store can be used for code signing. By contrast, all certificates (except for those explicitly prohibited from doing code signing according to the provisions of RFC 2459 Section 220.127.116.11) within a Microsoft certificate store can be used for code signing within the Microsoft system. This means that a certificate that is unable to be used for code signing in a Netscape system can be imported into the Microsoft system and be successfully used for code signing there.
This difference stems from RFC 2459 Section 18.104.22.168, which deals with the extended key usage field. The relevant text of the standard is as follows:
"If the extension is flagged critical, then the certificate MUST be used only for one of the purposes indicated. If the extension is flagged non-critical, then it indicates the intended purpose or purposes of the key, and may be used in the correct key/certificate of an entity that has multiple keys/certificates. It is an advisory field and does not imply that usage of the key is restricted by the certification authority to the purpose indicated. Certificate using applications may nevertheless require that a particular purpose be indicated in order for the certificate to be acceptable to that application."
What has occurred is that Netscape has implemented its system such that certificates can be used only for the purposes specified in the extended usage field. Netscape does this for both critical and noncritical markings. Microsoft, by contrast, provides that restriction solely to certificates that have been marked "critical," permitting certificates without a critical marking to be used for any activity possible. Both approaches are legal, and both fully conform to the standard.
Code Signing from an End User's Perspective
The results obtained when you try to execute signed code is a function of your underlying operating system, the browser you are using, and whether or not the executable is a Java applet. This should not be surprising, because similar differences also occur with unsigned code. For example, a Microsoft executable file will execute on a Microsoft Windows operating system but is unlikely to execute on operating systems that do not recognize that format. Similarly, a Java applet cannot be directly invoked on a Windows operating system, because that operating system does not recognize the .jar file extension. However, it will cleanly execute when accessed off of a Web page, regardless of the underlying operating system.
 "A Closer Look at the E-signatures Law," by Linda Rosencrance, Computer World , October 5, 2000.
 "Standards Issue Mars E-signature," by Jaikumar Vijayan and Kathleen Ohlson, Computer World , July 10, 2000.
 "Mobile Code and Security," by Gary McGraw and Edward Felten, IEEE Internet Computing , Volume 2, Number 6, November/December 1998.
 "Mobile Code Security," by Aviel Rubin and Daniel Geer, IEEE Internet Computing , Volume 2, Number 6, November/December 1998.
 "Securing Systems Against External Programs," by Brant Hashii, Manoj Lal, Raju Pandey, and Steven Samorodin, IEEE Internet Computing , Volume 2, Number 6, November/December 1998.
 "Secure Web Scripting," by Vinod Anupam and Alain Mayer, IEEE Internet Computing , Volume 2, Number 6. November/December 1998.
 "Secure Java Class Loading" by Li Gong, IEEE Internet Computing , Volume 2, Number 6, November/December 1998.
 "Mobile Code Security: Taking the Trojans out of the Trojan Horse," by Alan Muller, University of Cape Town. April 5, 2000.
 "Understanding the keys to Java Security—The Sandbox and Authentication" by Gary McGraw and Edward Felten, JavaWorld Magazine , May 1997.
 "Repair Program or Trojan Construction Kit?" by Greg Guerin, September 7, 1999.
 "Security, Reliability Twin Concerns in Net Era," by Bernard Cole, Electrical Engineering Times , July 24, 2000.
 "Java Security: From HotJava to Netscape and Beyond," by Drew Dean, Edward Felten, and Dan Wallach, Proceedings of 1996 IEEE Symposium on Security and Privacy, May 1996.
 "Formal Aspects of Mobile Code Security," by Richard Drews Dean, PhD thesis, Princeton University, January 1999.
 "A Flexible Security Model for Using Internet Content," by Nayeem Islam, Rangachari Anad, Trent Jaeger, and Josyula Rao, IBM Thomas J Watson Research Center, June 28, 1997.
 Securing Java—Getting Down to Business with Mobile Code , by Gary McGraw and Edward Felten, ISBN 0-471-31952-X, John Wiley & Sons, 1999.
 "Mobile Code: Emerging Cyberthreats and Protection Techniques,' by Dr. Jian Zhao, Proceedings of the Workshop on Emerging Threats Assessment—Biological Terrorism, July 7-9, 2000, Dartmouth College, Hanover, NH.
 Secrets and Lies—Digital Security in a Networked World , by Bruce Schneier, ISBN 0-471-25311-1, John Wiley and Sons, 2000.
 Telephone conversation between Bob Moskowitz and Eric Fleischman on September 26, 2000.
 E-mail correspondence between Joseph M. Reagle, Jr., of the W3C and Eric Fleischman on December 6, 2000.
[A longer version of this article can be obtained from the author.]
ERIC FLEISCHMAN has university degrees from Wheaton College (Illinois), the University of Texas at Arlington, and the University of California at Santa Cruz. He currently works in data communications security. He is employed as an Associate Technical Fellow by The Boeing Company. Eric was formerly employed by the Microsoft Corporation, AT&T Bell Laboratories, Digital Research, and Victor Technologies. He can be contacted at Eric.Fleischman@boeing.com