CS 2550 Foundations of Cybersecurity Cryptography The Science of Secrets Cryptography: the study of mathematical techniques to providing aspects of information security services Creating secrets Cryptanalysis: the study of mathematical techniques for attempting to defeat information security services Breaking secrets Cryptology: the study of cryptography and cryptanalysis Cryptographic Protocols Protocols that
Enable parties to Achieve goals to Overcome adversaries Need to understand Who are the parties and the context in which they act? What are the security goals of the protocols? What are the capabilities of the adversaries? Terminology Plaintext Ciphertext Plaintext Encryption
Decryption Encryption_Key Decryption_Key Parties The Good Guys The Bad Guys Alice Bob Carl Eve
Mallory Fundamental Goals Confidentiality no eavesdropping Integrity no unauthorized modifications Authenticity no spoofing or faking Non-repudiation no disclaiming of authorship Additional Goals of Modern Crypto Pseudo-random number generation Anonymity E-voting Secret sharing Zero-knowledge proof Secure multi-party computation Attacker Threat Model
1. Interaction with messages and the protocol Passive: only observes and attempts to decrypt messages Only threatens confidentiality Active: observes, modifies, or deletes messages Threatens confidentiality, integrity, and authenticity 2. Full knowledge of the chosen cryptographic algorithm Kerchhoffs Principle A cryptosystem should be secure even if everything about the system, except the key, is public knowledge Shannons Maxim The enemy knows the system No security through obscurity
Attacker Threat Model 3. Interaction with cipher algorithm Chosen-plaintext attack Attacker may choose a number of messages and obtain the ciphertexts for them Chosen-ciphertext attack Attacker may choose a number of ciphertexts and obtain the plaintexts Both attacks may be adaptive Choices may change based on results of previous requests 4. Computationally bounded Finite resources to calculate and store things No quantum computers Classical Cryptography
Approaches to Secure Communication Steganography covered writing hides the existence of a message depends on secrecy of method Cryptography hidden writing hide the meaning of a message depends on secrecy of a short key, not method Caesar Shift Simple symmetric substitution cipher Key is a number k To encrypt, shift each letter by k positions
To decrypt, shift each letter back by k positions HEY BRUTUS BRING A KNIFE TO THE PARTY K=3 KHB EUXWXV EULQJ D NQLIH WR WKH SDUWB Cryptanalysis of Shift Cipher Brute force: try all 25 possible keys (assuming English text) K = 0 and K = 26 dont make sense Lessons? Simple, exhaustive key search can be effective Key space needs to be large enough to prevent attack Monoalphabetic Substitution Cipher
Replace each letter X with (X) where is a ) where is a permutation In this cipher, the key is the permutation Key space is all possible permutations For English: 26! = 4*1026 HELLO WORLD A B C D E F G H I J K L M N O P Q R S T U V W = XC YA ZD O Z H W Y G B Q X L V T R N M S K J I P F E U YZXXT PTMXO Cryptanalysis of Monoalphabetic Substitution Dominates cryptography through the first millennium Exhaustive search is infeasible (26! = 4*1026 possible keys)
Frequency analysis Remember Al-Kindi from 800 AD? Frequency Analysis Human languages have patterns Frequency of letter usage Frequency of n-letter combinations These patterns survive substitution = B A D C Z H W Y G O Q X) where is a L V T R N M S K J I P F E U Cryptanalysis of Monoalphabetic Substitution
Dominates cryptography through the first millennium Exhaustive search is infeasible (26! = 4*1026 possible keys) Frequency analysis Remember Al-Kindi from 800 AD? Lessons? Use large blocks: instead of replacing ~6 bits at a time, replace 64 or 128 bits Leads to block ciphers like DES and AES Use different substitutions to prevent frequency analysis Leads to polyalphabetic substitution ciphers and stream ciphers Vigenre Cipher (1596) Main weakness of monoalphabetic substitution ciphers: Each letter in the ciphertext corresponds to only one letter in the plaintext Polyalphabetic substitution cipher
Given a key K = (k1, k2, , km) Shift each letter p in the plaintext by ki, where i is modulo m A B C D E F G H I J K 0 1 2 3 4 5 6 7 8 9 10 L M N O
P Q R S T U V W
X) where is a Y Z 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 Plaintext CRYPTOGRAPHY Key LUCKLUCKLUCK (Shift 11 20 2 10 11 20 2 11 ) Ciphertext NLAZEIIBLJJI Cryptanalysis of Vigenre Cipher Essentially a collection of shift ciphers One letter in ciphertext corresponds to multiple letters in plaintext Frustrates, but doesnt stop frequency analysis Cracking Vigenre (1854 or 1863) 1. Guess the key length x using Kasisky test or index of
coincidence 2. Divide the ciphertext into x shift cipher encryptions 3. Use frequency analysis on each shift cipher Kasisky Test Plaintext Key Ciphertext T H E S U N A N D T H E M A N I N T H E M O O N K I N G K I N G K I N G K I N G K I N G K I N G D P R Y E V N T N B U K W I A O X B U K W W B T Distance = 8
Repeating patterns (of length >2) in ciphertext are a tell Likely due to repeated plaintext encrypted under repeated key characters The distance is likely to be a multiple of the key length Cryptanalysis of Vigenre Cipher Cracking Vigenre (1854 or 1863) 1. Guess the key length x using Kasisky test of index of coincidence 2. Divide the ciphertext into x shift cipher encryptions 3. Use frequency analysis on each shift cipher Lessons? As key length increases, letter frequency becomes more random If key never repeated, Vigenre wouldnt be breakable One Time Pad (1920s)
Fix the vulnerability of the Vigenre cipher by using very long keys Key is a random string that is at least as long as the plaintext Similar encryption as with Vigenre (different shift per letter) Cryptanalysis of OTP Intuitively, the key is random, so ciphertext is also random OTP achieves Perfect Secrecy Shannon or Information Theoretic Security Basic idea: ciphertext reveals no information about plaintext An encryption over message space M is perfectly secure iff probability distribution over M message m M ciphertext c C for which P(c) > 0
we have P(PT=m | CT=c) = P(PT=m) Where PT is plaintext and CT is ciphertext In English The adversary believes the probability that the plaintext is m is P(PT=m) before seeing the ciphertext Maybe they are very sure, or maybe they have no idea The adversary believes the probability that the plaintext is m is P(PT=m | CT=c) after seeing that the ciphertext is c P(PT=m | CT=c) = P(PT = m) means that after knowing that the ciphertext is c, the adversarys belief does not change Intuitively, the adversary learned nothing from the ciphertext Put Another Way Imagine you have a ciphertext c where the length |c| = 1000
I can give you a key ki with |ki| = 1000 such that: The decrypted message mi is the first 1000 characters of Hamlet Or, I can give you a key kj with |kj| = 1000 such that: The decrypted message mj is the first 1000 characters of the US Constitution If an algorithm offers perfect secrecy then: For a given ciphertext of length n All possible corresponding plaintexts of length n are possible decryptions Cryptanalysis of OTP Intuitively, the key is random, so ciphertext is also random OTP achieves Perfect Secrecy Shannon or Information Theoretic Security Basic idea: ciphertext reveals no information about plaintext Caveats
If the length of the OTP key is less than the length of the message Its not a OTP anymore, not perfectly secret! If you reuse the OTP key Its not a OTP anymore, not perfectly secret! Major issue with OTP in practice? How to securely distribute the key books to both parties Symmetric Block Ciphers Symmetric Key Cryptography Algorithms that use a single key for encryption and decryption i.e. the algorithm is reversible k m Dk(Ek(m)) = m where m is a message, k is a key, and Dk and Ek are decryption and encryption using k
Historic examples: Caeser shift, mono and polyalphabetic substitution, OTP Modern examples: DES, 3DES, RC4, Blowfish, Twofish, AES Warning: many of these methods are known to be vulnerable Why Block Ciphers? One way to defeat frequency analysis Use different keys in different locations Examples: OTP, stream ciphers Another way Make the unit of transformation larger Instead of encrypting letter by letter (~6 bits), encrypt block by block (n bits) Data Encryption Standard (DES)
Designed by IBM, with modifications proposed by the NSA US national standard from 1977 to 2001 Block size is 64 bits Key size is 56 bits Has 16 rounds Designed mostly for fast implementation in hardware Software implementation is somewhat slow Considered insecure now Vulnerable to brute-force attacks, key too short Advanced Encryption Standard (AES) In 1997, NIST made a formal call for algorithms stipulating that the AES would specify an unclassified, publicly disclosed encryption algorithm, available royalty-free, worldwide Goal: replace DES for both government and private-sector encryption.
The algorithm must implement symmetric key cryptography as a block cipher and (at a minimum) support block sizes of 128-bits and key sizes of 128-, 192-, and 256-bits. In 1998, NIST selected 15 AES candidate algorithms. In 2000, NIST selected Rijndael (invented by Joan Daemen and Vincent Rijmen) as the AES AES Features Designed to be efficient in both hardware and software Block size: 128 bits Variable key size: 128, 192, or 256 bits No known weaknesses AES Example Alice Eavesdropper
KAES Bob KAES M EKAES(M) M EKAES(M) Need for Encryption Modes A block cipher encrypts only one block But a message may be longer than one block Need a way to extend the algorithm to encrypt arbitrarily long
messages Need to ensure that if block cipher is secure, then whole encryption is secure Unit test vs. integration test Whole operation should be IND-CPA secure if block cipher is IND-CPA secure ECB Encryption Mode Message is broken into independent blocks Electronic Code Book (ECB): each block is encrypted separately Key Plaintext 1 Plaintext 2 Block Cipher
Encryption Block Cipher Encryption Ciphertext 1 Key Ciphertext 2 Plaintext n Key
Block Cipher Encryption Ciphertext n Cryptanalysis of ECB Mode Deterministic The same data block always gets encrypted the same way Reveals patterns when data repeats! m encrypted with k always produces the same c This is the same problem we had with the Vigenre cipher Do not use ECB mode in practice
CBC Encryption Mode Cipher Block Chaining (CBC) Uses a random Initialization Vector (IV) Block i depends on block i-1 Plaintext 1 Plaintext 2 Block Cipher Encryption Block Cipher Encryption is exclusive bitwise OR (X) where is a OR)
Plaintext n IV K Ciphertext 0 Ciphertext 1 K Ciphertext 2
K Block Cipher Encryption Ciphertext n Cryptanalysis of CBC Mode CBC randomizes the encryption IV ensures initial block is randomized Dependency between blocks propagates randomness CBC is IND-CPA assuming Block cipher itself is secure IV is truly random IV is sufficiently large
Usage in practice: choose random IV and protect its integrity The IV is not secret (it becomes part of the ciphertext) Do not let the adversary control the IV Computational Security Towards Computational Security Perfect secrecy is too difficult to achieve in practice Imagine trying to do OTP encryption with every website that uses HTTPS Computational security uses two relaxations: 1. Security is preserved only against computationally bounded adversaries Limits on computational power and storage No quantum computers 2. Adversaries may successfully crack encryption with a very small probability So small that (we hope) it becomes negligible
IND-CPA Security Ciphertext Indistinguishability under a Chosen-Plaintext Attack Round 1: choose k and encryption algo k, Ek m0 , m1 M Round 2: choose two plaintext messages Round 3: choose a random binary number Round 4: encrypt the corresponding message b R {0, 1} c = Ek(mb)
b' {0, 1} Round 5: guess a the value of b Adversary wins if b = b Analyzing the IND-CPA Game If E is a perfectly secure algorithm, what is the probability that b = b? P(Mallory wins) = If E is a Caesar shift, what is the probability that b = b? k, Ek m0 , m1 M
b R {0, 1} c = Ek(mb) b' {0, 1} P(Mallory wins) = 1 Adversary wins if b = b Analyzing the IND-CPA Game If E is computationally secure algorithm, what is the probability that b = b? P(Mallory wins) = + negligible(|k|) k, Ek
m0 , m1 M b R {0, 1} c = Ek(mb) b' {0, 1} Adversary wins if b = b Intuition of IND-CPA Security Information theoretic (perfect) security means Given m0 and m1 P(CT=c | PT=m0) = P(CT=c | PT=m1) for any adversary Computational security means Given m0 and m1
P(CT=c | PT=m0) P(CT=c | PT=m1) for a computationally bounded adversary Computational security is the foundation of all modern cryptography Crash Course in Computational Complexity (define (add2 num) (+ 2 num)) O(1) Constant Time Assuming a list of length n O(n) Linear Time
(define (sum lon) (cond [(empty? lon) 0] [(cons? lon) (+ (first lon1) (sum (rest lon)))])) Crash Course in Computational Complexity Two lists of length n O(n2) Quadratic Time (define (make-list-of-sums lon1 lon2) (cond [(empty? lon1) ()] [(cons? lon1) (append (helper (first lon1) lon2) (make-list-of-sums (rest lon1) lon2))])) (define (helper num lon2) (cond [(empty? lon2) ()]
[(cons? lon2) (cons (+ num (first lon2)) (helper num (rest lon2)))])) Crash Course in Computational Complexity Quicksort O(n log n) O(2n) Exponential Time (define (fibonacci n) (if (<= n 1) n (+ (fibonacci (- n 2)) (fibonacci (- n 1))))) Applications to Cryptography O(1) < O() < O(n) < O(n log n) < O(n2) < O(n3) < < O(2n) < O(n!) < When we desire computational security, we want algorithms where:
Verifying a given solution can be done in polynomial time or faster Computing a novel solution takes O(2|k|) time or worse This is what we mean by a computationally bound adversary Given infinite computers, a O(2|k|) time problem can be solved Public Key Cryptography Weakness of Symmetric Key Crypto How do you securely exchange keys with someone? Easy(ish) to do if you can meet them in person However, the Internet is untrusted You cant exchange shared secrets over an untrusted medium Eavesdropper Alice
KAES KAES Bob Public Key Cryptography Public key cryptography, a.k.a. asymmetric cryptography Each principal has two keys: private (secret) and public A message encrypted with one key must be decrypted by the other Thus, the public key can be sent in-the-clear over the Internet Security is based on Very Hard Math Problems O(1) time to verify a given solution for a given instance Exponential time to check all possible solutions for a given instance Many different algorithms that offer different security properties
Diffie-Hellman, RSA, Goldwasser-Micali, ElGamal Forms the basis for most modern secure protocols IPsec, SSL, TLS, S/MIME, PGP/GPG, etc. Diffie-Hellman Really should be called Diffie-Hellman-Merkle Ralph Merkle developed the mathematical theories Whitfield Diffie and Martin Hellman developed the protocol Security is based on the discrete logarithm problem Compute k such that , where b, g, and k are all integers Possible that no solution exists given arbitrary b and g Best known algorithms are exponential time Diffie-Hellman Protocol Red = secret, blue = public
1. Alice and Bob agree on a prime p and a base g which is primitive root modulo p 2. Alice chooses a secret integer a; Bob chooses secret b 3. Alice Bob: ; Bob Alice: 4. Alice computes ; Bob computes 5. Alice and Bob now share secret key s Diffie-Hellman Example Eavesdropper Alice Knows p = 23, g = 5 aa == 6 6 B = 19
B = 19 Doesnt Know Knows p = 23, g = 5 b b == ?? Doesnt Know aa == ?, ?, b b == ?? A A == 8,
8, B B == 19 19 Calculating s requires solving for a or b, which is the discreet logarithm problem Bob Knows p = 23, g = 5 b b == 15 15 A=8 A=8
Doesnt Know aa == ?? More On Diffie-Hellman Diffie-Hellman can be used for public key cryptography Alices public key is <, , >, private key is a Diffie-Hellman is the basis for protocols with Perfect Forward Secrecy a and b must be thrown away at the end of the session Dont worry if you have never heard of PFS RSA Invented by Rivest, Shamir, and Adleman in 1978 Equivalent system invented by Clifford Cox in 1973, but GCHQ classified it RSA is the dominant public key cryptosystem today for historical reasons
Algorithm was commercialized by RSA Security RSA Security created a certificate authority that eventually became Verisign RSA Algorithm Security is based on the difficulty of factoring the product of primes Alice chooses two secret primes p and q, n = pq Choose e such that 1 < e < n (p + q 1), and gcd(e, n (p + q 1)) = 1
Compute ciphertext C = Me % n To decipher, compute Cd % n = (Me % n)d % n = Med % n = M RSA Examples p = 11, q = 7, n = pq = 77, n (p + q 1) = 60 e = 37, d = 13 (ed = 481, ed % 60 = 1) If M = 15 then C = Me % n = 1537 % 77 = 71 Cd % n = M = 7113 % 77 = 15 Attacks Against RSA The length of n=pq reflects the strength 700-bit n factored in 2007 768 bit factored in 2009 1024 bit for minimal level of security today Likely to be breakable in near future
2048 bits recommended for current usage RSA encryption/decryption speed is quadratic in key length Factoring is easy with a quantum computer Public Key Crypto Example Eavesdropper Alice Sa Pa KAES M Bob M
Pa E KAES(M) E Pa(KAES) E KAES(M) Brand new AES symmetric key KAES E KAES(M) E Pa(KAES) Why bother with the symmetric key? Why not just encrypt M with Pa? Performance
Asymmetric crypto is slow, symmetric is fast Use asymmetric for K (which is small) Use symmetric for M (which is large) Chosen Plaintext Attacks Problem: RSA and Diffie-Hellman are deterministic They do not include any randomness by default Given a message M and public key P, the encrypted message E P(M) will always be the same Attackers can decrypt messages using a chosen plaintext attack Attacker constructs a set of likely messages M For each , if E P(m) == E P(M) then m is the plaintext Practical implementations of RSA must carefully pad messages Adds randomness to M that prevents chosen plaintext attacks Also prevents other classes of attacks
Digital Signatures and Authentication Cryptographic Hash Functions Cryptographic hash function transform input data into scrambled output data Arbitrary length input fixed length output Deterministic: hash(A) = hash(A) High entropy: md5(security) = e91e6348157868de9dd8b25c81aebfb9 md5(security1) = 8632c375e9eba096df51844a5a43ae93 md5(Security) = 2fae32629d4ef4fc6341f1751b405e45 Collision resistant Locating A such that hash(A) = hash(A) takes a long time Example: 221 tries for md5
Well Known Hash Functions MD5 Outputs 128 bits Collision resistance totally broken in 2004 SHA1 Outputs 160 bits Partially broken: method exists to find collisions in 280 tries Deprecated SHA2 (SHA-224, SHA-256, SHA-384, SHA-512) SHA-224 matches the 112 bit key length of 3DES SHA-256, SHA-384, SHA-512 match the key lengths of AES (128, 192, 256 bits) Considered safe The Future: SHA3
2007: NIST opens competition for new hash functions 2008: Submission deadline, 64 entries, 51 make the cut 2009: 14 candidates move to round 2 2010: 5 candidates move to round 3 2011: final round of public comments 2012: NIST selects keccak (pronounced catch-ack) as SHA3 Created by Guido Bertoni, Joan Daemen, Gilles Van Assche, Michal Peeters Digital Signatures Bob Alice Sa Pa M M
H(M) M E Sa(H(M)) H(M) ?= H(M) What can you infer about a signed message? The holder of Sa must have produced the signature, since Pa decrypts the hash The message was not modified, otherwise the hash would not match Encryption vs. Signatures Public Key Encryption What does encryption give you? Confidentiality only the holder of the private key can read the message Integrity if the message is modified, it will no longer decrypt properly
What does encryption not give you? Authentication you have no idea who used your public key to encrypt the message Digital Signatures What do signatures give you? (Weak) Authentication only the holder of the private key could have signed the message Integrity if the message is modified, the signature will be invalid What do signatures not give you? Confidentiality the message is not
encrypted, its public Authentication Does public key cryptography provide authenticity guarantees? Yes if you obtain Alices public key through a secure, out-of-band exchange No if you obtain Alices key via an untrusted network The Man-in-the-Middle Attack Bob has no way of knowing that Pe does not belong to Alice Attacker Alice Sa
Pa Se Pe Bob M KAES EKAES (M) E Pe(KAES) E KAES(M) E KAES(M) KAES
KAES M M Alice has no idea message has been compromised EKAES (M) E Pa(KAES) Total compromise! The attacker can read, modify, or drop the message Signing Public Keys The only way to authenticate a public key is to rely on trust One or more third-party, trusted principals vouch for Alices key by signing it Bob can verify the signatures using the public keys of the trusted third-parties If Alices key is not signed, maybe Bob should not trust it
Question: who do you trust? 1. Web of trust: a social network of private individuals who sign each others keys OpenPGP keys 2. Certificate authorities: companies that verify individuals and sign public keys for a fee X) where is a .509 certificates (more on these later) PGP and GPG Pretty Good Privacy (PGP) Phil Zimmermann, 1991 Widely used open source encryption software Helped originate the OpenPGP standards for key exchange and digital signatures Supports RSA keypairs as well as many symmetric cypher suites Gnu Privacy Guard (GPG) 1999
Open source implementation of the OpenPGP standards Installed by default on most Linux/Unix/BSD systems Fred Web of Trust Pf Bob e Is this Freds key? No way to tell, none of my friends can vouch for it. Pb
Alice a Is this Erics key? Maybe. Dave is two hops away, but I dont know Fred. Eric Pa b Pe Dave
d d Pd Chris b c Is this Daves key? Bob and Chris say it is, and I trust them, so I trust this key. c e
Pc a d f Public Key Infrastructure (PKI) PVerisign The whole chain is valid because I trust Verisign. Alice Is this GoDaddys key?
PGoDaddy Verisign Is this BofAs key? PVerisign Your OS and web browser ship with ~200 trusted public keys by default. PBofA GoDaddy Applied Cryptography Public key crypto is a powerful tool But, understanding how it works and actually using it to build a secure system are two different things Example: designing the right padding approach to insert randomness
and prevent chosen plaintext attacks Incorrect implementations may create padding or format oracle attacks Transport Layer Security (TLS) and Revocation SSL/TLS Application-layer protocol for confidentiality, integrity, and authentication between clients and servers Introduced by Netscape in 1995 as the Secure Sockets Layer (SSL) Designed to encapsulate HTTP, hence HTTPS Transport Layer Security (TLS) is the upgraded standard Defined in an RFC in 1999 Supersedes SSL: SSL is known to be insecure and should not be used
Sits between transport and application layers Thus, applications must be TLS-aware Both client and server must have an asymmetric keypair X) where is a .509 certificates contain signed public keys PKI rooted in trusted (?) Certificate Authorities (CAs) Goals of TLS Confidentiality and integrity: use BofAs public key to negotiate a session key; encrypt all traffic Authentication: BofAs cert can be validating by checking Verisigns signature Verisign https ://ww
Trusted Key Store w.ba n kofam e SVerisign rica.c om Verisign BofA
Contains BofAs public key Signed by Verisign SBofA Lets Talk about Certificates Suppose you start a new website and you want TLS encryption You need a certificate. How do you get one? Option 1: generate a certificate yourself Use openssl to generate a new asymmetric keypair Use openssl to generate a certificate that includes your new public key Problem?
Your new cert is self-signed, i.e. not signed by a trusted CA Browsers cannot validate that the cert is trustworthy Users will be shown a scary security warning when they visit your site Option 2: Pay a well-known CA to sign your certificate Any browser that trusts the CA will also trust your new cert Certificate Authorities Certificate Authorities (CAs) are the roots of trust in the TLS PKI Symantec, Verisign, Thawte, Geotrust, Comodo, GlobalSign, Go Daddy, Digicert, Entrust, and hundreds of others Issue signed certs on behalf of third-parties
Any CA can issue a cert for any domain! How do you become a CA? The only thing that stops me from buying a cert for google.com is a 1. Create a self-signed root certificate verification process 2. Get all the major browser vendors to include yourmanual cert with their software 3. Keep your private key secret at all costs What is the key responsibility of being a CA? Verify that someone buying a cert for example.com actually controls
example.com Acquiring a Certificate 1. Generate a new keypair 2. Generate a Certificate Signing Request (CSR). Contains BofAs details, the DNS name for the cert, and PBofA 4.
Generate a new certificate using the data in the CSR, sign it with the CAs private key 3. Verify that the requestor owns the domain in the CSR SBofA PBofA Verisign SVerisign CSR bofa.com
PBofA BofA X.509 Certificate (Part 1) Certificate: Data: Issuer: who generated this Version: 3 (0x2) cert? (usually a CA) Serial Number: 0c:00:93:10:d2:06:db:e3:37:55:35:80:11:8d:dc:87 Signature Algorithm: sha256WithRSAEncryption Issuer: C=US, O=DigiCert Inc, OU=www.digicert.com, CN=DigiCert SHA2 Extended Validation Server CA Validity Used for revocation Not Before: Apr 8 00:00:00 2014 GMT
Certificates expire Not After : Apr 12 12:00:00 2016 GMT Subject: businessCategory=Private Organization/1.3.6.1.4.1.311.60.2.1.3=US/1.3.6.1.4.1.311.60.2.1.2=Delaware/serialNumber=5157550/street=548 4th Street/postalCode=94107, C=US, ST=California, L=San Francisco, O=GitHub, Inc., CN=github.com Subject Public Key Info: Public Key Algorithm: rsaEncryption Githubs public key Subject: who owns this cert? Public-Key: (2048 bit) This is Githubs certificate Modulus: Must be served from github.com 00:b1:d4:dc:3c:af:fd:f3:4e:ed:c1:67:ad:e6:cb: Additional DNS names that X.509 Certificate (Part 2)
may serve this cert X) where is a 509v3 extensions: X) where is a 509v3 Subject Alternative Name: DNS:github.com, DNS:www.github.com X) where is a 509v3 CRL Distribution Points: Full Name: URI:http://crl3.digicert.com/sha2-ev-server-g1.crl Full Name: URI:http://crl4.digicert.com/sha2-ev-server-g1.crl X) where is a 509v3 Certificate Policies: Policy: 2.16.840.1.114412.2.1 CPS: https://www.digicert.com/CPS Authority Information Access: OCSP - URI:http://ocsp.digicert.com If this cert is revoked, its serial will be in the lists at
these URLS Policy numbers are magic (more on this later) This certs revocation status may also be checked via OSCP TLS Connection Establishment SBofA BofA ClientHello(Version, Prefs, Noncec) ServerHello(Version, Prefs, Nonces) Certificates(CBofA, CVerisign) Both sides derive symmetric session key K
from the PreMasterKey Certificate chain ServerHelloDone ClientKeyExchange(E PBofA(PreMasterKey)) ChangeCipherSpec E K(Finished) ChangeCipherSpec E K(Finished) Encrypted using servers public key Encrypted using symmetric session key
TLS Authentication During the TLS handshake, the client receives a certificate chain Chain contains the servers cert, as well as the certs of the signing CA(s) The client must validate the certificate chain to establish trust i.e. is this chain authentic, correct, cryptographically sound, etc. Client-side validation checks Does the servers DNS name match the common name in the cert? E.g. example.com cannot serve a cert with common name google.com Are any certs in the chain expired?
Is the CAs signature cryptographically valid? Is the cert of the root CA in the chain present in the clients trusted key store? Is any cert in the chain revoked? (more on this later) Little Green Locks If the TLS handshake succeeds, and the servers certificate chain is valid, then the connection is authenticated and encrypted Green lock icon indicates secure TLS connection to the user Most certificates are Domain Validation (DV) certificates Some certs are Extended Validation (EV) certificates EV certs get the green bar Extended Validation Certificates What differs between a DV and an EV certs?
To get a DV cert, the CA verifies that you control the given common name To get an EV cert, the CA does a background check on you and your company EV certs cost a lot more than DV certs Other than the background check, EV certs offer the same security as DV certs How does your browser tell the difference between DV and EV certs? Remember the policy number in the X) where is a .509 certificate? Each CA designates certain magic policy numbers to indicate EV status Your browser contains a hard-coded list of magic policy numbers to identify EV certs :( Key Compromise CBofA is totally legit S*
SBofA BofA ClientHello BofA ClientHello *BofA BofA Secret key compromise leads to many devastating attacks Attacker can successfully MitM TLS connections (i.e. future connections) Attacker can decrypt historical TLS packets encrypted using the stolen key Changing to a new keypair/cert does not solve the problem! The old, stolen key is still valid! Attacker can still MitM connections!
BofA *BofA Expiration Certificate expiration is the simplest, most fundamental defense against secret key compromise All certificates have an expiration date A stolen key is only useful before it expires Ideally, all certs should have a short lifetime Months, weeks, or even days Problem: most certs have multi-year
lifetimes This gives an attacker plenty of time to abuse a stolen key X.509 Certificate Validity Not Before: Apr 8 00:00:00 2014 GMT Not After : Apr 12 12:00:00 2016 GMT Certificate Lifetimes Revocation Certificate revocations are another fundamental mechanism for mitigating secret key compromises After a secret key has been compromised, the owner is supposed to revoke the certificate CAs are responsible for hosting databases of revoked certificates that they issued
Clients are supposed to query the revocation status of all certificates they encounter during validation If a certificate is revoked, the client should never accept it Two revocation protocols for TLS certificates 1. Certificate Revocation Lists (CRLs) 2. Online Certificate Status Protocol (OCSP) Certificate Revocation Lists CRLs are the original mechanism for announcing and querying the revocation status of certificates CAs compile lists of serial numbers of revoked certificates URL for the list is included in each cert issued by the CA CRL is signed by the CA to protect integrity X.509 Certificates, Revisited If the cert is revoked, this serial
Certificate: number will appear in the CRL Data: Subject: businessCategory=Private Organization/1.3.6.1.4.1.311.60.2.1.3=US/1.3.6.1.4.1.311.60.2.1.2=Delaware/ serialNumber=5157550/street=548 4th Street/postalCode=94107, C=US, ST=California, L=San Francisco, O=GitHub, Inc., CN=github.com X) where is a 509v3 extensions: X) where is a 509v3 Subject Alternative Name: DNS:github.com, DNS:www.github.com X) where is a 509v3 CRL Distribution Points: URLs where clients can Full Name: find the CRLs for this cert URI:http://crl3.digicert.com/sha2-ev-server-g1.crl Full Name: URI:http://crl4.digicert.com/sha2-ev-server-g1.crl
Authority Information Access: OCSP - URI:http://ocsp.digicert.com CRL Example CRL Whoa, CBofA has been revoked! Ca Cb CBofA http://crl.verisign.com/master.crl Please revoke CBofA
BofA Weve been robbed! BofA *BofA SBofA S* BofA Problems with CRLs Clients should check the revocation status of every cert they encounter Leaf, intermediate, and root certs Problems Latency additional RTTs of latency are needed to download CRLs before a page will
load Size CRLs can grow to be quite large (~MBs), downloads may be slow MitM attackers can block access to the CRL/OCSP URLs Browsers default-accept certificates if the revocation status cannot be checked Known as soft-fail or fail-open security posture Does caching CRLs mitigate these performance problems? Yes, somewhat But caching CRLs for long periods is dangerous: they may be out of date Online Certificate Status Protocol OCSP is the modern replacement for CRLs API-style protocol that allows clients to query the revocation status of one or more certs No longer necessary to download the entire CRL CAs host an OCSP server that clients may query
OCSP URL included in OCSP-compliant certs Responses are signed by the CA to maintain integrity Responses also include an expiration date to prevent replay attacks X.509 Certificates, Revisited Query the serial number to see if Certificate: this cert has been revoked Data: Subject: businessCategory=Private Organization/1.3.6.1.4.1.311.60.2.1.3=US/1.3.6.1.4.1.311.60.2.1.2=Delaware/ serialNumber=5157550/street=548 4th Street/postalCode=94107, C=US, ST=California, L=San Francisco, O=GitHub, Inc., CN=github.com X) where is a 509v3 extensions: X) where is a 509v3 Subject Alternative Name: DNS:github.com, DNS:www.github.com X) where is a 509v3 CRL Distribution Points:
Full Name: URI:http://crl3.digicert.com/sha2-ev-server-g1.crl Full Name: URI:http://crl4.digicert.com/sha2-ev-server-g1.crl URLs where clients can find Authority Information Access: the OCSP server for this cert OCSP - URI:http://ocsp.digicert.com OCSP Example OCSP Database Ca Cb CBofA Yes it is.
http://ocsp.verisign.com Is CBofA revoked? Please revoke CBofA BofA Weve been robbed! Good Clients no longer need to download the entire CRL Bad Attackers can still block access to the OCSP server Bad OCSP check still adds latency to TLS connections
Bad OCSP potentially violates user privacy BofA *BofA SBofA S* BofA OCSP Database OCSP Must-Staple Client only accepts the cert if the OCSP response is stapled and valid
Ca Cb CBofA BofA com . n ig s i r e p.v
cs o / BofA p: / tt h Is CBofA revoked? OCSP response is stapled to the cert ocsp.v e risign
.com No, itsYes, not. it is. The good: Clients dont need to query revocation status at all Is CBofA revoked? Attacker cannot prevent clients from receiving revocation information No leakage of browsing history The bad: BofA OCSP Must-Staple is very new, not SBofA
supported by many browsers and certs Revocation in Practice Revocation is one of the most broken parts of the TLS ecosystem MitM attackers can block access to the CRL/OCSP URLs Browsers default-accept certificates if the revocation status cannot be checked Solved by OCSP Must-Staple, but this extension is not well deployed Many browsers no longer perform proper revocation checks Chrome only does CRL/OCSP checks on EV certs, and only on some platforms Windows Yes, Linux and Android No Chrome uses an alternative implementation called CRLset which is busted Firefox only supports OCSP But fewer than 5% of certificates use OCSP Mobile browsers never check for revocations
Adds additional latency to HTTPS connections onto already slow mobile networks Many administrators fail to revoke compromised certificates
The acquisition of grammatical categories : a state of the art
The acquisition of grammatical categories : a state of the art Marie Labelle Université du Québec à Montréalwe blog, we tweet, we e-newsletter*, we act
Get Whitelisted on Edelweissfor Macmillan E-Galleys!. Register for Edelweiss with your professional e-mail address.. Send Anne a message ([email protected]; subject line: Edelweiss) that includes the e-mail address you registered with, your full name and your current library.