Java or Python Programming Project.

Chapter 2: Set two: Encryption Part I. Deadline: Thursday October 7, 23:59 (11:59 PM) Please consult the ‘Practical assignments instructions and rules’ document before proceeding with the exercises. Exercise 1. [1.5 points] Purpose of this exercise: construct a tool to work with the Vernam cipher. Write a program that, given a key and input data, applies the key as a one time pad (Vernam cipher) to the input data to encrypt or decrypt it. The input of the program is given as follows: first a key consisting of n bytes, followed by the binary value 0xFF, followed by the input data consisting of n bytes as well. The output should be the encrypted/decrypted data, also in binary. Note: as this program requires you to read a binary file, functions designed for reading text files will not work. Exercise 2. [2 points] Purpose: learn to implement a stream cipher. Implement the RC4 stream cipher. The input for this program will be given in a similar format as the input for the Vernam cipher program earlier: first a key, followed by the binary value 0xFF, followed by the input data. The output should be the encrypted/decrypted data, also in binary. Please be aware that there is an attack possible against RC4. Read the RC4 section in the book Information Security: Principles and Practices (starting on page 55) on how to prevent that attack. This prevention is needed to pass the test cases on Themis. Note: as this program requires you to read a binary file, functions designed for reading text files will not work. Exercise 3. [2.5 points] Purpose: learn to work with a simple Feistel cipher. Define and implement a simple Feistel cipher in a little program. The input of this program is given in binary and formatted as follows: first the binary value of d (0x64) or e (0x65) specifying whether to decrypt or encrypt, followed by the binary value 0xFF, followed by a key, then another 0xFF, and finally the input data. The key size is a multiple of 4 bytes, while the input data is a multiple of 8 bytes. The key bytes are used in the key-schedule, described below. 1 Implement the feistel function as a function processing blocks of 8 bytes. Each block is split into a left half (LH) and right half (RH). As usual, RH becomes the LH of the next round, and LH xor f(key) becomes the RH of the next round. Although the ECB block cipher should be avoided, it is used in this exercise to encrypt subsequent blocks of plain text. This exercise concentrates on the Feistel method, and once that’s available it can be used in combination with other block cipher methods as well. Key Schedule To fully complete this method an additional manipulation would be used in which the RH and key are manipulated (see also 3DES). Eventually RH2 becomes LH1 ^ F(RH1, key) where the key schedule is a separate operation, and the RH may also be manipulated, combining them to F(RH, key). This part (the key schedule handling) is not further elaborated in this exercise: Instead, the function f (key) simply returns the key again. Exercise 4. [1.5 points] Purpose: learn to validate public and private keys of a superincreasing knapsack. Write a program that, given as input m, n, a private and a public key, checks if this private key is a valid key and whether the public key corresponds to it. The output of the program should be -1 if the private key is invalid, 0 if the public key is invalid and 1 if both the public and private key are valid. Keep in mind that the public key can never be valid if the private key is not. The input is structured as follows (all in text, not in binary): m n [ p r i v a t e key ] p u b l i c key Exercise 5. [2.5 points] Purpose: Implement encryption and decryption using a superincreasing knapsack. Write a program that, given as input the appropriate keys and the input data, encrypts or decrypts the input data and shows the encrypted/decrypted text. The input is formatted at follows (in text, not in binary): 2 For encryption: e [ p u b l i c key ] etc . For decryption: d [ s u p e r i n c r e a s i n g knapsack ] etc . 3 Information Security (INBSEC-08) Fatih Turkmen Office : should be somewhere in the Bernoulliborg Some slides are borrowed from Dr. Frank B. Brokken 1 Today • Topics 1st lecture on Encryption: • • • • • • More on Transposition ciphers Perfect Secrecy One-time Pad (Vernam cipher) Stream ciphers (RC4) Block ciphers (Feistel, DES, 3DES) Knapsack 2 Transposition • Is a form of diffusion • Smears the text in matrix M containing the plaintext row/column-wise over another matrix C containing the ciphertext: • • • • Use permutation matrices: P’P = I. fill a matrix M with the plaintext; pRemultiply by permutation matrix P1 to permute Rows: P1M; pOstmultiply P1M by permutation matrix P2 to permute cOlumns: C = P1 M P2 • Decrypting is easy: • P1’CP2′ = (P1′ P1) M (P2P2′) = M 3 Transposition • Transposition Example: permutes rows. E.g., row 1 to row 2 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 u a * s t i s t p i p i r o o h n a s n e g n i = c r a u t i s t s i p p r i o h o a n n e s n g c r i a u t i s t s i p p r i o h o a n n e s n 0 1 g 0 0 c * 0 0 r 0 0 i 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 = 0 0 n g c r i a u t i s a n n e s t s i p p r i o h o • Key: 2 1 4 5 3, 2 4 5 3 1 permutes cols. E.g., col 4 to col 3 4 Transposition • Transposition Example: • From: u a s t i s t p i p i r o o h n a s n e g n To: i c r n g c r i a u t i s a n n e s t s i p p r i o h o • To break: each row cq. each column shows the characters of a row cq. column of the original matrix. Once you know / guess, e.g, using, then the rest quickly follows. 5 Perfect Secrecy (i.e. Provably Secure) Assume that the key is chosen at random, and used only once When something is provably secure, it means “some statement” regarding it’s security can be (has been) proven. “Given the ciphertext, any ‘plaintext’ of the same length can be generated by a suitable choice of ‘key’ and all possible plaintexts are equally likely. So the ciphertext provides no meaningful information at all about the plaintext.” 6 Perfect Secrecy more formally • M: message space • K: key space An encryption system: – G: a probabilistic key generation algorithm • C: the set of all possible ciphertexts – Ek(m): Encryption algorithm (key k and message m as input, and ciphertext c as output) Perfect Secrecy is defined over probability distributions over M, K and C – Dk(c): Decryption algorithm (key k and ciphertext c as input, and message m as output) • For any key k ∈ K, Pr[K=k] denotes the probability that Gen’s output is equal to k. Conditional Probability in Encryption (Bayes’ Theorem) • Pr[M=m]: the message takes on the value m ∈ M • Pr[C=c]: Ek(m) results with c ∈ C, c ⟵ Ek(m): probabilistic, i.e. c is selected probabilistically from C c := Ek(m): deterministic • Pr[M=a | C=B]: What is the probability that the message a was encrypted given that we observe ciphertext B? Pr 𝐶 = 𝐵 𝑀 = 𝑎] . Pr[𝑀 = 𝑎] Pr 𝑀 = 𝑎 | 𝐶 = 𝐵 = Pr[𝐶 = 𝐵] 7 Perfect Secrecy more formally (cont.) Definition 1: An encryption scheme 𝐺, 𝐸, 𝐷 is perfectly secret with a message space M, if for every probability distribution for M, every message m ∈ M and every ciphertext c ∈ C (for which Pr[C=c] > 0) Pr[M=m | C=c] = Pr[M=m] Meaning: The a posteriori probability that some message m ∈ M was sent, conditioned on the ciphertext that was observed, should be no different from a priori probability that m would be sent. Plain English: An eavesdropper can observe the ciphertext. Observing the ciphertext should have no effect on the adversary’s knowledge regarding the actual message that was sent. 8 Perfect Secrecy more formally (cont.) Two more but equivalent definitions: • Definition 2 (Ciphertext Distribution): One that requires the distribution of the ciphertext does not depend on the plain text. That is, for every m,m’ ∈ M and every c ∈ C: Pr[Ek(m) =c] = Pr[Ek(m’)=c] • Definition 3 (Perfect (adversarial) indistinguishability): One, perhaps more interesting, that considers an experiment
where, given a ciphertext, the adversary (eavesdropper, denoted as 𝒜) tries to guess which of the two possible messages was encrypted. Denoted as 𝑷𝒓𝒊𝒗𝑲𝒆𝒂𝒗 𝓐,𝜫 • 𝒜 outputs a pair of messages m0, m1 ∈ M • A key k is generated using Gen and a uniform bit b ∈ {0,1} is chosen. c ⟵ Ek(mb) is computed and given to 𝒜. • 𝒜 outputs a bit b’. 1, 𝑏’ = 𝑏 $%& • The result of the experiment is 1 if b’ = b, otherwise 0. That is 𝑃𝑟𝑖𝑣𝐾𝒜,# = 4 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 )*+ We want Pr[ 𝑃𝑟𝑖𝑣𝐾𝒜,( =1] =1/2 9 One-time Pad • Provably secure, letters are used (‘a’ = 0) (a.k.a. Vernam cipher) To encrypt Ek(m) = c = (m + k) % 26 To decrypt In binary form: Add the plaintext and key bits modulo 2, i.e. XOR Dk(c) = m = (26 + c – k) % 26 • The key k is a random sequence with the size |m| • Any plaintext can be constructed from a given c 10 Common Operations • En/decryption: • Simple substitution ciphers use addition/subtraction: (m + shift) % 26 = c (26 + c – shift) % 26 = m • Better put: xor (⊕) for encryption and decryption for which we know a ⊕ 0 = a and a ⊕ a = 0 Xor truth table: rhsrhs ⊕⊕ 1 1 0 0 lhs lhs 1 1 v10 v21 0 0 v31 v40 Consequently: m ⊕ k = c ⊕values k =of(m ⊕ k) ⊕ k What arecthe v1,v2,v3,v4? = m ⊕ (k ⊕ k) = m ⊕ 0 = m 11 One-time Pad (cont.) • One-time pad: m1+ k1 = c so: m1 = c – k1 actually used encryption m2+ k2 = c so: m2 = c – k2 other possibility/ies for achieving c? (encrypt) (decrypt) • Same cipher text, only the keys differ. c does not reveal any info whether it is derived from m1 or m2. • How to obtain such key (k2): m1 + k1 – m2= c – m2 = k2 Note: any key is possible; there is no reason why key-1 should be the `correct key’, rather than key-2 12 One-time Pad (cont.) • Example • Real text(1) and used key: Plain text, m1: kill the president Key: qwertyuiiopasfga What is the ciphertext? aepcmfyxzshivjtt (m1+ k1 = c) • Let’s see how it is perfectly secure (Definition 1) • Chose an alternate plaintext, m2 : hello world crypto • First: decrypt cipher text with the alternate plain text Decrypt: AEPCMFYXZSHIVJTT with `key’ hello world crypto (k = c – m ) 2 2 plaintext/ciphertext: taeryjkgopfrxuaf (k2: the alternative key) (1)If you run this example remember to omit the blanks 13 One-time Pad (cont.) • Then: apply the alternate key to the ciphertext: Decrypt: AEPCMFYXZSHIVJTT with taeryjkgopfrxuaf Results in: helloworldcrypto What did we do? m1(kill the president) + k1(qwertyuiiopasfga) = m2(hello world crypto) + k2(taeryjkgopfrxuaf) Pr[M=“kill the president” | C=“aepcmfyxzshivjtt”] = Pr[M=“kill the president”] 14 One-time Pad (cont.) • One-time pad – disadvantages • Key size, distribution and usability (in depth vulnerability) • If padding is used then the “pad” information needs be securely communicated! • Key needs to be changed after every encryption (can only be used once) m1 + k = c1; m2 + k = c2 c1 – c2 = m1 – m2 (the key vanishes) (note: m1, m2: plain texts, not characters, k: the used key) 15 Stream Ciphers s0 s1 • General Idea • • • • En/decrypt one character at a time A key seeds the cipher: common secret Uses xor (⊕) to encrypt and decrypt. am e The same algorithm is used for tr s y encryption and decryption Ke s2 … m … m ⊕ p0 p1 p2 Encryption c0 c1 c2 … m … m ⊕ ExpandKey( k0 k1 k2 … n bits • Examples: RC4, A5/1 , N) s0 s1 nonce s2 Decryption p0 p1 p2 … m 16 Stream Ciphers Plain text: ”InfoSec” Or binary: 01001001 01101110 01100110 01101111 01010011 01100101 01100011 • Go to: • Write the RC4 encryption in binary to the chat 17 Stream Ciphers Stamp’s book mentions: • RC4: • Used in SSL communication. If used correctly: fine • Byte by byte encryption • Tailored to software • A5/1: • Tailored to hardware • Used in GSM encryption. Cracked. But how bad is that? • cf: cryptanalysis_o_1.html There are many others such as E0 (bluetooth). See the Wikipedia for a list. 18 RC4 • RC4 Stream cipher practice: $ rc4 ‘my RC4 key’ < plain > encrypted $ rc4 ‘my RC4 key’ < encrypted > orig $ diff plain orig $ • Streambyte implementation (C++): size_t RC4::keyStreamByte() { if (++d_left == 256) d_left = 0; d_right = (d_right + d_table[d_left]) % 256; swap(d_table[d_left], d_table[d_right]); return d_table[ (d_table[d_left] + d_table[d_right]) % 256 ]; } 19 Block Ciphers • Block ciphers: • • • • • • • Key Motivations: Efficiency and Security Widely used Encryption not byte-wise but block-wise (e.g. 8 byte blocks) Plaintext is mangled over several `rounds’ To decrypt: the rounds are played back. It is difficult to develop secure and efficient algorithms. Block ciphers are often designed as Feistel ciphers 20 Plaintext L0 R0 K1 ⊕ • Feistel ciphers: F L1 • Horst Feistel (1915-1990) • Encryption procedure: R1 K2 ⊕ Feistel • Split blocks in left/right halves M = (L, R) • The next round’s left half is the previous right half: Li = Ri-1 F • The next round’s right half: previous left xor-ed with the result of a round (key schedule) function F: … Ri = Li-1 ⊕ F(Ri-1, Ki) Ln+1 Rn+1 Ciphertext Note the final swap!! 21 Feistel (cont.) Ciphertext Ln+1 R L Rn+1 Kn ⊕ • Decrypting Feistel ciphers: F • Invert the process until M is reached: • Split the ciphertext in halves: C = (Li, Ri) L Ln R Rn Kn-1 ⊕ F • Compute previous right from current left. That is: since Li= Ri-1 (encryption) we get Ri-1= Li • Compute previous left: from encryption That is: since we have Ri= Li-1 ⊕ F(Ri-1, Ki-1) we get: Li-1= … L R L R [Encryption] Ri ⊕ F(Ri-1, Ki) Very similar to encryption but not exactly same. The subkeys used in encryption are used in the reverse order. Plaintext 22 Feistel (cont.) Plaintext L0 R0 K1 ⊕ F L1 Decrypting Feistel ciphers: R1 • Very similar to encryption but not exactly same. The subkeys used in encryption are used in the reverse order. Encryption K2 ⊕ Li = Ri-1 F … Ri = Li-1 ⊕ F(Ri-1, Ki) Decryption Li-1= Ri ⊕ F(Ri-1, Ki) Ln+1 Note that the round keys start from 1 in our notation Ri-1= Li Rn+1 Ciphertext 23 Feistel Implementation • Implementation: Feistel decryption uses encryption after swapping L and R and reversing the keys Initially: swap rule: (Ln, Rn) (R2, L2) Ln = Rp so: L2 = R1 Perform n rounds (R1, Finally: swap (L1, How to obtain a previous block (e.g. 1 from 2) rule: so: Rn = Lp ⊕ F(Rp, Kp) L1 = R2 ⊕ F(R1, K1) L1) a.k.a. L2 R1) 24 KAERB A DEEN I || I NEED A BREAK! 25 DES • Data Encryption Standard • Feistel cipher (with 16 rounds) • Key/block length: 64 bits (key used to be 128 in Lucifer) • Actual # of bits used for the key: 56 (8 bits are discarded) • Each round uses 48 bit subset of the 56 bit key • Think about brute force attacks…. • Lucifer had exhaustive key search space of 2127 • DES has exhaustive key search space of 255 3-4 minutes Assume that: – The effective key length of DES is 48 bits – We can test 108 keys per second – 2k is approximately 10k / 3 How many (roughly) months does it take to find a DES key? 26 DES 32 32 • One DES round: 28 28 48 48 • Expansion: 32 32 bits to 48. 28 • S-boxes (8 of them): 28 map 6 bits down to 4 provides non-linearity • Key schedule: • • • • (Feistel) Round function in DES F(Ri-1, Ki) = P-Box(S-Boxes(Expand(Ri-1) ⊕ Ki)) initially permute the master key, and split to two 28 bits shifts + compression next key: concatenates shifted (rotated) halves key schedule is fixed and public, only master key is secret The main vulnerability if the key size! 27 Triples DES (3DES) • 3DES: • DES disadvantage: small key. • Solution: use a 112 bit key, split in halves: • `2DES’ is insufficient: Using a table of 256 keys and a known plaintext 2DES can be broken. • 3DES is defined as: C = E( D( E(P,K1), K2), K1) Why EDE instead EEE? Backward compatibility! • Nice: for K1 == K2 this reduces to DES: C = E( D( E(P,K), K), K) • so: C = E( P , K) 28 Block Cipher Mod
es • How should multiple blocks be encrypted with a block cipher? • Highlights: • Padding • Avoid: Electronic Code Book (ECB) mode • Acceptable modes • CBC (cipher block chaining mode) • CTR • Initialization vectors • Information Leakage / Collisions • See also: modes_development.html 29 Block Cipher Modes • Padding • Padding is required, always at least 1 byte • Well-known methods (block size b): • Marker: add byte 128 and a remaining number of required 0-bytes (0..b-1) • Determine how many padding bytes are needed (n) and add that many n-value bytes. 30 Block Cipher Modes • Avoid: Electronic Code Book (ECB) CodeBlock = E(Key, PlainText) (for each block) • Same plain text results in identical code blocks • Think about padding, headers, footers • Highly susceptible to • Known plaintext attack • Plain text often has predictable characteristics • What is the key? • With ECB the key is always the same, opportunities for known plaintext attacks 31 Block Cipher Modes • OK mode: Cipher Block Chaining (CBC) • The plain text block is first xor-ed with the previous cipher text • Advantage: Identical plain text blocks no longer result in identical cipher blocks • Method: CodeBlock = E(Key, PlainText xor Previous Code Block) • Still to solve: what is the ‘previous encrypted block’ for the very first plain text block? 32 Block Cipher Modes • OK mode: Counter Mode (CTR) • A block cipher mode used as a stream cipher. • The algorithm generates a stream of keys • Method: Key2Use = E(Key, nonce || i) concatenation (for i = 1, … k) CipherText = PlainText xor Key2Use • 128 bit blocks, e.g. 128 bits • Nonce: e.g., 48 bit message number plus 16 bit additional data: must be unique or data will leak (cf. ECB) • 64 bit counter 33 Block Cipher Modes • Initialization vectors (IVs) • Are used to initialize block ciphers: • ‘Code block’ for the 1st plain text block (CBC) • Should not be fixed (or data will leak with different messages) • Counter IVs: small differences and (almost) identical plain text blocks may cause data to leak • Random IV: fine, but IV is sent as 1st block, enlarging the message by 1 block • Nonce: used to compute the IV/Key, is much smaller 34 Block Cipher Modes • Initialization vectors (IVs) • No need to keep secret. • Consider: CodeBlock = E(Key, PlainText xor Previous Code Block) • Plaintext is (i.e., remains) unknown; • Plaintext xor IV enters the key schedule, where it is modified: • the plaintext cannot be reconstructed from the CodeBlock alone. • Analogous considerations apply to CTR. 35 Block Cipher Modes • Information Leakage • ECB: always leaks (of course: same keys) • With CBC, when C blocks are equal, then: Ci E(K, Pi ^ Ci-1) Pi ^ Ci-1 Pi ^ Pj == == == == Cj E(K, Pj ^ Cj-1) Pj ^ Cj-1 Ci-1 ^ Cj-1 • previous C-blocks reveal info about P blocks (cf. one-time pad) • CTR does not have this property (which is nice) 36 Enter: Asymmetric Cryptography Knapsack Cryptosystem • Does not use a shared key • First example of public key cryptography • Proposed by Merkle and Hellman • Based on a paper by Diffie and Hellman • Uses a superincreasing knapsack 47 Knapsack Cryptosystem • Knapsack: i= {0…n} • The general knapsack problem: • Given a set of n weights, w0,..wn-1, find a series of (integral) values ai (∈ {0,1}) such that a given sum S is achieved: S = ∑$%& !”# 𝑎𝑖 𝑤𝑖 • Special case: superincreasing knapsack (SK): • Order the weights from least to greatest • Each next wi value exceeds the sum of the previous wj values: ∀wi: wi > ∑wj (0 ≤ j < i) • The problem is now easily solved: start with the largest weight, repeatedly subtract values until 0. 48 Knapsack Cryptosystem • Example SK: 2, 3, 7, 14, 30, 57, 120, 251 (∑wj: 2 5 12 26 56 113 233) • For S = 171, the ai are: 0, 0, 1, How do we solve this? 1, 1, 0, 1, 0 171-120=51 51-30=21 21-14=7 7-7=0 New target sum New target sum 49 Asymmetric Crypto • In public key crypto, you need: • a private key, kept secret, allowing us to decrypt information; • a public key, made public, allowing others to send us encrypted information • In practice, the private key cannot be obtained from the public key. 50 Constructing a Knapsack Cryptosystem • SK: 2, 3, 7, 14, 30, 57, 120, 251 (k values) (∑wj: 2 5 12 26 56 113 233) • To compute the private key: • Choose n (called modulus) • n > ∑wi (0 ≤ i < k), e.g. n = 491 (> 251 + 233) • Choose m such that m rp n (i.e. m and n are relatively prime) • m: a multiplication constant. E.g., m = 41 • Private key: the SK and the values m and n. 51 Constructing a Knapsack Cryptosystem • SK: 2, 3, 7, 14, 30, 57, 120, 251 (∑wj: 2 5 12 26 56 113 233) • The public key is the converted/generalized SK: • Computed from the SK: Public key: ki = m * wi % n, producing: Weights from SK 82,123,287,83,248,373,10,471 • Private key: m, n and the superincreasing knapsack • n > ∑wi, e.g. 491 • m rp n: e.g., 41 52 Constructing a Knapsack Cryptosystem • Knapsack: encrypting a message • If p is an ascii-character to encrypt (e.g., ‘p’), find its ascii value and convert it to binary: • ‘p’ = 0x70 = 01110000b NB: idx 0 note that the example uses ascii, but a knapsack using 64 bit values (or more) is OK too. 53 Constructing a Knapsack Cryptosystem • Knapsack: encrypting a message • If p is an ascii-character to encrypt (e.g., ‘p’), find its ascii value and convert it to binary: • ‘p’ = 0x70 = 01110000b • Use the bit-pattern as weight-vector and compute c = a’w using the published public key: Most significant to least 82,123,287,83,248,373,10,471 idx: 0, … 7 a: 0, 0, 0, 0, 1, 1, 1, 0 c = 248 + 373 + 10 = 631 54 Constructing a Knapsack Cryptosystem • Knapsack: decrypt a message: • Compute (once) m-1 % n = 41-1 % 491 = 12 • Intuition: the public key has values: ki = m * wi % n, so: m-1 * ki = m-1 * m * wi % n = wi % n = wi 55 Constructing a Knapsack Cryptosystem • Knapsack: decrypt a message: m-1 * ki = m-1 * m * wi % n = wi % n = wi • Encrypted values are handled likewise : c = ∑ki = ∑ m * wi so: = m ∑ wi m-1 * c = m-1 ∑ m * wi = m-1 * m ∑ wi = ∑ wi (% n) • Example: • c = 631, 631 * 12 (% 491) = 207 (= ∑ wi (% n)) 56 Constructing a Knapsack Cryptosystem • Knapsack: decrypting a message: • We just computed: m-1 n • C = 631, 631 * 12 % 491 = 207 = ∑ ai % n • Next solve 207 in the superincreasing knapsack: • 207 = 30 + 57 + 120 idx:0 2, 3, 7, 14, 30, 57, 120, 251 idx:7 • Interpret the values as 1-weights in the a vector and thus as character: 01110000b= 0x70 = ‘p’ 57 Take Home • A huge knapsack… (just for fun) • public key: • private key: 328D7C79E4FDBE27F1D46591602C408BCF5BFB7DB39A515518A1 819639474E581D95DD469AA423F9B9FED9EB296BD733436A96FC B9C44FFF07D92879ADD74236FCC71DFF4E151856A6ACC330E8BA B5740651905C54E59176EA9ACD5B4529101786466EA26FA8A080 31B30E1AE9EDDD7476DE05E8887BE78968862B91C96853BE996A 2D0EEA17CB85D0543344578251358019E144313A21F30008512C 2AF23628723DBC79A2FB5F1E8E374A0DB73EC62E29B75476F924 3F08D48DC39DF0BCB017FD760F8BD80089FFF3964011C40675A2 Multiplier: 0D19CC01EF6B6082EDF119437B modulo: E1C0B242BB64C30487F4A19C1039C25AE4F53DD01A4D467A0D0A F9CFEFE5E4F15F3C0C6151A0654C4281E61CBCB6C94EA851C9 01A3049CAAD6CBBB0B503B249D9A30E86E2351C294EE8DA620EE 039A82CA6C0BBF5831A0A1329DCD63B509C7ED034C913B5A8322 07168E4FD780F9A05E05B189B3C80C9D0E0E5B415D601C4F119A 0E41DE9C50EC4B29A65273E4D99F74E0D8AB2E13D3B6295D3CCC 1C1D9C0592BCC333AC4489C95DB67C0D89F375B5CB5F55B48BE0 385EDBA2475815EF3EC3AD9FCD57454DBC610F906480594178C8 70ED695B21FC6797CF7AA000FF9D0D698127650AE2441B76A218 58 What did we learn? • Some more on Transposition Cipher • Perfect Secrecy and One-time Pad • Symmetric Cryptography • Stream Ciphers • Block Ciphers: Feistel Cipher, DES • Introduction to Asymmetric Cryptography • Knapsack Problem 59 That’s All, Folks, for today. 61 INFORMATION SECURITY INFORMATION SECURITY Principles and Practice Second Edition Mark Stamp San Jose State University San Jose,
CA WILEY A JOHN WILEY & SONS, INC., PUBLICATION Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at Library of Congress Cataloging-in-Publication Data: Stamp, Mark. Information security: principles and practice / Mark Stamp. — 2nd ed. p. cm. Includes bibliographical references and index. ISBN 978-0-470-62639-9 (hardback) 1. Computer security. I. Title. QA76.9.A25S69 2011 005.8—dc22 2010045221 Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1 To Miles, Austin, and Melody, with love. Contents Preface xv About The Author xix Acknowledgments xxi 1 Introduction 1.1 The Cast of Characters 1.2 Alice’s Online Bank 1.2.1 Confidentiality, Integrity, and Availability 1.2.2 Beyond CIA 1.3 About This Book 1.3.1 Cryptography 1.3.2 Access Control 1.3.3 Protocols 1.3.4 Software 1.4 The People Problem 1.5 Principles and Practice 1.6 Problems 1 1 2 2 3 4 5 6 7 8 8 9 10 1 17 Crypto 2 Crypto Basics 2.1 Introduction 2.2 How to Speak Crypto 2.3 Classic Crypto 2.3.1 Simple Substitution Cipher 2.3.2 Cryptanalysis of a Simple Substitution 2.3.3 Definition of Secure 2.3.4 Double Transposition Cipher 2.3.5 One-Time Pad 2.3.6 Project VENONA 19 19 20 22 22 24 25 26 27 31 vii CONTENTS 2.4 2.5 2.6 2.7 2.8 3 2.3.7 Codebook Cipher 2.3.8 Ciphers of the Election of 1876 Modern Crypto History A Taxonomy of Cryptography A Taxonomy of Cryptanalysis Summary Problems Symmetric Key Crypto 3.1 Introduction 3.2 Stream Ciphers 3.2.1 A5/1 3.2.2 RC4 3.3 Block Ciphers 3.3.1 Feistel Cipher 3.3.2 DES 3.3.3 Triple DES 3.3.4 AES 3.3.5 Three More Block Ciphers 3.3.6 TEA 3.3.7 Block Cipher Modes 3.4 Integrity 3.5 Summary 3.6 Problems 4 Public Key Crypto 4.1 Introduction 4.2 Knapsack 4.3 RSA 4.3.1 Textbook RSA Example 4.3.2 Repeated Squaring 4.3.3 Speeding Up RSA 4.4 Diffie-Hellman 4.5 Elliptic Curve Cryptography 4.5.1 Elliptic Curve Math 4.5.2 ECC Diffie-Hellman 4.5.3 Realistic Elliptic Curve Example 4.6 Public Key Notation 4.7 Uses for Public Key Crypto 4.7.1 Confidentiality in the Real World 4.7.2 Signatures and Non-repudiation 4.7.3 Confidentiality and Non-repudiation 4.8 Public Key Infrastructure 32 35 37 40 41 42 43 51 51 52 53 55 57 57 58 65 67 69 70 72 76 78 79 89 89 91 95 97 98 99 100 102 103 105 106 107 107 108 108 109 112 CONTENTS ix 4.9 Summary 4.10 Problems 114 115 5 Hash Functions++ 5.1 Introduction 5.2 What is a Cryptographic Hash Function? 5.3 The Birthday Problem 5.4 A Birthday Attack 5.5 Non-Cryptographic Hashes 5.6 Tiger Hash 5.7 HMAC 5.8 Uses for Hash Functions 5.8.1 Online Bids 5.8.2 Spam Reduction 5.9 Miscellaneous Crypto-Related Topics 5.9.1 Secret Sharing 5.9.2 Random Numbers 5.9.3 Information Hiding 5.10 Summary 5.11 Problems 125 125 126 128 129 130 132 136 139 139 140 141 142 145 148 153 153 6 Advanced Cryptanalysis 6.1 Introduction 6.2 Enigma 6.2.1 Enigma Cipher Machine 6.2.2 Enigma Keyspace 6.2.3 Rotors 6.2.4 Enigma Attack 6.3 RC4 as Used in WEP 6.3.1 RC4 Algorithm 6.3.2 RC4 Cryptanalytic Attack 6.3.3 Preventing Attacks on RC4 6.4 Linear and Differential Cryptanalysis 6.4.1 Quick Review of DES 6.4.2 Overview of Differential Cryptanalysis 6.4.3 Overview of Linear Cryptanalysis 6.4.4 Tiny DES 6.4.5 Differential Cryptanalysis of TDES 6.4.6 Linear Cryptanalysis of TDES 6.4.7 Implications Block Cipher Design 6.5 Lattice Reduction and the Knapsack 6.6 RSA Timing Attacks 6.6.1 A Simple Timing Attack 167 167 169 169 172 174 176 179 180 181 185 186 186 187 190 192 194 199 202 203 210 211 CONTENTS x 6.7 6.8 II 6.6.2 Kocher’s Timing Attack Summary Problems A c c e s s Control 214 218 218 227 7 Authentication 7.1 Introduction 7.2 Authentication Methods 7.3 Passwords 7.3.1 Keys Versus Passwords 7.3.2 Choosing Passwords 7.3.3 Attacking Systems via Passwords 7.3.4 Password Verification 7.3.5 Math of Password Cracking 7.3.6 Other Password Issues 7.4 Biometrics 7.4.1 Types of Errors 7.4.2 Biometrie Examples 7.4.3 Biometrie Error Rates 7.4.4 Biometrie Conclusions 7.5 Something You Have 7.6 Two-Factor Authentication 7.7 Single Sign-On and Web Cookies 7.8 Summary 7.9 Problems 229 229 230 231 232 232 234 235 237 240 242 244 244 250 250 251 252 252 254 254 8 265 265 266 266 269 271 272 273 274 276 278 279 281 283 285 Authorization 8.1 Introduction 8.2 A Brief History of Authorization 8.2.1 The Orange Book 8.2.2 The Common Criteria 8.3 Access Control Matrix 8.3.1 ACLs and Capabilities 8.3.2 Confused Deputy 8.4 Multilevel Security Models 8.4.1 Bell-LaPadula 8.4.2 Biba’s Model 8.5 Compartments 8.6 Covert Channel 8.7 Inference Control 8.8 CAPTCHA CONTENTS xi 8.9 Firewalls 8.9.1 Packet Filter 8.9.2 Stateful Packet Filter 8.9.3 Application Proxy 8.9.4 Personal Firewall 8.9.5 Defense in Depth 8.10 Intrusion Detection Systems 8.10.1 Signature-Based IDS 8.10.2 Anomaly-Based IDS 8.11 Summary 8.12 Problems III 9 287 288 290 291 293 293 294 296 297 301 302 Protocols Simple Authentication Protocols 9.1 Introduction 9.2 Simple Security Protocols 9.3 Authentication Protocols 9.3.1 Authentication Using Symmetric Keys 9.3.2 Authentication Using Public Keys 9.3.3 Session Keys 9.3.4 Perfect Forward Secrecy 9.3.5 Mutual Authentication, Session Key, and PFS 9.3.6 Timestamps 9.4 Authentication and TCP 9.5 Zero Knowledge Proofs 9.6 The Best Authentication Protocol? 9.7 Summary 9.8 Problems 10 Real-World Security Protocols 10.1 Introduction 10.2 SSH 10.3 SSL 10.3.1 SSL and the Man-in-the-Middle 10.3.2 SSL Connections 10.3.3 SSL Versus IPSec 10.4 IPSec 10.4.1 IKE Phase 1: Digital Signature 10.4.2 IKE Phase 1: Symmetric Key 10.4.3 IKE Phase 1: Public Key Encryption 10.4.4 IPSec Cookies 311 …. 313 313 315 317 320 323 325 327 329 330 332 335 339 339 340 351 351 352 353 356 357 358 359 361 363 364 366 CONTENTS xii 10.5 10.6 10.7 10.8 10.9 IV 10.4.5 IKE Phase 1 Summary 10.4.6 IKE Phase 2 10.4.7 IPSec and IP Datagrams 10.4.8 Transport and Tunnel Modes 10.4.9 ESP and AH Kerberos 10.5.1 Kerberized Login 10.5.2 Kerberos Ticket 10.5.3 Kerberos Security WEP 10.6.1 WEP Authentication 10.6.2 WEP Encryption 10.6.3 WEP Non-Integrity 10.6.4 Other WEP Issues 10.6.5 WEP: The Bottom Line GSM
10.7.1 GSM Architecture 10.7.2 GSM Security Architecture 10.7.3 GSM Authentication Protocol 10.7.4 GSM Security Flaws 10.7.5 GSM Conclusions 10.7.6 3GPP Summary Problems Software 11 Software Flaws and Malware 11.1 Introduction 11.2 Software Flaws 11.2.1 Buffer Overflow 11.2.2 Incomplete Mediation 11.2.3 Race Conditions 11.3 Malware 11.3.1 Brain 11.3.2 Morris Worm 11.3.3 Code Red 11.3.4 SQL Slammer 11.3.5 Trojan Example 11.3.6 Malware Detection 11.3.7 The Future of Malware 11.3.8 Cyber Diseases Versus Biological Diseases 366 367 368 369 370 372 374 375 376 377 377 378 379 379 380 381 381 383 385 386 388 389 389 390 401 403 403 404 407 418 419 421 422 422 424 425 426 427 429 432 CONTENTS 11.4 Botnets 11.5 Miscellaneous Software-Based Attacks 11.5.1 Salami Attacks 11.5.2 Linearization Attacks 11.5.3 Time Bombs 11.5.4 Trusting Software 11.6 Summary 11.7 Problems xiii 433 433 434 434 436 436 437 438 12 Insecurity in Software 12.1 Introduction 12.2 Software Reverse Engineering 12.2.1 Reversing Java Bytecode 12.2.2 SRE Example 12.2.3 Anti-Disassembly Techniques 12.2.4 Anti-Debugging Techniques 12.2.5 Software Tamper Resistance 12.2.6 Metamorphism 2.0 12.3 Digital Rights Management 12.3.1 What is DRM? 12.3.2 A Real-World DRM System 12.3.3 DRM for Streaming Media 12.3.4 DRM for a P2P Application 12.3.5 Enterprise DRM 12.3.6 DRM Failures 12.3.7 DRM Conclusions 12.4 Software Development 12.4.1 Open Versus Closed Source Software 12.4.2 Finding Flaws 12.4.3 Other Software Development Issues 12.5 Summary 12.6 Problems 447 447 448 450 451 455 456 457 459 460 460 464 467 469 470 471 472 472 473 476 477 481 481 13 Operating Systems and Security 13.1 Introduction 13.2 OS Security Functions 13.2.1 Separation 13.2.2 Memory Protection 13.2.3 Access Control 13.3 Trusted Operating System 13.3.1 MAC, DAC, and More 13.3.2 Trusted Path 13.3.3 Trusted Computing Base 491 491 492 492 492 494 495 495 497 498 CONTENTS xiv 13.4 Next Generation Secure Computing Base 13.4.1 NGSCB Feature Groups 13.4.2 NGSCB Compelling Applications 13.4.3 Criticisms of NGSCB 13.5 Summary 13.6 Problems 500 502 503 504 506 506 Appendix A-l Network Security Basics A-l.l Introduction A-1.2 The Protocol Stack A-l.3 Application Layer A-l.4 Transport Layer A-l.5 Network Layer A-1.6 Link Layer A-l.7 Conclusions A-2 Math Essentials A-2.1 Introduction A-2.2 Modular Arithmetic A-2.3 Permutations A-2.4 Probability A-2.5 Linear Algebra A-2.6 Conclusions 511 511 511 512 514 516 519 521 523 523 523 524 526 526 527 529 Annotated Bibliography 531 Index 572 Preface Please sir or madam won’t you read my book? It took me years to write, won’t you take a look? — Lennon and McCartney I hate black boxes. One of my goals in writing this book was to illuminate some of those black boxes that are so popular in information security books today. On the other hand, I don’t want to bore you to death with trivial details (if that’s what you want, go read some RFCs). As a result, I often ignore details that I deem irrelevant to the topic at hand. You can judge whether I’ve struck the proper balance between these two competing goals. I’ve strived to keep the presentation moving along so as to cover a broad selection of topics. My goal is to cover each item in just enough detail so that you can appreciate the basic security issue at hand, while not getting bogged down in details. I’ve also attempted to regularly emphasize and reiterate the main points so that crucial information doesn’t slip by below the radar screen. Another goal of mine was to present the topic in a reasonably lively and interesting way. If any computing subject should be exciting and fun, it’s information security. Security is happening now and it’s in the news—it’s clearly alive and kicking. I’ve also tried to inject a little humor into the material. They say that humor is derived from pain, so judging by the quality of my jokes, I’d say that I’ve led a charmed life. In any case, most of the really bad jokes are in footnotes so they shouldn’t be too distracting. Some security textbooks offer a large dollop of dry useless theory. Reading one of those books is about as exciting as reading a calculus textbook. Other books offer a seemingly random collection of apparently unrelated facts, giving the impression that security is not really a coherent subject at all. Then there are books that present the topic as a collection of high-level managerial platitudes. Finally, some texts focus on the human factors in security. While all of these approaches have their place, I believe that, first and foremost, a xv PREFACE XVI security engineer must have a solid understanding of the inherent strengths and weaknesses of the underlying technology. Information security is a huge topic, and unlike more established fields, it’s not clear what material should be included in a book like this, or how best to organize it. I’ve chosen to organize this book around the following four major themes: • • • • Cryptography Access Control Protocols Software In my usage, these themes are fairly elastic. For example, under the heading of access control I’ve included the traditional topics of authentication and authorization, along with such nontraditional topics as firewalls and CAPTCHAs. The software theme is particularly flexible, and includes such diverse topics as secure software development, malware, software reverse engineering, and operating systems. Although this book is focused on practical issues, I’ve tried to cover enough of the fundamental principles so that you will be prepared for further study in the field. In addition, I’ve strived to minimize the background requirements as much as possible. In particular, the mathematical formalism has been kept to a bare minimum (the Appendix contains a review of all necessary math topics). Despite this self-imposed limitation, I believe this book contains more substantive cryptography than most security books out there. The required computer science background is also minimal—an introductory computer organization course (or comparable experience) is more than sufficient. Some programming experience is assumed and a rudimentary knowledge of assembly language would be helpful in a couple of sections, but it’s not mandatory. Networking basics arise in a few sections. The Appendix contains a brief overview of networking that provides more than sufficient background material. If you are an information technology professional who’s trying to learn more about security, I would suggest that you read the entire book. However, if you want to avoid the material that’s most likely to slow you down and is not critical to the overall flow of the book, you can safely skip Section 4.5, all of Chapter 6 (although Section 6.6 is highly recommended), and Section 8.4. If you are teaching a security class, you need to realize that this book has more material than can be covered in a one-semester course. The schedule that I generally follow in my undergraduate security class appears in Table 1. This schedule allows ample time to cover a few of the optional topics. If the syllabus in Table 1 is too busy, you could cut Section 8.9 of Chapter 8 and some of the topics in Chapters 12 and 13. Of course, many other variations on the syllabus are possible. PREFACE XVll Chapter 1. Introduction 2. Classic Cryptography 3. Symmetric Key Crypto 4. Public Key Crypto 5. Hash Functions++ Hours 1 3 4 4 3 6. Advanced Cryptanalysis 7. Authentication 8. Authorization 0 4 2 9. Authentication Protocols 10. Real-World Protocols 11. Software Flaws and Malware 12. Insecurity in Software 13. OS and Security Total 4 4 4 4 3 40 Comments All All Omit Section 3.3.5 Omit Section 4.5 Omit 5.6 Omit attack details in 5.7 Omit Section 5.9.1 Omit entire chapter All Omit 8.4.1 and 8.4.2 Omit 8.10 Omit 9.4 Omit either WEP or GSM All Omit 12.3 All, time permitting Table 1: Suggested Syllabus Security is not a spectator sport—doing a large number of homework problems is essential to learning the material in this book. Many topics are fleshed out in the problems and additional topics are often introduced. The bottom line is that the more problems you solve, the more you’ll le
arn. A security course based on this book is an ideal venue for individual or group projects. Chapter 6 is a good source for crypto projects, while the annotated bibliography provides a starting point to search for additional project topics. In addition, many homework problems lend themselves well to class discussions or in-class assignments (see, for example, Problem 19 in Chapter 10 or Problem 33 in Chapter 11). The textbook website is at http :// where you’ll find PowerPoint slides, all of the files mentioned in the homework problems, errata, and so on. If I were teaching this class for the first time, I would particularly appreciate the PowerPoint slides, which have been thoroughly “battle tested” and improved over several iterations. In addition, a solutions manual is available to instructors (sorry, students) from the publisher. It is also worth noting how the Appendices fit in. Appendix A-l, Network Security Basics, is relevant to Sections 8.9 and 8.10 of Chapter 8 and also for PREFACE XVlll all of Part III. Even if students have a solid foundation in networking, it’s probably worthwhile to review this material, since networking terminology is not always consistent and the focus here is on security. The Math Essentials of Appendix A-2 are assumed in various places throughout the text. Elementary modular arithmetic (Appendix A-2.2) arises in a few sections of Chapter 3 and Chapter 5, while some of the relatively advanced concepts are required in Chapter 4 and Section 9.5 of Chapter 9. I’ve found that the vast majority of my students need to brush up on modular arithmetic basics. It only takes about 20 to 30 minutes of class time to cover the material on modular arithmetic and that will be time well spent prior to diving into public key cryptography. Trust me. Permutations, which are briefly discussed in Appendix A-2.3, are most prominent in Chapter 3, while elementary discrete probability (Appendix A2.4) appears in several places. The elementary linear algebra in Appendix A2.5 is only required in Section 6.5. Just as any large and complex piece of software must have bugs, this book inevitably has errors. I would like to hear about any errors—large or small— that you find. I will maintain a reasonably up-to-date errata on the textbook website. Also, don’t hesitate to provide any suggestions you might have for future editions of this book. What’s New for the Second Edition? Cats right themseJves; books don’t. — John Ay cock One major change for this second edition is that the number and quality of the homework problems have both greatly increased. In addition to the newand-improved homework problems, new topics have been added, some new background material has been included, virtually all of the existing material has been updated and clarified, and all known errors have been corrected. Examples of new topics include a practical RS A timing attack, a discussion of botnets, and coverage of security certification. Examples of added background material include a section on the Enigma cipher and coverage of the classic “orange book” view of security. Information security is a rapidly evolving field and there have been some significant changes since the first edition of this book was published in 2005. Nevertheless, the basic structure of the book remains intact. I believe the organization and list of topics has held up well over the past five years. Consequently, the changes to the content for this second edition are more evolutionary than revolutionary. Mark Stamp San Jose State University About The Author I’ve got nearly 20 years of experience in information security, including extensive work in industry and government. My work experience includes more than seven years at the National Security Agency followed by two years at a Silicon Valley startup company. While I can’t say too much about my work at NSA, I can tell you that my job title was Cryptologie Mathematician. In industry I helped design and develop a digital rights management security product. This real-world work was sandwiched between academic jobs. While in academia, my research interests have included a wide variety of security topics. When I returned to academia in 2002, it seemed to me that none of the available security textbooks had much connection with the real world. I felt that I could write an information security book that would fill this gap, while also containing information that would be useful to working IT professionals. Based on the feedback I’ve received, the first edition was apparently a success. I believe that this second edition will prove even more valuable in its dual role as a textbook and as a resource for working professionals, but then I’m biased. I can say that many of my former students who are now at leading Silicon Valley technology companies tell me that the information they learned in my courses has been useful to them. And I certainly wish that a book like this had been available when I worked in industry, since my colleagues and I would have benefitted from it. I do have a life outside of information security.1 My family includes my wife, Melody, and two wonderful sons, Austin (whose initials are AES), and Miles (whose initials are not DES, thanks to Melody). We enjoy the outdoors, with regular local trips involving such activities as bicycling, hiking, camping, and fishing. I also spend way too much time working on my fixer-upper house in the Santa Cruz mountains. 1 Well, sort of… xix Acknowledgments My work in information security began when I was in graduate school. I want to thank my thesis advisor, Clyde F. Martin, for introducing me to this fascinating subject. In my seven years at NSA, I learned more about security than I could have learned in a lifetime anywhere else. From my time in industry, I want to thank Joe Pasqua and Paul Clarke for giving me the chance to work on a fascinating and challenging project. The following San Jose State University students helped greatly with the first edition: Fiona Wong, Martina Simova, Deepali Holankar, Xufen Gao, Subha Rajagopalan, Neerja Bhatnager, Amit Mathur, Ali Hushyar, Smita Thaker, Puneet Mishra, Jianning Yang, Konstantin Skachkov, Jian Dai, Thomas Niki, Ikai Lan, Thu Nguyen, Samuel Reed, Yue Wang, David Stillion, Edward Yin, and Randy Fort. Richard Low, a colleague here at SJSU, provided helpful feedback on an early version of the manuscript. David Blockus (God rest his soul) deserves special mention for providing detailed comments on each chapter at a particularly critical juncture in the writing of the first edition. For this second edition, many of my SJSU masters students “volunteered” to serve as proofreaders. The following students all contributed their time and energy to correct errors in the manuscript: Naidele Manjunath, Mausami Mungale, Deepti Kundu, Jianrui (Louis) Zhang, Abhishek Shah, Sushant Priyadarshi, Mahim Patel, Lin Huang, Eilbroun Benjamin, Neha Samant, Rashmi Muralidhar, Kenny Zhang, Jyotsna Krishnaswamy, Ronak Shah, Gauri Gokhale, Arnold Suvatne, Ashish Sharma, Ankit Patel, Annie Hii, Namrata Buddhadev, Sujandharan Venkatachalam, and Sathya Anandan. In addition, Piyush Upadhyay found several errors in the first edition. Many other people made helpful comments and suggestions. Here, I would like to specifically thank Bob Harris (Penn State University) for the visual crypto example and exercise, and a very special thanks goes to John Trono (Saint Michael’s College) for his many detailed comments and questions. Undoubtedly, errors remain. Of course, all remaining flaws are my responsibility alone. xxi Information Security: Principles and Practice, Second Edition by Mark Stamp Copyright © 2011 John Wiley & Sons, Inc. Chapter 1 Introduction “Begin a t t h e beginning, ” the King said, very “and go on till you come to the end: then — Lewis Carroll, Alice in 1.1 gravely, stop.” Wonderland The Cast of Characters Following tradition, Alice and Bob, who are pictured in Figure 1.1, are the good guys. Occasionally we’ll require additional good guys, such as Charlie and Dave. Alice Bob Figure 1.1: Alice and Bob. Trudy, pictured
in Figure 1.2, is a generic bad “guy” who is trying to attack the system in some way. Some authors employ a team of bad guys where the name implies the particular nefarious activity. In this usage, Trudy is an “intruder” and Eve is an “eavesdropper” and so on. To simplify things, we’ll use Trudy as our all-purpose bad guy.1 x You might be wondering why a picture of Tweedledee and Tweedledum is used to represent Trudy. After all, Trudy is typically a female name, so why two bad guys instead of one bad girl? One possible reason is that, occasionally, we need two bad guys, so it’s convenient to have both Tweedledee and Tweedledum available. Another plausible 1 INTRODUCTION 2 Figure 1.2: Trudy. Alice, Bob, Trudy, and the rest of the gang need not be humans. For example, one of many possible scenarios would have Alice as a laptop, Bob a server, and Trudy a human. 1.2 Alice’s Online Bank Suppose that Alice starts an online banking business, appropriately named Alice’s Online Bank,2 or AOB. What are Alice’s information security concerns? If Bob is Alice’s customer, what are his information security concerns? Are Bob’s concerns the same as Alice’s? If we look at AOB from Trudy’s perspective, what security vulnerabilities might we see? First, let’s consider the traditional triumvirate of confidentiality, integrity, and availability, or CIA,3 in the context of Alice’s Bank. Then we’ll point out some of the many other possible security concerns. 1.2.1 Confidentiality, Integrity, and Availability Confidentiality deals with preventing unauthorized reading of information. AOB probably wouldn’t care much about the confidentiality of the information it deals with, except for the fact that its customers certainly do. For example, Bob doesn’t want Trudy to know how much he has in his savings account. Alice’s Bank would also face legal problems if it failed to protect the confidentiality of such information. Integrity deals with preventing, or at least detecting, unauthorized “writing” (i.e., changes to data). Alice’s Bank must protect the integrity of account information to prevent Trudy from, say, increasing the balance in her account or changing the balance in Bob’s account. Note that confidentiality and integrity are not the same thing. For example, even if Trudy cannot read the data, she might be able to modify this unreadable data, which, if undetected, explanation is that you never know who might be acting as “Trudy.” While these would be good reasons for choosing the Tweedle brothers, the reality is that your easily amused author finds the picture, well, amusing. 2 Not to be confused with “Alice’s Restaurant” [135]. 3 No, not that CIA… 1.2 ALICE’S ONLINE BANK 3 would destroy its integrity. In this case, Trudy might not know what changes she had made to the data (since she can’t read it), but she might not care— sometimes just causing trouble is good enough. Denial of service, or DoS, attacks are a relatively recent concern. Such attacks try to reduce access to information. As a result of the rise in DoS attacks, data availability has become a fundamental issue in information security. Availability is an issue for both Alice’s Bank and Bob—if AOB’s website is unavailable, then Alice can’t make money from customer transactions and Bob can’t get his business done. Bob might then take his business elsewhere. If Trudy has a grudge against Alice, or if she just wants to be malicious, she might attempt a denial of service attack on Alice’s Online Bank. 1.2.2 Beyond CIA Confidentiality, integrity, and availability are only the beginning of the information security story. Beginning at the beginning, consider the situation when customer Bob logs on to his computer. How does Bob’s computer determine that “Bob” is really Bob and not Trudy? And when Bob logs into his account at Alice’s Online Bank, how does AOB know that “Bob” is really Bob, and not Trudy? Although these two authentication problems appear to be similar on the surface, under the covers they are actually completely different. Authentication on a standalone computer typically requires that Bob’s password be verified. To do so securely, some clever techniques from the field of cryptography are required. On the other hand, authentication over a network is open to many kinds of attacks that are not usually relevant on a standalone computer. Potentially, the messages sent over a network can be viewed by Trudy. To make matters worse, Trudy might be able to intercept messages, alter messages, and insert messages of her own making. If so, Trudy can simply replay Bob’s old messages in an effort to, say, convince AOB that she is really Bob. Since information security people are professional paranoids, 4 we always assume the worst. In any case, authentication over a network requires careful attention to protocol, that is, the composition and ordering of the exchanged messages. Cryptography also has an important role to play in security protocols. Once Bob has been authenticated by Alice’s Bank, then Alice must enforce restrictions on Bob’s actions. For example, Bob can’t look at Charlie’s account balance or install new accounting software on the AOB system. However, Sam, the AOB system administrator, can install new accounting software. Enforcing such restrictions goes by the name of authorization. Note that authorization places restrictions on the actions of authenticated users. 4 Rumor has it that the security people at Yahoo proudly carry the title of “Paranoids.” INTRODUCTION 4 Since authentication and authorization both deal with issues of access to resources, we’ll lump them together under the clever title of access control. All of the information security mechanisms discussed so far are implemented in software. And, if you think about it, other than the hardware, what isn’t software in a modern computing system? Today, software systems tend to be large, complex, and rife with bugs. A software bug is not just an annoyance, it is a potential security issue, since it may cause the system to misbehave. Of course, Trudy loves misbehavior. What software flaws are security issues, and how are they exploited? How can AOB be sure that its software is behaving correctly? How can AOB’s software developers reduce (or, ideally, eliminate) security flaws in their software? We’ll examine these software development related questions (and much more) in Chapter 11. Although bugs can (and do) give rise to security flaws, these problems are created unintentionally by well-meaning developers. On the other hand, some software is written with the intent of doing evil. Examples of such malicious software, or malware, includes the all-too-familiar computer viruses and worms that plague the Internet today. How do these nasty beasts do what they do, and what can Alice’s Online Bank do to limit their damage? What can Trudy do to increase the nastiness of such pests? We’ll also consider these and related questions in Chapter 11. Of course, Bob has many software concerns, too. For example, when Bob enters his password on his computer, how does he know that his password has not been captured and sent to Trudy? If Bob conducts a transaction at, how does he know that the transaction he sees on his screen is the same transaction that actually goes to the bank? That is, how can Bob be confident that his software is behaving as it should, instead of as Trudy would like it to behave? We’ll consider these questions as well. When discussing software and security, we’ll need to consider operating system, or OS, topics. Operating systems are themselves large and complex pieces of software and OSs are responsible for enforcing much of the security in any system. So, some basic knowledge of OSs is necessary to fully appreciate the challenges of information security. We’ll also briefly consider the concept of a trusted operating system, that is, an operating system that we can actually have reasonable confidence is doing the right thing. 1.3 A b o u t This Book Lampson [180] believes that real-world security boils down to the following. • Specification/policy — What is the system supposed to do? • Impleme
ntation/mechanism — How does it do it? 1.3 ABOUT THIS BOOK 5 • Correctness/assurance — Does it really work? Your humble author would humbly5 add a fourth category: • Human nature — Can the system survive “clever” users? The focus of this book is primarily on the implementation/mechanism front. Your fearless author believes this is appropriate, nay essential, for an introductory course, since the strengths, weaknesses, and inherent limitations of the mechanisms directly affect all other aspects of security. In other words, without a reasonable understanding of the mechanisms, it is not possible to have an informed discussion of other security issues. The material in this book is divided into four major parts. The first part deals with cryptography, while the next part covers access control. Part III is on protocols, while the final part deals with the vast and relatively illdefined topic of software. Hopefully, the previous discussion of Alice’s Online Bank6 has convinced you that these major topics are all relevant to real-world information security. In the remainder of this chapter, we’ll give a quick preview of each of these four major topics. Then the chapter concludes with a summary followed by some lovely homework problems. 1.3.1 Cryptography Cryptography or “secret codes” are a fundamental information security tool. Cryptography has many uses, including providing confidentiality and integrity, among other vital information security functions. We’ll discuss cryptography in detail, since this is essential background for any sensible discussion of information security. We’ll begin our coverage of cryptography with a look at a handful of classic cipher systems. In addition to their obvious historic and entertainment value, these classic ciphers illustrate the fundamental principles that are employed in modern digital cipher systems, but in a more user-friendly format. With this background, we’ll be prepared to study modern cryptography. Symmetric key cryptography and public key cryptography both play major roles in information security, and we’ll spend an entire chapter on each. We’ll then turn our attention to hash functions, which are another fundamental security tool. Hash functions are used in many different contexts in information security, some of which are surprising and not always intuitive. Then we’ll briefly consider a few special topics that are related to cryptography. For example, we’ll discuss information hiding, where the goal is for Alice and Bob to communicate without Trudy even knowing that any 5 6 This sentence is brought to you by the Department of Redundancy Department. You did read that, right? INTRODUCTION 6 information has been passed. This is closely related to the concept of digital watermarking, which we also cover briefly. The final chapter on cryptography deals with cryptanalysis, that is, the methods used to break cipher systems. Although this is relatively technical and specialized information, understanding these attack methods makes clear many of the design principles behind modern cryptographic systems. 1.3.2 A c c e s s Control As mentioned above, access control deals with authentication and authorization. In the area of authentication, we’ll consider the many issues related to passwords. Passwords are the most often used form of authentication today, but this is primarily because passwords are cheap, and definitely not because they are the most secure option.7 We’ll consider how to securely store passwords. Then we’ll delve into the issues surrounding secure password selection. Although it is possible to select reasonably strong passwords that are relatively easy to remember, it’s surprisingly difficult to enforce such policies on clever users. In any case, weak passwords present a major security vulnerability in most systems. The alternatives to passwords include biometrics and smartcards. We’ll consider some of the security benefits of these alternate forms of authentication. In particular, we’ll discuss the details of several biometrie authentication methods. Authorization deals with restrictions placed on authenticated users. Once Alice’s Bank is convinced that Bob is really Bob, it must enforce restrictions on Bob’s actions. The two classic methods for enforcing such restrictions are so-called access control lists 8 and capabilities. We’ll look at the plusses and minuses of each of these methods. Authorization leads naturally to a few relatively specialized topics. We’ll discuss multilevel security (and the related topic of compartments). For example, the United States government and military has TOP SECRET and SECRET information—some users can see both types of information, while other users can only see the SECRET information, and some can’t view either. If both types of information are stored on a single system, how can we enforce such restrictions? This is a thorny authorization issue that has potential implications beyond classified military systems. Multilevel security leads naturally into the rarified air of security modeling. The idea behind such modeling is to lay out the essential security requirements of a system. Ideally, by verifying a few simple properties we 7 If someone asks you why some weak security measure is used when better options are available, the correct answer is invariably “money.” 8 Access control list, or ACL, is one of many overloaded terms that arise in the field of information security. 1.3 ABOUT THIS BOOK 7 would know that a given system satisfies a particular security model. If so, the system would automatically inherit all of the security properties that are known to hold for such a model. We’ll only present two of the simplest security models, both of which arise in the context of multilevel security. Multilevel security also provides an opportunity to discuss covert channels and inference control. Covert channels are unintended channels of communication. Such channels are common in the real world and create potential security problems. Inference control, on the other hand, refers to attempts to limit the sensitive information that can unintentionally leak out of a database due to legitimate user queries. Both covert channels and inference control are difficult problems to deal with effectively in real-world systems. Since firewalls act as a form of access control for the network, we stretch the usual definition of access control to include firewalls. Regardless of the type of access control employed, attacks are bound to occur. An intrusion detection system (IDS) is designed to detect attacks in progress. So we include a brief discussion of IDS techniques after our discussion of firewalls. 1.3.3 Protocols We’ll then cover security protocols. First, we consider the general problem of authentication over a network. Many examples will be provided, each of which illustrates a particular security pitfall. For example, replay is a critical problem, and so we must consider effective ways to prevent such attacks. Cryptography will prove essential in authentication protocols. We’ll give example of protocols that use symmetric cryptography, as well as examples that rely on public key cryptography. Hash functions also have an important role to play in security protocols. Our study of simple authentication protocols will illustrate some of the subtleties that can arise in the field of security protocols. A seemingly insignificant change to a protocol can completely change its security. We’ll also highlight several specific techniques that are commonly used in real-world security protocols. Then we’ll move on to study several real-world security protocols. First, we look at the so-called Secure Shell, or SSH, which is a relatively simple example. Next, we consider the Secure Socket Layer, or SSL, which is used extensively to secure e-commerce on the Internet today. SSL is an elegant and efficient protocol. We’ll also discuss IPSec, which is another Internet security protocol. Conceptually, SSL and IPSec share many similarities, but the implementations differ greatly. In contrast to SSL, IPSec is complex and it’s often said to be over-engineered. Apparently d
ue to its complexity, some fairly significant security issues are present in IPSec—despite a lengthy and open development process. The contrast between SSL and IPSec illustrates some of the inherent INTRODUCTION 8 challenges and tradeoffs that arise when developing security protocols. Another real-world protocol that we’ll consider is Kerberos, which is an authentication system based on symmetric cryptography. Kerberos follows a much different approach than either SSL or IPSec. We’ll also discuss two wireless security protocols, WEP and GSM. Both of these protocols have many security flaws, including problems with the underlying cryptography and issues with the protocols themselves, which make them interesting case studies. 1.3.4 Software In the final part of the book, we’ll take a look at some aspects of security and software. This is a huge topic, and in three chapters we barely do more than scratch the surface. For starters, we’ll discuss security flaws and malware, which were mentioned above. We’ll also consider software reverse engineering, which illustrates how a dedicated attacker can deconstruct software, even without access to the source code. We then apply our newfound hacker’s knowledge to the problem of digital rights management, which provides a good example of the limits of security in software, particularly when that software executes in a hostile environment. Our final software-related topic is operating systems (OSs). The OS is the arbiter of many security operations, so it’s important to understand how the OS enforces security. We also consider the requirements of a so-called trusted OS, where “trusted” means that we can have confidence that the OS is performing properly, even when under attack. With this background in hand, we consider a recent attempt by Microsoft to develop a trusted OS for the PC platform. 1.4 The People Problem Users are surprisingly adept at damaging the best laid security plans. For example, suppose that Bob wants to purchase an item from amazon. com. Bob can use his Web browser to securely contact Amazon using the SSL protocol (discussed in Part III), which relies on various cryptographic techniques (see Part I). Access control issues arise in such a transaction (Part II), and all of these security mechanisms are enforced in software (Part IV). So far, so good. However, we’ll see in Chapter 10 that a practical attack on this transaction that will cause Bob’s Web browser to issue a warning. If Bob heeds the warning, no attack will occur. Unfortunately, if Bob is a typical user, he will ignore the warning, which has the effect of negating this sophisticated security scheme. That is, the security can be broken due to user error, despite the fact 1.5 PRINCIPLES AND PRACTICE 9 that the cryptography, protocols, access control, and software all performed flawlessly. To take just one more example, consider passwords. Users want to choose easy to remember passwords, but this also makes it easier for Trudy to guess passwords—as discussed in Chapter 7. A possible solution is to assign strong passwords to users. However, this is generally a bad idea since it is likely to result in passwords being written down and posted in prominent locations, likely making the system less secure than if users were allowed to choose their own (weaker) passwords. As mentioned above, the primary focus of this book is on understanding security mechanisms—the nuts and bolts of security. Yet in several places throughout the book, various “people problems” arise. It would be possible to write an entire volume on this single topic, but the bottom line is that, from a security perspective, the best solution is to remove the humans from the equation as much as possible. In fact, we will see some specific examples of this as well. For more information on the role that humans play in information security, a good source is Ross Anderson’s book [14]. Anderson’s book is filled with case studies of security failures, many of which have at least one of their roots somewhere in human nature. 1.5 Principles and Practice This book is not a theory book. While theory certainly has its place, in your opinionated author’s opinion, many aspects of information security are not yet ripe for a meaningful theoretical treatment. 9 Of course, some topics are inherently more theoretical than others. But even the more theoretical security topics can be understood without getting deeply into the theory. For example, cryptography can be (and often is) taught from a highly mathematical perspective. However, with rare exception, a little elementary math is all that is needed to understand important cryptographic principles. Your practical author has consciously tried to keep the focus on practical issues, but at a deep enough level to give the reader some understanding of— and appreciation for—the underlying concepts. The goal is to get into some depth without overwhelming the reader with trivial details. Admittedly, this is a delicate balancing act and, no doubt, many will disagree that a proper balance has been struck here or there. In any case, the book touches on a large number of security issues related to a wide variety of fundamental principles, 9 To take but one example, consider the infamous buffer overflow attack, which is certainly the most serious software security flaw of all time (see Section 11.2.1 of Chapter 11). What is the grand theory behind this particular exploit? There isn’t any—it’s simply due to a quirk in the way that memory is laid out in modern processors. INTRODUCTION 10 and this breadth necessarily comes at the expense of some rigor and detail. For those who yearn for a more theoretical treatment of the subject, Bishop’s book [34] is the obvious choice. 1.6 Problems The problem is not that there are problems. The problem is expecting otherwise and thinking that having problems is a problem. — Theodore I. Rubin 1. Among the fundamental challenges in information security are confidentiality, integrity, and availability, or CIA. a. Define each of these terms: confidentiality, integrity, availability. b. Give a concrete example where confidentiality is more important t h a n integrity. c. Give a concrete example where integrity is more important t h a n confidentiality. d. Give a concrete example where availability is the overriding concern. 2. From a bank’s perspective, which is usually more important, the integrity of its customer’s d a t a or the confidentiality of the data? From the perspective of the bank’s customers, which is more important? 3. Instead of an online bank, suppose t h a t Alice provides an online chess playing service known as Alice’s Online Chess (AOC). Players, who pay a monthly fee, log into AOC where they are matched with another player of comparable ability. a. Where (and why) is confidentiality important for AOC and its customers? b. Why is integrity necessary? c. Why is availability an important concern? 4. Instead of an online bank, suppose t h a t Alice provides an online chess playing service known as Alice’s Online Chess (AOC). Players, who pay a monthly fee, log into AOC where they are matched with another player of comparable ability. a. Where should cryptography be used in AOC? b. Where should access control used? 1.6 PROBLEMS 11 c. Where would security protocols be used? d. Is software security a concern for AOC? Why or why not? 5. Some authors distinguish between secrecy, privacy, and confidentiality. In this usage, secrecy is equivalent to our use of the term confidentiality, whereas privacy is secrecy applied to personal data, and confidentiality (in this misguided sense) refers to an obligation not to divulge certain information. a. Discuss a real-world situation where privacy is an important security issue. b. Discuss a real-world situation where confidentiality (in this incorrect sense) is a critical security issue. 6. RFID tags are extremely small devices capable of broadcasting a number over the air that can be read by a nearby sensor. RFID tags are used for tracking inventory, and they have many other potential uses. For example, RFID tags are used in passports and it has be
en suggested that they should be put into paper money to prevent counterfeiting. In the future, a person might be surrounded by a cloud of RFID numbers that would provide a great deal of information about the person. a. Discuss some privacy concerns related to the widespread use of RFID tags. b. Discuss security issues, other than privacy, that might arise due to the widespread use of RFID tags. 7. Cryptography is sometimes said to be brittle, in the sense that it can be very strong, but when it breaks, it (generally) completely shatters. In contrast, some security features can “bend” without breaking completely—security may be lost as a result of the bending, but some useful level of security remains. a. Other than cryptography, give an example where security is brittle. b. Provide an example where security is not brittle, that is, the security can bend without completely breaking. 8. Read Diffie and Hellman’s classic paper [90]. a. Briefly summarize the paper. b. Diffie and Hellman give a system for distributing keys over an insecure channel (see Section 3 of the paper). How does this system work? 12 INTRODUCTION c. Diffie and Hellman also conjecture that a “one way compiler” might be used to construct a public key cryptosystem. Do you believe this is a plausible approach? Why or why not? 9. The most famous World War II cipher machine was the German Enigma (see also Problem 10). a. Draw a diagram illustrating the inner workings of the Enigma. b. The Enigma was broken by the Allies and intelligence gained from Enigma intercepts was invaluable. Discuss a significant World War II event where broken Enigma messages played a major role. 10. The German Enigma is the most famous World War II cipher machine (see also Problem 9). The cipher was broken by the Allies and intelligence gained from Enigma messages proved invaluable. At first, the Allies were very careful when using the information gained from broken Enigma messages—sometimes the Allies did not use information that could have given them an advantage. Later in the war, however, the Allies (in particular, the Americans) were much less careful, and they tended to use virtually all information obtained from broken Enigma messages. a. The Allies were cautious about using information gained from broken Enigma messages for fear that the Germans would realize the cipher was broken. Discuss two different approaches that the Germans might have taken if they had realized that the Enigma was broken. b. At some point in the war, it should have become obvious to the Germans that the Enigma was broken, yet the Enigma was used until the end of the war. Why did the Nazis continue to use the Enigma? 11. When you want to authenticate yourself to your computer, most likely you type in your username and password. The username is considered public knowledge, so it is the password that authenticates you. Your password is something you know. a. It is also possible to authenticate based on something you are, that is, a physical characteristic. Such a characteristic is known as a biometrie. Give an example of biometric-based authentication. b. It is also possible to authenticate based on something you have, that is, something in your possession. Give an example of authentication based on something you have. 1.6 PROBLEMS 13 c. Two-factor authentication requires that two of the three authentication methods (something you know, something you have, something you are) be used. Give an example from everyday life where two-factor authentication is used. Which two of the three are used? 12. CAPTCHAs [319] are often used in an attempt to restrict access to humans (as opposed to automated processes). a. Give a real-world example where you were required to solve a CAPTCHA to gain access to some resource. What do you have to do to solve the CAPTCHA? b. Discuss various technical methods that might be used to break the CAPTCHA you described in part a. c. Outline a non-technical method that might be used to attack the CAPTCHA from part a. d. How effective is the CAPTCHA in part a? How user-friendly is the CAPTCHA? e. Why do you hate CAPTCHAs? 13. Suppose that a particular security protocol is well designed and secure. However, there is a fairly common situation where insufficient information is available to complete the security protocol. In such cases, the protocol fails and, ideally, a transaction between the participants, say, Alice and Bob, should not be allowed to occur. However, in the real world, protocol designers must decide how to handle cases where protocols fail. As a practical matter, both security and convenience must be considered. Comment on the relative merits of each of the following solutions to protocol failure. Be sure to consider both the relative security and user-friendliness of each. a. When the protocol fails, a brief warning is given to Alice and Bob, but the transaction continues as if the protocol had succeeded, without any intervention required from either Alice or Bob. b. When the protocol fails, a warning is given to Alice and she decides (by clicking a checkbox) whether the transaction should continue or not. c. When the protocol fails, a notification is given to Alice and Bob and the transaction terminates. d. When the protocol fails, the transaction terminates with no explanation given to Alice or Bob. 14. Automatic teller machines (ATMs) are an interesting case study in security. Anderson [14] claims that when ATMs were first developed, most INTRODUCTION 14 attention was paid to high-tech attacks. However, most real-world attacks on ATMs have been decidedly low tech. a. Examples of high-tech attacks on ATMs would be breaking the encryption or authentication protocol. If possible, find a real-world case where a high-tech attack on an ATM has actually occurred and provide the details. b. Shoulder surfing is an example of a low-tech attack. In this scenario, Trudy stands behind Alice in line and watches the numbers Alice presses when entering her PIN. Then Trudy bonks Alice in the head and takes her ATM card. Give another example of a low-tech attack on an ATM that has actually occurred. 15. Large and complex software systems invariably have a large number of bugs. a. For honest users, such as Alice and Bob, buggy software is certainly annoying but why is it a security issue? b. Why does Trudy love buggy software? c. In general terms, how might Trudy use bugs in software to break the security of a system? 16. Malware is software that is intentionally malicious, in the sense that it is designed to do damage or break the security of a system. Malware comes in many familiar varieties, including viruses, worms, and Trojans. a. Has your computer ever been infected with malware? If so, what did the malware do and how did you get rid of the problem? If not, why have you been so lucky? b. In the past, most malware was designed to annoy users. Today, it is often claimed that most malware is written for profit. How could malware possibly be profitable? 17. In the movie Office Space [223], software developers attempt to modify company software so that for each financial transaction, any leftover fraction of a cent goes to the developers, instead of going to the company. The idea is that for any particular transaction, nobody will notice the missing fraction of a cent, but over time the developers will accumulate a large sum of money. This type of attack is sometimes known as a salami attack. a. Find a real-world example of a salami attack. b. In the movie, the salami attack fails. Why? 1.6 PROBLEMS 15 18. Some commercial software is closed source, meaning that the source code is not available to users. On the other hand, some software is open source, meaning that the source code is available to users. a. Give an example of software that you use (or have used) that is closed source. b. Give an example of software that you use (or have used) that is open source. c. For open source software, what can Trudy do to search for security flaws in the software? d. For closed source software, what can Trudy do to search for security flaws in the software? e. For open source software, what can Alice do to ma
ke the software more secure? f. For closed source software, what can Alice do to make the software more secure? g. Which is inherently more secure, open source software or closed source software? Why? 19. It’s sometimes said that complexity is the enemy of security. a. Give an example of commercial software to which this statement applies, that is, find an example of software that is large and complex and has had significant security problems. b. Find an example of a security protocol to which this statement applies. 20. Suppose that this textbook was sold online (as a PDF) by your moneygrubbing author for, say, $5. Then the author would make more money off of each copy sold than he currently does 10 and people who purchase the book would save a lot of money. a. What are the security issues related to the sale of an online book? b. How could you make the selling of an online book more secure, from the copyright holder’s perspective? c. How secure is your approach in part b? What are some possible attacks on your proposed system? 21. The PowerPoint slides at [255] describe a security class project where students successfully hacked the Boston subway system. 10 Believe it or not. 16 INTRODUCTION a. Summarize each of the various attacks. What was the crucial vulnerability that enabled each attack to succeed? b. The students planned to give a presentation at the self-proclaimed “hacker’s convention,” Defcon 16 [80], where they would have presented the PowerPoint slides now available at [255]. At the request of the Boston transit authority, a judge issued a temporary restraining order (since lifted) that prevented the students from talking about their work. Do you think this was justified, based on the material in the slides? c. What are war dialing and war driving? What is war carting? d. Comment on the production quality of the “melodramatic video about the warcart” (a link to the video can be found at [16]). Information Security: Principles and Practice, Second Edition by Mark Stamp Copyright © 2011 John Wiley & Sons, Inc. Part I Crypto Information Security: Principles and Practice, Second Edition by Mark Stamp Copyright © 2011 John Wiley & Sons, Inc. Chapter 2 Crypto Basics MXDXBVTZWVMXNSPBQXLIMSCCSGXSCJXBOVQXCJZMOJZCVC TVWJCZAAXZBCSSCJXBQCJZCOJZCNSPOXBXSBTVWJC JZDXGXXMOZQMSCSCJXBOVQXCJZMOJZCNSPJZHGXXMOSPLH JZDXZAAXZBXHCSCJXTCSGXSCJXBOVQX — plaintext from Lewis Carroll, Alice in Wonderland The solution is by no means so difficult as you might be led to imagine from the first hasty inspection of the characters. These characters, as any one might readily guess, form a cipher—that is to say, they convey a meaning… — Edgar Allan Poe, The Gold Bug 2.1 Introduction In this chapter we’ll discuss some of the basic elements of cryptography. This discussion will lay the foundation for the remaining crypto chapters which, in turn, underpin much of the material throughout the book. We’ll avoid mathematical rigor as much as possible. Nevertheless, there is enough detail here so t h a t you will not only understand the “what” but you will also have some appreciation for the “why.” After this introductory chapter, the remaining crypto chapters focus on: • Symmetric key cryptography • Public key cryptography • Hash functions • Advanced cryptanalysis A handful of special topics are also covered. 19 CRYPTO 20 2.2 BASICS How t o Speak C r y p t o The basic terminology of crypto includes the following. • Cryptology — the art and science of making and breaking “secret codes.” • Cryptography — the making of “secret codes.” • Cryptanalysis — the breaking of “secret codes.” • Crypto — a synonym for any or all of the above (and more), where the precise meaning should be clear from context. A cipher or crypto system is used to encrypt data. The original unencrypted data is known as plaintext, and the result of encryption is ciphertext. We decrypt the ciphertext to recover the original plaintext. A key is used to configure a cryptosystem for encryption and decryption. In a symmetric cipher, the same key is used to encrypt and to decrypt, as illustrated by the black box cryptosystem in Figure 2.I. 1 There is also a concept of public key cryptography where the encryption and decryption keys are different. Since different keys are used, it’s possible to make the encryption key public—thus the name public key.2 In public key crypto, the encryption key is, appropriately, known as the public key, whereas the decryption key, which must remain secret, is the private key. In symmetric key crypto, the key is known as a symmetric key. We’ll avoid the ambiguous term secret key. key plaintext- key encrypt — / W W w W V — * decrypt ■ plaintext ciphertext Figure 2.1: Crypto as a Black Box For an ideal cipher, it is infeasible to recover the plaintext from the ciphertext without the key. That is, even if the attacker, Trudy, has complete knowledge of the algorithms used and lots of other information (to be made more precise later), she can’t recover the plaintext without the key. That’s the goal, although reality sometimes differs. 1 This is the only black box you’ll find in this book! Public key crypto is also known as asymmetric crypto, in reference to the fact that the encryption and decryption keys are different. 2 2.2 HOW TO SPEAK CRYPTO 21 A fundamental tenet of cryptography is that the inner workings of a cryptosystem are completely known to the attacker, Trudy, and the only secret is a key. This is known as Kerckhoffs’ Principle, which, believe it or not, was named after a guy named Kerckhoffs. In the year 1883, Kerckhoffs, a Dutch linguist and cryptographer, laid out six principles of cipher design and use [164]. The principle that now bears his name states that a cipher “must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience” [165], that is, the design of the cipher is not secret. What is the point of Kerckhoffs’ Principle? After all, it must certainly be more difficult for Trudy to attack a cryptosystem if she doesn’t know how the cipher works. So, why would we want to make Trudy’s life easier? There are (at least) a couple of problems with relying on a secret design for your security. For one, the details of “secret” cryptosystems seldom, if ever, remain secret for long. Reverse engineering can be used to recover algorithms from software, and even algorithms embedded in tamper-resistant hardware are sometimes subject to reverse engineering attacks and exposure. And, even more worrisome is the fact that secret crypto-algorithms have a long history of failing to be secure once the algorithms have been exposed to public scrutiny—see [29] for a relatively recent example where Microsoft violated Kerckhoffs’ Principle. Cryptographers will not deem a crypto-algorithm worthy of use until it has withstood extensive public analysis by many cryptographers over an extended period of time. The bottom line is that any cryptosystem that does not satisfy Kerckhoffs’ Principle is suspect. In other words, ciphers are presumed guilty until “proven” innocent. Kerckhoffs’ Principle is often extended to cover various aspects of security well beyond cryptography. In other contexts, this basic principle is usually taken to mean that the security design itself is open to public scrutiny. The belief is that “more eyeballs” are more likely to expose more security flaws and therefore ultimately result in a system that is more secure. Although Kerckhoffs’ Principle (in both its narrow crypto form and in a broader context) seems to be universally accepted in principle, there are many real-world temptations to violate this fundamental tenet, almost invariably with disastrous consequences. Throughout this book we’ll see several examples of security failures that were directly caused by a failure to heed the venerable Mr. Kerckhoffs. In the next section, we look briefly at a few classic cryptosystems. Although the history of crypto is a fascinating topic [159], the purpose of this material is to provide an elementary introduction to some of the crucial concepts that arise in modern cryptogra
phy. In other words, pay attention since we will see all of these concepts again in the next couple of chapters and in many cases, in later chapters as well. CRYPTO 22 2.3 BASICS Classic C r y p t o In this section, we examine four classic ciphers, each of which illustrates a feature that is relevant to modern cryptosystems. First on our agenda is the simple substitution, which is one of the oldest cipher systems—dating back at least 2,000 years—and one that is good for illustrating some basic attacks. We then turn our attention to a type of double transposition cipher, which includes important concepts that are used in modern ciphers. We also discuss classic codebooks, since many modern ciphers can be viewed as the “electronic” equivalent of codebooks. Finally, we consider the so-called onetime pad, a practical cipher that is provably secure. No other cipher in this book (or in common use) is provably secure. 2.3.1 Simple S u b s t i t u t i o n Cipher First, we consider a particularly simple implementation of a simple substitution cipher. In the simplest case, the message is encrypted by substituting the letter of the alphabet n places ahead of the current letter. For example, with n = 3, the substitution—which acts as the key—is plaintext: a b c d e f g h i j k l m n o p q r s t u v w x y z ciphertext: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C where we’ve followed the convention that the plaintext is lowercase, and the ciphertext is uppercase. In this example, the key could be given succinctly as “3” since the amount of the shift is, in effect, the key. Using the key 3, we can encrypt the plaintext message fourscoreandsevenyearsago (2-1) by looking up each plaintext letter in the table above and then substituting the corresponding letter in the ciphertext row, or by simply replacing each letter by the letter that is three positions ahead of it in the alphabet. For the particular plaintext in (2.1), the resulting ciphertext is IRXUVFRUHDAGVHYHABHDUVDIR. To decrypt this simple substitution, we look up the ciphertext letter in the ciphertext row and replace it with the corresponding letter in the plaintext row, or we can shift each ciphertext letter backward by three. The simple substitution with a shift of three is known as the Caesar’s cipher.3 3 Historians generally agree that the Caesar’s cipher was named after the Roman dictator, not the salad. 2.3 CLASSIC CRYPTO 23 There is nothing magical about a shift by three—any shift will do. If we limit the simple substitution to shifts of the alphabet, then the possible keys are n G {0,1,2,…, 25}. Suppose Trudy intercepts the ciphertext message CSYEVIXIVQMREXIH and she suspect that it was encrypted with a simple substitution cipher using a shift by n. Then she can try each of the 26 possible keys, “decrypting” the message with each putative key and checking whether the resulting putative plaintext makes sense. If the message really was encrypted via a shift by n, Trudy can expect to find the true plaintext—and thereby recover the key— after 13 tries, on average. This brute force attack is something that Trudy can always attempt. Provided that Trudy has enough time and resources, she will eventually stumble across the correct key and break the message. This most elementary of all crypto attacks is known as an exhaustive key search. Since this attack is always an option, it’s necessary (although far from sufficient) that the number of possible keys be too large for Trudy to simply try them all in any reasonable amount of time. How large of a keyspace is large enough? Suppose Trudy has a fast computer (or group of computers) that’s able to test 2 40 keys each second.4 Then a keyspace of size 2 56 can be exhausted in 2 16 seconds, or about 18 hours, whereas a keyspace of size 2 64 would take more than half a year for an exhaustive key search, and a keyspace of size 2 128 would require more than nine quintillion years. For modern symmetric ciphers, the key is typically 128 bits or more, giving a keyspace of size 2128 or more. Now, back to the simple substitution cipher. If we only allow shifts of the alphabet, then the number of possible keys is far too small, since Trudy can do an exhaustive key search very quickly. Is there any way that we can increase the number of keys? In fact, there is no need not to limit the simple substitution to a shifting by n, since any permutation of the 26 letters will serve as a key. For example, the following permutation, which is not a shift of the alphabet, gives us a key for a simple substitution cipher: plaintext: a b c d e f g h i j k l m n o p q r s t u v w x y z ciphertext: Z P B Y J R G K F L X Q N W V D H M S U T O I A E C In general, a simple substitution cipher can employ any permutation of the alphabet as a key, which implies that there are 26! « 2 88 possible keys. With 4 In 1998 the Electronic Frontier Foundation (EFF) built a special-purpose key cracking machine for attacking the Data Encryption Standard (DES, which we’ll discuss in the next chapter). This machine, which cost $220,000, consisted of about 43,200 processors, each of which ran at 40 MHz and, overall, it was capable of testing about 2.5 million keys per second [156]. Extrapolating this to a state-of-the-art PC with a single 4 GHz processor, Trudy could test fewer than 2 3 0 keys per second on one such machine. So, if she had access to 1000 such machines, she could test about 2 4 0 keys per second. CRYPTO 24 BASICS Trudy’s superfast computer that tests 2 40 keys per second, trying all possible keys for the simple substitution would take more than 8900 millennia. Of course, she would expect to find the correct key half that time, or just 4450 millennia. Since 2 88 keys is far more than Trudy can try in any reasonable amount of time, this cipher passes the crucial first requirement of any practical cipher, namely, the keyspace is big enough so that an exhaustive key search is infeasible. Does this mean that a simple substitution cipher is secure? The answer is a resounding no, as the attack described in the next section clearly illustrates. 2.3.2 Cryptanalysis of a Simple S u b s t i t u t i o n Suppose Trudy intercepts the following ciphertext, which she suspects was produced by a simple substitution cipher, where the key could be any permutation of the alphabet: PBFPVYFBQXZTYFPBFEQJHDXXQVAPTPQJKTOYQWIPBWLXTOXBTFXCÌWA XBVCXQWAXFQJVWLEQNTOZQGGQLFXQWAKVWLXQWAEBIPBFXFqVXGTVJV WLBTPQWAEBFPBFHCVLXBQUFEWLXGDPEQVPQGVPPBFTIXPFHXZHVFAG FOTHFEFBQUFTDHZBQPOTHXTYFTODXQHFTDPTOGHFqPBQWAqjJTODXqH FOQPWTBDHHIXqVAPBFZqHCFWPFHPBFIPBqWKFABVYYDZBOTHPBqPqjT qOTOGHFqAPBFEqjHDXXqVAVXEBqPEFZBVFOJIWFFACFCCFHQWAUVWFL qHGFXVAFXqHFUFHILTTAVWAFFAWTEVDITDHFHFqAITIXPFHXAFqHEFZ qWGFLVWPTOFFA , » ^ ‘ ‘ Since it’s too much work for Trudy to try all 2 8 8 possible keys, can she be more clever? Assuming the plaintext is English, Trudy can make use of the English letter frequency counts in Figure 2.2 together with the frequency counts for the ciphertext in (2.2), which appear in Figure 2.3. A B C D E F 6 H I J K L M N O P Q R S T U V W X Y Z Figure 2.2: English Letter Frequency Counts From the ciphertext frequency counts in Figure 2.3, we see that “F” is the most common letter in the encrypted message and, according to Figure 2.2, “E” is the most common letter in the English language. Trudy therefore 2.3 CLASSIC CRYPTO 25 A B C D E F e H I J K L M N O P Q R S T U V W X Y Z Figure 2.3: Ciphertext Frequency Counts surmises that it’s likely that “F” has been substituted for “E.” Continuing in this manner, Trudy can try likely substitutions until she recognizes words, at which point she can be confident in her guesses. Initially, the easiest word to determine might be the first word, since Trudy doesn’t know where inter-word spaces belong in the text. Since the third plaintext letter appears to be “e,” and given the high frequency counts of the first two letter, Trudy might reasonably guess (correctly, as it turns out) that the first word of the plaintext is “the.” Making these substitutions into the remaining ciphertext, she will be able to guess more letters and the puzzle will begin to unravel.
Trudy will likely make some missteps along the way, but with sensible use of the statistical information available, she will find the plaintext in considerably less time than 4450 millennia. This attack on the simple substitution shows that a large keyspace is not sufficient to ensure security. This attack also shows that cipher designers must guard against clever attacks. But how can we protect against all such attacks, since new attacks are developed all the time? The answer is that we can’t and, as a result, a cipher must be subjected to extensive analysis by skilled cryptographers before we can trust it—the more skilled cryptographers who have tried to break a cipher and failed, the more confidence we have in the system. 2.3.3 Definition of Secure There are several reasonable definitions of a secure cipher. Ideally, we would like to have a rigorous mathematical proof that there is no feasib…