By Kenton Varda - 01 May 2015
So here’s a fun problem. Let’s say that a web site wants to implement authentication based on public-key cryptography. That is, to log into your account, you need to prove that you possess a particular public/private key pair associated with the account. You can use the same key pair for several such sites, because none of the sites ever see the private key and therefore cannot impersonate you to each other. Way better than passwords!
Asymmetric encryption algorithms (ones with public/private key pairs) generally offer one or more of the following low-level operations:
Let’s say your keys are RSA (probably the most common case today). To log in, you need to prove to the site that you have possession of your private key. How might we use the available operations (encrypt and sign) to do this? Keeping in mind that we must make sure someone snooping on a previous login cannot simply reply those past messages to log in again, we might propose:
Both basic approaches are commonly used for authentication. SSH, for example, used the encryption approach in v1, then switched to the signature approach in v2, in order to accommodate non-RSA key types. But the protocols used in practice are more complicated that described above, because…
There is huge problem with both approaches! In both cases, the web site controls the “random value” that is signed or the message to be decrypted. You’ve effectively given the web site complete access to your key:
(Of course, both of these problems can trivially be exploited to allow an evil web site operator to secretly log into some other web site as you, by relaying the login challenge from that other site.)
So what do we do about this?
Let’s focus on the signature case. Naively, you might say: We need to restrict the format of the message the site presents for signature, so that it cannot be misinterpreted to mean anything else. The message will need a random component, of course, in order to prevent replays, but we can delimit that randomness in ASCII text which states the purpose. (In particular, it’s important that the text specify exactly what web site we’re trying to log into.) So, now we have:
z.example.com login challenge:1Z5ns8ectRGTMNYz3NHdB 699a674d30fc3bd42ec3619cfab13deb
Here is a message that could not reasonably be interpreted as anything except “I want to log in to z.example.com”, right? It’s safe to sign, right?
I’m sorry to say, but there is still a problem. It turns out that message isn’t the ASCII text it appears to be:
$ echo -n 'z.example.com login challenge:1Z5ns8ectRGTMNYz3NHdB 699a674d30fc3bd42ec3619cfab13deb' \
| protoc --decode=Check check.proto
toAccount: "699a674d30fc3bd42ec3619cfab13deb"
amount: 100
memo: "example.com login challenge:1Z5ns8ectRGTMNYz3N"
Oops, it appears you just signed a bank check after all – and you were trying so hard not to! The check was encoded in Google’s binary encoding format Protocol Buffers – we were able to decode it using the protoc
tool. Here is the Protobuf schema (the file check.proto
that we passed to protoc
):
message Check {
required string toAccount = 8;
// 32-byte hex account number
required uint32 amount = 9;
// Dollar amount to pay.
optional string memo = 15;
// Just a comment, not important.
}
Thanks for your $100. Hope you have fun on z.example.com
.
The trick here is that, using my knowledge of Protobuf encoding, I was able to carefully craft an ASCII message that happened to be a valid Protobuf message. I chose Protobufs because of my familiarity with them, but many binary encodings could have worked here. In particular, most binary formats include ways to insert bytes which will be ignored, giving us an opening to insert some human-readable ASCII text. Here I pulled the text into the memo
field, but had I not defined that field, Protobuf would have ignored the text altogether with no error.
Is this attack realistic? If the person designing the Check
protocol intended to make the attack possible, then absolutely! No one would ever notice that the field numbers had been carefully chosen to encode nicely as ASCII, especially if plausible-looking optional fields were defined to fill in the gaps between them. (Of course, a real attacker would have omitted the memo
field.)
Even if the other format were not designed maliciously, it’s entirely possible for two protocols to assign different meanings to the same data by accident – such as an IRC server accepting input from an HTTP client. This kind of problem is known as inter-protocol exploitation. We’ve simply extended it to signatures.
Here’s the thing: When you create a key pair for signing, you need to decide exactly what format of data you plan to sign with it. Then, don’t allow signatures made with that key to be accepted for any other purpose.
So, if you have a key used for logging into web sites, it can only be used for logging into web sites – and all those web sites must use exactly the same format for authentication challenges.
Note that the key “format” can very well be “natural language messages encoded in UTF-8 to be read by humans”, and then you can use the key to sign or encrypt email and such. But, then the key cannot be used for any automated purpose (like login) unless the protocol specifies an unambiguous natural-language “envelope” for its payload. Most such protocols (such as SSH authentication) do no such thing. (This is why, even though SSH and PGP both typically use RSA, you should NOT convert your PGP key into an SSH key nor vice versa.)
Another valid way to allow one key to be used for multiple purposes is to make sure all signed messages begin with a “context string”, for example as recommended by Adam Langley for TLS. However, in this case, all applications which use the key for signing must include such a context string, and context strings must be chosen to avoid conflicts.
But probably better than either of the above is to use a different key for each purpose. So, your PGP master key should not be used for anything except signing subkeys. Unfortunately, PGP subkeys do not allow you to express much about the purpose of subkeys – you can only specify “encryption”, “signing”, “certification”, or “authentication”. This does not seem good enough – e.g. an “authentication” key still cannot safely be used for SSH login because a malicious SSH server could still trick you into signing an SSH login request that happens to be a valid authentication request for a completely different service in a different format. (Admittedly, this possibility seems unlikely, but tricking an HTTP client into talking to a server expecting a completely different protocol also seems unlikely, yet it works e.g. with IRC. We can’t really prove that no other protocol is designed exactly wrong such that it interprets SSH’s authentication signatures as something else.) Alas, in practice, SSH is exactly what the “authentication” subkey type is used for! So, perhaps SSH should be considered the de facto standard format for PGP’s “authentication” keys, and if you implement an authentication system that accepts PGP keys designated for authentication, you should be sure to exactly match SSH’s authentication signature format.
Similarly, X.509 certificates have the “extended key usage” extension to specify a key’s designated purpose, but the options here are similarly limited.
Another line of thought says simply: “Don’t use signatures for anything except signing certificates.” Indeed, signing email is arguably a bad idea: it means that not only can the recipient verify that it came from you, but the recipient can prove to other people that you wrote the message, which is perhaps not what you really wanted. Similarly, when you log into a server, you probably don’t want that server to be able to prove to the rest of the world that you logged in (as SSH servers currently can do!). What you probably really wanted is a zero-knowledge authentication system where only the recipient can know for sure that the message came from you (which can be built on, for example, key agreement protocols).
In any case, when Sandstorm implements authentication based on public keys, it will probably not be based on signatures. But, we still have more research to do here.
Thanks to Ryan Sleevi, Tony Arcieri, and Glenn Willen for providing pointers and insights that went into writing this article. However, any misunderstandings of cryptography are strictly my own.
By Jade Wang - 17 Apr 2015
How does the technological development of a civilization relate to its culture?
At equilibrium, a society with safe, affordable and widely accessible forms of birth control has different cultural values from one that does not. A society with affordable, ubiquitous, and open Internet access will have different cultural values, productivity, and global impact on other cultures from one that does not. And of course, a society with ubiquitous smartphone cameras will hold its police officers more accountable for human lives than one that relies on human eyewitness testimony.
Ubiquitous. Widely accessible. Democratized. It is not enough for these technologies to simply exist; there is a limit to the impact of a technology that only the wealthy few percent can afford, and a limit on the pressure exerted on that culture to adapt to its new environmental conditions. There is something amazing, akin to a phase transition, that happens when each technology gets democratized past a certain point. It’s like the difference between a society with access to measles and polio vaccines and one that does not. It’s the efficiency of trade in a developing nation with infrastructure to support basic, affordable SMS service versus one that does not.
Those of us working in technology today would do well to keep these things in mind: we don’t just serve deep pockets, we serve humanity. Democratizing access to existing technology is just as important as creating new technology.
The cultural effects of widespread and democratized access to personal servers is left as an exercise for the reader.
Defining “progress” is also left as an exercise for the reader.
By Jason Paryani - 13 Apr 2015
Last week, I packaged Let’s Chat, an open source team chat app, similar to Slack. You can try it on the Sandstorm demo right now, or install it from the app list.
As described by its developers:
Let’s Chat is a persistent messaging application that runs on Node.js and MongoDB. It’s designed to be easily deployable and fits well with small, intimate teams.
A few notes on the packaging process: This package doesn’t use the normal MongoDB, but instead Kenton’s fork. This allows the app to only use ~300KB of storage when each instance is created instead of the normal minimum of a few hundred MB for Mongo. Otherwise, the packaging process went very smoothly except for dealing with passport, a Node.js authentication library. I ended up forking passport-token to look at HTTP headers only (by default it will look at POST data as well, which would make it insecure inside Sandstorm). You can see the forked code here, and if there’s some interest, I can work to get it published on NPM.
You can see the full package source here, and if you want to use Let’s Chat, check it out on the app list. Feel free to let me know what you think on sandstorm-dev.
By Kenton Varda - 08 Apr 2015
A few months ago I discovered a security bug in the Darwin kernel used by most Apple products. The bug could allow an attacker to trivially remotely DoS a variety of network services and apps, from Node.js to Chrome. Today, Apple released a patch (look for CVE-2015-1105), so now I can tell you about it.
Now, just to be clear, I’m no Adam Langley. This bug is “just” a DoS, nothing like a Heartbleed or a Shellshock. The worst it can do is allow an attacker to cause a temporary service disruption. But I think all security bugs deserve a writeup so that we can learn from them, and Apple’s terse description of the problem doesn’t accomplish this. Also, it’s a fun story.
I discovered the problem while doing research on the different interfaces that various operating systems provide for doing event-driven I/O – that is, how you tell the platform: “Here are all my open connections; wake me up when one of them receives a message.” It turns out that every OS does this differently. Linux has epoll
. BSD has kqueue
. Windows has… well, about five different mechanisms that cover differing subsets of usecases and you can only choose one. In any case, I was trying to build an abstraction layer over these for Cap’n Proto, so I wanted to make sure I understood them all.
I noticed a curious thing: some man pages discussed an event called “out-of-band data” while others didn’t. “Out-of-band data” (OOB), also known as “urgent data”, is a little-used feature of TCP connections that essentially allows you to send a byte that “jumps the queue” so that the receiving app can receive it before receiving other data sent before it. You probably didn’t know about this, because basically no one uses this feature – except for Telnet, which needs a way to signal that you pressed “ctrl+C” when the destination app is not otherwise processing input.
With almost all event notification APIs, regular data and OOB data raise different kinds of events. For example, poll()
(and its successor on Linux, epoll
) has POLLIN
for regular data and POLLPRI
for OOB. This way, if your app does not expect to receive OOB data, it simply doesn’t ask to be notified about that type of event, and the kernel happily discards it for you (or maybe inserts it into the regular stream, which is fine).
Curiously, the BSD kqueue
docs are unclear on how OOB data is handled. FreeBSD’s kqueue
makes no mention of it, and as far as I’ve been able to determine it simply doesn’t support notification of OOB events at all. DragonflyBSD’s kqueue
defines an EVFILT_EXCEPT
event type.
Darwin’s (OSX/iOS) kqueue
also doesn’t mention OOB data, but some Googling revealed an undocumented “feature”: on OOB data, Darwin will raise a regular EVFILT_READ
event (which normally indicates that regular in-band data was received) but set the special flag EV_OOBAND
on the event structure.
Of course, if you aren’t expecting OOB data, you’re not going to check for that flag. So when you receive EVFILT_READ
, you’re going to believe you’ve received regular data. And you’re going to do a recv()
call to read that data, and there isn’t going to be any. And then you’re going to say “oh well” and return to the event loop. But if you are using kqueue()
in level-triggered mode (as most people do, because it’s easier), then the operating system is going to see that the OOB data is still there, and is going to give you the exact same event again.
So you go into an infinite loop.
Wait, doesn’t that mean almost all event-driven OSX network apps will go into an infinite loop if they receive a single TCP packet with the urgent bit set?
I didn’t think that could possibly be true at first, so I fired up a Mac machine to try it. Sure enough:
When Google Chrome visited an HTTP server that sent back an OOB byte, the whole app (not just the tab, but everything) locked up and had to be force-quit. It turns out Chrome does all networking from the main process, so the per-tab process separation did not help. (Chromium issue 437642 – currently still locked down as a security issue)
When a Node.js server received an OOB byte from a client, the server would go into an infinite loop and stop handling other connections. (fixed in this commit)
On the other hand, my third test case – nginx – was not affected, because it uses kqueue in edge-triggered mode, and therefore it only receives the unexpected event when new data arrives rather than any time data is available – i.e. once rather than infinity times. But two of three is a pretty worrisome hit rate, especially when these are some very big names.
Arguably the worst / most interesting part of this problem is that it was a problem inherent in the API. Technically it was not that the kernel was buggy, but that the interface was confusing (and underdocumented) in a way that caused the same bug to manifest in several different apps. Fixing the problem either required fixing every app (and being ever-vigilant in the future), or changing the API and breaking any existing app that depended on the behavior (of which there appears to be a few).
To Apple’s credit, they did what I think is the right thing: they changed the interface so that it no longer reports EVFILT_READ
events on TCP OOB data. I do not quite understand their description of this problem as a “state inconsistency issue”, but my tests confirm that OOB data is now ignored.
The moral of the story? Confusing APIs are a security problem. If many users of your API get it wrong in a way that introduces a security bug, that’s a bug in your API, not their code.
By Asheesh Laroia - 06 Apr 2015
I’m hoping to see you in Montreal at PyCon 2015! I’m co-speaking at a talk and a tutorial. I’ll also be eager to talk to people about Sandstorm and self-hosting servers.
Last year, I co-gave a talk about turning your computer into a server with Karen Rustad. Here’s what it looked like when one of my friends surprised me by changing the data we were showing during a live demo:
That Lol is supposed to say Django. Thanks, Luke. Check out my expression of amusement masking horror.
I got compromised because the Django sample app I demo’d has a default admin account bundled with it, as part of the demo. The good news is if you’re using Sandstorm, the platform handles authentication & authorization for apps, so this sort of thing won’t happen to you. Plus, as you would expect, Sandstorm ships with no default passwords.
This year I’m co-leading a tutorial called Getting Comfortable With Web Security where we discuss all sorts of common security issues with web applications; Jacky Chang and Nicole Zuckerman are my co-presenters. I’m also sharing a stage with Philip James to answer the question, Type python, press enter. What happens?
I hope to see you there! Doubly so if you’re interested in a Sandstorm & server self-hosting Birds of a Feather session. Send an email to asheesh@sandstorm.io!