Main

February 4, 2010

Why don't websites default to SSL/TLS?

When a client connects on TCP port 80 to a webserver, one of the first thing it sends is a line that tells the server what website it wants. This line looks like this:
HOST: www.csoandy.com
This tells the webserver which configuration to use, and how to present content to the end-user. This effectively abstracts TCP and IP issues, and lets websites and webservers interact at the discretion of the owner. The designed user of HTTP is the administrator, who may need to host dozens of websites on a smaller number of systems.

Secure Socket Layer (SSL) and its successor, Transport Layer Security (TLS), on the other hand, were designed for exactly the opposite user. The designed user of HTTPS is the paranoid security academic, who doesn't even want to tell a server the hostname it is looking for (the fact that you were willing to tell a DNS server is immaterial). In essence, SSL requires that any server IP address can have only one website on it. When a client connects to a webserver on port 443, the first thing it expects is for the server to provide a signed certificate, that matches the hostname that the client has not yet sent. So if you connect to www.csoandy.com via SSL, you'll note you get back a different certificate: one for a248.e.akamai.net. This is expected -- nothing on this hostname requires encryption. Similarly, for Akamai customers that are delivering whole-site content via Akamai on hostnames CNAMEd to Akamai's edgesuite.net domain, attempts to access these sites via SSL will result in a certificate for a248.e.akamai.net being returned. (Aside: customers use our SSL Object Caching service to deliver objects on the a248.e.akamai.net hostname; customers who want SSL on their own hostnames use our WAA and DSA-Secure services).

The designs of HTTP and HTTPS are diametrically opposed, and the SSL piece of the design creates horrendous scaling problems. The server you're reading this from serves over 150,000 different websites. Those sites are actually loadbalanced across around 1600 different clusters of servers. For each website to have an SSL certificate on this network, we'd need to consume around 250 million IP addresses - or 5.75% of the IPv4 space. That's a big chunk of the 9% left as of today. Note that there isn't a strong demand to put most sites on SSL; this is just elucidating why, even if there were demand, the sheer number of websites today makes this infeasible.

Fortunately, there are paths to a solution.

Wildcard certificates
For servers that only serve hostnames in one domain, a wildcard certificate can help. If, for instance, in addition to www.csoandy.com, I had store.csoandy.com, pictures.csoandy.com, and catalog.csoandy.com, instead of needing four certificates (across those 1800 locations!), I could use a certificate for *.csoandy.com, which would match all four domains, as well as future growth of hostnames.

You're still limited; a wildcard can only match one field, so a certificate for *.csoandy.com wouldn't work on a site named comments.apps.csoandy.com. Also, many security practitioners, operating from principles of least privilege, frown on wildcard certificates, as they can be used even for unanticipated sites in a domain.

Subject Alternate Name (SAN) certificates
A slightly more recent variety of certificate are the SAN certificates. In addition to the hostname listed in the certificate, an additional field lets you specify a list of valid hostnames for that certificate (If you look closely, the a248.e.akamai.net certificate on this host has a SAN field set, which includes both a248.e.akamai.net and *.akamaihd.net). This permits a server to have multiple, disparate hostnames on one interface.

On the downside, you still only have one certificate, which is going to get larger and larger the more hostnames you have (which hurts performance). It also ties all of those hostnames into one list, which may present brand and security issues to some enterprises.

Server Name Indication (SNI)
The long term solution is a feature for TLS called Server Name Indication (SNI). This extension calls for the client to, as part of the initial handshake, indicate to the server the name of the site it is looking for. This will permit a server to select the appropriate one from its set of SSL certificates, and present that.

Unfortunately, SNI only provides benefit when everyone supports it. Currently, a handful of systems don't support SNI, most notably Windows XP and IIS. And those two major components are significant: XP accounts for 55-60% of the browser market, and IIS looks to be around 24%. So it'll be a while until SNI is ready for primetime.

January 25, 2010

The Designed User

One of the things I've always liked about Apple technology is that every system feels like it was designed for a specific individual. The more you're like that individual, the more you like their technology. This isn't unique to Apple -- most technology capabilities are designed for a specific problem space -- Apple is just clearer about it. As a security professional, I like to understand who a specific technology is designed for ("The Designed User") as part of assessing risks involved.

As an example of designed users, take the new Twitter "retweet" functionality. (For those of you new to Twitter: Twitter permits people to post 140-character tweets. Interesting tweets are often "retweeted" by prepending "RT @username (original tweet here)", sometimes with some commentary appended. Twitter has another setting: whenever someone puts your username into a tweet, you see it.) The new retweet functionality, much maligned, allows a single click to retweet a tweet. The originating user does not see the new tweet.

The "old" retweet function -- really, a use created by users -- is perfect for the networking user. It often gets used to make a comment on someone else's tweet, while rebroadcasting it. I want to see every time someone retweets something I said (really, it doesn't happen that often). But I'm not the target of the new functionality: celebrities are. A large number of retweets are celebrity tweets being rebroadcast by their followers. If you're in that network, you want to minimize how many times you see the same retweet in your timeline. For those users, the new capability is easier, and far more preferred.

With any capability, we should always ask who the intended audience is as part of understanding the design space the developers were in. This may help us understand why certain security tradeoffs were chosen.

December 18, 2009

Modeling Imperfect Adversaries

An important piece of risk assessment is understanding your adversaries. Often, this can degenerate into an assumption of perfect adversaries. Yet when we think about risk, understanding that our adversaries have different capabilities is critical to formulating reasonable security frameworks. Nowhere is this more true than in the streaming media space.

Brian Sniffen (a colleague at Akamai) recently presented a paper at FAST exploring ways of considering different adversaries, especially in the context of different business models. He presents some interesting concepts worth exploring if you're in the space:

  • Defense in breadth: The concept of using different security techniques to protect distinct attack surfaces, specifically looking at defeating how a specific adversary type is prone to conduct attacks.
  • Tag-limited adversaries: An extension to the Dolev-Yao adversary (a perfect adversary who sits inline on a communication stream), the tag-limited adversary may only have knowledge, capabilities, or desire to conduct attacks within a limited vocabulary.

His paper is also a good primer on thinking about streaming threat models.

December 15, 2009

Virtual Patching

Virtual patching, for those new to the term, is the practice of adding a rule to a Web Application Firewall (WAF) to filter out traffic that could exploit a known vulnerability in a protected application. This has triggered debate in the security community -- is this a good thing? Why would a developer fix the vulnerability if it is mitigated?

bandaid.jpgFirst off, Virtual Patching is, in fact, a good thing. The development turnaround time for a WAF rule is almost certainly shorter than the development cycle for the backend application, so this shortens your mitigation window. That shouldn't really be a topic of debate.

The interesting debate, then, is how we manage the underlying vulnerability. One school of thought argues that the WAF is only a stopgap, and developers should fix the vulnerability because that's the right thing to do. Another school's argument is a bit more complex. Don't fix the vulnerability. Fix the entire class of vulnerabilities.

If you have a vulnerability in a webapp, odds are that it's a symptom of a greater category of vulnerability. That CSRF bug on one page likely reflects a design flaw in not making GET requests nullipotent, or not using some form of session request management. Given the urgency of an open vulnerability, the developers will likely focus on fixing the tactical issue. Virtually patch the tactical vulnerability, and focus on the flaw in the foundation. It'll take longer to fix, but it'll be worth it in the long run.

December 10, 2009

DDoS thoughts

We are used to measuring the efficiency of DDoS attacks in ratios of bits-per-second. An attacker wants to consume many bits-per-second for each of his own bits-per-second that he uses. He would rather send one packet to generate a storm than have to send the storm himself. We can extend this efficiency measurement to other attacks. Let's use the name "flits per second" (for fully-loaded bits) for this more general measurement of cost and pain: sending or receiving one bit-per-second costs one flit-per-second. Doing enough computation to have an opportunity cost of one bit-per-second has a cost of one flit-per-second. Enough disk access to forestall one bit-per-second costs one flit-per-second. Now we can talk about the flit capacity of the attacker and defender, and about the ratio between flits consumed on each side during an attack.

From a defensive efficiency perspective, we have two axes to play with: first, reducing the flit-to-bit ratio of an attack, by designing optimal ways of handling traffic; and second, increasing the relative client cost to produce an attack.

To reduce the flitcost, consider the history of SYN floods. SYN floods work by sending only the first packet in the three way TCP handshake; the victim computer keeps track of the half-open connection after sending its response, and waits for the attacker's followup. That doesn't come; for the cost of a single SYN packet, the attacker gets to consume a sparse resource (half-open connections) for some period of time. The total amount of traffic needed historically was pretty minimal, until SYN cookies came along. Now, instead of using the sparse resource, targets use a little bit of CPU to generate a cryptographic message, embed it in their response, and proceed apace. What was a very effective attack has become rather ineffective; against most systems, a SYN flood has a lower flit-to-bit ratio than more advanced application layer attacks.

The other axis is more interesting, and shows why SYN floods are still prevalent even today: they're cheap to produce. They don't consume a lot of cycles on the attacking systems, and don't require interesting logic or protocol awareness. The fewer resources an attacker can consume, the more likely their attack will go unnoticed by the owners of the compromised hosts used in the attack (Case in point: look at how fast Slammer remediation happened. Why? ISPs were knocked offline by having infected systems inside). Many attack programs effectively reduce to "while (1) {attack}". If the attack is making an HTTP request, filtering the request will often generate a higher rate of requests, without changing the attacker's costs. If this higher rate has the same effect on you that the lower rate did, you didn't buy anything in your remediation. You might have been better off responding more slowly, than not at all.

In the general case, this leads us to two solution sets. Traffic filtering is the set of technologies designed to make handling attacks more efficient; either by handling the attack further out in an infrastructure, classifying it as malicious for a cheaper cost than processing it, or making processing cheaper.

Capacity increases, on the other hand, are normally expensive, and they're a risky gamble. If you increase far in excess of attacks you ever see, you've wasted money. On the other hand, if you increase by not quite enough, you're still going to be impacted by an event (and now, of course, you'll be criticized for wasting money that didn't help). Obligatory vendor pitch: this is where a shared cloud infrastructure, like a CDN, comes into play. Infrastructures that measure normal usage in terabits per second have a significantly different tolerance for attack capacity planning than most normal users.

November 23, 2009

H1N1 and telework

The nervousness around H1N1 has pretty much permeated every aspect of our lives. Remember a year or two ago, the hysteria around hand sanitizers and alcohol poisoning? Gone; in its place, we have dispensers in buildings everywhere. That's the power of the fear of H1N1.

Another place is in schooling. Not too long ago, if your kid got sick, school policy was "keep them home if they have a fever or are vomiting." Sanely, this migrated to "keep them home for 24 hours after a fever." Now, however, it is "48 hours fever-free with no medications." Some schools/daycares have added "and no symptoms either," which is moderately impractical for the kids who get a three-week long lingering cough.

This affects us in the workplace. If an employee has a small child and they don't have a stay-at-home caregiver, expect that they're going to miss more time than in prior years; and that the employee actually will be stressed about this (heck, anyone trapped at home with a no-longer-sick child on a school-day is going to end up pretty stressed). Also, you may want to suggest that employees with sick children stay at home even if they aren't the primary caregiver, just to minimize workplace infections.

Key to this is a sane telework plan. Like most things, this comes down to People, Process, and Technology.

People: Do the employee and manager have a good rapport, such that working remotely does not lead to communications failures? Can the employee work without direct management? Can the employee balance the needs of daytime home-life with work?

Process: Do you have understood ways for the employee's status to be communicated? Do other employees know how to reach them? How many hours do you expect when an employee is "working from home"?

Technology: What telework capabilities do you have? (VOIP phones in the home? VTC setups?) What about remote collaboration? (A wiki, IM, ticketing system or just email?) Do your employees have enough bandwidth at home to telework? Do you have enough in your office to support them?

It's going to happen to you -- you just need a little prep. And most of that prep? You can typeset it to hand to your auditors; it's a big piece of your DRP.

November 8, 2009

Secure by design?

"How do we ensure people build secure systems?"

This was the question to the panel before mine at the Thayer School's Complex Systems Symposium. It's not a new question - it comes up every time anyone tries to tackle hard problems around internet security. But it's an unfair question, because we have never built anything securely.

The question was asked in a lecture hall. Every time the symposium took a break, the two aisles bottled up with side conversation, inhibiting the flow of people needing to exit/enter. There were several "captains of industry", extremely talented professors, and bright students in the room; yet a mob could have swooped in shouting at any minute or an attacker could have waltzed in unimpeded (I could go on and on with threat scenarios). Yet who is responsible for the poor security design of that lecture hall?

In reality, security is about making good risk decisions, and accepting that there are some attacks and adversaries that you will not defend against. For internet-connected systems, this tradeoff is harder, as the cost to your adversaries is usually small enough that attacks that are implausible in the physical world become economical (remember the half-penny skimmers?)

October 13, 2009

Compliance, Security, and the relations therein

Last week, Anton Chuvakin shared his latest in the "compliance is not security" discussion:

Blabbing "compliance does not equal security" is a secret rite of passage into the League of High Priests of the Arcane Art of Security today. Still, it is often quietly assumed that a well-managed, mature security program, backed up by astute technology purchases and their solid implementation will render compliance "easy" or at least "easier." One of my colleagues also calls it "compliance as a byproduct." So, I was shocked to find out that not everyone agrees...

I think there are two separate issues that Anton is exploring.

The first is that a well-designed security control should aid in compliance. As one of his commenters notes, a good security program considers the regulatory issues; or, more plainly, a good security control considers the compliance auditor as an adversary. If you do not design controls to be auditable, you are building risk into your system (sidebar: what security risks are worse than failing an audit?).

But the second point is more interesting. Most compliance frameworks are written to target the industry standard architectures and designs. What if you are doing something so different that a given control has no parallel in your environment? Example: You have no password authentication in your environment; what do you do about controls that require certain password settings? What if your auditor insists on viewing inapplicable settings?

Then, you have three options:

  1. Convince your auditor of the inapplicability of the controls.
  2. Create sham controls to satisfy the auditor
  3. Find another auditor

July 3, 2006

Security and Obscurity

Everyone has heard the mantra, "Security through obscurity is no security at all." I hope that people remember where it came from - when companies were announcing proprietary cryptographic algorithms, everyone pointed out that cryptanalysis is an almost impossible task to get right, so you couldn't really know how secure your algorithm was unless it had been peer-reviewed.

But this comes up every few days, when people discuss security systems and architectures. And there, I contend, obscurity is the single most important component of every security system. Because let's face it, there is no such thing as perfect security. So every architecture has its holes.

The job of a good security professional is to reduce those holes; to make exploiting the holes more expensive than the value of doing so, and to implement layered security systems so that attackers are unlikely to make it all the way through a system without tripping an alarm somewhere. Without obscurity, that's impossible.

Put another way, an attacker, or even a neutral party, has absolutely no need to know the details of your architecture.

June 30, 2006

Social Engineering Self-training

Most security systems have the annoying side effect that increasing attack volumes can degrade them, usually through tuning of defenses, or desensitization (Yes, this is a generalization). Social Engineering, on the other hand, has the nice feature that the more often someone tries to social engineer you, the less likely the next person is to succeed - even if the first attack is incompetent, and the second one highly competent.

That's because every failed attempt is a training exercise for the target.

June 8, 2006

Policy and Practice - a Talmudic distinction

It's hip, of course, to be able to use Talmudic in a description of regulatory environment - but this is actually going to use the Talmud as a source. Policy is what we write down; practice is what we do. The relationship between them is nicely covered in the first tractate of the Talmud.

Mishna. From what time can the Shma be recited in the evening? From the hour when the priests go in to eat their tithes until the end of the first watch - the words of Rabbi Eliezer. And the Sages say: Until midnight. Rabban Gamliel says: Until the break of day (Brokhos 2a).
There is a bunch of esoteric coverage about the start point - but what about the end point? Why are both midnight and daybreak listed?
Mishna. Whenever the Sages say "until midnight," the obligation extends until the break of day.... Then why did the Sages say "until midnight"? In order to keep people from transgressing (Brokhos 2a).
And that is the difference between policy and practice. A well written policy should never be broken - and one way to ensure that is to have practice be more stringent than the policy.

Note that I except from this rule CYA policies, of the sort lawyers tend to write to protect organizations from liability.

(Thanks to Born to Kvetch by Michael Wex for the inspiration).

May 31, 2006

The Perfect is the Enemy of the Good

My favorite Voltaire quote:

Le mieux est l'ennemi du bien.

So often in information security we are presented with fairly poor starting scenarios. And there are usually a number of options - raging from doing nothing, to modest improvements, to complete redesign. Purists, of course, tend to advocate a complete redesign from the ground up. Generally, that's a poor strategy, especially if you're an outside stakeholder.

If the responsible owner wasn't planning a redesign, you are basically advocating a complete reinvestment of all costs incurred to date, plus whatever is necessary to meet security needs, for no increased value (generally, revenue). Obviously, they aren't going to just comply.

Assuming you manage to comvince the owner that a complete redesign is worthwhile, you'll find that every other stakeholder jumps in. Second system Syndrome now kicks in - as long as such a large investment is being made, why not add in all these other, "minor" things people have been looking for?

Instead, find the little changes, that can get slipped in to improve the current state. Few systems are so bad that minor changes won't provide a world of good. But more importantly, this establishes security as a reasonable requirement, that the system owner is used to dealing with.

And when they decide to implement their next major change, odds are, they'll come find you.

May 21, 2006

Sledgehammers

How do you perfectly secure data on a system? The hard drive should be encrypted, of course. Logging onto the system should use a one time password, as well as an asymmetric identifier. You put the computer in a locked room. Make sure the computer isn't connected to the network, of course, and, for good measure, power it down. The door should have multiple locks, so that you can enforce two-person access controls, and each needs to prove their identity with a physical token, biometrics, and a PIN.

And, of course, the last thing you should do is take a sledgehammer to the computer before leaving the room.

Continue reading "Sledgehammers" »