September 11, 2012

Take Over, Bos'n!

Eleven years ago, Danny Lewin was murdered.

This is a story from before that -- and how Danny inspired me to change the web.

It starts about twelve years ago. Akamai had just launched EdgeSuite, our new, whole-site content delivery product. Instead of having to change the URLs on your objects to start with, you could just CNAME your whole domain over, and we'd deliver everything - even the HTML. It was revolutionary, and would power our move to application delivery.

But Danny wasn't satisfied with that (Danny was rarely satisfied with anything, actually). I'd just become Akamai's Chief Security Architect - mostly focusing on protecting our own infrastructure - and Danny came to me and said, "What will it take to convince banks to use EdgeSuite?"

I'll be honest, I laughed at him at first. We argued for weeks about how paranoid bank security teams were, and why they'd never let their SSL keys be held by someone else. We debated what security model would scale (we even considered having 24x7 security guards outside our datacenter racks). We talked about the scalability of IP space for SSL. Through all of that, Danny was insistent that, if we built it, the market would accept it - even need it. I didn't really believe him at the time, but it was an exciting challenge. We were designing a distributed, lights-out security model in a world with no good guidance on how to do it. And we did.

But I still didn't believe - not the way Danny did. Then came the phone call. I'd been up until 4 am working an incident, and my phone rings at 9 am. It's Danny. "Andy, I'm here with [large credit card company], and they want to understand how our SSL network works. Can you explain it to them?"

I begged for thirty seconds to switch to a landline (and toss cold water on my face), and off we go. We didn't actually have a pitch, so I was making it up on the fly, standing next to the bed in my basement apartment, without notes. I talked about the security model we'd built - and how putting datacenter security into the rack was the wave of the future. I talked about our access control model, the software audits we were building, and our automated installation system. I talked for forty-five minutes, and when I was done, I was convinced - we had a product that would sell, and sell well (it just took a few years for that latter half to come true).

When I got off the phone, I went to my desk, and turned that improvisational pitch into the core of the security story I still tell to this day. More importantly, I truly believed that our SSL capability would be used by those financial services customers. Like Danny, I was wrong by about a decade - but in the meantime, we enabled e-commerce, e-government, and business-to-business applications to work better.

Danny, thanks for that early morning phone call.

"When you're bossman" he added, "in command and responsible for the rest, you- you sure get to see things different don't you?"

July 9, 2012

HITB Keynote

I recently keynoted at Hack in the Box 2012 Amsterdam. My topic was "Getting ahead of the Security Poverty Line", and the talk is below:

After giving the talk, I think I want to explore more about the set point theory of risk tolerance, and how to social engineer risk perception. Updated versions of this talk will appear at the ISSA conference in October, and at Security Zone in December.

July 7, 2012

Social Engineering Judo

(or, how good customer service and getting scammed can look alike)

On a business trip a few years ago, I found myself without a hotel room (the hotel which Egencia asserted I had a reservation claimed to know nothing about me). I made a new reservation at a Marriott hotel, and then called to check-in, since I had to head off to a customer event, and wouldn't get to the hotel until around midnight (and didn't want a repeat off having no hotel room). The desk clerk informed me that I couldn't check-in yet, but she assured me that yes, I'd have a room, and it was horrible that the other hotel had left me without a room. And yes, it would have a king size bed.

When I arrived, it turned out they'd upgraded me to a penthouse suite for the night. Good customer service, right? (Yes, of course, but now I have to argue the downside.). The clerk didn't actually know if I'd had a problem earlier, so really, she let me socially engineer her (honestly, it wasn't intentional). I've been in the hospitality industry myself, and it's really hard to tell the difference between a customer with a problem whose day you can improve, and a con artist just looking to get by.

One hotel I've worked for had the policy that you could never comp the meal or room a guest was complaining about (because too many people would complain just to see if they could have a free meal), but for folks with issues, you'd comp their next stay, or a meal the next night. This usually made guests happy, and con artists only got fifty percent off (until we discovered the "guest" that hadn't paid for their last ten stays by exercising this policy).

The trick here is to empower your customer service folks -- your front line against con artists and social engineers -- to have enough flexibility to make customers happy, while reducing how much they can cost you. A room upgrade has almost no marginal cost for a midnight check-in; but a free meal is a bit more expensive.

Since drafting this post, I've noticed what seems to be a disturbing trend in the hospitality industry: very few organizations can answer the question, "how will you reduce the likelihood of this happening again?" Instead, they focus merely on, "how can I make you stop complaining?" That's the best case, but it's only a first step.

December 13, 2011

Security Subsistence Syndrome

Wendy Nather, of The 451 Group, has recently discussed "Living Below the Security Poverty Line," which looks at what happens when your budget is below the cost to implement what common wisdom says are the "standard" security controls. I think that's just another, albeit crowd-sourced, compliance regime. A more important area to consider is the mindset of professionals who believe they live below the security poverty line:

Security1 Subsistence Syndrome (SSS) is a mindset in an organization that believes it has no security choices, and is underfunded, so it minimally spends to meet perceived2 statutory and regulatory requirements.

Note that I'm defining this mindset with attitude, not money. I think that's a key distinction - it's possible to have a lot of money and still be in a bad place, just as it's possible to operate a good security program on a shoestring budget. Security subsistence syndrome is about lowered expectations, and an attitude of doing "only what you have to." If an enterprise suffering from security subsistence syndrome can reasonably expect no one to audit their controls, then they are unlikely to invest in meeting security requirements. If they can do minimal security work and reasonably expect to pass an "audit3", they will do so.

The true danger of believing you live at (or below) the security poverty line isn't that you aren't investing enough; it's that because you are generally spending time and money on templatized controls without really understanding the benefit they might provide, you aren't generating security value, and you're probably letting down those that rely on you. When you don't suffer security subsistence syndrome, you start to think with discretion; implementing controls that might be qualitatively better than the minimum - and sometimes, with lower long-term cost.

Security subsistence syndrome means you tend to be reactive to industry trends, rather than proactively solving problems specific to your business. As an example, within a few years, many workforces will likely be significantly tabletized (and by tablets, I mean iPads). Regulatory requirements around tablets are either non-existent, or impossible to satisfy; so in security subsistence syndrome, tablets are either banned, or ignored (or banned, and the ban is then ignored). That's a strategy that will wait to react to the existence of tablets and vendor-supplied industry "standards," rather than proactively moving the business into using them safely, and sanely.

Security awareness training is an example of a control which can reflect security subsistence syndrome. To satisfy the need for "annual security training", companies will often have a member of the security team stand up in front of employees with a canned presentation, and make them sign that they received the training. The signed pieces of paper go into someone's desk drawer, who hopes an auditor never asks to look at them. Perhaps the business uses an online computer-based training system, which uses a canned presentation, forcing users to click through some links. Those are both ineffective controls, and worse, inefficient (90 minutes per employee means that in a 1500 person company, you're spending over an FTE just to generate that piece of paper!).

Free of the subsistence mindset, companies get creative. Perhaps you put security awareness training on a single, click through webpage (we do!). That lets you drop the time requirement down (communicating to employees that you value their time), and lets you focus on other awareness efforts - small fora, executive education, or targeted social engineering defense training. Likely, you'll spend less time and money on improving security awareness training, have a more effective program, and be able to demonstrate compliance trivially to an auditor.

Security subsistence syndrome is about your attitude, and the choices you make: at each step, do you choose to take the minimal, rote steps to satisfy your perceived viewers, or do you instead take strategic steps to improve your security? I'd argue that in many cases, the strategic steps are cheaper than the rote steps, and have a greater effect in the medium term.

1 Nothing restricts this to security; likely, enterprise IT organizations can fall into the same trap.

2 To the satisfaction of the reasonably expectable auditor, not the perfect auditor.

3 I'm loosely defining audit here, to include any survey of a company's security practices; not just "a PCI audit."

December 7, 2011

Enterprise InfoSec Lessons from the TSA

The TSA and its security practices are fairly common targets for security commentary. You can find armchair critics in most every bar, living room, and, especially, information security team. But the TSA is a great analogue to the way enterprises tend to practice information security; so maybe we can learn a thing or three from them.

We can begin with the motherhood and apple pie of security: business alignment. TSA has little to no incentive that lines up with its customers (or with their customers). TSA's metric is, ostensibly, the number of successful airplane attacks. Being perceived to reduce that number is their only true metric. On any day where there is no known breach, they can claim success - just like an enterprise information security team. And they can also be criticized for being irrelevant - just like said enterprise information security team. The business, meanwhile (both airlines and passengers), are worrying about other metrics: being on time, minimized hassle, and costs. Almost any action the TSA undertakes in pursuit of its goals are going to have a harmful effect on everyone else's goals. This is a recipe for institutional failure: as the TSA (or infosec team) acknowledges that it can never make its constituents happy, it runs the risk of not even trying.

Consider the security checkpoint, the TSA equivalent to the enterprise firewall (if you consider airplanes as VPN tunnels, it's a remarkable parallel). The security checkpoint begins with a weak authentication check: you are required to present a ticket, and ID that matches. Unfortunately, unless you are using a QR-coded smartphone ticket, the only validation of the ticket is that it appears - to a human eyeball - to be a ticket for this date and a gate behind this checkpoint. Tickets are trivially forgeable, and can be easily matched to whatever ID you present. The ID is casually validated, and goes unrecorded. This is akin to a, sadly, standard enterprise practice, to log minimal data about connections that cross the perimeter, and to not compare those connections to a list of expected traffic.

In parallel, we find the cameras. Mounted all through the security checkpoint, the cameras are a standard forensic tool - if you know what and when you are looking for something, they'll provide some evidence after the fact. But they aren't very helpful in stopping or identifying attacks in progress. Much like the voluminous logs many of our enterprises deploy. Useful for forensics, useless for prevention.

Having entered the checkpoint, the TSA is going to split passengers from their bags (and their shoes, belts, jackets, ID, and, importantly, recording devices). Their possessions are going to be placed onto a conveyor belt, where they will undergo inspection via an X-ray machine. This is, historically, the biggest bottleneck for throughput, and a nice parallel to many application level security tools. Because we have to disassemble the possessions, and then inspect one at a time (or maybe two, or three, in a high-availability scenario), we slow everything down. And because the technology to look for problems is highly signature based, it's prone to significant false negatives. Consider the X-ray machine to be the anti-virus of the TSA.

The passengers now get directed to one of two technologies: the magnetometers, or the full body imagers. The magnetometers are an old, well-understood technology: they detect efforts to bring metal through, are useless for ceramics or explosives, and are relatively speedy. The imagers, on the other hand, are what every security team desires: the latest and greatest technology; thoroughly unproven in the field, with unknown side effects, and invasive (in a sense, they're like reading people's email: sure, you might find data exfiltration, but you're more likely to violate the person's privacy and learn about who they are dating). The body scanners are slow. Slower, even, than the x-ray machines for personal effects. Slow enough that, at most checkpoints, when under load, passengers are diverted to the magnetometers, either wholesale, or piecemeal (this leads to interesting timing attacks to get a passenger shifted into the magnetometer queue). The magnetometer is your old-school intrusion-detection system: good at detecting a known set of attacks, bad at new attacks, but highly optimized at its job. The imagers are that latest technology your preferred vendor just sold you: you don't really know if it works well, and you're exporting too much information to the vendor, and you're seeing things you shouldn't, and you have to fail-around it too often for it to be useful; but at least you can claim you are doing something new.

If a passenger opts-out of the imaging process, rather than pass them through the magnetometer, we subject them to a "pat-down". The pat-down is a punitive punishment, enacted whenever someone questions the utility of the latest technology. It isn't very effective (if you'd like to smuggle a box cutter into an airport, and don't want to risk the X-ray machine detecting it, taping the razor blade to the bottom of your foot is probably going to work). But it does tend to discourage opt-out criticism.

Sadly, for all of the TSA's faults, in enterprise security, we tend to implement controls based on the same philosophy. Rather than focus on security techniques that enable the business while defending against a complex attacker ecosystem, we build rigid control frameworks, often explicitly designed to be able, on paper, to detect the most recent attack (often, in implementation, these fail, but we are reassured by having done something).

August 19, 2011

Password Weakness

Randall Munroe opines in xkcd on password strength, noting that we've trained people to "use passwords that are hard for humans to remember, but easy for computers to guess." He's both right and wrong.

First off, the security industry owes Randall a debt of gratitude for this comic; people who don't normally interact with security technologies (or only grudgingly) are discussing and debating the merits of various password algorithms, and whether "correct horse battery staple" is, in fact, more memorable and more secure than "Tr0ub4dor&3". That's an important conversation to have.

Is it more secure?

Randall plays a trick on the audience, by picking a single strawman password implementation, and showing how weak it is compared to a preferred model. He also limits himself to a specific attack vector (against an online oracle), which makes the difference between the two seem larger than it really is.

Consider the following (obvious) variants of the "weak" password algorithm presented in the comic: "Tr0ub4dor&3!", "Troubbador&3", "2roubador&3". None of these match the algorithm presented - so they don't fit into the 28 bits of entropy of concern. That doesn't make them perfect, I merely note that Randall arbitrarily drew his line around "likely passwords" where he wanted to. That's not necessarily unreasonable: for instance, if a password scheme requires 8 characters, including at least one upper case, one lower case, one number, and one symbol, assuming people will pick "Upper case, five lower case with a number thrown in, symbol that is a shifted-number" probably isn't a bad idea, and lets you ignore 99.9975% of possible 8-character passwords. But it is unreasonable if you're arguing that your specific model might be better.

Let's say that we give users the simple model proposed: pick four random words. People fail at picking random things; see Troy Hunt's analysis of the passwords revealed in the Sony Pictures breach. So if you let the user pick the word list, you'll end up with common phrases like "patriots football super bowl" or "monkey password access sucks", and adversaries will start there. Or, we can give users their passphrases, and probably discover later that there was a bug in the random number generator used to select words, and half of our users have the passphrase "caffeine programmer staccato novel".

Randall is correct that, when it comes to user-memorized secrets, longer is better. So is less predictability. Most password rules are designed to move from easy predictability (common words) to harder predictability (common words plus some interspersed silly keystrokes).

The real risk

Going back to Troy Hunt's analysis, the real risk isn't that someone will use an online oracle to brute force your password or passphrase. The real risk is that some password holder will be breached, and like 67% of users, you'll have used the same password on another site. Password strength doesn't help at all with that problem.

But which one?

The answer is neither. If you're using either password scheme demonstrated by Randall, change it (e.g., add some random symbols between your words), as it's now more likely to be an adversarial target. The real question is how do we get away from passwords? SSL certificates - for all their issues - are one option. One time passwords - generated either by a dedicated token or application, or out-of-band, via SMS - also are an interesting choice.

But if the only threat you're worried about are online oracle attacks, you can defend against those by looking for them, and making them harder for adversaries to conduct. But that's a mostly losing battle in the long run.

January 27, 2011


I was reading Rafal Los over at the HP Following the White Rabbit blog discussing whether anonymous web browsing is even possible:

Can making anonymous surfing still sustain the "free web" concept? - Much of the content you surf today is free, meaning, you don't pay to go to the site and access it. Many of these sites offer feature-rich experiences, and lots of content, information and require lots of work and upkeep. It's no secret that these sites rely on advertising revenue at least partly (which relies on tracking you) to survive ...if this model goes away what happens to these types of sites? Does the idea of free Internet content go away? What would that model evolve to?

This is a great point, although insufficiently generic, and limited to the gratis view of the web -- the allegedly free content. While you can consider the web browsing experience to be strictly transactional, it can more readily be contemplated as an instantiation of a world of relationships. For instance, you read a lot of news; but reading a single news article is just a transaction in the relationships between you and news providers.

A purchase from an online retailer? Let's consider a few relations:

The buyer's relationship with:
  • the merchant
  • their credit card / alternative payment system
  • the receiver
The merchant's relationship with:
  • the payment gateway
  • the manufacturer
  • the shipping company
The payment card industry's interrelationships: payment gateways, acquiring banks, card brands, and card issuers all have entangled relationships.

The web is a world filled with fraud, and fraud lives in the gaps between these relationships (Often, relationships are only used one way: buyer gives credit card to merchant, who gives it to their gateway, who passes it into the banking system. If the buyer simply notified their bank of every transaction, fraud would be hard; the absence of that notification is a gap in the transaction). The more a merchant understands about their customer, the lower their cost can be.

Of course, this model is harder to perceive in the gratis environment, but is nonetheless present. First, let's remember:

If you're not paying for something, you're not the customer; you're the product being sold.

Often, the product is simply your eyeballs; but your eyeballs might have more value the more the merchant knows about you. (Consider the low value of the eyeballs of your average fantasy football manager. If the merchant knows from past history that those eyeballs are also attached to a person in market for a new car, they can sell more valuable ad space.) And here, the more value the merchant can capture, the better services they can provide to you.

A real and fair concern is whether the systemic risk added by the merchant in aggregating information about end users is worth the added reward received by the merchant and the user. Consider the risk of a new startup in the gratis world of location-based services. This startup may create a large database of the locations of its users over time (consider the surveillance possibilities!), which, if breached, might expose the privacy and safety of those individuals. Yet because that cost is not borne by the startup, they may perceive it as a reasonable risk to take for even a small return.

Gratis services - and even for-pay services - are subsidized by the exploitable value of the data collected. Whether or not the business is fully monetizing that data, it's still a fair question to ask whether the businesses can thrive without that revenue source.

December 16, 2010

Architecting for DDoS Defense

DDoS is back in the news again, given the recent post-Cyber Monday DDoS attacks and the Anonymous DDoS attacks targeted at various parties. This seems like a good time to remember the concepts you need in the front of your mind when you're designing to resist DDoS.

DDoS mitigation isn't a point solution; it's much more about understanding how your architecture might fail, and how efficient DDoS attacks can be. Sometimes, simply throwing capacity at the problem is good enough (in many cases, our customers just start with this approach, using our WAA, DSA, and EDNS solutions to provide that instant scalability), but how do you plan for when simple capacity might not be sufficient?

It starts with assuming failure: at some point, your delicate origin infrastructure is going to be overwhelmed. Given that, how can you begin pushing out as much functionality as possible into the edge? Do you have a set of pages that ought to be static, but are currently rendered dynamically? Make them cacheable, or set up a backup cacheable version. Put that version of your site into scaleable cloud storage, so that it isn't relying on your infrastructure.

For even dynamic content, you'd be amazed at the power of a short-term caching. A 2 second cache is all but unnoticeable to your users, but can offload significant attack traffic to your edge. Even a zero-second cache can be interesting; this lets your front end cache results, and serve them (stale) if they can't get a response from your application.

After you think about disaster resilience, you should start planning for the future. How can you authenticate users early and prioritize the requests of your known good users? How much dynamic content assembly can you do without touching a database? Can you store-and-forward user generated content when you're under heavy load?

The important point is to forget much of what we've been taught about business continuity. The holy grail of "Recovery Time Objective" (how long you can be down) shouldn't be the target, since you don't want to go down. Instead, you need to design for you Minimum Uninterrupted Service Target - the basic capabilities and services you must always provide to your users and the public. It's harder to design for, but makes weathering DDoS attacks much more pleasant.

August 23, 2010

Awareness Training

Implementing a good security awareness program is not hard, if your company cares about security. If they don't, well, you've got a big problem.

It doesn't start with the auditable security program that most standards would have you set up. Quoting PCI-DSS testing procedures:

12.6.1.a Verify that the security awareness program provides multiple methods of communicating awareness and educating employees (for example, posters, letters, memos, web based training, meetings, and promotions).
12.6.1.b Verify that employees attend awareness training upon hire and at least annually.
12.6.2 Verify that the security awareness program requires employees to acknowledge (for example, in writing or electronically) at least annually that they have read and understand the company's information security policy.

For many awareness programs, this is their beginning and end. An annual opportunity to force everyone in the business to listen to us pontificate on the importance of information security, and make them read the same slides we've shown them every year. Or, if you've needed to gain cost efficiencies, you've bought a CBT program that is lightly tailored for your business (and as a side benefit, your employees can have races to see how quickly they can click through the program).

But at least it's auditor-friendly: you have a record that everyone attended, and you can make them acknowledge receipt of the policy that they are about to throw in the recycle bin. And you have to have an auditor friendly program, but it shouldn't be all that you do.

I can tell you that, for our baseline, auditor-friendly security awareness program, over 98% of our employee base have reviewed and certified the requisite courseware in the last year; and that of the people who haven't, the vast majority have either started work in the last two weeks (and thus are in a grace period), or are on an extended leave. It's an automated system, which takes them to a single page. At the bottom of the page is the button they need to click to satisfy the annual requirement. No gimmicks, no trapping the user in a maze of clicky links. But on that page is a lot of information: why security is important to us; what additional training is available; links to our security policy (2 pages) and our security program (nearly 80 pages); and an explanation of the annual requirement. And we find that a large majority of our users take the time to read the supplemental training material.

But much more importantly, we weave security awareness into a lot of activities. Listen to our quarterly investor calls, and you'll hear our executives mention the importance of security. Employees go to our all-hands meetings, and hear those same executives talk about security. The four adjectives we've often used to describe the company are "fast, reliable, scalable, and secure". Social engineering attempts get broadcast to a mailing list (very entertaining reading for everyone answering a published telephone number). And that doesn't count all of the organizations that interact with security as part of their routine.

And that's really what security awareness is about: are your employees thinking about security when it's actually relevant? If they are, you've succeeded. If they aren't, no amount of self-enclosed "awareness training" is going to fix it. Except, of course, to let you check the box for your auditors.

May 28, 2010

NSEC3: Is the glass half full or half empty?

NSEC3, or the "Hashed Authenticated Denial of Existence", is a DNSSEC specification to authenticate the NXDOMAIN response in DNS. To understand how we came to create it, and the secrecy issues around it, we have to understand why it was designed. As the industry moves to a rollout of DNSSEC, understanding the security goals of our various Designed Users helps us understand how we might improve on the security in the protocol through our own implementations.

About the Domain Name Service (DNS)

DNS is the protocol which converts mostly readable hostnames, like, into IP addresses (like At its heart, a client (your desktop) is asking a server to provide that conversion. There are a lot of possible positive answers, which hopefully result in your computer finding its destination. But there are also some negative answers. The interesting answer here is the NXDOMAIN response, which tells your client that the hostnames does not exist.

Secrecy in DNS

DNS requests and replies, by design, have no confidentiality: anyone can see any request and response. Further, there is no client authentication: if an answer is available to one client, it is available to all clients. The contents of a zone file (the list of host names in a domain) are rarely publicized, but a DNS server acts as a public oracle for the zone file; anyone can make continuous requests for hostnames until they reverse engineer the contents of the zone file. With one caveat: the attacker will never know that they are done, as there might exist hostname that they have not yet tried.

But that hasn't kept people from putting information that has some form of borderline secrecy into a zone file. Naming conventions in zone files might permit someone to easily map an intranet just looking at the hostnames. Host names might contain names of individuals. So there is a desire to at least keep the zone files from being trivially readable.

DNSSEC and authenticated denials

DNSSEC adds in one bit of security: the response from the server to the client is signed. Since a zone file is (usually) finite, this signing can take place offline: you sign the contents of the zone file whenever you modify them, and then hand out static results. Negative answers are harder: you can't presign them all, and signing is expensive enough that letting an adversary make you do arbitrary signings can lead to DoS attacks. And you have to authenticate denials, or an adversary could poison lookups with long-lived denials.

Along came NSEC. NSEC permitted a denial response to cover an entire range (e.g., there are no hosts between and Unfortunately, this made it trivial to gather the contents of a zone: after you get one range, simply ask for the next alphabetical host ( and learn what the next actual host is ( From a pre-computation standpoint, NSEC was great - there are the same number of NSEC signed responses in a zone as all other signatures - but from a secrecy standpoint, NSEC destroyed what little obscurity existed in DNS.


NSEC3 is the update to NSEC. Instead of providing a range in which there are no hostnames, a DNS server publishes a hashing function, and a signed range in which there are no valid hashes.. This prevents an adversary from easily collecting the contents of the zone (as with NSEC), but does allow them to gather the size of the zone file (by making queries to find all of the unused hash ranges), and then conduct offline guessing at the contents of the zone files (as Dan Bernstein has been doing for a while). Enabling offline guessing makes a significant difference: with traditional DNS, an adversary must send an arbitrarily large number of queries (guesses) to a name server (making them possibly detectable); with NSEC, they must send as many queries as there are records; and with NSEC3, they must also send the same number of requests as there are records (with some computation to make the right guesses), and then can conduct all of their guessing offline.

While NSEC3 is an improvement from NSEC, it still represents a small step down in zone file secrecy. This step is necessary from a defensive perspective, but it makes one wonder if this is the best solution: why do we still have the concept of semi-secret public DNS names? If we have a zone file we want to keep secret, we should authenticate requests before answering. But until then, at least we can make it harder for an adversary to determine the contents of a public zone.

"Best" practices in zone secrecy

If you have a zone whose contents you want to keep obscure anyway, you should consider:
  • Limiting access to the zone, likely by IP address.
  • Use randomly generated record names, to make offline attacks such as Dan Bernstein's more difficult.
  • Fill your zone with spurious answers, to send adversaries on wild goose chases.
  • Instrument your IDS system to detect people trying to walk your zone file, and give them a different answer set than you give to legitimate users.

Jason Bau and John Mitchell, both of Stanford, have an even deeper dive into DNSSEC and NSEC3.

May 14, 2010

Skynet or The Calculor?

Technology-based companies can either be organized around silicon (computers), or carbon (people). Depending on which type of company you are, security practices are, by necessity, different. So why then do we delude ourselves into thinking that there is one perfect set of security practices?

In a silicon-based organization, business processes are predominantly driven by predefined rules, implemented in computing systems (think: call centers). Where humans serve a role, it is often as an advanced interactive voice recognition (IVR) system, or as an expert system to handle cases not yet planned for (and, potentially, to make your customers happier to talk to a human). Security systems in this type of a world can be designed to think adversarially about the employees -- after all, the employees are to be constrained to augment the computing systems which drive the bulk of the business.

In a carbon-based organization, business processes are organized around the human, and are typically more fluid (think: software engineering, marketing). The role of computers in a carbon-based organization is to augment the knowledge workers, not constrain them. The tasks and challenges of a security system are far different, and require greater flexibility -- after all, employees are to be supported when engaging in novel activities designed to grow the business.

Consider a specific problem: data leakage. In a silicon-based organization, this is a (relatively) simple problem. Users can be restricted in exporting data from their systems, consumer technology more advanced than pen and paper can be banned from the facility, and all work must be done in the facility. Layer on just-in-time access control (restricting a user to only access records that they have been assigned to), and you've limited the leakage ... to what a user can remember or write down. And that's still a problem: twenty years ago, I worked in a company that used social security numbers as the unique identifier for its employees. Two decades later, a handful of those numbers are still rattling around in my head, a deferred data leakage problem waiting to happen.

Now compare that simple scenario against a knowledge worker environment. People are very mobile, and need to access company data everywhere. Assignments show up by word of mouth, so users need quick access to sources of data they had not seen before. Users are on multiple platforms, even when performing the same job, so system and application management is a challenge. Interaction with customers and partners happens rapidly, with sensitive data flying across the wires. Trying to prevent data leakage in this world is a Herculean task. Likely, given the impossibility of the task, most security implementations here will reduce business flexibility, without
significantly reducing business risk.

But what can the enterprising security manager do? First, understand your business. If it's a silicon-based organization, you're in luck. Most security vendors, consultants, and best practices guides are looking to help you out (and take some of your money while doing so). If, on the other hand, you're in a carbon-based business, you've got a much harder task ahead of you. Most solutions wont help a lot out of the box, and risk acceptance may be so endemic to your organizational mindset that changing it may well feel impossible. You'll need to focus on understanding and communicating risk, and designing novel solutions to problems unique to your business. Sounds hard, but take heart: it's not like the silicon-based security team is going to get it right, either.

May 6, 2010

Contracting the Common Cloud

After attending CSO Perspectives, Bill Brenner has some observations on contract negotiations with SaaS vendors. While his panel demonstrated a breadth of customer experience, it was, unfortunately, lacking in a critical perspective: that of a cloud provider.

Much of the point of SaaS, or any cloud service, in the economy of scale you get; not just in capacity, but also in features. You're selecting from the same set of features that every other customer is selecting from, and that's what makes it affordable. And that same set of features needs to extend up into the business relationship. As the panel noted, relationships around breach management, data portability, and transport encryption are all important, but if you find yourself arguing for a provider to do something it isn't already, you're likely fighting a Sisyphean battle.

But how did a customer get to that point? Enterprises typically generate their own control frameworks, in theory beginning from a standard (like ISO 27002), but then redacting out the inapplicable (to them), tailoring controls, and adding in new controls to cover incidents they've had in the past. And when they encounter a SaaS provider who isn't talking about security controls, the natural tendency is to convert their own control framework into contractual language. Which leads to the observations of the panel participants: it's like pulling teeth.

A common request I've seen is the customer who wants to attach their own security policy - often a thirty to ninety page document - as a contract addendum, and require the vendor to comply with it, "as may be amended from time to time by Customer". And while communicating desires to your vendors is how they'll decide to improve, no cloud vendor is going to be able to satisfy that contract.

Instead, vendors need to be providing a high-water mark of business and technology capabilities to their customer base. To do this, they should start back from those original control frameworks, and not only apply them to their own infrastructure, but evaluate the vendor-customer interface as well. Once implemented, these controls can then be packaged, both into the baseline (for controls with little or no variable cost), and into for-fee packages. Customers may not want (or want to pay for) all of them, but the vendors need to be ahead of their customers on satisfying needs. Because one-off contract requirements don't scale. But good security practices do.

March 10, 2010

The Adaptive Persistent Threat

Much ado has been made of the "Advanced Persistent Threat". Unfortunately, pundits look at the ease of some of the attacks, and get hung up on the keyword, "Advanced." How do we know the adversary is so advanced, if he can succeed using such trivial attacks? The relative skill of the adversary is actually uninteresting; it is his persistence.

Many threat models focus on vulnerability and exposures. This creates an assumption of an ephemeral adversary; one who has a limited toolbox, and will attack at random until he finds an ill-defended target. This leads to planning to simply not be the easiest target ("Do you have to be faster than a bear to outrun it? No, you just need to be faster than the other guy. Or tie his shoelaces together").

Unfortunately, in a world of persistent threats, this may leave you open to other attacks. Consider the difference between protecting yourself against a random mugging and dealing with a stalker. Even if a stalker is relatively primitive, they will adapt to your defenses, and present a very real threat.

So let's drop the "Advanced" and replace it with "Adaptive": the "Adaptive Persistent Threat". In the face of this APT, unless you know yourself to be invulnerable (implausible), you should worry also about secondary targets. Once the APT has penetrated your first line of defenses, what can they do to you? How do you defend yourself?

February 4, 2010

Why don't websites default to SSL/TLS?

When a client connects on TCP port 80 to a webserver, one of the first thing it sends is a line that tells the server what website it wants. This line looks like this:
This tells the webserver which configuration to use, and how to present content to the end-user. This effectively abstracts TCP and IP issues, and lets websites and webservers interact at the discretion of the owner. The designed user of HTTP is the administrator, who may need to host dozens of websites on a smaller number of systems.

Secure Socket Layer (SSL) and its successor, Transport Layer Security (TLS), on the other hand, were designed for exactly the opposite user. The designed user of HTTPS is the paranoid security academic, who doesn't even want to tell a server the hostname it is looking for (the fact that you were willing to tell a DNS server is immaterial). In essence, SSL requires that any server IP address can have only one website on it. When a client connects to a webserver on port 443, the first thing it expects is for the server to provide a signed certificate, that matches the hostname that the client has not yet sent. So if you connect to via SSL, you'll note you get back a different certificate: one for This is expected -- nothing on this hostname requires encryption. Similarly, for Akamai customers that are delivering whole-site content via Akamai on hostnames CNAMEd to Akamai's domain, attempts to access these sites via SSL will result in a certificate for being returned. (Aside: customers use our SSL Object Caching service to deliver objects on the hostname; customers who want SSL on their own hostnames use our WAA and DSA-Secure services).

The designs of HTTP and HTTPS are diametrically opposed, and the SSL piece of the design creates horrendous scaling problems. The server you're reading this from serves over 150,000 different websites. Those sites are actually loadbalanced across around 1600 different clusters of servers. For each website to have an SSL certificate on this network, we'd need to consume around 250 million IP addresses - or 5.75% of the IPv4 space. That's a big chunk of the 9% left as of today. Note that there isn't a strong demand to put most sites on SSL; this is just elucidating why, even if there were demand, the sheer number of websites today makes this infeasible.

Fortunately, there are paths to a solution.

Wildcard certificates
For servers that only serve hostnames in one domain, a wildcard certificate can help. If, for instance, in addition to, I had,, and, instead of needing four certificates (across those 1800 locations!), I could use a certificate for *, which would match all four domains, as well as future growth of hostnames.

You're still limited; a wildcard can only match one field, so a certificate for * wouldn't work on a site named Also, many security practitioners, operating from principles of least privilege, frown on wildcard certificates, as they can be used even for unanticipated sites in a domain.

Subject Alternate Name (SAN) certificates
A slightly more recent variety of certificate are the SAN certificates. In addition to the hostname listed in the certificate, an additional field lets you specify a list of valid hostnames for that certificate (If you look closely, the certificate on this host has a SAN field set, which includes both and * This permits a server to have multiple, disparate hostnames on one interface.

On the downside, you still only have one certificate, which is going to get larger and larger the more hostnames you have (which hurts performance). It also ties all of those hostnames into one list, which may present brand and security issues to some enterprises.

Server Name Indication (SNI)
The long term solution is a feature for TLS called Server Name Indication (SNI). This extension calls for the client to, as part of the initial handshake, indicate to the server the name of the site it is looking for. This will permit a server to select the appropriate one from its set of SSL certificates, and present that.

Unfortunately, SNI only provides benefit when everyone supports it. Currently, a handful of systems don't support SNI, most notably Windows XP and IIS. And those two major components are significant: XP accounts for 55-60% of the browser market, and IIS looks to be around 24%. So it'll be a while until SNI is ready for primetime.

January 25, 2010

The Designed User

One of the things I've always liked about Apple technology is that every system feels like it was designed for a specific individual. The more you're like that individual, the more you like their technology. This isn't unique to Apple -- most technology capabilities are designed for a specific problem space -- Apple is just clearer about it. As a security professional, I like to understand who a specific technology is designed for ("The Designed User") as part of assessing risks involved.

As an example of designed users, take the new Twitter "retweet" functionality. (For those of you new to Twitter: Twitter permits people to post 140-character tweets. Interesting tweets are often "retweeted" by prepending "RT @username (original tweet here)", sometimes with some commentary appended. Twitter has another setting: whenever someone puts your username into a tweet, you see it.) The new retweet functionality, much maligned, allows a single click to retweet a tweet. The originating user does not see the new tweet.

The "old" retweet function -- really, a use created by users -- is perfect for the networking user. It often gets used to make a comment on someone else's tweet, while rebroadcasting it. I want to see every time someone retweets something I said (really, it doesn't happen that often). But I'm not the target of the new functionality: celebrities are. A large number of retweets are celebrity tweets being rebroadcast by their followers. If you're in that network, you want to minimize how many times you see the same retweet in your timeline. For those users, the new capability is easier, and far more preferred.

With any capability, we should always ask who the intended audience is as part of understanding the design space the developers were in. This may help us understand why certain security tradeoffs were chosen.

December 18, 2009

Modeling Imperfect Adversaries

An important piece of risk assessment is understanding your adversaries. Often, this can degenerate into an assumption of perfect adversaries. Yet when we think about risk, understanding that our adversaries have different capabilities is critical to formulating reasonable security frameworks. Nowhere is this more true than in the streaming media space.

Brian Sniffen (a colleague at Akamai) recently presented a paper at FAST exploring ways of considering different adversaries, especially in the context of different business models. He presents some interesting concepts worth exploring if you're in the space:

  • Defense in breadth: The concept of using different security techniques to protect distinct attack surfaces, specifically looking at defeating how a specific adversary type is prone to conduct attacks.
  • Tag-limited adversaries: An extension to the Dolev-Yao adversary (a perfect adversary who sits inline on a communication stream), the tag-limited adversary may only have knowledge, capabilities, or desire to conduct attacks within a limited vocabulary.

His paper is also a good primer on thinking about streaming threat models.

December 15, 2009

Virtual Patching

Virtual patching, for those new to the term, is the practice of adding a rule to a Web Application Firewall (WAF) to filter out traffic that could exploit a known vulnerability in a protected application. This has triggered debate in the security community -- is this a good thing? Why would a developer fix the vulnerability if it is mitigated?

bandaid.jpgFirst off, Virtual Patching is, in fact, a good thing. The development turnaround time for a WAF rule is almost certainly shorter than the development cycle for the backend application, so this shortens your mitigation window. That shouldn't really be a topic of debate.

The interesting debate, then, is how we manage the underlying vulnerability. One school of thought argues that the WAF is only a stopgap, and developers should fix the vulnerability because that's the right thing to do. Another school's argument is a bit more complex. Don't fix the vulnerability. Fix the entire class of vulnerabilities.

If you have a vulnerability in a webapp, odds are that it's a symptom of a greater category of vulnerability. That CSRF bug on one page likely reflects a design flaw in not making GET requests nullipotent, or not using some form of session request management. Given the urgency of an open vulnerability, the developers will likely focus on fixing the tactical issue. Virtually patch the tactical vulnerability, and focus on the flaw in the foundation. It'll take longer to fix, but it'll be worth it in the long run.

December 10, 2009

DDoS thoughts

We are used to measuring the efficiency of DDoS attacks in ratios of bits-per-second. An attacker wants to consume many bits-per-second for each of his own bits-per-second that he uses. He would rather send one packet to generate a storm than have to send the storm himself. We can extend this efficiency measurement to other attacks. Let's use the name "flits per second" (for fully-loaded bits) for this more general measurement of cost and pain: sending or receiving one bit-per-second costs one flit-per-second. Doing enough computation to have an opportunity cost of one bit-per-second has a cost of one flit-per-second. Enough disk access to forestall one bit-per-second costs one flit-per-second. Now we can talk about the flit capacity of the attacker and defender, and about the ratio between flits consumed on each side during an attack.

From a defensive efficiency perspective, we have two axes to play with: first, reducing the flit-to-bit ratio of an attack, by designing optimal ways of handling traffic; and second, increasing the relative client cost to produce an attack.

To reduce the flitcost, consider the history of SYN floods. SYN floods work by sending only the first packet in the three way TCP handshake; the victim computer keeps track of the half-open connection after sending its response, and waits for the attacker's followup. That doesn't come; for the cost of a single SYN packet, the attacker gets to consume a sparse resource (half-open connections) for some period of time. The total amount of traffic needed historically was pretty minimal, until SYN cookies came along. Now, instead of using the sparse resource, targets use a little bit of CPU to generate a cryptographic message, embed it in their response, and proceed apace. What was a very effective attack has become rather ineffective; against most systems, a SYN flood has a lower flit-to-bit ratio than more advanced application layer attacks.

The other axis is more interesting, and shows why SYN floods are still prevalent even today: they're cheap to produce. They don't consume a lot of cycles on the attacking systems, and don't require interesting logic or protocol awareness. The fewer resources an attacker can consume, the more likely their attack will go unnoticed by the owners of the compromised hosts used in the attack (Case in point: look at how fast Slammer remediation happened. Why? ISPs were knocked offline by having infected systems inside). Many attack programs effectively reduce to "while (1) {attack}". If the attack is making an HTTP request, filtering the request will often generate a higher rate of requests, without changing the attacker's costs. If this higher rate has the same effect on you that the lower rate did, you didn't buy anything in your remediation. You might have been better off responding more slowly, than not at all.

In the general case, this leads us to two solution sets. Traffic filtering is the set of technologies designed to make handling attacks more efficient; either by handling the attack further out in an infrastructure, classifying it as malicious for a cheaper cost than processing it, or making processing cheaper.

Capacity increases, on the other hand, are normally expensive, and they're a risky gamble. If you increase far in excess of attacks you ever see, you've wasted money. On the other hand, if you increase by not quite enough, you're still going to be impacted by an event (and now, of course, you'll be criticized for wasting money that didn't help). Obligatory vendor pitch: this is where a shared cloud infrastructure, like a CDN, comes into play. Infrastructures that measure normal usage in terabits per second have a significantly different tolerance for attack capacity planning than most normal users.

November 23, 2009

H1N1 and telework

The nervousness around H1N1 has pretty much permeated every aspect of our lives. Remember a year or two ago, the hysteria around hand sanitizers and alcohol poisoning? Gone; in its place, we have dispensers in buildings everywhere. That's the power of the fear of H1N1.

Another place is in schooling. Not too long ago, if your kid got sick, school policy was "keep them home if they have a fever or are vomiting." Sanely, this migrated to "keep them home for 24 hours after a fever." Now, however, it is "48 hours fever-free with no medications." Some schools/daycares have added "and no symptoms either," which is moderately impractical for the kids who get a three-week long lingering cough.

This affects us in the workplace. If an employee has a small child and they don't have a stay-at-home caregiver, expect that they're going to miss more time than in prior years; and that the employee actually will be stressed about this (heck, anyone trapped at home with a no-longer-sick child on a school-day is going to end up pretty stressed). Also, you may want to suggest that employees with sick children stay at home even if they aren't the primary caregiver, just to minimize workplace infections.

Key to this is a sane telework plan. Like most things, this comes down to People, Process, and Technology.

People: Do the employee and manager have a good rapport, such that working remotely does not lead to communications failures? Can the employee work without direct management? Can the employee balance the needs of daytime home-life with work?

Process: Do you have understood ways for the employee's status to be communicated? Do other employees know how to reach them? How many hours do you expect when an employee is "working from home"?

Technology: What telework capabilities do you have? (VOIP phones in the home? VTC setups?) What about remote collaboration? (A wiki, IM, ticketing system or just email?) Do your employees have enough bandwidth at home to telework? Do you have enough in your office to support them?

It's going to happen to you -- you just need a little prep. And most of that prep? You can typeset it to hand to your auditors; it's a big piece of your DRP.

November 8, 2009

Secure by design?

"How do we ensure people build secure systems?"

This was the question to the panel before mine at the Thayer School's Complex Systems Symposium. It's not a new question - it comes up every time anyone tries to tackle hard problems around internet security. But it's an unfair question, because we have never built anything securely.

The question was asked in a lecture hall. Every time the symposium took a break, the two aisles bottled up with side conversation, inhibiting the flow of people needing to exit/enter. There were several "captains of industry", extremely talented professors, and bright students in the room; yet a mob could have swooped in shouting at any minute or an attacker could have waltzed in unimpeded (I could go on and on with threat scenarios). Yet who is responsible for the poor security design of that lecture hall?

In reality, security is about making good risk decisions, and accepting that there are some attacks and adversaries that you will not defend against. For internet-connected systems, this tradeoff is harder, as the cost to your adversaries is usually small enough that attacks that are implausible in the physical world become economical (remember the half-penny skimmers?)

October 13, 2009

Compliance, Security, and the relations therein

Last week, Anton Chuvakin shared his latest in the "compliance is not security" discussion:

Blabbing "compliance does not equal security" is a secret rite of passage into the League of High Priests of the Arcane Art of Security today. Still, it is often quietly assumed that a well-managed, mature security program, backed up by astute technology purchases and their solid implementation will render compliance "easy" or at least "easier." One of my colleagues also calls it "compliance as a byproduct." So, I was shocked to find out that not everyone agrees...

I think there are two separate issues that Anton is exploring.

The first is that a well-designed security control should aid in compliance. As one of his commenters notes, a good security program considers the regulatory issues; or, more plainly, a good security control considers the compliance auditor as an adversary. If you do not design controls to be auditable, you are building risk into your system (sidebar: what security risks are worse than failing an audit?).

But the second point is more interesting. Most compliance frameworks are written to target the industry standard architectures and designs. What if you are doing something so different that a given control has no parallel in your environment? Example: You have no password authentication in your environment; what do you do about controls that require certain password settings? What if your auditor insists on viewing inapplicable settings?

Then, you have three options:

  1. Convince your auditor of the inapplicability of the controls.
  2. Create sham controls to satisfy the auditor
  3. Find another auditor

July 3, 2006

Security and Obscurity

Everyone has heard the mantra, "Security through obscurity is no security at all." I hope that people remember where it came from - when companies were announcing proprietary cryptographic algorithms, everyone pointed out that cryptanalysis is an almost impossible task to get right, so you couldn't really know how secure your algorithm was unless it had been peer-reviewed.

But this comes up every few days, when people discuss security systems and architectures. And there, I contend, obscurity is the single most important component of every security system. Because let's face it, there is no such thing as perfect security. So every architecture has its holes.

The job of a good security professional is to reduce those holes; to make exploiting the holes more expensive than the value of doing so, and to implement layered security systems so that attackers are unlikely to make it all the way through a system without tripping an alarm somewhere. Without obscurity, that's impossible.

Put another way, an attacker, or even a neutral party, has absolutely no need to know the details of your architecture.

June 30, 2006

Social Engineering Self-training

Most security systems have the annoying side effect that increasing attack volumes can degrade them, usually through tuning of defenses, or desensitization (Yes, this is a generalization). Social Engineering, on the other hand, has the nice feature that the more often someone tries to social engineer you, the less likely the next person is to succeed - even if the first attack is incompetent, and the second one highly competent.

That's because every failed attempt is a training exercise for the target.

June 8, 2006

Policy and Practice - a Talmudic distinction

It's hip, of course, to be able to use Talmudic in a description of regulatory environment - but this is actually going to use the Talmud as a source. Policy is what we write down; practice is what we do. The relationship between them is nicely covered in the first tractate of the Talmud.

Mishna. From what time can the Shma be recited in the evening? From the hour when the priests go in to eat their tithes until the end of the first watch - the words of Rabbi Eliezer. And the Sages say: Until midnight. Rabban Gamliel says: Until the break of day (Brokhos 2a).
There is a bunch of esoteric coverage about the start point - but what about the end point? Why are both midnight and daybreak listed?
Mishna. Whenever the Sages say "until midnight," the obligation extends until the break of day.... Then why did the Sages say "until midnight"? In order to keep people from transgressing (Brokhos 2a).
And that is the difference between policy and practice. A well written policy should never be broken - and one way to ensure that is to have practice be more stringent than the policy.

Note that I except from this rule CYA policies, of the sort lawyers tend to write to protect organizations from liability.

(Thanks to Born to Kvetch by Michael Wex for the inspiration).

May 31, 2006

The Perfect is the Enemy of the Good

My favorite Voltaire quote:

Le mieux est l'ennemi du bien.

So often in information security we are presented with fairly poor starting scenarios. And there are usually a number of options - raging from doing nothing, to modest improvements, to complete redesign. Purists, of course, tend to advocate a complete redesign from the ground up. Generally, that's a poor strategy, especially if you're an outside stakeholder.

If the responsible owner wasn't planning a redesign, you are basically advocating a complete reinvestment of all costs incurred to date, plus whatever is necessary to meet security needs, for no increased value (generally, revenue). Obviously, they aren't going to just comply.

Assuming you manage to comvince the owner that a complete redesign is worthwhile, you'll find that every other stakeholder jumps in. Second system Syndrome now kicks in - as long as such a large investment is being made, why not add in all these other, "minor" things people have been looking for?

Instead, find the little changes, that can get slipped in to improve the current state. Few systems are so bad that minor changes won't provide a world of good. But more importantly, this establishes security as a reasonable requirement, that the system owner is used to dealing with.

And when they decide to implement their next major change, odds are, they'll come find you.

May 21, 2006


How do you perfectly secure data on a system? The hard drive should be encrypted, of course. Logging onto the system should use a one time password, as well as an asymmetric identifier. You put the computer in a locked room. Make sure the computer isn't connected to the network, of course, and, for good measure, power it down. The door should have multiple locks, so that you can enforce two-person access controls, and each needs to prove their identity with a physical token, biometrics, and a PIN.

And, of course, the last thing you should do is take a sledgehammer to the computer before leaving the room.

Continue reading "Sledgehammers" »