bestkungfu weblog

Matt Presents: Escape from CAPTCHA

Filed in: accessibility, CSUN2004, tech, Sun, Mar 21 2004 02:15 PT

It’s really hard to blog your own presentation. I mean, there you are, saying something brilliant, and then you have to go sit down, remember what you said, and annoy all of the people listening by wasting their time as you select your LiveJournal mood and what’s playing in iTunes. So these are just fading memories of my fantastic presentation on the inaccessibility of CAPTCHA. (This is long. Sorry. It’s also kind of important.)

In the beginning, life was easy. Resources flowed freely across the Web. Then people started creating Web sites that offered resources people liked, and made people register for them. These were usually useful resources like, say, Web-based email, instant messaging, and so forth. Thing was, these accounts were valuable to people who wished to exploit them for this new “spam” thing that was going around. And then the cat-and-mouse game began: spammers started creating millions of Hotmail and AIM accounts just to steal their resources, and the goal for those sites became finding a way to keep the bad guys out, while letting the good guys in. The hunt for a better mousetrap led to a system called CAPTCHA, which shows distorted text in a bitmap image, and asks users to enter the text into a form.

The T in CAPTCHA stands for “Turing test.” (Actually, there are four T words in a row, but I’m pretty sure they only promoted these two.) A Turing test is a hypothesis put forward by famous mathematician Alan Turing in 1950. He said (and I paraphrase), put somebody in front of a terminal. Feed them input. It could be from a human or a computer. If the user can’t tell whether the input comes from a human or a computer, then call 60 Minutes, baby, the era of artificial intelligence has begun. Well, Morley Safer will tell you that they haven’t gotten that call, so the only use of a Turing test is to create something that will let humans pass, while totally hamstringing robots. (Poor things. I can see them wincing and grasping their legs right now.)

Only, it’s inaccessible by design. Assistive technology, which is, for all intents and purposes in this story, a robot, can’t read these images. (And if they could, so could all the bad robots going around stealing stuff.) There is no alt text available for most CAPTCHAs, which makes their host documents invalid in addition to inaccessible. This isn’t a test to prove you’re human. It’s a test to prove you’re human, have very good eyesight, and are not dyslexic. People are frequently stopped dead at a CAPTCHA.

More proof that this is a poor mousetrap: CAPTCHA is hackable. In addition to CAPTCHA crackers using optical character recognition to defeat the images (which is why they’re sometimes so distorted as to be unreadable even if you are human), there are some great social engineering exercises. Let’s say you’re a spammer (you filthy bastard). If your business comes from stealing these accounts, then it’s worth it to make sure a human can create these things as rapidly as possible. So you pay someone to code you a system, and you pay someone else minimum wage to sit there and help the robots by deciphering a thousand of these codes every hour. (Humans are crafty.)

Or you can do it for free. The first publicized hack of CAPTCHA consisted of a developer creating a porn site, then making its users enter the CAPTCHA codes in order to gain access to the pictures. (Humans are really, really crafty.) In other words, there are ways to ensure that an infinite number of users are willing to solve your CAPTCHA problem. (Though it’s probably best not to suggest this idea to blindness and dyslexia-oriented advocacy groups.)

So, CAPTCHA is broken. What to do? We wrote a paper, which I originally titled “On the Internet, Nobody’s Sure if You’re a Robot“, but is now something much uncooler. It says all the stuff I just said here, (erm, formalized slightly), and then tries to get people thinking about CAPTCHA to step back and think about exactly what they’re planning to solve, so that they can see that no matter what it is, CAPTCHA ain’t the solution.

There are three models of user checking out there on the Web:


Most sites don’t really care whether a user is a robot, as long as they’re not hammering the server. The accounts that are set up are more for tracking settings and gaining user data than anything else.


This is to see if someone is an actual-factual human. Humans buy more stuff. And humans may even be afforded the privilege of more than one account on the same system: some sites give one user seven email addresses, to be parceled out at the user’s choice.


This is the one-person, one-vote system: passports, state identifications, driving licenses, bank cards. As social services move online, Americans will likely have to cough up their Social Security numbers in order to approximate this right. In the long term, hopefully, this will change.

With these in mind, we came up with several different approaches to enhanced security. While we don’t have the silver bullet, we can say that CAPTCHA isn’t any better, and many of our suggestions are more accessible.

Logic puzzles

Is there an elephant in the room? Humans can answer that correctly and uniformly. (Except for people who work with elephants, or have hangovers.) Computers can’t. So using logic-based questions would be sufficient – until the Semantic Web happens, perhaps. Cons: this is bad for users with cognitive disabilities, speakers of foreign languages, and people who can’t spell. And you’d need a zillion questions to keep robots from caching them and determining the answers.


What if they read the letters out to you instead? The advantage is that it would be usable in more than one modality. Cons: it’s hard to transcribe audio. (Trust me. I do it all the time.) You may need to listen to something like this several times before you actually manage to write it down. But this is also vulnerable to voice-recognition systems, so often these sound files are also distorted and hard to understand. And they’re no great bargain if you happen to be deaf and blind.

Credit-card validation

The first private identity system in the United States was the credit card network. Using a credit card number, you can match a person to a mailing address, which is good enough security for many companies. Cons: many users, including all under 18, do not have a credit card. This also costs money for companies to execute, and creates perception issues around security.

Live operators

Yahoo and AOL both offer live operators to allow users to bypass this system. Computers are really poor at conducting a phone conversation, so this would be satisfactory. Cons: running a 24-hour call center is something that’s so expensive that only the five richest kings of Europe can afford it. And it’s another separate-but-equal solution that violates the spirit of an accessible Web.

Limited use and usage tracking

Maybe your system doesn’t really need to provide infinite service to new customers. So you set a limit of ten outgoing mail messages per day, rather than locking down registration. But then maybe you’d have caused 10,000 times more registrations than before. Then comes what I describe as “post-hoc checking”: watch the usage patterns of use of certain accounts, and suspend access of users that match a given pattern. This can be successful, and pretty silent, if you can find artifacts of abusers. But it can also fail with certain users: Joi Ito, for example, was put in “Orkut jail,” a form of post-hoc check, because he had acquired too many new friends too quickly.

Identity systems and biometrics

Passport, Liberty Alliance, and public-key infrastructure solutions are all potential solutions to this (except, of course, for the fact that Passport uses CAPTCHA). So are biometrics, which are going to be built into the Longhorn version of Windows. But single sign-on systems have privacy issues, PKI has the problem that a solution based on it doesn’t exist, and biometrics are going to require hardware to be supported. We expect that the true solution to all of these levels of access will ultimately be found here.

In conclusion, think about what you’re trying to do. As you can tell, there is no easy solution, including CAPTCHA. You may get a quick reprieve from the hackers by implementing CAPTCHA, but that will certainly go away after a while. And in the meantime, you’re going to piss off a whole lot of users. So stop it, please. We need to move beyond the security model of the Club car lock, in which its presence on a steering wheel simply means that it’s not as easy to steal as the next car, which doesn’t have one. Eventually, the thieves realize that they can cut into your steering wheel, and then, you’re back where you started.

V2: Universal Remote Control

Filed in: accessibility, CSUN2004, tech, Fri, Mar 19 2004 18:30 PT

Gregg Vanderheiden, in another capacity chair of the W3C’s Web Content Accessibility Guidelines Working Group, led a presentation on the INCITS V2 working group. Their project is something that’s cool for everybody: a universal remote control on steroids.

The objective of the URC is to be able to control any device from anywhere, using whatever device was handy. This remote control has killer apps: it doesn’t matter what the interface is for the thing you’re approaching, you only do it one way. For example, if you want $60 from an ATM, or want to set an alarm for 6am (which is when I woke up this morning. blah.), you can do this easily. And if you speak Japanese, but the instructions for controlling the air conditioning in your hotel room are in Swahili, no problem. The control will do it for you. Any devices that are V2-enabled will interoperate in perpetuity, so when voice recognition improves, as it constantly does, your new remote will still work with the old stuff.

So, one might say: “I have a disability. What’s in it for me?” Well, if one is blind, or has motor-related disabilities, their V2-compatible device can operate the thermostat, the oven, the microwave, etc. Even better, since this is a universal design device, when V2 hits the market, users who, for example, become paralyzed won’t have to rebuy or retrofit every electronic device they own in order to use them.

The V2 working group is close to issuing a draft standard, which is the first step in really opening it up for business. It should be out in six weeks from now (April or May 2004).

The protocol stack, presented by Gottfried Zimmermann, is pretty hard to explain without a chart. Suffice to say, it has various frequencies and media on which V2-compatible devices can communicate (802.11b, Bluetooth, FireWire, Powerline, Ethernet, and more), a discovery protocol, and at the highest levels, XML-based data interchange and even interface guidance.

Chris Hofstader of Freedom Scientific says this is reasonably easy to implement on assistive technology like his company’s JAWS screen reader. He related a real-world example of why the URC is what he needs: at the hotel he’s staying in, he said he has been having trouble getting crushed by elevator doors. If elevators supported V2, he could make the elevator wait until he’s safely inside.

Al Gilman, my chair in the W3C/WAI Protocols and Formats Working Group, went next. He mentioned that device independence is important, and people are working on it: for example, the Device Independence Working Group at the W3C. It’s larger than just accessibility, though it does touch us in several areas.

Accessibility: what not to do

Filed in: accessibility, CSUN2004, Web, Thu, Mar 18 2004 21:25 PT

Jim Thatcher, lead author of Glasshaus’ “Constructing Accessible Web Sites”, presented case studies of sites that have technically passed the Section 508 standards, but in a completely unusable way for users with disabilities.

Long long descriptions. Showed the Department of Veterans Affairs site’s code, which has descriptions attached that basically explain what was already in the document, and may as well not have existed. Spacer GIFs that say “one pixel image, used for spacing.” Do not do this. He fixed the document, reducing the VA page’s reading time by 77%.

Skip-navigation links (see the nav of my page for an example) are good, in moderation. CNN has one, which skips hundreds of links. Jim’s site has two levels. ITTATC has four levels, which is getting a little burdensome. But HP’s site, archived in 2002, he says, was the worst example: it had eleven skip links, each with “you have just skipped…” after the fact. Super-complicated. And you could skip the skip links. Half the words on the HP page were skip links.

The worst of all is a current site:, the National Archives and Records Administration. It’s a simple enough page, design-wise. But they do so many things wrong: images with tiny print for the nav, and the worst select menu ever: it has an onchange event attached, so interacting with the keyboard causes the page to reload as soon as you hit the down arrow. Do not do this.

Then, you roll over the tiny text images, and an image of explanatory text appears to the left. Totally unreadable by assistive technoogy, natch. There’s an image in the noscript area of the document that’s one by one pixel, doesn’t have any alt on it, and doesn’t appear to have a reason to exist. The alt text of another spacer informs users of screen readers of the contortions they have to go through to use the site. They take roughly four hours to get through, and tout how great it is that users can’t interact with their navigation. It also announces “This table is used for layout” in the summary attributes of their tables. Do not do this. Leave an empty summary if it’s for layout. He can cut 80% of the bad stuff out of this site for users.

The basic message of this is not to overload users of screen readers with remedial site usage details that they don’t really need. People who are just coding to the letter of Section 508 may have their hearts in the right place, but need to use their head, and need to read about what they don’t need to do, as much as what they do. Thatcher suggests hiring accessibility experts who, for example, know what the CSUN conference is, even if they’ve never been. (This ends up being shorthand for: if you don’t know any users with disabilities, or how assistive technology works, or the rules of applied Web accessibility, you are not a Web accessibility expert. Perhaps impolite, but otherwise accurate, in my opinion.)

Kynn Bartlett: blogs and accessibility

Filed in: accessibility, blogging, CSUN2004, 00:00 PT

Kynn's Obey UAAG logo

He knows I’m blogging this.

On Sunday, I was the accessibility guy talking to bloggers about accessibility. Today, Kynn Bartlett (Maccessibility, Shock and Awe) is the accessibility guy talking to the accessibility community about blogging. Or maybe he’s the blogger talking to the… or maybe I’m… okay. Now I’m confused. Anyway.

(I showed up late. Sorry. It was Microsoft’s fault. I didn’t even get a t-shirt. The end is always the best part, anyway.)

He smirked at the “crotchety old curmudgeons” who code their sites by hand. The point, he says, is to get their message out there, not how hardcore you are. From which, we get Movable Type. He gave a quick demo. The idea of blogs is easy content.

Someone asked about whether blogging makes for publishing, since many people would think of it as grafitti. Kynn said, essentially, that grafitti is still a form of expression (and the Los Angeles River its greatest content aggregator), and it’s good to have the ability to host that expression.

The benefits of blogging to accessibility: Content can be separated from presentation. You can have alternate interfaces: for example, the blind people group on LiveJournal. And RSS: despite the contention over the standard, it passes the just-works test for end users. Stuff like BlogLines for aggregation. (Aww. He featured my blog. Thanks, Kynn!) He actually distracted himself with someone else’s feed during his own preso. NADD much?

Anyone can publish. That includes users with disabilities. “Anything that makes it easier for anyone to publish makes it easier for people with disabilities to publish, because they’re just anyone.” And it doesn’t matter if you’re blind if all you want to do is hate Bush (like Kynn and me). But it’s personal enough that details tend to escape.

Accessibility challenges: “Do we really want people to create all this crap?” Now that the bar is lowered, newer bloggers aren’t learning things that are obvious to Web designers, like alt text. There are things like photoblogs and audioblogs (q.v.: my advice to audio and video bloggers) that aren’t accessibility. And the tools themselves have no guarantee of accessibility.

How do you promote accessibility? Reach out to receptive bloggers. Look at and point to blogs on accessibility. “The Zeldman Effect,” as Kynn puts it, is that whatever Jeffrey thinks is cool is what others will think is cool. The cool part is that the people at the top of the design food chain actually think accessibility is cool. “And that’s starting to trickle down now.” And ensure that all voices are heard, including users with disabilities. The more you make that visible, the more it spreads.

He closed with Mark Siegel of the 19th Floor. (He likes Mark. We all like Mark.) No Pity is a LiveJournal community just to chat without looking for sympathy, etc. (My old-school friend Rachel created that. Rachel’s cool, and her fonts are really big.)

That’s it, he says, get to bloggin’. And go see Kynn’s blogging class.

Keynote: Vinton Cerf

Filed in: accessibility, CSUN2004, tech, Wed, Mar 17 2004 16:45 PT

Last year, Ray Kurzweil carried the audience fifty years forward at the CSUN keynote. (Which turned out to be about fifty years further than many apparently wanted to go.) So this time around, we have another guy who is deeply rooted in our technical history, and looking

Overheard during the intro: “Have you heard of this guy?” “No, who is he?” Argh.

He started by talking about MCI Mail. His boss told him: “To do the impossible, first you have to believe it isn’t.” Power corrupts; PowerPoint corrupts absolutely. (Woohoo! Tufte speech! References to McLuhan! I’m in heaven!) So, he knows PowerPoint isn’t the greatest thing for speeches, but he’s going to use it anyway. There are now 800 million to 1 billion Internet users, with the US and Canada, Europe and Asia sharing equally in numbers. Asia/Pacific is going to be a huge source of growth. He predicts up to 2.7 billion users by 2015. It tapers off slightly after 2006 because of the diminished capacity to fund the infrastructure in the third world.

He showed a photo of a coin-op laundry in Belize that has added Internet service. He found 36 Internet cafes in Ghana.

Cerf explained why IP is so cool: it doesn’t care what its medium is, or what it’s carrying. He showed his “IP on Everything” t-shirt. (I love that story.) Now, he wants a t-shirt that says “IP under everything.” There are Internet fridges and picture frames. (He said he thought that was “about as useful as an electric fork.” When he thought about it, though, he thought it was pretty cool: only two buttons on the thing, manages the things it does well, and you can now have stock quotes or scores, whatever. He now has a half-dozen of them spread around his family.) Their presence, he says, implies a lot about what is about to happen. The ability to bring in relays, speech to text to speech, is a part of Internet architecture. So we can see these applications as the “tip of an iceberg-sized collection of functionality.”

Video conferencing, which was once seen as the “holy grail”, could come to pass as a result of video games implementing that functionality cheaply in one package. He’s talking about radio frequency identifiers (RFIDs). Wal-Mart and Procter & Gamble are all about that because of stock maintenance. (I got what I think was my first RFID at SXSW this week.) He’s joking about Internet fridges using RFIDs to detect contents, and then use that data to determine what needs to be repurchased. (What’s frightening is that when I was working in online grocery, we were thinking about this stuff.) So he mentions that there are Japanese companies making Internet scales. He mused, what if the scale could talk to the fridge? (Big laughs.)

Next, the Internet wine cork. Imagine recording wine information on an RFID, and when the wine turns out not to have been good, you can ask what happened from the bottle, and the bottle can report back that back in 1985, it was in a room that was 104 degrees. (Maybe okay for madeira, but bad for other wines.) And Internet socks, so you can check the house for that one missing sock. Of course, there are security issues all over that, he says.

Now, he says, I’m not a designer of assistive technologies. He talks about Sigrid, who received a cochlear implant at age 50 after being deafened at age 3. There’s a computer that’s an important part of the implant. She said she turned into a “50-year-old teenager.” She even chatted up the AT&T operators, despite her husband (Cerf) being a vice president at MCI. Started listening to books on tape to learn how words she recognized in print were pronounced. She called the library to subscribe to books on tape. “They said, ‘You’re blind, aren’t you?’ And she said, ‘No, I’m deaf.'” Sigrid attaches a mic to people to talk with them, and she’s “not afraid to use this technology visibly.” She once took the mic off of Sam Donaldson when she wanted to talk to someone else. He says, don’t be ashamed to use technology visibly. Encourage people to be “fearless about using these auxiliary technologies.”

Naturally, Cerf wants to wire her implant device to the Internet, so she can hear data directly in her head. Though that is likely to be bad for cheating students, he supposes. It’s now possible to do various translations: amplification, sign, real-time captioning. “It’s that combinatorial power that I think is so exciting.” He wants to take these technologies from being assistive to being augmentative for people without disabilities. He wants that voice in his head, too. (And, really, accessibility has historically served this purpose: optical character recognition is just one example of systems that were widely used for accessibility purposes before it ever reached the broad market.) So “while you’re at it,” assistive technology developers, think about what other things you can do with the technology. Use standards to interoperate. (Yes! Listen and heed, assistive technology vendors.)

“I think there are people who abuse the intent of the ADA.” On the whole, though, the execution of ADA has been beneficial.

“Standards create interoperability.” He mentions the World Summit on the Information Society. He says he feels assistive technologies haven’t been adequately addressed, and he plans on raising that point at their next meeting in New York. But he shouldn’t be the only one. He wants people to get informed and involved in that, and make people creating Internet applications more aware of accessibility implications.

He says that devices “are just a receptacle for software” to enhance experience. Think about what around you can be fixed or made better by giving it more applications. (Provided they don’t just get 300 keys on all of them, I hope.)

Great plenary session. For a moment, I forgot I was stuck in Los Angeles.

Powered by WordPress (RSS 2.0, Atom)