• Speculation about why the federal exchange is so slow and buggy

    This morning, I was discussing with a colleague Ezra Klein’s Wonkblog takedown of the federal exchange website. He points to a “darkly amusing” Reddit thread full of evidence of technical goofs by the site’s coders. My colleague and I speculated that perhaps government red tape and security regulations may have driven some design decisions in a way that may make the site less efficient. Yeah, maybe some of them look stupid from the outside, but perhaps some could not be avoided for compliance reasons. It’d take a good, technical/journalistic investigation to know for sure.

    The strength of this theory is that there really is a lot of government red tape and a ton of security regulations. The weakness of this theory is that other, single-state sites are performing relatively well (e.g., Kentucky’s). Still, it could be a confluence of compliance-driven inefficient design and traffic in excess of server capacity that makes the difference. I’m only speculating.

    Still, some support of this regulatory compliance design constraint theory is found in Timothy Lee’s interview with Robert Moss. (Emphasis added.)

    Even though in this type of setting the development teams are using what you might call agile methods, there’s still a huge layer of requirements and review and sign-off. There’s lots of policy decisions that have to be made that shape ever step of the way. There’s much more overhead involved in this sort of thing than if you’re trying to have a small set of people developing the Web site.

    The [bottom] layer is the Affordable Care Act, which laid out the parameters. Then on top of that are all the regulations that HHS issued over the course of two years. Then it goes to contractors who have to build it. If you look at the contract, there’s usually a prime contractor and subcontractors. And I think that just adds to the complexity and adds to the number of parties involved. The state governments had to comply with CMS mandates and then work with their contractors. So it’s a pretty complicated structure of trying to roll out. To design what you’re trying to build and build it at a time where the regulations were being written. […]

    These projects so big is that there is a very rigorous security oversight involved and layers of audit, layers of rules. The kind of thing that small start-up companies who are just winging it [don’t deal with]. The Center for Medicare and Medicaid Services has been doing this for a long time in terms of the Medicare and Medicaid program. There’s already a lot of rigor, security audits that have to be passed. […]

    The Internal Revenue Service is involved for tax reasons. The Department of Homeland Security is involved because of immigration status. The Social Security Administration is involved. Lots of agencies are involved to confirm eligibility for coverage and subsidies. Part of the challenge has been building a data hub to connect data from all of these agencies.

    You can bet all these agencies have their own requirements that might trickle down to design constraints. Maybe it’d be more efficient to host or access data this way vs. that way but the agency in question can’t permit the efficient route because of regulation XYZ, blah blah blah. This would not surprise me in the least, but that does not mean I’m right. It also does not excuse an unworkable website.


    • What astounds me is that, given the human propensity to procrastinate, so many were compelled to sign up almost three months before the insurance will go into effect. In any case, assuming that the volume of traffic isn’t sabotage by you know who (sometimes paranoids have cause to be paranoid), would uniformity have made a difference. It’s not just different federal agencies that are involved, but thirty something states, each with different requirements, different policy benefits, different costs, different everything.

    • It is a bit frustrating that taxpayers have to pay for a site that essentially helps for-profit insurance companies pull in more subsidized customers. Maybe WellPoint, Cigna, United and the others could help the rest of us pay for the website?

    • I started visiting and mocking healthcare.gov about two years ago. As an IT professional my complaints was that it was an unnecessarily clunky user interface if you want to find health insurance policies. It seemed to be a health care information site that was coerced into competing with ehealthinsurance.com and the others. There is nothing that destroys usability faster than a site trying to serve two masters. My assumption was that it was a business HHS did not want to be in but they had to finish writing the back end specs before they could get out of the business. Once the specs are complete they could publish an API and let ehealthinsurance.com and the other finish the marketing job. At this time it is hard to see whether scalability or specs is the primary problem. Scalability is a much easier problem to fix than bad specs. If bad/untested specs is the primary problem, the bugs could be with us for several months and will make scalability issues worse than they should be. Oh well!

    • The federal exchange is slow because it was never meant to handle 34 states. Plenty of state exchanges are running OK, and most states could and should be running their own exchange.

      (Does it need to be improved? Yes)

    • I don’t think you should discount the sluggishness that HHS has displayed in issuing guidance and making decisions. I work with Kentucky’s exchange, and even though we were ready to be fully operational they issued regulations 8 days before our go live date declaring that we had to disable the part of our site that would accept payments until 11/1/2013…don’t want one state to be fully operational when the federal hub is in shambles.

      Medicaid expansion has been the same story. It seems like we have spent an inordinate amount of time waiting for HHS to publish rules they have had 3 years to develop.

    • Capers Jones, at spr.com, is the expert on software development efficiencies. In his studies he has found that the government is just about as efficient at developing software as private industry. (That is, except for the military). The gov tends to take longer to prepare – writing more detailed specs – but the actual coding goes faster.

    • As a regular reader with a little bit of web development experience, I feel like I can contribute here.

      Suffice to say, there are *always* things you can do to improve a website, improve code, improve user interface, etc. Give any developer a web address and they’ll find something that seems wrong and/or problematic. Some of the comments/complaints about Heatlthcare.gov are like this…more nitpicking than anything else.

      I’ve reviewed the reddit thread you’ve linked to, checked out all the comments, and come to the following conclusion: the site wasn’t quite ready for prime-time. The developers failed to make some finishing touches on the “front end” of the site (the part that users see), and it seems clear that the “back end” of the site wasn’t tested as well as it should have been.

      As someone who has launched a few websites in my time, I’d say that this is not at all unusual. The difference between healthcare.gov and most other sites, however, is that you can “soft launch” most sites on a weekend or slow weekday, and then fix issues as they come up. It’s not ideal – you’d prefer not to have any problems – but it’s pretty standard.

      Suffice to say, the problems are overblown. Anyone who had any practical experience in web dev would know that this isn’t that big of a deal. Sites rarely work perfectly out of the box. Like most systems, they need ongoing refinement.

    • @Jason – I cannot say the problems are overblown. It is impossible to figure the magnitude of the problems until you can create a userid and the current system failed this simple test. If this system was in the refinement mode then they would have delayed the launch date a week or a month have the developers fix the problems. Launching the site on October 1 was a political decision and it implies that management has accepted a more chaotic future. Since the system has gone live I assume that the contractor has completed the customer acceptance test, the developers have been given a punch list, the system is now being transferred to the maintenance staff, and the developers are looking for their next job. That is the way government contracts work. If you go live, then you accept the system. Now we are stuck with fixing an incomplete system with a history of too many cooks in the kitchen.

      So the interesting question is how badly mismanaged was this project and why. @McDruid pointed out that there are good ways to develop systems but considering the number of red flags already flying by Capers Jones standards the management for this project was incompetent. In fact I suspect there are a lot of software engineers and MBA types gathering info to write a book on the lessons we deliberately ignored during this project. One of the red flags was @Robin’s comment that the specifications were still being finalized eight days before launch date. Good project management would have locked down the specifications 30, 60, or 90 days in advance so you can complete the testing before going live. Another red flag has to be the problems with the front end. This is the easy stuff and they screwed it up. As an example I tried to perform a simple check to see what insurance plans were available on healthcare.gov for my demographics. This is a query that I completed in 2011, 2012, and earlier this year without logging in. Now I cannot complete this simple query because I am required to login into the system before proceeding. Requiring a customer to login before proceeding is one of cardinal sins of eCommerce design. My guess this was a deliberate design change to increase the number of people signing up on the exchanges and an unintended consequence was that it cratered the system. The number of people trying to apply seems to be the most important talking point on the Sunday morning talk shows. This amount of middle management meddling in the so late in the design is a red flag. So I am concerned if they screw up the easy stuff, what does this say about the in-house testing? If the in-house testing was either ineffective or non-existent. then it looks like they have embraced maintenance mode and the idea of letting the customer find the problems. This is an unpleasant way to fix a system. Considering we have not even started testing the tough stuff we have too many red flags.

      • Unfortunately, you can’t evaluate a system based on the fact that a supposedly “simple” thing like user registration isn’t working.

        First, the facts are that the system did register some users. Second, more than one person suspects that the site was under attack on launch day…bots could easily take down the registration system, especially in the first few hours. Three, the site is already working much better in the time since I’ve commented.

        The reports of problems on Healthcare.gov have been overblown – plain and simple. Web development isn’t like hanging the Christmas lights…you can’t just plug things in and have them work.