Wednesday 23 October 2013

The largest Software project in the history of the universe?

I recently read that Healthcare.gov consisted of 500 million lines of code. As a person educated in software development this report got my attention. The reason this grabbed my attention is the shear size of a project of that magnitude would easily make this the largest project ever in the history of computing. To put this in perspective, if you add up major Microsoft OS releases between 1993 - 2003* you get around 149 million lines of code. The only projects that come close to this are some OpenSource projects, but these are exceptions because they are typically layers of code that are accumulated over decades, involving tens of thousands of individual programmers. The point of this article was to point out how rediculis this claim by the administration is and the best way to do this is with math, yeay.

1000 Avg SLOC 1 programmer produces per month.
/
500,000,000 SLOC
=
5,000,000
/
36 months
=
13,888.889 programmers needed to complete project.
*
$70,000 Web developer median wage
=
$9,722,222,222.2 total labor cost.

Please note that this is only dealing with the labor cost of SLOC (source lines of code) This doesn't include Project cost which would be effectively 33% more.

Let us also consider that 13,888 web developers would be around 7% of the workforce. I was hoping to track median wage statistics over the years in which this project was in development, but unfortunately the BLS didn't track web programmers specifically until 2012. A fluctuation this large in the available labor would have resulted in a significant supply reduction which could have been verified independently. That is if the BLS was tracking those stats. I also searched all of the major industry publications for mentions of Obamacare the ACA or healthcare.gov during the time period between Jan 2009 - Oct 2013 Nothing was mentioned. This has to be the most information secure project ever. You don't simply hire and employ over 20K people and no one notices.

Now all of this supposition is based on accepting that the stories are accurate as reported. Obviously this is not the case. There is no way that this website dose contain 500 million unique lines of source code. That is to say that it may contain 500 million lines of reused code. this is much more conceivable. If this is the case however this project was run by an idiot who has no concept of software development or technology in general. I can just imagine that this was some half hazard attempt to hack several CMS's together to make a system that some how routed information between several databases. I haven't personally professionally programmed a Back-end in years but I can tell you this much there is no way this project should have $400+ million, nor should it have required 500 M-LOC. The reason I can safely say this is because the software requirement is simply not there to require that much data management. Oh yes I know there are several databases and so on blah blah. Debian 5.0 the largest code base I could find is 324 M-LOC. Debian is multiple magnitudes greater in complexity than a website of any complexity. I would say that Google search wouldn't contain much more than 1-5 million. I doubt it contains that much.

In conclusion I have to say that this whole story about the amount of code involved is pure fiction. It also tells me that there were like 4 guys in a conference room for like 6 months merging 2 or three OpenSource CMS systems together attempting to meet the system specifications. That is the real story of Healtcare.gov.



No comments:

Post a Comment