May 26, 2011

President Obama's Long Form Birth Certificate

I am not a "birther". I’ve never been especially impressed with President Obama, but neither have I been impressed with many of his detractors. I began to look into the issue of his birth certificate very recently because a friend sent me a link to a YouTube video that caught my attention. I have decades of experience with digitally processed images, scanners, and things of that sort. It appeared to me that the author of the YouTube video had uncovered something interesting, so I decided to examine the PDF of the long form birth certificate myself, largely out of professional curiosity. I have political opinions on this issue, but I will reserve any comment on those opinions until the end. First, I will deal, to the best of my ability, in facts.

[Click to Enlarge]

The image above is a portion of the document in question, and is provided only for reference. The PDF document is greater in extent and contains more information. Everything I care to say has to do with the region above, and not with the peculiarities of the stamps, the security paper, etc.

The actual PDF on the Whitehouse web site is divided into a number of layers. There is a background layer, which includes the pattern of the green security paper, the lines of the form, and an odd scattering of seemingly random parts of the text, including parts of the typed and written text. There are several higher layers that include individual stamps. Between the background layer and the stamp layers is a single layer that contains most of the printed, typed, and written material. It is this layer I will chiefly focus on.

I cannot say with any certainty how the separate image layers were generated. As others have pointed out, such layering is not a normal artifact of the scanning process, and this was claimed to have been a document scanned from a copy. As has also been pointed out, PDF optimization can create layers but the layers this document is sorted into is not consistent with that process. I tend to agree, but for the purposes of my evidence it doesn’t really matter how the layers were originally generated. The primary text layer got there somehow, and I intend to show that it contains features that are almost certain evidence of tampering in and of themselves.

The primary text layer has to have been, at some point, reduced to a binary image. Let me try to explain this in layman’s terms. All digital images consist of a grid of individual picture elements – "pixels". In a color image, each pixel has a particular color. In a grayscale image, each pixel is a shade of grey. In a binary image, each pixel is normally either white or black. Binary images have an unmistakable "jagged" look. Images of text are commonly saved in binary form, because binary data saves file space. PDF optimization can create a layer of binary material out of things like text and form lines. This may or may not be how the primary text layer was created, but in any case it is binary, composed of pixels of only two types.

When a typed character is reduced to a binary image, it becomes a pattern of black and white squares. If you type a thousand letter "B’s" on an old manual typewriter, like the one a person would have used to fill in a 1961 birth certificate, no two letters will be exactly alike. The grain of the paper and the typewriter ribbon, the pressure and the speed with which one strikes the key, and other factors produce many variations on the same basic letter. When you scan such a typed document as an image, you create even more variations because of the physical properties of the scanner. While binary images of letters are less varied than color or grayscale ones of the same resolution, they are still so varied that at the 300 dpi resolution of the long form birth certificate PDF, it would be highly improbable to find two letters exactly alike. The President’s birth certificate contains at least two letters that are absolutely identical, pixel for pixel.

[Click to Enlarge]

If a person has a layer of binary text he or she wants to make changes to, the easiest way to do it is to copy words or characters that already exist elsewhere in the text. One can just copy and paste letters to make new words, rather like an old kidnapper’s ransom note. As you can see in the illustration above (magnified for clarity), the "B" and the second "I" in "OBAMA, II" are identical to the same two letters in other places. It is likely they were copied to, and not from, this location for reasons I will outline later.

Modifying binary text of this kind is comparatively easy to do, because the white (or transparent) background behind the images eliminates the risk of creating artifacts in the process of moving letters around. It is also fairly easy to alter the copied letters to make them look authentic. That is, if one understands what one is doing and isn’t careless.

[Click to Enlarge]
To understand just how improbable it would be to find two identical letters in the President’s birth certificate, I have shown two examples of close mismatches (again magnified for clarity). The green and red letters were taken from different places in the document. The black letters in between show the pixels shared in common, while the red and green pixels around the edge of the black letters show the differences. The extent of the variation is bound to vary with the size and complexity of the letter, but both the "S" and the "B" vary by more than forty pixels. As I’ve said, these are close mismatches, and there are plenty of matches in the document, struck with the same key, that vary even more. Still, let’s err on the side of caution and say that the average variation is about 40 pixels.

If the variation were much smaller, say, only one pixel, then there would only be two possible patterns for each letter. The odds of any two letters of the same type being identical would be 50%. If they varied by only two pixels, there would be four possible patterns – a 25% chance of an identical match. The odds decrease exponentially; 2 raised to the 40th power is a staggering number: 1,099,511,627,776. In other words, 40 randomly varying pixels make the odds of any two letters of the same type being identical about one in a trillion. Of course, there may be some patterns that are more likely than others; there are multiple capital "B’s" and "I’s" in the document, but even allowing this it is hard to imagine the odds getting any better than one in a billion. This is more than close enough for practical certainty. There may be more matching letters in the document for all I know. I only spent a couple of hours looking.

Now, consider where the identical matches occur. They occur in the President’s last name. The inference here is obvious. The likely purpose of the forgery was to hide the fact that the President was born out of wedlock, and that his last name on the original document was "Dunham", his mother’s maiden name.

[Click to Enlarge]

As you can see above, the apparent manipulation of the President’s mother’s signature also adds weight to the theory that he was simply illegitimate. Her partial signature is the only signature material on the binary text layer. It is plausible that "Obama" was added and "…unham" was overwritten from the same source to make the alteration less conspicuous. Note, too, that if "OBAMA, II" replaced "DUNHAM" in the typed text of the President’s name, two of the needed letters, "A" and "M," were already at hand. I have searched in vain for the missing "O", second "A", the comma, and second "I". Perhaps they were modified from other letters. Someone with more time, better software, and a flair for statistical analysis may succeed where I have failed.

The matching letters, I believe, are conclusive evidence of tampering. The document does hold certain other mysteries, many of them no doubt wholly innocent . Finding letters moved from place to place makes the proposition that the birth certificate is a forgery from the ground up rather unlikely. It would have been far easier to have cleaned someone else’s form and simply typed in all the text. I believe the President was probably born when and where he said he was, and had the biological parents he has always claimed to have had.

Presuming my theory about the President’s illegitimacy is correct, there is something both ironic and sad about the whole affair. It hardly matters to me whether his parents were married or not. This is not 1961, and I doubt that it would matter to most people now. The people it would matter to probably despise Obama anyway. It is hard for me to believe that none of the experts among the president’s enemies have found the same evidence and come to the same conclusions I have. It seems at least plausible that they have made the political calculation that rousing suspicions of a "Manchurian candidate" is a profitable adventure, but exposing a person’s illegitimacy might only seem a callous, brutal exercise. Lacking either any special ill will or political calculation, I, however, can present my evidence with an entirely clean conscience. Truth cannot be negotiated.

7/4/2011 - Note:

This article has, not surprisingly, stirred up a considerable amount of anger from some, and criticism in some variety. Some of these criticisms may be worth looking into, and, having given the matter consideration, I would like to address two of them.

There has been a persistent criticism that there is no controlled study here to support my technical evidence. My initial view was that, given the nature of my argument, comparison with other documents hardly seemed necessary. The first person who raised this issue (Mr. Planck) demanded a double-blind study of Hawaiian birth certificates of the same age. Since this is obviously beyond my resources either in time, money, or authority this seemed to be just an expedient attempt to stifle the matter. Having given this further consideration, I realize there are other types of control studies that can be done. I don't have the time to do the work at the moment, but I welcome anyone else to conduct the following experiment.

One would need to type, on a manual typewriter, some large number of letter "B"s. The more the better, but the larger the number the more tedious the matching process is going to be. The sheet should be copied on a modern copier once, preferably but not necessarily onto security paper of the approximate type used for the president's birth certificate. The copy should be scanned at the same resolution one sees on the PDF. I don't recall what that resolution turned out to be, but the PDF can be used as a guide. The scan can be either a TIFF or can go directly to PDF if the scanner allows. (This may seem haphazard, but since I don't know the exact process used to scan and optimize the birth certificate, we can only be so precise.) Optimize the scanned image. Adobe Acrobat Professional is the software I would use, but you might be able to optimize in other Adobe applications as well. Extract the resultant binary text layer from the resultant PDF. (I would suggest using Adobe Illustrator for this process.) Now comes the fun part. Compare each "B" with every other "B" to see if any of them match. If you are proficient with Adobe Photoshop this is not as bad as it sounds. Make a duplicate layer and use the "Difference" setting. Matching "B"s will disappear, while non-matching ones will show the differences. Scoot the layer around in a systematic way, and you should be able to complete the comparisons in a few hours. Report the number of exact matches you find, if any. If the pixel dimensions of the characters are about the same, you do the process right, and you find a match in a something less than a thousand characters, I will consider my argument refuted and publicly say so. I reserve the right to raise any doubts I might have about your process, but even if I doubt it I will still post a link to you trial on my original article -- as long as you've had the courtesy to be civil.

Could I do this test myself? Yes, but I don't have either the manual typewriter or the time. It hardly matters who conducts the experiment.

My second proposal is to do something similar regarding my statistical analysis. If you can show that my math is wrong, I'll post a link to you evidence on my original article. If I understand your refutation and accept it as correct, I will acknowledge my error.

I check comments on my site regularly, and can be reached by commenting on this post.



  1. I'm still holding out for a controlled test. I think you can find Donald Trump's b/c on-line; pouring through the PDF of that with the same criteria would be interesting.

    There is a possibility here that I could believe: that is, that Obama is illegitimate and doesn't know it. One can imagine this is the sort of thing his parents might have forged on their own, without telling him. That kind of stuff happens all the time.

    Of course, your theory has no political consequences, so it's simply going to be ignored. Who wants facts that don't help prove their case? :D

  2. Since all of the obvious alterations are digital, the notion that Obama's mother had it doctored up in 1961, or shortly thereafter, is non-sense. The idea that parents routinely break into hospital reconds and tinker with the documentation is also ridiculous. Who are you suggesting Obams's real father was, G. Gordon Liddy?

    I am hard pressed to understand what anyone would consider evidence if not a direct character match with such a low order of accidental probability. Short of a confession in front of the Whitehouse press corps, few things could be more conclusive.

    If someone, at the President's request, forged an official document it certainly COULD have political consequences. Politicians lie, but forgery is another matter. Further, there are probably legal ramifications to providing false documentation to election officials.

    Frankly, it changes nothing if Donald Trump's birth certificate turns out to be written in crayon.

    - e.m. cadwaladr

  3. I looked somewhat hastily at your analysis.

    You make a statement, for which you offer no support or evidence: "As others have pointed out, such layering is not a normal artifact of the scanning process, and this was claimed to have been a document scanned from a copy. As has also been pointed out, PDF optimization can create layers but the layers this document is sorted into is not consistent with that process." I suppose you are claiming expertise, but as far as I am concerned you are an "anonymous expert" which I cannot accept.

    The second problem you have is plausiblility. The Hawaii Department of Health Obama FAQ states that they delivered the long form certificate to Obama who put it on his web site, and then links to the White House page where the certificate images are. The State is endorsing the image. The digital images were made by the White House, so if your theory is right, the document Obama received from Hawaii said "Dunham" and what they released said "Obama", and that is impossible because:

    1) The Hawaii Birth Index for 1961 says Obama
    2) 2 Newspaper notices from the Bureau of Vital Statistics in 1961 says Obama
    3) The Certification of Live Birth released by the Obama campaign in 2008 says Obama
    4) The Obama divorce decree puts their date of marriage prior to the President's birth

    So one can rule out any possibility that the President's birth certificate, since 1961, that the birth certificate says anything other than Obama.

    I haven't had time to look at your "probability" calculation except to note that it's not right (I have a Master's Degree in Math). I'm not saying that the right answer (even under your assumptions) isn't a large number, just that you didn't do it right.

    The White House released photocopies of the originals to the Press, and high-resolution scans (at least 400 PPI) of those are available on the Internet. It would probably be worth your while to see of the identical letters you claim to find in the PDF also appear there. It is clear that the photocopies are not derivative from the PDF files because they show more detail -- another problem with the whole PDF tampering theory.

  4. Dr Conspiracy.

    I have no more reason to accept your anonymous math expertise than you have of accepting my graphics expertise.

    I admit I cannot account for "chain of evidence" issues. I don't know why, when or exactly how things were done. I do not claim to. You are more of an expert on the subject in general; you have a web site dedicated to this matter. I do know, however a good deal about PDF's and at least the rudiments for probability. Some things are just too improbable to dismiss.

  5. This should settle my math expertise question.

    The problem is that I'm leaving town in the morning and don't have the time to do a thorough analysis of the problem, which would involve verifying your results to start with. If you can wait a week, I'll go through it.

    I would offer for your consideration the problem of comparing the probability of the Birth Certificate being a fake given the external evidence, and the chance of your engaging in a faulty analysis. Having seen many a faulty analysis made by math majors in class (including myself) on textbook probability problems, and the number of folks who have convinced themselves of false conclusions over various artifacts in documents, I consider such things not at all unlikely. In fact, it happens all the time.

    It's always useful to give an answer the sniff test. If the answer you get doesn't seem right, then it's time to check your work. And as Scientist (who really is a scientist) on my blog pointed out, you have no controls.

  6. I started the analysis and I'm immediately stuck. I used Adobe Acrobat 9.0 Standard to export the images from the White House PDF. It doesn't get all the layers, but it gets the bitmap layer you talked about. I zoomed in on the letter "B" in the two instances of "OBAMA" and they weren't at all alike.

    So you will have to explain your methodology. Here's what I got:

  7. Here's part of a post I found on Free Republic a while back that should end all speculation about the birth certificate being a forgery:

    "It's like accusing the President of counterfeiting money with collusion from the Treasury. You MIGHT be able to prove that there was a conspiracy with Treasury, but I don't see how you could convince me you could prove the bills printed were 'forgeries.'

    If the Treasury prints the bills they are genuine.

    If the Hawaii DOH prints the COLB it is genuine."

  8. Fraud "hardly matters" to you?

  9. Anonymous said...
    "Fraud "hardly matters" to you?"

    I would certainly prefer that public officials and their staffs refrained from lying. It does, however, concern me more that the Consumer Price Index and other key economic measures are beinging systematically "cooked" than it does the president or members of his staff may have made minor alterations to his personal history. Obviously, the problems with the birth certificate do bother me enough for me to raise the issue. -e.m.c.

  10. I was reading this thread:
    there were many comments, I didn’t read them all.
    I confirmed the match by zooming with sumatraPDF
    (not an Adobe product)
    My first idea was, that the compression algorithm
    would compare letters or boxes of pixels and if one is sufficiently
    similar to a former one, it is replaced by the former almost-match so to
    reduce size, since now you just only need the address to the match.
    Was this mentioned/discussed ?
    I’m also interested to get exact bitmap exports of the layers.

    Dr. Conspiracy July 4, 2011 at 12:29 pm #
    The Bs are the same because the optimization severely reduces the resolution, removing the differences, and the viewer interpolated the display version when blowing it back up, making it appear that there is more resolution than is really there.
    Flate is a complex “algorithm of algorithms”, that breaks an image into zones and selects an approach (one of several compression algorithms) to each based on Flate’s own criteria. Flate may also decide to leave a zone, or by extension an entire object, uncompressed.
    When I export the images out of Acrobat Standard, it does not ask me for a resolution, but appears to use the internal resolution of the original. When I exported, I got three images and each at a different resolution. If you follow my link in the main article tomy exported image, you can see the actual resolution. A next step for someone with a copy of Illustrator would beto export the object at exactly the same resolution and compare.

    1. I not really familiar with the intricacies of Flate, so I can't say your explanation is impossible. My assumption is that the extraction of the binary text layer in question is a simple threshhold reduction from a higher color-depth version of that material (the scan). That would create a highly compressed file already -- why create an algorithm to compress that part of the image further, but not do a better job compressing the security paper? This does not mean I can say catagorically that Flate doesn't work that way, but I doubt it. If it did, there should be letter matches all over.

      I stopped talking to the Obamaconspiracy people after the first few days, largely because I tired of being gratuitously insulted. At this point, I'm not even sure where my extracted files are. Though I probably could find them, I can't prove I haven't tampered with them. If you have the original Whitehouse PDF, you have my starting point. Believe it or not, I do not live and breathe this stuff. It depresses me that it trumps the rest of my posts in popularity.

      Good luck in your search for the truth. Let me know how it goes. If you can prove it either way, I'll be happy to post your results.

  11. thanks for replying. I see you are still there
    reading this. I'll post my final conclusion later.
    Your argument with the threshold reduction
    doesn't sound plausible to me, but maybe I don't
    understand it exactly.
    Deflate would only recognize matches that are less
    than 32768 bytes away and presumably the data was encoded line by line, not in blocks that contain whole letters (?) And the pdf encoded this text-layer
    in 67980 bytes while the same algo (gzip) now
    re-encodes it to 54500 bytes only while the
    equal-making of some letters could only improve this
    by ~<1% , I estimate. So there are problems with my explanation too.

  12. ahh, there are many other examples !
    I did a systematic computer search on the
    1407 connected components.
    two square boxes
    21 times "d" with 158 pixels ,
    all 21 identical

    that should confirm my theory, that in a first
    step similar letters were replaced by identical ones.
    (unless you assume it was faked)
    But despite that first step they got such a
    bad compression rate for that text-layer,
    I don't know why.


    1. Again, keep at it if it interests you. I stand by my word that I will post your findings. A relatively concise explanation would be appreciated. If what you’ve sent in comments is what you’d like me to post, I will, but I assumed you would want to be a little clearer.

      I think there is a truth to be had somewhere regarding the oddities of the document. Unfortunately, from a political perspective, I don't think many people on either side believe in facts anymore. They believe in their respective narratives. It is hard to imagine a circumstance in which a clear, factual resolution of the birth certificate issue (in either direction) would sway opinion on either side. Still, I appreciate your attempt to find an explanation.