Redacting with confidence: some advice from the NSA
Some of the more dispiriting items I receive in my e-mail are Word document attachments. This would be fine if the sender wants me to edit the document but most senders are actually sharing finished documents. And so I get to read through so many changes and comments in CVs that I know the authors would never have wanted me to see. It is sad that so many documents come from NHS staff. And that includes the guardians of UK NHS IT systems, the good people of Connecting for Health.
There are many ways to avoid divulging information in this way, the easiest of which is of course to use open source software like OpenOffice.org instead of Microsoft Office. If you do not want to take this route, however, the USA’s National Security Agency would like to help you. They published a nice guide to the alternatives (Redacting with Confidence PDF or see original NSA website). The guide finds “three common mistakes with MS Word and PDF that lead to most cases of unintentional exposure”:
1. Redaction of Text and Diagrams – Covering text, charts, tables, or diagrams with black rectangles, or highlighting text in black, is a common and effective means of redaction for hardcopy printed materials. It is not effective, in general, for computer documents distributed across computer networks (i.e. in “softcopy” format). The most common mistake is covering text with black.
2. Redaction of Images – Covering up parts of an image with separate graphics such as black rectangles, or making images ‘unreadable’ by reducing their size, has also been used for redaction of hardcopy printed materials. It is generally not effective for computer documents distributed in softcopy form.
3. Meta-data and Document Properties – In addition to the visible content of a document, most office tools, such as MS Word, contain substantial hidden information about the document. This information is often as sensitive as the original document, and its presence in downgraded or sanitized documents has historically led to compromise.
Note that many of these mistakes can also occur inadvertently in document composition. For example, sensitive information in an embedded image can be overlaid with another image during format. Such hidden data can be difficult to be spot during manual review of the softcopy.
The document was originally published back in 2006 and I found it today while going through blog posts I have in draft mode. I guess I was worried about republishing NSA materials, but seeing as MI6 is checking my GMail anyway, I thought I would push out the post.
The reason for the guide was to avoid the US military divulging information like they did in the case of the Italian intelligence officer they killed in Iraq, also in 2006:
U.S. military commanders in Iraq released a long-awaited report of the American investigation into the fatal shooting of an Italian agent escorting a freed hostage through a security checkpoint. In order to give the classified report the widest possible distribution, officials posted the document on the military’s “Multinational Force-Iraq” Web site in Adobe’s portable document format, or PDF. The report was heavily redacted, with sections obscured by black boxes.
Within hours, however, readers in the blogosphere had discovered that the classified information would appear if the text was copied and pasted into Microsoft Word or any other word-processing program. Stars and Stripes, the Department of Defense newspaper, noted that the classified sections of the report covered “the securing of checkpoints, as well as specifics concerning how soldiers manned the checkpoint where the Italian intelligence officer was killed. In the past, Pentagon officials have repeatedly refused to discuss such details, citing security concerns.” Soon after, the report was removed from the Web site.
I hope clinicians learn from this before releasing their Word and PDF documents.