By Brianna Hughes
The harm from a breach of privacy can not only be potential fines, litigation costs, loss of trade secrets and reputational damage, a breach of privacy can also put the nation’s safety at risk.[1] An important tool to maintain privacy and secure confidential information is the technique of redaction.[2] Redaction is used to filter information out of documents to keep that information secret from unauthorized individuals.[3] In the past, redaction was performed manually by using a black marker to mark out the information or by cutting out the information with scissors.[4] These manual methods were costly and time-consuming.[5] As technology evolved, different digital techniques for redaction came to light, making it easier to filter out confidential information.[6] Many individuals, businesses, and governments rely on these digital techniques to keep their sensitive information confidential.[7] Though these digital techniques are easier, this does not mean that the redactions done are secure.[8]
Those who use digital redaction typically use PDF redaction tools.[9] This involves placing a black box over the sensitive information that is supposed to remove the information behind the box.[10] When this technique fails, it is usually because the text data remained in the document.[11] This allows an individual who would like to access the information to simply copy and paste the information behind the black box into a new word document, defeating the purpose of the redaction.[12] Some information can also be shown when converting a redacted document from Microsoft Word to PDF.[13] Additionally, the inclusion of enough details can allow individuals to decipher what the redactions were meant to be.[14] When these failures occur, information that was supposed to be unknown to the public could be exploited through the press.[15] Examples of this include the redacted court deposition of Ghislaine Maxwell, the partner of Jeffrey Epstein, being published after being deciphered by journalists.[16] The journalists were able to decipher many names that were redacted, many of those names being high-profile individuals.[17] Redaction failures do not only happen in court filings; anyone using digital redaction techniques can fall victim.[18] An unintentional exploitation of private information through the press occurred through the New York Times when they published redacted information that fell victim to the copy and paste method.[19] This redacted information revealed CIA operations and the name of a program’s target; this information is a matter of national security and was not intended to be known by the public.[20] There are multiple high-profile redaction failures that have exposed information that someone wanted to keep secret.[21]
Ineffective redaction can be detected before information is leaked; however, if it is not detected, that information sits available to all.[22] Researchers have built a tool called Edact-Ray to identify, break, and fix information leaks.[23] The program focuses on the size of the characters and their positioning; it then compares the size of the redaction with a predefined “dictionary” of words to estimate what has been replaced.[24] This software can eliminate 80,000 estimates per second. When it detects a vulnerable PDF redaction, it removes the underlying text from the PDF.[25] The inventors of this tool intend to release parts of this program to help identify non-excising redactions and help repair those redactions.[26] For those individuals and entities that intend to use digital redaction in the future and do not intend on using Edact-Ray, changing the content of the original document before redacting can be one way to avoid failure.[27] While redacting will never be proof, understanding that redaction is not as secure as one thinks will help avoid careless mistakes.[28]
[1] See e.g., Adam Pez, Digital Redaction Fails & Best Practices: How to Keep Your Sensitive Information Safe, Intralinks (Sept. 3, 2020), https://www.intralinks.com/blog/2020/09/digital-redaction-fails-best-practices-how-keep-your-sensitive-information-safe; Matt Burgess, Redacted Documents Are Not as Secure as You Think, wired (Nov. 25, 2022), https://www.wired.com/story/redact-pdf-online-privacy/.
[2] See Pez, supra note 1.
[3] Id.
[4] See Maxwell Bland, et al., Story Beyond the Eye: Glyph Positions Break PDF Text Redaction 1 (2022).
[5] See Pez, supra note 1.
[6] See Burgess, supra note 1.
[7] See id.
[8] See id.
[9] See Bland, supra note 4, at 1.
[10] Burgess, supra note 1.
[11] Lisa C. Wood & Marco J. Quina, Litigation Practice Notes from the Field the Perils of Electronic Filing and Transmission of Documents, 22 Antitrust ABA 91, 91 (2008).
[12] See Bland, supra note 4, at 2.
[13] See Judge Herbert B. Dixon Jr., Embarrassing Redaction Failures, 58 the judges journal 37, 38 (2019).
[14] See Burgess, supra note 1.
[15] See Wood, supra note 11, at 91.
[16] Burgess, supra note 1; Josh Levin, et al., We Cracked the Redactions in the Ghislaine Maxwell Deposition, slate (Oct. 22, 2020), https://slate.com/news-and-politics/2020/10/ghislaine-maxwell-deposition-redactions-epstein-how-to-crack.html.
[17] Levin, supra note 16.
[18] Id.
[19] See Dixon, supra note 13, at 39.
[20] Id.
[21] Burgess, supra note 1.
[22] Wood, supra note 11, at 92.
[23] Bland, supra note 4, at 2.
[24] Burgess, supra note 1.
[25] Id.
[26] Bland, supra note 4, at 18.
[27] Id. at 17.
[28] See Burgess, supra note 1.
Image Source: https://images.squarespace-cdn.com/content/v1/58090c87d1758ec5d1815f6f/1507577772082-FHNXIFFO7VSSSFCMEWOB/howto.PNG?format=1500w