Richmond Journal of Law and Technology

The first exclusively online law review.

Month: March 2014 (Page 1 of 2)

Blog: Trapping "Trappy"; the FAA's Attempt to Regulate Model Aircrafts

by Laura Bedson, Associate Symposium and Survey Editor


Say it isn’t so!  The days of unregulated model airplane flying may well be behind us, particularly if the Federal Aviation Administration (FAA) has anything to say about it.  As the use of Unmanned Aircraft Systems (UAS) or drones, as they are more commonly known, has literally skyrocketed in recent years, the FAA has gone to work crafting laws geared towards regulating the use of these devices.

The potential loopholes in these new regulations were pointed out in a recent case that came out of Charlottesville, Virginia.  Back in October of 2011, Raphael Pirker piloted a $130 foam glider above the University of Virginia’s Medical Center in the hopes of capturing aerial footage of the school for advertisements.[1]  Despite this seemingly innocent motive, the FAA came down hard on Mr.Pirker for operating a UAS or drone without obtaining prior authorization from the FAA.  As a result, the FAA imposed a $10,000 civil penalty on Mr.Pirker for operating this commercial drone.  The FAA’s complaint alleged that Mr.Pirker had carelessly operated the drone in a manner that potentially endangered life and property.

To someone such as myself, Mr.Pirker seems like an innocent, model airplane enthusiast who got mixed up in this emerging area of law.  That is not the case.  After doing some research, I learned that Mr.Pirker, aka “Trappy” is a 29-year-old “aerial anarchist”[2] who has been on the FAA’s list of least favorite people for a few years now.  He has taken videos of the Statue of Liberty, French Alps, and Costa Concordia using the small model aircrafts.[3]  It wasn’t until he arrived on UVA’s campus to take videos of the MedicalSchool however, that the FAA was able to get its hands on him. 

Leaping at the opportunity to make an example of “Tappy” and push through model aircraft regulation, the FAA pursued the case by arguing that model aircrafts were covered under its regulations, and even suggested that  model airplanes should be classified as drones.  Currently, the FAA defines a UAS as the device flown by a pilot “via a ground control system, or autonomously through use of an on-board computer”.[4]  Based on this basic definition, Mr.Priker could have been considered to have been operating a UAS and thus in violation of not obtaining prior authorization per FAA rules.

            Despite the FAA’s arguments that Mr.Priker was recklessly operating this commercial drone in a manner that endangered human life and property, a National Transportation and Safety Board (NTSB) administrative judge was not convinced.  In the first case of its kind, the judge dismissed the FAA’s case against Mr.Priker.[5]  The Judge held that Mr. Pirker was not operating a drone, or what the FAA traditionally considers to be a drone, but instead, merely a model airplane, which is a device that is not subject to FAA regulation and enforcement.[6] 

            This decision and the arguments from both parties will likely prove to be more monumental than we may think, particularly because the holding perfectly coincides with newly publicized FAA restrictions regarding the commercial use of drones.  The FAA, for some time (since 2007), has banned the commercial use of model aircrafts and this decision ultimately makes that policy unenforceable.[7]  Unsurprisingly the FAA has appealed the ruling, which means that the case will now be brought before the full NTSB board for a ruling.  While there is no guarantee as to how the full board will rule, there is no question that drones are here to stay and cases such as this are just the beginning of a long race to regulate these aircrafts.

[1] Mike M. Ahlers, Pilot wins case against FAA over commercial drone flight, CNN U.S. (Mar. 6, 2014, 10:07 PM),

[2] Jason Koebler, Drones Could Be Coming to American Skies Sooner Than You Think, Politico Magazine (Jan. 28, 2014),

[3] Id.

[4] Unmanned Aircraft (UAS) Questions and Answers, Federal Aviation Administration (July 26, 2013, 12:29 PM),

[5] Ahlers, supra note 1.

[6] Ahlers, supra note 1. 

[7] Ahlers, supra note 1.

Blog: Will “Smart Guns” be Accepted as a Trailblazing Technology or Lead to Constitutional Issues?

by Taylor Linkous, Associate Technology and Public Relations Editor

            Gun control is one of the most controversial and divisive issues in America and with a slew of mass shootings in recent years, the debate seems to have only intensified.  To make things more interesting, over the past few years, personalized guns or “smart guns” have also entered the conversation.  While at one point, smart guns seemed to be out-of-reach concepts used in movies like Skyfall, the Oak Tree Gun Club in California recently became the first store in the nation to put a smart gun on sale.[1]

            Smart guns were created with the aim of reducing the misuse of guns and accidents caused by guns by using radio-frequency identification (“RFID”) chips, fingerprint recognition, or magnetic rings which would allow only authorized persons to use the weapon.[2]  The thought is that this technology could prevent accidental shootings by children or even prevent violent gun crimes by barring anyone but an authorized user to fire the gun.[3]   

            As noted above, just recently, Oak Tree Gun Club, one of the largest gun stores in California, became the first in the nation to sell a smart gun.[4]  Oak Tree Gun Club is selling the Armatix iP1, which requires the user to be wearing a black waterproof watch in order to fire.[5]  The gun and the watch both contain electronic chips and when the watch is within reach of the gun, a light on the grip of the gun turns green and the user is able to fire.[6]

             Proponents of smart guns insist this technology is revolutionary and a momentous step in the right direction when it comes to controlling the use of guns and reducing gun violence.[7]  However, opponents of the new technology are mostly gun rights advocates who worry about its reliability and that the government will trample on their Second Amendment rights by eventually requiring all guns to have this technology.[8] 

Other opponents include the Violence Policy Center, a strong advocate of reducing gun violence, who argues smart guns won’t reduce gun violence because most gun homicides happen between people who know each other and the new technology would not prevent such crimes.[9]  Moreover, the Violence Policy Center argues smart guns could increase the number of people purchasing guns because those who were once opposed to owning guns may change their minds if they think this technology makes guns safer.[10]

Some of the concerns about Oak Tree Gun Club’s sale of smart guns come from a New Jersey law passed in 2002 which requires all handguns in the state to be personalized within three years of a smart gun being sold anywhere in the U.S.[11]  With the store selling the first smart gun in the country, arguably the clock on the New Jersey law has started running, putting many opponents of the technology in a panic.  The backlash against Oak Tree Gun Store has been so strong that the store has actually denied ever selling the gun, despite photos of the gun for sale at the facility.[12]

Just recently, Senator Edward Markey, a Democrat from Massachusetts has revealed a gun control bill requiring all new guns to be personalized so they can only be fired by their owners.[13]  While the benefits of such a technology seem obvious, smart guns have already gained some strong enemies and it unlikely Congress will pass such a bold law with so much controversy and debate already surrounding it.






[5] Id.

[6] Id.

[7] Id.



[10] Id.




Blog: The Overbroad Computer Fraud and Abuse Act: Its Implications and Why Its Scope Should be Narrowed

by Barry Gabay, Associate Staff

If you are at work and you are reading this, you may be subject to federal criminal sanctions.

The Computer Fraud and Abuse Act, the federal government’s key anti-hacking law, was originally enacted in 1986 to deter hackers from wrongfully obtaining confidential governmental and financial information, or inflicting “federal interest” computers with harmful viruses.  In passing the act, Congress sought to regulate only those computer crimes that were interstate in nature, particularly those involving large financial institutions and governmental organizations.[1]  However, the statute was amended several times to ultimately broaden the CFAA’s reach.  In the mid-90s, for example, Congress placed criminal misdemeanor liability upon individuals who acted merely “recklessly” in their computer use,[2] and later placed liability upon individuals who obtained and read “any information of any kind so long as the conduct involved an interstate or foreign communication.”[3]  But Congress went even further in 2008 when it most recently amended the CFAA.  For starters, Congress eliminated the $5,000 misappropriation threshold for CFAA liability.  But further, while previously a defendant must have stolen information through interstate commerce or foreign communication to be prosecuted under the CFAA, the statute was amended to now encompass all information obtained “from any protected computer.”[4]  

Today, liability under the CFAA can be proven by showing that a defendant (1) intentionally accessed a computer (2) without authorization or exceeding authorized access, and thereby (3) obtained information from a protected computer.[5]  The pertinent definition of “protected computer” is any computer “which is used in or affecting interstate or foreign commerce or communication.”[6]  Courts have found that the Internet is “an instrumentality and channel of interstate commerce,” thus within the realm of Congressional regulation, and for purposes of CFAA violations, the defining characteristic of a “protected computer.”[7]  To put it in perspective, this criminal statute was broadened from pertaining only to computers with direct “federal interest” to now any computer connected to the Internet.

Nevertheless, the main litigable issue has proven to be determining when an individual is “authorized” to use a computer.  Under the CFAA the phrase “exceeds authorized access” is “to access a computer with authorization and to use such access to obtain or alter information in the computer that the accesser is not entitled to obtain or alter.”[8]  Whereas an employee who uses a computer “without authorization” has “no rights, limited or otherwise, to access the computer in question,” an employee who “exceeds authorized access” had initial authorization to use the computer “for certain purposes but goes beyond those limitations.”[9]  However, the phrase, “without authorization” is not defined in the CFAA, and a circuit split has thus developed over the interpretation of the phrase.  

The majority broad view, adopted by the First, Fifth, Seventh and Eleventh Circuits, holds that an employee’s computer authorization is terminated the moment that an employee acts contrary to his employer’s interest.[10]  These circuits hold that any time an employee uses a company computer in a way not in direct benefit to his employer the Department of Justice has jurisdiction to prosecute.  As Justice Floyd noted in the summer of 2012, “[s]uch a rule would mean that any employee who checked the latest Facebook posting or sporting event scores in contravention of his employer’s use policy would be subject to the instantaneous cessation of his agency and, as a result, would be left without any authorization to access his employer’s computer systems.”[11]

However, in the two most recent federal appellate cases on the issue, the Fourth and Ninth Circuits both adopted a narrow interpretation of the statute.  Those circuits held that an employee is “authorized” to use a company computer when the employer gives that employee permission to use it.  An employee’s subsequent misuse of an employer’s computer would not be subject to federal sanctions, as that employee was “authorized” to use that computer under the CFAA. [12]

While a broad interpretation of the CFAA may deter some individuals from using computers in ways not intended by their employers, that deterrence derives from ludicrous sentencing for comparatively innocuous criminal actions.  Aaron Swartz, the well-documented Internet activist who allegedly downloaded millions of articles from MIT’s online library, faced a maximum sentence of 35 years incarceration before the 26-year-old took his own life. [13]  In comparison, the maximum federal sentence for a first-time felon guilty of attempted murder who left the victim with life-threatening bodily injury is 24 years.  A first-time child pornographer who distributes images of a child under the age of 12 engaged in explicit sexual acts would receive a maximum federal sentence of 30 years imprisonment.  If an employee merely getting fired by her employer is not enough deterrence for misusing a company computer, then state criminal statutes and tort and contract law surely provide adequate deterrence.  Thus, in practice, the broad interpretation of the CFAA merely serves to make ordinary working individuals, who, while perhaps distracted during the workday possess no real criminal intent whatsoever, into federal criminals. 

In the wake of Aaron Swartz’s suicide, the Justice Department and members of Congress have recently expressed their willingness to narrow the scope of the Computer Fraud and Abuse Act.[14]  The bipartisan Aaron’s Law was introduced in the House of Representatives to limit the scope of the CFAA.  That limitation is long overdue.  It is a well-established canon of statutory construction that courts must construe criminal statutes narrowly, so as to avoid over-criminalization.  But courts, obviously unable to define a crime, are relegated merely to the text, and hinge liability on the terms “without authorization” and “exceeding authorized access.”  With the firmly entrenched circuit split now in place, the Supreme Court may in the not too distant future weigh in on the issue if Congress does not first amend this overbroad statute.

[1] See Sarah A. Constant, The Computer Fraud and Abuse Act: A Prosecutor’s Dream and a Ha
cker’s Worst Nightmare—The Case Against Aaron Swartz and the Need to Reform the CFAA
, 16 Tul. J. Tech. & Intell. Prop. 231, 233 (2013).  

[2] Computer Abuse Amendments Act, Pub. L. No. 103-322, tit. XXIX, 108 Stat. 2097 (1994).

[3] Economic Espionage Act, Pub. L. No. 104-294, tit. II, 110 Stat. 3488, 3491 (1996). 

[4] 18 U.S.C. §1030(a)(2)(C) (2008).

[5] Id.

[6] 18 U.S.C. § 1030(e)(2)(B). 

[7] United States v. Trotter, 478 F.3d 918, 920-21 (8th Cir. 2007) (internal citations omitted). 

[8] 18 U.S.C. § 1030(e)(6). 

[9] LVRC Holdings LLC v. Brekka, 581 F.3d 1127, 33 (9th Cir. 2009). 

[10] See See E.F. Cultural Travel BV v. Explorica, 274 F.3d 577 (1st Cir. 2001); United States v. John, 597 F.3d 263 (5th Cir. 2010); Int’l Airport Ctrs., LLC v. Citrin, 440 F.3d 418, 420-21 (7th Cir. 2006); United States v. Rodriguez, 628 F.3d 1258 (11th Cir. 2010).

[11] WEC Carolina Energy Solutions LLC v. Miller, 687 F.3d 199, 206 (4th Cir. 2012).

[12] Brekka, supra note 10, at 1133.

[13] See generally David Amsden, The Brilliant Life and Tragic Death of Aaron Swartz, Rolling Stone (2013), available at

[14] Brian Fung, The Justice Department Used This Law to Pursue Aaron Swartz. Now It’s Open to Reforming It. Wash. Post. (Feb. 7, 2014 at 4:03 PM),

Blog: How Will the Government Deal with Bitcoin?

By: Associate Technology and Public Relations Editor, Taylor Linkous


            As bitcoin continues to rise in popularity and value and steadily establishes itself in the mainstream economy, it has simultaneously revealed its flaws and weaknesses.  Will bitcoin successfully establish itself as a revolutionary technology that is here to stay or will its faults cause it to eventually fade out?  Moreover, if it is here to stay, what should the government do with it?

            First of all, bitcoin is a virtual form of currency that was created in 2009 by an anonymous programmer who goes by the code name, “Satoshi Nakamoto.”[1]  Bitcoin is not backed by any government or banks and actually exists only online.[2]  What makes it attractive yet dicey is that transactions are made directly between users with no middle men and buyers and sellers remain anonymous.[3]  Bitcoins are created through “mining.”  Explained in those most basic terms, “mining” is when people solve very complex math puzzles on computers and are rewarded with bitcoins.[4]  Many businesses have already decided to accept bitcoins including a Subway sandwich shop in Pennsylvania,, and even some law firms.[5]

Bitcoin’s rise in popularity and value is evidenced by the steep increase in its value.  As of December 2013, bitcoins were worth $1,100 each.[6]  Further, the first Bitcoin ATM was installed in a coffee shop in Vancouver, Canada and had 81 transactions on its first day.[7]  In fact, just recently, the first Bitcoin ATMs offering Bitcoins for cash in the United States opened up in Boston and Albuquerque and the first Bitcoin ATM Machine to dispense cash for Bitcoins is set to open in a bar in downtown Austin this week.[8]  These Bitcoin ATMs are predicted to encourage people to use the virtual currency in their everyday lives, bringing it out of the deep, complex tech world and into the “real world.”[9]

However, despite bitcoin becoming increasingly recognized as legitimate, it is still unregulated and there are remaining concerns about its prevalent use for criminal activities because of the anonymity.  Bitcoin was the currency used for Silk Road, a website that was shut down by the FBI last year for facilitating the purchase and sale of drugs and other illegal items such as guns and child pornography.[10]  Silk Road 2.0 was launched after the original website’s shutdown; however, a glitch in the website allowed hackers to steal $2.7 million from the customers who had their money in Silk Road’s accounts.[11]  Thus, not only are there concerns about bitcoin use for criminal activity, this incident with Silk Road 2.0 has called into question bitcoin’s credibility and hurt its chances of becoming more mainstream.[12]

Regardless, the growing popularity and value did spark a conversation in the Senate Committee on Homeland Security and Government Affairs this past November.[13]  At the hearing, law enforcement officials expressed concerns about anonymity and their need for help in catching people who are using bitcoin for criminal activity.[14]  On the other side, bitcoin users and The Bitcoin Foundation stated the government should leave the virtual currency unregulated and allow it to continue to grow and thrive on its own.[15]

Currently, even though users of bitcoin are not regulated, businesses acting as “money transmitters” are covered under current law.  Other existing statutes, such as mail and wire fraud, could be used to prosecute some misuse of bitcoin.[16]  Other than that, regulators are having a hard time deciding how to deal with it.  It is unclear whether bitcoin should even be treated as a legitimate currency.[17]  For example, Canada has taken the position that bitcoin is not considered currency and they will not regulate it.[18]  Germany is treating bitcoin like a foreign currency and Brazil passed a law in October 2013 specifically dealing with electronic currencies, such as bitcoin.[19] 

It will be interesting to see whether the introduction of the Bitcoin ATM’s will help bitcoin secure its spot as a legitimate currency or whether the Silk Road 2.0 incident will stunt bitcoin’s otherwise rising popularity.  Even more importantly, if bitcoin is here to stay, will the government allow it to remain unregulated or recognize it as currency and step in to create a regulation?




[4]  See id.








[12]  See id.


[14]  See id.

[15] See id.


[17] See id.


[19] See id.

Blog: Snapchat – Defeating an Authenticity Objection in Court

by Danielle Bringard, Associate Survey and Symposium Editor       

             We’ve all done it.  We’ve all taken the “selfie.”  Facebook, Twitter, Tinder, Instagram, and Snapchat are just a few of these social media applications that allow users to transmit photos and videos to each other from various electronic devices.  However, unlike most social media sites, Snapchat has a unique feature.  Unless a user takes a “screenshot” of the photo or video while it is being received, than the photo or video is deleted after up to 10 seconds.[1] 

            The only information you can obtain from Snapchat about a user is: the user’s email, the user’s phone number, the username, a log of the last 200 snaps that have been sent and received, and the date the user created the account.[2]  The exception being if either the sender or the recipient downloaded the message, kept it saved, and was able to be retrieved from the actual device rather than from the Snapchat application.  The Richmond Journal of Law and Technology has recently published an article which explores the dangers of Snapchat and sexting,[3] but what about the admissibility of a Snapchat history in a court of law?

            While there have been no cases specifically dealing with Snapchat, there have evolved two competing theories regarding the authenticity of social media evidence in general.[4]  First, in the Griffin case, the court requires: the testimony of the creator, documentation from the creator’s computer, or information obtained directly from the social media site which would tend to show that the evidence seeking to be admitted was not falsified or created by another person.[5]  Under the Griffin Test, Snapchat could be authenticated with the report from Snapchat Legal which gives the user’s email, phone number and username provided the party seeking admission could show a match with the purported author.  Second, in the Tienda case, the court will admit social media evidence on a ruling from the judge that a jury could reasonably find that the proffered evidence is authentic.[6]  Under this ruling authenticity could also easily be established with a report from Snapchat Legal.

            While the Griffin test appears to be more stringent that the Tienda test, both courts examined the content and the context of the social media page seeking admission in making its ruling.  Given that requesting the proper authenticating information from Snapchat Legal is no series of hoops to jump through, it is likely that most courts will overrule any objection to the authenticity of a user’s history report from Snapchat Legal.


[1] Snapchat Law Enforcement Guide 3 (last updated Dec. 1, 2012) available at

[2] Id. at 4.

[3] Nicole A. Poltash, Snapchat and Sexting: A Snapshot of Baring Your Bare Essentials, 19 Rich. J. L. & Tech. 4 (2013) available at

[4] Tienda v. Texas, 358 S.W.3d 633 (Ct. Crim. App. Tex. 2012); Griffin v. Maryland, 19 A.3d 415 (Ct. App. Md. 2011).

[5] Griffin, 19 A.3d at 427-428.

[6] Tienda, 358 S.W.3d at 638.

Getting Serious: Why Companies Must Adopt Information Governance Measures to Prepare for the Upcoming Changes to the Federal Rules of Civil Procedure


Cite as: Philip J. Favro, Getting Serious: Why Companies Must Adopt Information Governance Measures to Prepare for the Upcoming Changes to the Federal Rules of Civil Procedure, 20 Rich. J.L. & Tech. 5 (2014),

Philip J. Favro*

“[W]ithout a corresponding change in discovery culture by courts, counsel and clients alike, the proposed rules modifications will likely have little to no effect on the manner in which discovery is conducted today.”[1]



I.  Introduction

[1]        It has been over seven years now since the so-called e-Discovery amendments to the Federal Rules of Civil Procedure (“Federal Rules,” “Rules,” or individually, “Rule”) went into effect.[2]  When they were implemented, various commentators reasoned those amendments would facilitate a more efficient and cost-effective resolution of discovery issues.[3]  This, in turn, would free parties to focus on the merits of claims and defenses, “teeing matters up for disposition through settlement, summary judgment, or trial.”[4]  The reality, of course, is far from this Pollyannaish vision.  Instead of simplifying the process, the 2006 amendments seem to have generated more satellite litigation than ever before about preservation and production issues.[5]

[2]        Beyond the issues spawned by the 2006 amendments, the costs and complexity of discovery are increasing due to digital age advances that have caused information to proliferate exponentially.[6]  For example, mobile devices such as smartphones and tablet computers have provided users with new methods that facilitate a more rapid and user-friendly exchange of information.[7]  Users now share that information with increasing frequency through short message service and social networks.[8]  Because users do so in far greater quantities than they did with e-mail, the number of communications potentially subject to discovery has been substantially augmented.[9]  Moreover, users have an unlimited virtual warehouse in which to store those conversations due to the popularity of low cost cloud computing services.[10]

[3]        Given these factors and the challenges they present to the discovery process, there should be little doubt as to why the Judicial Conference Advisory Committee on the Civil Rules (“Committee”) has proposed another round of Rules amendments.[11]  The draft amendments are generally designed to streamline the federal discovery process, encourage cooperative advocacy among litigants, and eliminate gamesmanship.[12]  The proposed changes also tackle the continuing problems associated with the preservation of electronically stored information (“ESI”).[13]  As a result of its efforts, the Committee has produced a package of amendments that could affect many aspects of federal discovery practice.[14]

[4]        To date, most of the debate on the proposals has focused on the draft amendment to Rule 37(e).[15]  That amendment would raise the standard of culpability required to impose sanctions for any failure to preserve relevant information.[16]  Such attention is understandable given the proposal’s likely impact on organizations’ defensible deletion efforts.[17]  Nevertheless, there are several other noteworthy changes that are no less important for litigants and lawyers.[18]  Among these are the amendments that would usher in a new era of adversarial cooperation, proportionality standards, and active judicial case management.[19]  The collective impact of these proposals could result in decreased burdens and costs for courts, clients, and counsel alike.[20]

[5]        For organizations to meet the challenges these proposed changes pose, they will need to take actionable measures to satisfy those provisions.[21]  Such measures generally fall under the umbrella of an enterprise’s information governance plan.[22]  For many companies, information governance remains an elusive concept.[23]  Nevertheless, an intelligent information governance plan offers a more enlightened approach for companies to comply with the proposed Rules changes.[24]  Moreover, it is perhaps the only way for clients to realistically reduce the costs and burdens of discovery.[25]

[6]        In this Article, I will consider these subjects. In Part II, I provide an overview of the newly proposed amendments and discuss the impact the Rules proposals will likely have on organizations. In Part III, I offer five practical suggestions that, if followed, will help enterprises meet the information governance challenges posed by the proposed Rules amendments.


II.  The Newly Proposed Amendments

[7]        The overall thrust of the Committee’s proposed amendments is to facilitate the tripartite aims of Federal Rule 1 in the discovery process.[26] To carry out Rule 1’s lofty yet important mandate of securing “the just, speedy, and inexpensive determination” of litigation,[27] the Committee has proposed several modifications to advance the notions of cooperation and proportionality.[28]  Other changes focus on improving “early and effective judicial case management.”[29]  In addition, the Committee has proposed revising Federal Rule 37(e) in an attempt to create a uniform national standard for discovery sanctions stemming from failures to preserve evidence.[30]  The draft amendments that address these concepts are each considered in turn. I will then conclude this Part by generally discussing the effects the Rules changes will likely have on organizations.

A.  Cooperation—Rule 1

[8]        To better emphasize the need for adversarial cooperation in discovery, the Committee has recommended that Rule 1 be amended to specify that clients share the responsibility with the court for achieving the Rule’s objectives.[31]  The proposed revisions to the Rule (in italics with deletions in strikethrough) read in pertinent part as follows: “[These rules] should be construed, and administered, and employed by the court and the parties to secure the just, speedy, and inexpensive determination of every action and proceeding.”[32]

[9]        Even though this concept was already set forth in the Advisory Committee Notes to Rule 1, the Committee felt that an express reference in the Rule itself would prompt litigants and their lawyers to engage in more cooperative conduct.[33]  Perhaps more importantly, this mandate should also enable judges “to elicit better cooperation when the lawyers and parties fall short.”[34] Indeed, such a reference, when coupled with the “stop and think” certification requirement from Federal Rule 26(g), should give jurists more than enough procedural basis to remind counsel and clients of their duty to conduct discovery in a cooperative and cost effective manner.[35]

B.  Proportionality—Rules 26, 30, 31, 33, 34, 36

[10]      The logical corollary to cooperation in discovery is proportionality.[36]  Proportionality standards, which require that the benefits of discovery be commensurate with its burdens, have been extant in the Federal Rules since 1983.[37]  Nevertheless, they have been invoked too infrequently over the past thirty years to address the problems of over-discovery and gamesmanship that permeate the discovery process.[38]  In an effort to spotlight this “highly valued” yet “missing in action” doctrine,[39] the Committee has proposed numerous changes to the current Rules regime.[40]  The most significant changes are found in Rules 26(b)(1) and 34(b).[41]

1.  Rule 26(b)(1)—Tightening the Scope of Permissible Discovery

[11]      The Committee has proposed that the permissible scope of discovery under Rule 26(b)(1) be modified to spotlight the limitations proportionality imposes on discovery.[42]  Those limitations are presently found in Rule 26(b)(2)(C) and are not readily apparent to many lawyers or judges.[43]  Rule 26(b)(2)(C) provides that discovery must be limited where requests are unreasonably cumulative or duplicative, the discovery can be obtained from an alternative source that is less expensive or burdensome, or the burden or expense of the discovery outweighs its benefit.[44]  The proposed modification (in italics) would address this problem by placing them in Rule 26(b)(1) and by more clearly conditioning the permissible scope of discovery on proportionality standards:

Parties may obtain discovery regarding any nonprivileged matter that is relevant to any party’s claim or defense and proportional to the needs of the case, considering the amount in controversy, the importance of the issues at stake in the action, the parties’ resources, the importance of the discovery in resolving the issues, and whether the burden or expense of the proposed discovery outweighs its likely benefit.[45]

By moving the proportionality rule directly into the scope of discovery, counsel and the courts may gain a better understanding of the restraints this concept places on discovery.[46]

[12]      Rule 26(b)(1) has additionally been modified to enforce the notion that discovery is confined to those matters that are relevant to the claims or defenses at issue in a particular case.[47]  Even though discovery has been limited in this regard for many years, the Committee felt this limitation was being swallowed by the “reasonably calculated” provision in Rule 26(b)(1).[48]  That provision currently provides for the discovery of relevant evidence that is inadmissible so long as it is “reasonably calculated to lead to the discovery of admissible evidence.”[49]  Despite the narrow purpose of this provision, the Committee found many judges and lawyers unwittingly extrapolated the “reasonably calculated” wording to broaden discovery beyond the benchmark of relevance.[50]  To disabuse courts and counsel of this practice, the “reasonably calculated” phrase has been removed and replaced with the following sentence: “Information within this scope of discovery need not be admissible in evidence to be discoverable.”[51]

[13]      Similarly, the Committee has recommended eliminating the provision in Rule 26(b)(1) which presently allows the court—on a showing of good cause—to order “discovery of any matter relevant to the subject matter involved in the action.”[52]  In its proposed “Committee Note,” the Committee justified this excision by reiterating its mantra about the proper scope of discovery: “Proportional discovery relevant to any party’s claim or defense suffices.”[53]

2.  Rule 34(b)—Eliminating Gamesmanship with Document Productions

[14]      The three key modifications the Committee has proposed for Rule 34 are designed to eliminate some of the gamesmanship associated with written discovery responses.[54]  The first change is a requirement in Rule 34(b)(2)(B) that any objection made in response to a document request must be stated “with specificity.”[55]  This recommended change is supposed to do away with the assertion of general objections.[56]  While such objections have almost universally been rejected in federal discovery practice, they still appear in Rule 34 responses.[57]  By including an explicit requirement for specific objections and coupling it with the threat of sanctions for non-compliance under Rule 26(g), the Committee may finally eradicate this practice from discovery.[58]

[15]      The second change is calculated to address another longstanding discovery dodge: making a party’s response “subject to” a particular set of objections.[59]  Whether those objections are specific or general, the Committee concluded that such a conditional response leaves the party who requested the materials unsure as to whether anything was withheld and, if so, on what grounds.[60]  To remedy this practice, the Committee added the following provision to Rule 34(b)(2)(C): “An objection must state whether any responsive materials are being withheld on the basis of that objection.”[61]  If enforced, such a requirement could make Rule 34 responses more straightforward and less evasive.[62]  This, in turn, would obviate needless meet-and-confer efforts and motion practice undertaken to ferret out such information.[63]

[16]      The third change is intended to clarify the uncertainty surrounding the responding party’s timeframe for producing documents.[64]  As it now stands, Rule 34 does not expressly mandate when the responding party must complete its production of documents.[65]  That omission has led to delayed and open-ended productions, which can lengthen the discovery process and increase litigation expenses.[66]  To correct this oversight, the Committee proposed that the responding party complete its production “no later than the time for inspection stated in the request or [at] a later reasonable time stated in the response.”[67]  For so-called “rolling productions,” the responding party “should specify the beginning and end dates of the production.”[68]  Such a provision should ultimately provide greater clarity and increased understanding surrounding productions of ESI.[69]

3.  Other Changes—Cost Shifting in Rule 26(c), Reductions in Discovery under Rules 30, 31, 33, 36

[17]      There were several additional changes the Committee recommended that are grounded in the concept of proportionality.  The new cost shifting provision in Rule 26(c) is particularly noteworthy.[70]  While several courts have implied cost-shifting authority presently exists in Rule 26(c) and have issued orders accordingly, the proposed changes would eliminate any ambiguity on this issue.[71]  Courts would be expressly authorized to allocate the expenses of discovery among the parties.[72]

[18]      The Committee has also suggested reductions in the number of depositions, interrogatories, and requests for admission.[73]  Under the draft amendments, the number of depositions would be reduced from ten to five.[74]  Oral deposition time would also be cut from seven hours to six.[75]  As for written discovery, the number of interrogatories would decrease from twenty-five to fifteen and a numerical limit of twenty-five would be introduced for requests for admission.[76]  That limit of twenty-five, however, would not apply to requests that seek to ascertain the genuineness of a particular document.[77]

C.  Case Management—Rules 4, 16, 26, 34

[19]      To better ensure that its objectives regarding cooperation and proportionality are achieved, the Committee has introduced several Rules changes that would augment the level of judicial involvement in case management.[78]  Most of these changes are designed to improve the effectiveness of the Rule 26(f) discovery conference, to encourage courts to provide input on key discovery issues at the outset of a case, and to expedite the commencement of discovery.[79]

1.  Rules 26 and 34—Improving the Effectiveness of the Rule 26(f) Discovery Conference

[20]      One way the Committee felt it could enable greater judicial involvement in case management was to require the parties to flesh out specific issues in the Rule 26(f) conference.[80]   The renewed emphasis on conducting a meaningful Rule 26(f) conference is significant as courts generally believe that a successful conference is the lynchpin for conducting discovery in a proportional manner.[81]

[21]      To enhance the usefulness of the conference, the Committee recommended amending Rule 26(f) to specifically require the parties to discuss any pertinent issues surrounding the preservation of ESI.[82]  This provision is calculated to get the parties thinking proactively about preservation problems that could arise later in discovery.[83]  It is also designed to work in conjunction with the proposed amendments to Rule 16(b)(3) and Rule 37(e).[84]  Changes to the former would expressly empower the court to issue a scheduling order addressing ESI preservation issues.[85]  Under the latter, the extent to which preservation issues were addressed at a discovery conference or in a scheduling order could very well affect any subsequent motion for sanctions for failure to preserve relevant ESI.[86]

[22]      Another amendment to Rule 26(f) would require the parties to discuss the need for a “clawback” order under Federal Rule of Evidence 502.[87]  Though underused, Rule 502(d) orders generally reduce the expense and hassle of litigating over the inadvertent disclosure of ESI protected by the lawyer-client privilege.[88]  To ensure this overlooked provision receives attention from litigants, the Committee has drafted a corresponding amendment to Rule 16(b)(3) that would specifically enable the court to address Rule 502(d) matters in a scheduling order.[89]

[23]      The final step the Committee has proposed for increasing the effectiveness of the Rule 26(f) conference is to amend Rule 26(d) and Rule 34(b)(2) to enable parties to serve Rule 34 document requests prior to that conference.[90]  These “early” requests, which are not deemed served until the conference, are designed to “facilitate the conference by allowing consideration of actual requests, providing a focus for specific discussion.”[91] This, the Committee hopes, will enable the parties to subsequently prepare Rule 34 requests that are more targeted and proportional to the issues in play.[92]

2.  Rule 16—Greater Judicial Input on Key Discovery Issues

[24]      As mentioned above, the Committee has suggested adding provisions to Rule 16(b)(3) that track those in Rule 26(f) so as to provide the opportunity for greater judicial input on certain e-Discovery issues at the outset of a case.[93]  In addition to these changes, Rule 16(b)(3) would also allow a court to require that the parties caucus with the court before filing a discovery motion.[94]  The purpose of this provision is to encourage the disposition of these matters without the expense or delay of motion practice.[95]  According to the Committee, various courts have used similar arrangements under their local rules that have “prove[n] highly effective in reducing cost and delay.”[96]

3.  Rules 4 and 16—Expediting the Commencement of Discovery

[25]      The Committee has also recommended the time for the commencement of discovery be shortened after the filing of the complaint so as to expedite the eventual disposition of a given case.[97]  In particular, Rule 4(m) would be revised to shorten time to serve the summons and complaint from 120 days to sixty days.[98]  In addition, the Rule 16(b)(2) amendment would reduce by thirty days the time when a court must issue a scheduling order.[99]

D.  Preservation and Sanctions under a Revised Federal Rule 37(e)

[26]      The Committee has separately considered issues regarding the over-preservation of evidence and the appropriate standard of culpability required to impose sanctions for any failures to preserve relevant information.[100]  Even though the current iteration of Rule 37(e) is supposed to provide guidance on these issues, amendments were deemed necessary given the inherent limitations with the Rule.[101]

[27]      As it now stands, Rule 37(e) is designed to protect litigants from court sanctions when the good faith, programmed operation of their computer systems automatically destroys ESI.[102]  Nevertheless, the Rule has largely proved ineffective as a national standard because it does not apply to pre-litigation information destruction activities.[103]  As a result, courts often used their inherent authority to bypass the Rule’s protections and punish clients that negligently, though not nefariously, destroyed documents before a lawsuit was filed.[104]  Moreover, the Rule applied only to ESI and did not address issues surrounding the preservation of paper documents or other forms of evidence.[105]  All of which has caused confusion among parties over what needs to be maintained for litigation, resulting in the over-preservation of information.[106]

[28]      The amendments to Rule 37(e) are designed to address these issues by “provid[ing] a uniform standard in federal court for sanctions for failure to preserve.”[107]  They do so by removing the possibility that courts could impose the so-called doomsday sanctions from Rule 37(b)(2)(A) for either negligent or grossly negligent conduct in connection with preservation obligations.[108]  Instead, the proposal would shield pre-litigation destruction of information from sanctions except where “the party’s actions” resulted in either of the following: “(i) caused substantial prejudice in the litigation and were willful or in bad faith; or (ii) irreparably deprived a party of any meaningful opportunity to present or defend against the claims in the litigation.”[109]

[29]      In making a determination on this issue, courts would no longer just rely on their inherent powers.[110]  Instead, they would employ a multifaceted analysis to examine the nature and motives underlying the party’s information retention decisions.[111]  Such factors include:

(A) the extent to which the party was on notice that litigation was likely and that the information would be discoverable;

(B) the reasonableness of the party’s efforts to preserve the information;

(C) whether the party received a request to preserve information, whether the request was clear and reasonable, and whether the person who made it and the party consulted in good faith about the scope of preservation;

(D) the proportionality of the preservation efforts to any anticipated or ongoing litigation; and

(E) whether the party timely sought the court’s guidance on any unresolved disputes about preserving discoverable information.[112]

[30]      By ensuring the analysis includes a broad range of considerations, the proposed Rule appears to delineate a balanced approach to preservation questions.[113]  Such an approach may very well benefit organizations, which could justify a reasonable document retention strategy on best corporate practices for defensible deletion.[114]  The Committee contemplates as much, observing that “[t]his subdivision [proposed Rule 37 (e)(1)(B)(i)] protects a party that has made reasonable preservation decisions in light of the factors identified in Rule 37(e)(2), which emphasize both reasonableness and proportionality.”[115]

[31]      While the draft amendments to Rule 37(e) provide some key protections for enterprises, the proposed Rule also addresses some of the lingering concerns from the plaintiffs’ bar.[116]  For example, the Rule specifically empowers the court to order “additional discovery” or other “curative measures” when a litigant has destroyed information that it should have retained for litigation.[117] Under these provisions, an aggrieved party can ferret out the circumstances surrounding the destruction of that data.[118]  If the party uncovers evidence suggesting the destruction was sufficiently grievous, it could ultimately justify the imposition of sanctions under either of the above tests.[119]

E.  The Instant Rules Proposals Will Impact Organizations

[32]      To be sure, the amendments the Committee has proposed will have a direct impact on organizations.  For example, the draft revisions to Rule 37(e) clearly emphasize the need for companies to develop reasonable information retention policies, along with a workable litigation hold procedure.[120]  The enterprise that does so could simultaneously eliminate large amounts of information and reduce its discovery costs and legal exposure.[121]

[33]      Another effect of the proposed changes is that they will force companies to address discovery matters on an expedited timeframe.[122]  The truncated time periods for the service of a complaint and the issuance of a scheduling order mean parties would have less time to prepare for the commencement of discovery.[123]

[34]      In addition, the proposals spotlight the need for litigants to be prepared to address substantive discovery issues early in the case.  This is evidenced by the draft requirement that litigants discuss ESI preservation and Rule 502(d) orders at the Rule 26(f) conference and the Rule 16(b) scheduling conference.[124]  The proposed advent of early Rule 34 document requests is also exemplary of this substantive discovery issue as it would require litigants to more thoroughly vet discovery issues at the Rule 26(f) conference.[125]  The elimination of open-ended, rolling document productions under a revised Rule 34(b)(2)(B) also underscores the need for better discovery preparations and expedited compliance.[126]

[35]      The proportionality changes to Rule 26(b)(1) will also impact organizations.[127]  Companies seeking to stave off overly broad requests will need to better understand the nature of their relevant data if they are to articulate with the necessary precision the burdens associated with production.[128]  Otherwise, disproportionate production orders will continue to be issued.[129]  In contrast, companies that have a grasp of their relevant information stand a greater chance of making the case to narrow the scope of the requests or having the costs of discovery shifted under the proposed amendment to Rule 26(c).[130]

[36]      In summary, there should be little dispute that the proposed amendments will affect litigants.  The question for organizations, however, is whether they will take the necessary measures to improve their information governance so they are prepared for the Rules changes once they are enacted.


III.  Practical Suggestions for Meeting the Information Governance Challenges Posed by the Draft Rules Changes

[37]      If enterprises expect to address the likely effects of the proposed Rules amendments, they will need to take proactive steps to ensure they can do so.[131]  While there are no quick or easy solutions to these problems, an increasingly popular method for effectively dealing with them is through an organizational strategy referred to as information governance.[132]  At its core, information governance is a comprehensive approach that companies adopt to satisfy the challenges associated with information retention, data security, privacy, and e-Discovery.[133]  Organizations that have done so have been successful in addressing the costs and risks associated with these formerly distinct disciplines.[134]

[38]      While there are many steps that enterprises can take to implement an effective information governance program, the five that I discuss in this Part are essential for those companies seeking to satisfy the draft Rules changes and thereby decrease the costs and delays associated with the discovery process.  They include developing reasonable information retention policies; preparing an effective litigation hold process; creating policies governing employee mobile device use; deploying technologies for ESI collection, search, and review; and developing a more coordinated and better managed relationship with outside counsel.  I consider each of these steps in turn.

A.  Develop Reasonable Information Retention Policies

[39]      If a company is really intent on obtaining more cost-effective results in discovery under the proposed Rules, it should examine its strategy for information retention.[135]  The time to conduct this examination is not in the crisis atmosphere of complex litigation.[136]  Instead, it should be part of the business plan for the organization.[137]  Effective information retention requires each business unit to identify the records that it creates, why it creates them, whether to retain them and for how long, who gets access to these records, and where the records are stored.[138]  The organization that can easily determine whether relevant records exist and where they should be located will clearly be ahead when litigation inevitably arises.[139]

[40]      This, in turn, should lead to the development of top-down information retention policies.[140]  Enterprises can hardly hope to decrease their discovery spending if their retention policies are antiquated, inadequate, or arbitrarily observed.[141]  Indeed, the casebooks are replete with examples of companies whose discovery costs skyrocketed because they failed to properly manage their data with reasonable retention protocols.[142]  The case of Northington v. H&M International is particularly instructive on this issue.[143]

[41]      In Northington, the court issued an adverse inference instruction to address the defendant company’s destruction of key e-mails and other ESI.[144]  The company failed to preserve those records because it did not think to implement a pre-litigation information retention strategy.[145]  For example, the company neglected to establish a formal document retention policy.[146]  Instead, “data retention . . . was evidently handled on an ad hoc, case-by-case basis.”[147]  This lack of organization eventually led to the loss of key data, costly motion practice, and the court’s sanctions award.[148]

[42]      To avoid these negative consequences, companies should insist that their in-house counsel work with IT professionals, records managers, and business units to jointly decide what data must be kept and for what length of time.[149]  By so doing, companies can spearhead the development of retention policies that are reasonable in relation to the enterprise’s business needs and its litigation profile.[150]  This should eventually lead to the systematic elimination of useless, superfluous, and/or harmful data in an organized and reasonable fashion.[151]  If performed in this manner, it is unlikely that such document destruction would be viewed as spoliation under the draft revisions to Rule 37(e) or much of the existing case law on this issue.[152]

B.  Prepare an Effective Litigation Hold Process

[43]      If information retention policies are to be effective for purposes of the draft revisions to Rule 37(e), they must be accompanied by a workable litigation hold process.[153]  Without a workable approach to litigation holds, the entire discovery process may very well collapse.[154]  For documents to be produced in litigation, they must first be preserved.[155]  Documents cannot be preserved if the key players or data source custodians are unaware that they must be retained.[156]  Indeed, employees and data sources may discard or overwrite ESI if they are oblivious to a preservation duty.[157]  This would leave organizations vulnerable to data loss and court sanctions, regardless of the proposed changes to Rule 37(e).[158]  No recent case is more instructive on this than E.I. du Pont de Nemours v. Kolon Industries.[159]

[44]      In Du Pont, the court issued a stiff rebuke against defendant Kolon Industries for failing to issue a timely and proper litigation hold.[160]  That rebuke came in the form of an instruction to the jury that Kolon executives and employees deleted key evidence after the company’s preservation duty was triggered.[161]  The jury responded by returning a $919 million verdict in favor of DuPont.[162]

[45]      The destruction at issue occurred when Kolon deleted e-mails and other records relevant to DuPont’s trade secret claims.[163]  After being apprised of the lawsuit and then receiving multiple litigation hold notices, various Kolon executives and employees met together and identified ESI that should be deleted.[164]  The ensuing data destruction was staggering: nearly 18,000 files and e-mails were destroyed.[165]  Furthermore, many of these materials went right to the heart of DuPont’s claim that key aspects of its Kevlar formula were allegedly misappropriated to improve Kolon’s competing product line.[166]

[46]      Surprisingly, however, the court did not blame Kolon’s employees as the principal culprits for spoliation.[167]  Instead, the court criticized the company’s attorneys and executives, reasoning they could have prevented the destruction of information through an effective litigation hold process.[168]  This was because the three hold notices circulated to the key players and data sources were either too limited in their distribution, ineffective since they were prepared in English for Korean-speaking employees, or were too late to prevent or otherwise alleviate the spoliation.[169]

[47]      The Du Pont case underscores the importance of developing a workable litigation hold process as part of the company’s overall information governance plan.[170]  As Du Pont teaches, organizations should identify what key players and data sources may have relevant information.[171]  Designated officials who are responsible for preparing the hold should then draft the hold instructions in an intelligible fashion.[172]  Finally, the hold should be circulated immediately to prevent data loss.[173]  It is only by following these suggestions that organizations can ensure that information subject to a preservation duty is actually retained and thereby avoid sanctions under the proposed amendments to Rule 37(e).[174]

C.  Create Policies Governing Mobile Device Use

[48]      Another aspect of information governance that can help companies address the impact of the Rules proposals is the development of policies governing the use of mobile devices.[175]  These devices—especially smartphones and tablet computers—are at the forefront of digital age innovations affecting businesses today.[176]  While these mobile devices have revolutionized the way in which business is conducted, they have also introduced a myriad of security, privacy, and e-Discovery complications for enterprises.[177]

[49]      In particular, mobile device use lessens the extent of corporate control over confidential business information.[178]  Whether that information consists of trade secrets, proprietary financial data, or attorney-client privileged communications, mobile devices allow employees to more easily disclose and misappropriate that information than they otherwise could have with traditional computer hardware.[179]  With a single touch of a smartphone screen, an employee can direct sensitive company data to personal cloud providers, social networking sites, or Wikileaks pages.[180]  Any of these scenarios could prove disastrous for an organization.[181]

[50]      Furthermore, an enterprise has the challenge of preserving and producing information maintained on a mobile device.[182]  The logistical challenges of locating, retaining, and turning over that data—all while trying to observe employee privacy—present complications for satisfying the proposed Rules amendments, among many other things.[183]

[51]      To address these and other problems associated with these devices, organizations will need to develop workable use policies.[184]  Such policies will need to address how employees should handle company data on mobile devices, regardless of whether those devices are work-issued or whether they belong to the employee.[185]  They should also delineate the nature and extent of the enterprise’s right to access data on the employee device, particularly for discovery purposes.[186]  To address inevitable privacy concerns that arise when trolling through an employee device for discoverable data, technologies could be downloaded on to that device to segregate and encrypt company information from personal materials.[187]  Such a measure would also help prevent an employee’s family or friends from accessing confidential ESI.[188]

[52]      Another best practice for enabling more rapid preservation and production of mobile device ESI is to eliminate any notion that the employee has a reasonable expectation of privacy in the device.[189]  While this can likely be done by policy for work-issued devices, it should probably be secured by separate agreement from an employee who is using a personal device under a “bring your own device” policy.[190]  The organization that has an unfettered right to obtain relevant ESI from a mobile device will more likely satisfy the preservation, proportionality, and accelerated compliance expectations of the proposed Rules amendments.[191]

D.  Deploy Technologies for ESI Collection, Search, and Review

[53]      Just as technology can facilitate compliance with company mobile device policies, ESI collection, search, and review technologies can help companies satisfy the expedited discovery objectives of the Rules proposals.[192]  This undoubtedly includes cutting edge innovations such as predictive coding and visualization tools.[193]

[54]      Predictive coding employs machine-learning technology to more readily pinpoint relevant ESI than would be possible for human reviewers.[194]  If properly utilized, predictive coding can also reduce the staff required to conduct document reviews.[195]  On the other hand, visualization tools use analytics and machine learning to provide companies with a better understanding of the nature of their relevant information.[196]  This allows for the detection of trends, relationships, and patterns within the universe of that information; all of which can expedite the search and review process.[197]

[55]      Enterprises would also be well served to familiarize themselves with traditional e-Discovery technology tools such as keyword search, concept search, email threading, and data clustering.[198]  With respect to keyword searches, there is significant confusion regarding their continued viability given some prominent court opinions frowning on so-called blind keyword searches.[199]  However, most e-Discovery jurisprudence and authoritative commentators confirm the effectiveness of certain keyword searches so far as they involve some combination of testing, sampling and iterative feedback.[200]

[56]      Regardless of the tools that a litigant selects for collection, search, and review, some form of technology is ultimately necessary to meet the proposed Rules changes.  It is not difficult to envision the problems that companies will have litigating under the revised Rules without using some combination of these tools.[201]  For example, enterprises will find it difficult to intelligently discuss discovery matters at the Rule 26(f) conference or the Rule 16(b) scheduling conference.  Nor will they be able to establish—much less meet—good faith production deadlines required by proposed Rule 34(b)(2)(B).  While various other scenarios similar to these abound, it is sufficient to observe that e-Discovery in 2014 and beyond will require help from technology.[202]

E.  Better Management of Outside Counsel

[57]      A final measure that companies should consider is developing a more carefully managed relationship with their retained outside counsel.[203]  More of an outgrowth of information governance, such a well-managed relationship has the potential to keep client discovery costs more reasonable while guiding counsel to litigate within the bounds of the proposed Rules changes.[204]

[58]      The first step that companies can take in this regard is to state their expectations for how discovery should be conducted at the time of retention or at the commencement of a suit.[205]  A realistic budget and staffing, considering those expectations, must be addressed.[206]  Companies should also emphasize to their engaged lawyers the importance of satisfying the requirements of the proposed Rules, particularly proportionality standards.[207]  While these requirements may be overlooked or even unknown to many attorneys, clients are bound—under penalty of sanctions—to ensure that their discovery efforts meet these standards.[208]  Moreover, company efforts to insist on proportional discovery may be rewarded with decreased preservation and collection costs.[209]

[59]      It is also crucial that organizations communicate with their outside lawyers regarding pertinent aspects of their information governance plan.[210]  To decrease the possibility for misunderstandings, companies should provide ready access to appropriate information technology personnel and relevant business leaders (the owners of the relevant information) to outside counsel.[211]  Outside counsel cannot be effective—and may inadvertently stumble into a costly e-Discovery sideshow—if they are unfamiliar with the company’s information governance and retention policies.[212]  In contrast, having such information will enable outside counsel to more easily negotiate key issues surrounding the discovery of ESI at the Rule 26(f) conference and Rule 16(b) scheduling conference.[213]  Moreover, open communication regarding this matter will facilitate strategy and logistics regarding the preservation and collection of relevant information.[214]

[60]      By taking these steps, organizations will increase their likelihood of compliance with the Rules proposals.  In addition, having such an organized strategy and partnership will reduce discovery delays and related legal fees that typically result from poor planning.[215]


IV.       Conclusion

[61]      Compliance with the proposed Rules amendments does not need to be an elusive concept.  Organizations can prepare for the Rules amendments by taking the initiative to implement or update their information governance strategy.  By following the suggestions that I delineate in this Article, along with other best practices, enterprises can satisfy the new requirements under the draft Rules revisions.  In so doing, they will likely reduce the costs and burdens associated with discovery—both now and in the future.

* Senior Discovery Counsel, Recommind, Inc.; J.D., Santa Clara University School of Law, 1999; B.A., Political Science, Brigham Young University, 1994.


[1] Mitchell Dembin & Philip Favro, Changing Discovery Culture One Step at a Time, Law Tech. News (Dec. 5, 2013), (describing the steps organizations can take to satisfy the provisions set forth in the newly proposed amendments to the Federal Rules of Civil Procedure).

[2] See U.S. Supreme Court Order Amending the Fed. R. Civ. P. at 3,  Apr. 12, 2006, available at; see also Philip J. Favro, A New Frontier in Electronic Discovery: Preserving and Obtaining Metadata, 13 B.U. J. Sci. & Tech. L. 1, 18 n.114 (2007).

[3] See Judicial Conference Comm. on Rules of Practice and Procedure, Summary of the Report of the Judicial Conference Comm. on Rules of Practice and Procedure 24 (Sep. 2005), available at; see also Jessica DeBono, Comment, Preventing and Reducing Costs and Burdens Associated with E-discovery: The 2006 Amendments to the Federal Rules of Civil Procedure, 59 Mercer L. Rev. 963, 964 (2008) (explaining that “the 2006 amendments are intended to help reduce the costs and burdens imposed by electronic discovery”).

[4] Philip J. Favro & Hon. Derek P. Pullan, New Utah Rule 26: A Blueprint for Proportionality under the Federal Rules of Civil Procedure, 2012 Mich. St. L. Rev. 933, 979 (2012); see also Milberg LLP & Hausfeld LLP, E-Discovery Today: The Fault Lies Not in Our Rules . . ., 4 Fed. Cts. L. Rev. 131, 142 (2011) (arguing that the 2006 Rules amendments “place a premium on a fair resolution on the merits” and deter lawyers from using discovery “as an opportunity to hide the ball until trial”).

[5] See Philip Favro & Tish Looper, The Rule 37(e) Safe Harbor: The Touchstone of Effective Information Management, Metropolitan Corp. Couns., December 2011, at 12;  Dan H. Willoughby, Jr. et al.,  Sanctions for E-Discovery Violations: By the Numbers, 60 Duke L.J. 789, 792-95 (2010) (observing that the “highest number of filed motions and awards relating to e-[D]iscovery sanctions in any single year prior to 2010 occurred in 2009, three years after the effective date of the 2006 amendments”).

[6] See Comm. on Rules of Practice and Procedure of the Judicial Conference of the U.S., 113th Cong., Preliminary Draft of Proposed Amendments to the Federal Rules of Bankruptcy and Civil Procedure 271 (Comm. Print 2013), available at [hereinafter Report] (observing that “[t]he amount and variety of digital information has expanded enormously in the last decade, and the costs and burdens of litigation holds have escalated as well”).

[7] See generally Tom Kaneshige, Infographic: BYOD’s Meteoric Rise, CIO (Jan. 16, 2013, 2:50 PM), (noting the substantial growth of personal mobile device use in the workplace).

[8] See Gabriella Khorasanee, The Growing Reach of e-Discovery: Text Messages, In-House (Oct. 14, 2013, 11:52 AM), (discussing survey results regarding cellphone use for text messaging, along with associated e-Discovery risks arising from text messaging).

[9] Cf. William D. Henderson, A Blueprint for Change, 40 Pepp. L. Rev. 461, 487 (2013) (observing that discovery burdens have increased due to the “massive explosion of digital data,” which includes “e[-]mails, text messages, internal knowledge management platforms designed to replace e[-]mail, and digitized voice mail”).

[10] See generally William Jeremy Robison, Note, Free at What Cost?: Cloud Computing Privacy Under the Stored Communications Act, 98 Geo. L.J. 1195, 1200 n.26, 1202-04 (2010) (defining cloud computing and describing its rapidly expanding usage).

[11] See generally Craig B. Shaffer & Ryan T. Shaffer, Looking Past The Debate: Proposed Revisions to the Federal Rules of Civil Procedure, 7 Fed. Cts. L. Rev. 178, 187-90 (2013) (describing generally the factors driving the demand for additional amendments to the Federal Rules); Report, supra note 6, at 259-339.

[12] See Report, supra note 6, at 1, 260, 270.

[13] See id. at 272, 274.

[14] See Shaffer & Shaffer, supra note 11, at 178-79.  See generally Report, supra note 6, at 259-339.

[15] See, e.g., Thomas Y. Allman, Rules Committee Adopts ‘Package’ of Discovery Amendments, 13 Digital Discovery and e-Evidence 200 (2013),

[16] See Report, supra note 6, at 272 (“[T]he amended rule [37(e)] makes it clear that—in all but very exceptional cases in which failure to preserve ‘irreparably deprived a party of any meaningful opportunity to present or defend against the claims in the litigation’—sanctions (as opposed to curative measures) could be employed only if the court finds that the failure to preserve was willful or in bad faith, and that it caused substantial prejudice in the litigation.” (quoting the proposed Rule 37(e)(1)(B)(ii))).

[17] See Michael Kozubek, Proposed Federal Rule Changes Would Limit the Scope of e-discovery, Inside Counsel (July 1, 2013),

[18] See Report, supra note 6, at 260.

[19] See id.

[20] See Alison Frankel, Debate Sharpens on Proposed Changes to Federal Rules on Discovery, Reuters (Nov. 6, 2013),

[21]  Cf. Hon. Patrick J. Walsh, Rethinking Civil Litigation in Federal District Court, 40 Litig. 6, 7 (2013) (urging lawyers to use “[twenty-first] century computer technology” to address digital age discovery issues instead of relying on legacy discovery technologies).

[22] See Dembin & Favro, supra note 1.

[23] See id.

[24] See id.

[25] See id.

[26] See Report, supra note 6, at 260-61, 264, 269-70.

[27] Fed. R. Civ. P. 1.

[28] See Report, supra note 6, at 260-61, 264, 269-70 (observing that “[p]roportionality in discovery, cooperation among lawyers, and early and active judicial case management are highly valued and, at times, missing in action,” and discussing how the proposed amendments would advance these notions).

[29] Id. at 260.

[30] See id. at 272 (“A central objective of the proposed new Rule 37(e) is to replace the disparate treatment of preservation/sanctions issues in different circuits by adopting a single standard.”).

[31] See id. at 270.

[32] Id. at 281.

[33] See Report, supra note 6, at 270, 281.

[34] Id. at 270.

[35] See Bottoms v. Liberty Life Assurance Co. of Bos., No. 11-cv-01606-PAB-CBS, 2011 U.S. Dist. LEXIS 143251, at *10-11 (D. Colo. Dec. 13, 2011) (spotlighting the importance of the Rule 26(g) certification requirement, along with sanctions for noncompliance, for curbing discovery abuses).

[36] See, e.g., Pippins v. KPMG LLP, No. 11 Civ. 0377(CM)(JLC), 2011 U.S. Dist. LEXIS 116427, at *23-27 (S.D.N.Y. Oct. 7, 2011), aff’d, 279 F.R.D. 245 (S.D.N.Y. 2012) (discussing generally why cooperation and proportionality are inextricably intertwined for purposes of discovery).

[37] See Report, supra note 6, at 264-65.

[38] Cf. Favro & Pullan, supra note 4, at 966-968 (proposing modest changes to the Federal Rules to better emphasize that proportionality standards are the touchstone of federal discovery).

[39] Report, supra note 6, at 260.

[40] See id. at 264-67, 269.

[41]  See id. at 264-67.

[42] See id. at 265, 296.

[43] See id. at 296; Favro & Pullan, supra note 4, at 966.

[44] Fed. R. Civ. P. 26(b)(2)(C).

[45] Report, supra note 6, at 289.

[46] See Favro & Pullan, supra note 4, at 966, 976.

[47] See Report, supra note 6, at 296-97.

[48] Id. at 266.

[49] Fed. R. Civ. P. 26(b)(1).

[50] See Report, supra note 6, at 266.

[51] Id. at 289-90.

[52] Id. at 265-66, 296-97.

[53] Id. at 296-297.

[54]  See id. at 269.

[55] Report, supra note 6, at 269, 307-08.

[56] See id. at 308.

[57] See, e.g., Mancia v. Mayflower Textile Servs. Co., 253 F.R.D. 354, 359 (D. Md. 2008).

[58] See Fed. R. Civ. P. 26(g)(3).

[59] See Report, supra note 6, at 269.

[60] See id. at 269, 309.

[61] Id. at 308.

[62] See id.  at 269, 309.

[63] See id.

[64] See Report, supra note 6 at 269.

[65] See id.

[66] See id.

[67] Id. at 269, 307.

[68] Id. at 269, 309.

[69] See Report, supra note 6, at 269.

[70] See generally id. at 266, 298.

[71] See id.

[72] See id.

[73]  See id. at 267-69.

[74] See Report, supra note 6, at 267.

[75] Id. at 301.

[76] See id. at 268-69, 305.

[77] See id. at 269.

[78] See id. at 260-61.

[79] See Report, supra note 6, at 261.

[80] See id. at 263.

[81] See, e.g., Seventh Circuit Elec. Discovery Comm., Principles Relating to the Discovery of Electronically Stored Information, at princ. 2.05-2.06 (2010), available at

[82] See Report, supra note 6, at 263, 295.

[83] See id. at 299.

[84] See id. at 263; accord id. at  287.

[85] See id. at 263.

[86] See id. at 299, 327-28.

[87] See Report, supra note 6 at 263, 296.

[88] See John M. Barkett, Evidence Rule 502: The Solution to the Privilege-Protection Puzzle in the Digital Era, 81 Fordham L. Rev. 1589, 1619-20 (2013) (discussing the importance of Federal Rule of Evidence 502(d) in reducing the costs and burdens associated with attorney-client privilege reviews in discovery).  See generally Richard Marcus, The Rulemakers’ Laments, 81 Fordham L. Rev. 1639 (2013) (describing the underuse of Federal Rule of Evidence Rule 502(d)).

[89] See Report, supra note 6, at 263, 286.

[90] See id. at 263-64, 294, 298, 306, 308.

[91] Id. at 263-64.

[92] See id. at 264.

[93] See id. at 263.

[94] See Report, supra note 6, at 263, 288.

[95] See id. at 263, 288.

[96] Id. at 263.

[97] See id. at 261, 282, 284-85, 287

[98] Id. at 261, 282.

[99] Report, supra note 6, at 261, 284-85.

[100] See id. at 271-72.

[101] See id. at 272, 274.

[102] Fed. R. Civ. P. 37(e).  See generally Philip J. Favro, Sea Change or Status Quo: Has the Rule 37(e) Safe Harbor Advanced Best Practices for Information Management?, 11 Minn. J.L. Sci. & Tech. 317 (2010) (discussing the background, purposes, and application of Rule 37(e)).

[103] See Paul W. Grimm et al., Proportionality in the Post-Hoc Analysis of Pre-Litigation Preservation Decisions, 37 U. Balt. L. Rev. 381, 398 (2008).

[104] See Report, supra note 6, at 272 (noting that the proposed amendments reject a standard that holds negligence to be sufficient for sanctions, such as the one used in Residential Funding Corp. v. DeGeorge Financial Corp., 306 F.3d 99 (2d Cir. 2002)).

[105] See id.  at 274.

[106] See id. at 317-18.

[107] Id. at 321; see id. at 318.

[108] See id. at 272, 321.

[109] Report, supra note 6, at 315.

[110] See id. at 320.

[111] See id. at 325-28.

[112] Id. at 316-17.

[113] See id. at 325-28.

[114] Kozubek, supra note 17.

[115] Report, supra note 6, at 321.

[116] See id. at 314-15, 320-21.

[117] Id. at 314-15.

[118] See id. at 320-21.

[119] See id. at 320-23, 325-28.

[120] Cf. Dembin & Favro, supra note 1 (suggesting some steps that in-house lawyers can take on behalf of their organizational clients to change the manner in which discovery is conducted).

[121] See id.; see also supra Part II.D.

[122] Report, supra note 6, at 261 (“The case-management proposals reflect a perception that the early stages of litigation often take far too long. ‘Time is money.’ The longer it takes to litigate an action, the more it costs. And delay is itself undesirable.”).

[123] See supra Part II.C.3.

[124] See supra Part II.C.1-2.

[125] See supra Part II.C.1.

[126] See supra Part II.B.2.

[127] See supra Part II.B.1.

[128] See generally Pippins v. KPMG LLP, No. 11 Civ. 0377(CM)(JLC), 2011 U.S. Dist. LEXIS 116427, at *23-27 (S.D.N.Y. Oct. 7, 2011), aff’d, 279 F.R.D. 245 (S.D.N.Y. 2012) (discussing proportionality standards).

[129] See id.

[130] See supra Part II.B.3.  See generally Eisai Inc. v. Sanofi-Aventis U.S., LLC, No. 08-4168 (MLC), 2012 US. Dist. LEXIS 52885 (D.N.J. Apr. 16, 2012) (invoking proportionality standards to deny substantially all of the plaintiff’s document requests).

[131] See Charles R. Ragan, Information Governance: It’s a Duty and It’s Smart Business, 19 Rich. J.L. & Tech. 12, ¶ 9 (2013),; Dean Gonsowski, Inside Experts: Information Governance Takes the Stage in 2012, Inside Counsel (Jan. 27, 2012),

[132] See Ragan, supra note 131, at ¶¶30-33.

[133] See Gonsowski, supra note 131.

[134] See, e.g., E.I. du Pont De Nemours & Co. v. Kolon Indus., Inc., No. 3:09cv58, 2011 U.S. Dist. LEXIS 45888, at *46-48 (E.D. Va. Apr. 27, 2011) (holding that sanctions were not appropriate where emails were eliminated pursuant to a good faith information retention policy before a duty to preserve attached).

[135] See Anne Kershaw, Proposed New Federal Civil Rules—Part One (Data Disposition & Sanctions), Exchange (ARMA Metro NYC, New York, N.Y.), Nov.–Dec. 2013, at 10, 13, (opining that “organizations will have every reason to make sure that they routinely dispose of documents that do not need to be retained” if the proposed changes to Rule 37(e) are enacted).

[136] See Ragan, supra note 131, at ¶¶ 42-43.

[137] See id.

[138] See id.

[139] See Brigham Young Univ. v. Pfizer, Inc., 282 F.R.D. 566, 572-73 (D. Utah 2012) (denying plaintiffs’ fourth motion for doomsday sanctions since evidence was destroyed pursuant to defendants’ “good faith business procedures”).

[140] See Gonsowski, supra note 131.

[141] See Doe v. Norwalk Cmty. Coll., 248 F.R.D. 372, 378 (D. Conn. 2007) (denying defendants’ request to invoke the so-called “safe harbor” provision under Rule 37(e) where the defendants failed to observe their own document retention policies).

[142] See, e.g., United Med. Supply Co. v. United States, 77 Fed. Cl. 257, 274 (2007) (sanctioning defendant for allowing materials to be destroyed by its “antiquated” retention policies); Doe, 248 F.R.D. at 378.

[143] Northington v. H&M Int’l, No. 08-CV-6297, 2011 U.S. Dist. LEXIS 14366, at *43, *45-46 (N.D. Ill. Jan. 12, 2011).

[144] Id. at *58-61.

[145] See id. at *22-25.

[146] Id. at *21.

[147] Id.

[148] Northington, 2011 U.S. Dist. LEXIS 14366, at *16-19, *21.

[149] See Gonsowski, supra note 131.

[150] See id.

[151] See Micron Tech., Inc. v. Rambus Inc., 645 F.3d 1311, 1322 (Fed. Cir. 2011) (approving information retention policies that eliminate documents for “good housekeeping” purposes); Gonsowski, supra note 131.

[152] See, e.g., Viramontes v. U.S. Bancorp, No. 10 C 761, 2011 U.S. Dist. LEXIS 7850, at *8, *10-13 (N.D. Ill. Jan. 27, 2011) (citing Fed. R. Civ. P. 37(e)) (denying sanctions motion since the emails at issue were eliminated pursuant to a good faith retention policy before a duty to preserve was triggered).

[153] See, e.g., id. at *8-10, *12-13 (citing Fed. R. Civ. P. 37(e)).

[154] See, e.g., E.I. du Pont de Nemours & Co. v. Kolon Indus., Inc., 803 F. Supp. 2d 469, 509-10 (E.D. Va. 2011) (issuing an adverse inference jury instruction as a result of the defendant’s failure to distribute a timely and comprehensive litigation hold after its obligation ripened to retain relevant ESI).

[155] See, e.g., id. at 508-09.

[156] See, e.g., id. at 507-09.

[157] See Oleksy v. General Elec. Co., No. 06 C 1245, 2013 U.S. Dist. LEXIS 107638, at *33-35 (N.D. Ill. July 31, 2013) (ordering the production of defendant’s litigation hold instructions as a discovery sanction for failing to preserve relevant evidence that was purged from a database).

[158] See Micron Tech., Inc. v. Rambus Inc., 917 F. Supp. 2d 300, 316, 327 (D. Del. 2013) (declaring defendant’s patents unenforceable as a discovery sanction to address its failure to preserve email backup tapes, paper documents and other ESI).  But see Brigham Young Univ. v. Pfizer, Inc., 282 F.R.D. 566, 572-73 (D. Utah 2012) (denying plaintiffs’ fourth motion for doomsday sanctions since evidence was destroyed pursuant to defendants’ “good faith business procedures”).

[159] See Du Pont, 803 F. Supp. 2d at 510.

[160] Id. at 501-02, 509-10.

[161] Id. at 509-10.

[162] E.I. du Pont De Nemours & Co. v. Kolon Indus., Inc., 894 F. Supp. 2d 691, 721 (E.D. Va. 2012) (entering a 20-year product injunction against the defendant); Press Release, McGuire Woods, Jury Returns $919 Million for DuPont in Trade Secrets Theft Case (Sept. 15, 2011), available at$919-Million-for-DuPont-in-Trade-Secrets-Theft-Case.aspx.

[163] Du Pont, 803 F. Supp. 2d at 478-82.

[164] Id. at 478, 480-82, 501-05.

[165] Id. at 480.

[166] Id. at 480, 482, 489.

[167] Id. at 501.

[168] Du Pont, 803 F. Supp. 2d at 501 (holding that Kolon’s “counsel and executives should have affirmatively monitored compliance with the [litigation hold] orders.”).

[169] Id. at 479, 494.

[170] See generally id.

[171] See id. at 500.

[172] See id.

[173] See Du Pont, 803 F. Supp. 2d at 500.

[174] See, e.g., Viramontes v. U.S. Bancorp, No. 10 C 761, 2011 U.S. Dist. LEXIS 7850, at *12-13 (N.D. Ill. Jan. 27, 2011) (citing Fed. R. Civ. P. 37(a)(5)(B)) (denying sanctions motion since defendant issued a timely litigation hold to preserve relevant documents once a preservation duty attached).

[175] See Philip Berkowitz et al., Littler Report, The “Bring Your Own Device” to Work Movement: Engineering Practical Employment and Labor Law Compliance Solutions 1, 45 (2012), available at (detailing legal challenges regarding mobile device use such as implementing legal holds, protecting trade secrets, and proving misappropriation).

[176] See Greg Day, Overview from Greg Day On the Topic of Bring Your Own Device—The Challenges Facing Today and How This Trend Will Evolve in the Future, Symantec (Apr. 23, 2012), (describing the various challenges associated with mobile devices in the workplace).

[177] See Berkowitz, supra note 175, at 10.

[178] See Henry Z. Horbaczewski & Ronald I. Raether, BYOD:  Know the Privacy and Security Issues Before Inviting Employee-Owned Devices to the Party, ACC Docket, Apr. 2012, at 71, 72, available at (“Security starts with knowing what data resides where, and who has access to that data.  With employee-owned devices, the main unique issue from a security perspective is loss of control.”).

[179] See id.

[180]  See Lisa Milam-Perez, Littler Mendelson Attorney Warns of Pitfalls of “BYOD”, Wolters Kluwer (July 29, 2012), (describing best practices for workplace policies regarding mobile device use: “No use by friends and family members!  ‘I got the most guff for this one . . . and I imagine you probably will too.  I know your kid likes to play Angry Birds, and I know you bought it with your own money,’ but it’s an essential control”); Privacy Roundtable Highlights, Recorder (Mar. 5, 2013), (discussing the risk of misappropriation of company data by family members sharing devices that may also be used for work under an employer’s mobile device policy).

[181] See Milam-Perez, supra note 180 (discussing the “potential liability and other risks” of bring your own device policies).

[182] See Ragan, supra note 131, at ¶ 16 (noting that companies must keep certain information for various time periods and the effect of new technologies on information retention).

[183] See id; see also Greg Buckles, A Quick Forensics Lesson: The Smart Phone Is Much More Than Just a Hard Drive, Legal IT Profs. (July 17, 2012), (describing various challenges surrounding the preservation and collection of ESI from mobile devices).

[184] See Susan Ross, Unintended Consequences of Bring Your Own Device, Law Tech. News, Mar. 7, 2013, at 3, available at

[185] See Milam-Perez, supra note 180; Privacy Roundtable Highlights, supra note 180.

[186] See Day, supra note 176.

[187] See Philip J. Favro, Inviting Scrutiny: How Technologies are Eroding the Attorney-Client Privilege, 20 Rich. J.L. & Tech. 2, ¶ 158 (2013),

[188] Id.

[189] See, e.g., Michael Z. Green, Against Employer Dumpster-Diving for Email, 64 S.C. L. Rev. 323, 341 (2012).

[190] See id. at 341, 362-63.

[191] See generally Howard Hunter, Social Media and Discovery, 24 N.Y. St. B. Ass’n  Int’l L. Practicum 117, 117, 119-21 (2011) (describing the interplay between privacy strictures and discovery obligations).

[192] See Patrick J. Walsh, Rethinking Civil Litigation in Federal District Court, 40 No. 1 Litig. 6, 6-7 (2013).

[193] See id. at 7 (“A better method for searching large databases is predictive coding.”).

[194] See Moore v. Publicis Groupe, 287 F.R.D. 182, 190 (S.D.N.Y. 2012) (detailing the cost and review benefits that predictive coding technologies may offer over traditional review methods).

[195] See id.

[196] Tal Z. Zarsky, “Mine Your Own Business!”: Making the Case for the Implications of the Data Mining of Personal Information in the Forum of Public Opinion, 5 Yale J. L. & Tech. 4, 9 n.27 (2003) (discussing the functions and ostensible benefits of visualization technologies).

[197] See Jacob Tingen, Technologies-That-Must-Not-Be-Named: Understanding and Implementing Advanced Search Technologies in E-Discovery, 19 Rich. J.L. & Tech. 2, ¶¶ 1-2, 43 (2012), (explaining the benefits of using visualization tools in discovery over traditional review methods).

[198] See Philip Favro, Mission Impossible? The eDiscovery Implications of the ABA’s New Ethics Rules, e-discovery 2.0 (Aug. 30, 2012), (describing the importance of using traditional and new technologies to satisfy discovery obligations).

[199] See, e.g., Moore, 287 F.R.D. at 190-91; William A. Gross Const. Assocs, Inc. v. Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009) (“This case is just the latest example of lawyers designing keyword searches in the dark, by the seat of the pants, without adequate (indeed, here, apparently without any) discussion with those who wrote the emails.”).

[200] See William A. Gross, 256 F.R.D. at 135-36; Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260-62 (D. Md. 2008) (“Selection of the appropriate search and information retrieval technique requires careful advance planning by persons qualified to design effective search methodology.  The implementation of the methodology selected should be tested for quality assurance; and the party selecting the methodology must be prepared to explain the rationale for the method chosen to the court, demonstrate that it is appropriate for the task, and show that it was properly implemented.”).

[201] See Walsh, supra note 192, at 7 (“The biggest problem I see with electronic discovery is that lawyers are using 20th-century technology-that is, obtaining all of the documents, organizing them in folders, and trying to read and digest them-to address 21st-century production.”).

[202] See id.

[203] See Shawn Cheadle and Philip J. Favro, Push or Pull: Deciding How Much Oversight is Required of In-house Counsel in eDiscovery, ACC Docket, May 2013, at 82, 89 (describing some of the ways that in-house counsel can obtain better advocacy from its retained outside counsel).

[204] See id. at 89-90.

[205] Id. at 89.

[206] Id.

[207] Id.

[208] Cheadle & Favro, supra note 203, at 89; see Fed. R. Civ. P. 26(g)(3).

[209] See generally Eisai Inc. v. Sanofi-Aventis U.S., LLC, No. 08-4168 (MLC), 2012 U.S. Dist. LEXIS 52887 (D.N.J. Apr. 16, 2012); Pippins v. KPMG LLP, No. 11 Civ. 0377(CM)(JLC), 2011 U.S. Dist. LEXIS 116427 (S.D.N.Y. Oct. 7, 2011), aff’d, 279 F.R.D. 245 (S.D.N.Y. 2012).

[210] See Kershaw, supra note 135, at 13 (noting that “lawyers will need to have a good understanding of their client’s records management and disposition policies”).

[211] See id.

[212] See id. at 11, 13.

[213] See id. at 13 (“[E]ngaging in early discussions with adversaries  . . . means we can finally replace preservation uncertainty—the reason why organizations save everything—with preservation certainty.”).

[214] See id.

[215] See Gonsowski, supra note 131.

Finding the Signal in the Noise: Information Governance, Analytics, and the Future of Legal Practice


Cite as: Bennett B. Borden & Jason R. Baron, Finding the Signal in the Noise: Information Governance, Analytics, and the Future of Legal Practice, 20 Rich. J.L. & Tech. 7 (2014),


Bennett B. Borden* and Jason R. Baron**



[1]        In the watershed year of 2012, the world of law witnessed the first concrete discussion of how predictive analytics may be used to make legal practice more efficient.  That the conversation about the use of predictive analytics has emerged out of the e-Discovery sector of the law is not all that surprising: in the last decade and with increasing force since 2006—with the passage of revised Federal Rules of Civil Procedure that expressly took into account the fact that lawyers must confront “electronically stored information” in all its varieties—there has been a growing recognition among courts and commentators that the practice of litigation is changing dramatically.  What needs now to be recognized, however, is that the rapidly evolving tools and techniques that have been so helpful in providing efficient responses to document requests in complex litigation may be used in a variety of complementary ways to the discovery process itself.

[2]        This Article is informed by the authors’ strong views on the subject of using advanced technological strategies to be better at “information governance,” as defined herein.  If a certain evangelical strain appears to arise out of these pages, the authors willingly plead guilty.  One need not be an evangelist, however, but merely a realist to recognize that the legal world and the corporate world both are increasingly confronting the challenges and opportunities posed by “Big data.”[1]  This Article has a modest aim: to suggest certain paths forward where lawyers may add value in recommending to their clients greater use of advanced analytical techniques for the purpose of optimizing various aspects of information governance.  No attempt at comprehensiveness is aimed for here; instead, the motivation behind writing this Article is simply to take stock of where the legal profession is, as represented by the emerging case law on predictive coding represented by Da Silva Moore,[2] and to suggest that the expertise law firms have gained in this area may be applied in a variety of related contexts.

[3]        To accomplish what we are setting out to do, we will divide the discussion into the following parts: first, a synopsis of why and how predictive coding first emerged against the backdrop of e-Discovery.  This discussion will include a brief overview of predictive coding with references to the technical literature, as the subject has been recently covered exhaustively elsewhere.  Second, we will define what we mean by “Big data,” “analytics,” and “information governance,” for the purpose of providing a proper context for what follows.  Third, we will note those aspects of an information governance program that are most susceptible to the application of predictive coding and related analytical techniques.  Perhaps of most value, we wish to share a few “early” examples of where we as lawyers have brought advanced analytics, like predictive coding, to bear in non-litigation contexts and to assist our clients in creative new ways.  We fully expect that what we say here will be overrun with a multitude of real-life use cases soon to emerge in the legal space.  Armed with the knowledge that we are attempting to catch lightning in a bottle and that law reviews on subjects such as this one have ever decreasing “shelf-lives”[3] in terms of the value proposition they provide, we proceed nonetheless.

A.  The Path to Da Silva Moore

[4]        The Law of Search and Retrieval.  In the beginning, there was manual review.  Any graduate of a law school during the latter part of the twentieth century who found herself or himself employed before the year 2000 at a law firm specializing in litigation and engaged in high-stakes discovery remembers well how document review was conducted: legions of lawyers with hundreds if not thousands of boxes in warehouses, reviewing folders and pages one-by-one in an effort to find the relevant needles in the haystack.[4]  (Some of us also remember “Sheparding” a case to find subsequent citations to it, using red and yellow booklets, before automated key-citing came along.)  Although manual review continues to remain a default practice in a variety of more modest engagements, it is increasingly the case that all of discovery involves “e-Discovery” of some sort—that the world is simply “awash in data”[5] (starting but by no means ending with email, messages and other textual documents of all varieties), and that it will increasingly be the unusual case of any size where documents in paper form still loom large as the principal source of discovery.

[5]        At the turn of the century, the dawning awareness of the need to deal with a new realm of electronically stored information (“ESI”) led to burgeoning efforts on many fronts, including, for example, the creation of The Sedona Conference working group on electronic document retention and production, members of which drafted The Sedona Conference Principles: Addressing Electronic Document Production (2005; 2d ed. 2007) and its “prequel,” The Sedona Guidelines: Best Practice Guidelines and Commentary for Managing Records and Information in the Electronic Age (2005; 2d ed. 2007).  These early commentaries, including a smattering of pre-2006 case law,[6] recognized that changes in legal practice were necessary to accommodate the big changes coming in the world of records and information management within the enterprise.  Subsequent developments would constitute various complementary threads leading to the greater use of analytics in the legal space.

[6]        First, part of that early recognition was that in an inflationary universe of rapidly expanding amounts of ESI, new tools and techniques would be necessary for the legal profession to adapt and keep up with the times.[7]  By the time of adoption of the revised Federal Rules of Civil Procedure in 2006, which expressly added the term “ESI” to supplement “documents” in the rule set applicable to discovery practice, the legal profession was well aware of the need to perform automated searches in the form of keyword searching within large data sets as the only realistically available means for sorting information into relevant and non-relevant evidence in particular engagements, be they litigation or investigations.  So too, it was recognized early on in commentaries[8] and followed by case law[9] that keyword searching, as good a tool as it was, had profound limitations that in the end do not scale well.  At the end of the day, even being able to limit or cull down a large data set to one percent of its original size through the use of keywords leaves the lawyer with the near impossible task of manually reviewing a very large set of documents at great cost.[10]

[7]        Second, in evolving e-Discovery practice after 2006, a growing recognition also occurred around the idea that e-Discovery workflows are an “industrial” process in need of better metrics and measures for evaluating the quality of productions of large data sets.  As recognized in The Sedona Conference Commentary on Achieving Quality in E-discovery (Post-Public Comment Version 2013):

The legal profession has passe
d a crossroads: When faced with a choice between continuing to conduct discovery as it had “always been practiced” in a paper world—before the advent of computers, the Internet, and the exponential growth of electronically stored information (ESI)—or alternatively embracing new ways of thinking in today’s digital world, practitioners and parties acknowledged a new reality and chose progress.  But while the initial steps are completed, cost-conscious clients and over-burdened judges are increasingly demanding that parties find new approaches to solve litigation problems.[11]

 [8]        The Commentary goes on to suggest that the legal profession would benefit from greater

awareness about a variety of processes, tools, techniques, methods, and metrics that fall broadly under the umbrella term “quality measures” and that may be of assistance in handling ESI throughout the various phases of the discovery workflow process.  These include greater use of project management, sampling, machine learning, and other means to verify the accuracy and completeness of what constitutes the “output” of e-[D]iscovery.  Such collective measures, drawn from a wide variety of scientific and management disciplines, are intended only as an entry-point for further discussion, rather than an all-inclusive checklist or cookie-cutter solution to all e-[D]iscovery issues.[12]

 [9]        Indeed, more recent case law has recognized the need for quality control, including through the use of greater sampling, iterative methods, and phased productions in line with principles of proportionality.[13]  Still other case law has emphasized the need for cooperation among parties in litigation on technical subjects, especially at the margins of, or outside the range of, lawyer expertise if not basic competence.

[10]      Active or supervised “machine learning,” as referred to here in the context of e-Discovery, refers to a set of analytical tools and techniques that go by a variety of names, such as “predictive coding,” “computer-assisted review,” and “technology assisted review.”  As explained in one helpful recent monograph:

Predictive coding is the process of using a smaller set of manual reviewed and coded documents as examples to build a computer generated mathematical model that is then used to predict the coding on a larger set of documents.  It is a specialized application of a class of techniques referred to as supervised machine-learning in computer science.  Other technical terms often used to describe predictive coding include document (or text) “classification” and document (or text) “categorization.”[14]

 [11]      And as stated in The Sedona Conference Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery (Post-Public Comment Version 2013):

Generally put, computer- or technology-assisted approaches are based on iterative processes where one (or more) attorneys or [Information Retrieval] experts train the software, using document exemplars, to differentiate between relevant and non-relevant documents.  In most cases, these technologies are combined with statistical and quality assurance features that assess the quality of the results.  The research . . . has demonstrated such techniques superior, in most cases, to traditional keyword based search, and, even, in some cases, to human review.

 The computer- or technology-assisted review paradigm is the joint product of human expertise (usually an attorney or IR expert working in concert with case attorneys) and technology.  The quality of the application’s output, which is an assessment or ranking of the relevance of each document in the collection, is highly dependent on the quality of the input, that is, the human training. Best practices focus on the utilization of informed, experienced, and reliable individuals training the system.  These individuals work in close consultation with the legal team handling the matter, for engineering the application. Similarly . . . the defensibility and usability of computer- or technology-assisted review tools require the application of statistically-valid approaches to selection of a “seed” or “training” set of documents, monitoring of the training process, sampling, and quantification and verification of the results.[15] 

A discussion of the mathematical algorithms that underlie predictive coding is beyond the intended scope of this Article, but the interested reader should refer to references cited at the margin to understand better what is “going on under the hood” with respect to the mathematics involved.[16]

[12]      The Da Silva Moore Precedent.  The various threads in search and retrieval law, including the need for advanced search methods applied to document review in a world of increasingly large data sets, were well known by 2012.  In February 2012, drawing on recent research and scholarship emanating out of the Text Retrieval Conference (TREC) Legal Track[17] and the 2007 public comment version of The Sedona Conference Search Commentary,[18] Judge Peck approached the Da Silva Moore case as an appropriate vehicle to provide a judicial blessing for the use of predictive coding in e-Discovery.  In doing so, however, Judge Peck’s opinion may also be viewed as setting the stage for greater use of analytics generally in the information governance practice area, beyond “mere” e-Discovery.

[13]      Plaintiffs in Da Silva Moore brought claims of gender discrimination against defendant advertising conglomerate Publicis Groupe and its United States public relations subsidiary, defendant MSL Group.[19]  Prior to the February 2012 opinion issued by Judge Peck, the parties had already agreed that defendant MSL would use predictive coding to review and produce relevant documents, but disagreed on methodology.[20]  Defendant MSL proposed starting with the manual review of a random sample of documents to create a “seed set” of documents that would be used to train the predictive coding software.[21]  Plaintiffs would participate in the creation of the “seed set” of documents by offering keywords.[22] All documents reviewed during the creation of the “seed set,” relevant or irrelevant, would be provided to plaintiffs.[23]

[14]      After creation of the seed set of documents, MSL proposed using a series of “iterative rounds” to test and stabilize the training software.[24]  The results of these iterative rounds would be provided to plaintiffs, who would be able to provide feedback to further refine the searches.[25]   Judge Peck accepted MSL’s proposal.[26]  Plaintiffs filed objections with the district judge on the grounds that Judge Peck’s approval of MSL’s protocol unlawfully disposed of MSL’s duty under Federal Rule of Civil Procedure 26(g) to certify the completeness of its document collection, and the methodology in MSL’s protocol was not sufficiently reliable to satisfy Federal Rule of Evidence 702 and Daubert.[27]

 [15]      Judge Peck found the plaintiffs’ objections to be misplaced and irrelevant.[28]  With respect to Federal Rule of Civil Procedure 26(g), Judge Peck commented that no attorney could certify the completeness of a document production as large as MSL’s. Moreover, Federal Rule of Civil Procedure 26(g) did not require the type of certification plaintiffs described.[29]  Further, Federal Rule of Evidence 702 and Daubert are applicable to expert methodology, not to methodologies used in electronic discovery.[30]  Judge Peck went on to note that the decision to allow computer-assisted review in this case was easy because the parties agreed to this method of document collection and review.[31]  While computer-assisted review may not be a perfect system, he found it to be more
efficient and effective than using manual review and keyword searches to locate responsive documents.[32]  Use of predictive coding was appropriate in this case considering:

 (1) the parties’ agreement, (2) the vast amount of ESI to be reviewed (over three million documents), (3) the superiority of computer-assisted review to the available alternatives (i.e., linear manual review or keyword searches), (4) the need for cost effectiveness and proportionality under Rule 26(b)(2)(C), and (5) the transparent process proposed by MSL.[33]

 [16]      In issuing this opinion, Judge Peck became the first judge to approve the use of computer-assisted review.[34]  He also stressed the limitations of his opinion, stating that computer-assisted review may not be appropriate in all cases, and his opinion was not intended to endorse any particular computer-assisted review method.[35]  However, Judge Peck encouraged the Bar to consider computer-assisted review as an available tool for “large-data-volume cases” where use of such methods could save significant amounts of legal fees.[36]  Judge Peck also stressed the importance of cooperation, or what he called “strategic proactive disclosure of information.”  If counsel is knowledgeable about the client’s key custodians and fully explains proposed search methods to opposing counsel and the court, those proposed search methods are more likely to be approved.  To sum up his opinion, Judge Peck noted that “[c]ounsel no longer have to worry about being the ‘first’ or ‘guinea pig’ for judicial acceptance of computer-assisted review. . . . Computer-assisted review now can be considered judicially-approved for use in appropriate cases.”[37]  In the two years since Da Silva Moore, in addition to cases in which the parties have agreed upon a predictive coding methodology,[38] courts have confronted the issue of having to rule on either the requesting or responding party’s motion to compel a judicial “blessing” of the use of predictive coding (however termed).  In Global Aerospace,[39] the responding party asked that the court approve its own use of such technique; in Kleen Products, the requesting party made an ultimately unsuccessful demand for a “do-over” in discovery, where the responding party had used keyword search methods and the plaintiffs were demanding that more advanced methods be tried.[40]  In the EOHRB case, the Court sua sponte suggested that the parties consider using predictive coding, including the same vendor.[41]  And in the In re Biomet case,[42] the court approved a predictive coding methodology over the objections of the requesting party.  These cases represent only some of the reported decisions to date, and we suspect that there will be dozens of reported cases and many more unreported ones in the near term.

[17]      As recognized in these cases (implicitly or explicitly), as well as in a growing number of commentaries,[43] predictive coding is an analytical technique holding the promise of achieving much greater efficiencies in the e-Discovery process.  Notwithstanding Da Silva Moore’s call to action, it needs to be conceded, however, that the research has not proven that active machine learning techniques will always achieve greater scores than keyword search or manual review.[44]  Additionally, we bow to the reality that in a large class of cases the use of predictive coding is currently infeasible or unwarranted, especially as a matter of cost.[45]

[18]      Nevertheless, it seems apparent that the legal profession finds itself in a new place—namely, in need of recognizing that artificial intelligence techniques are growing in strength from year to year—and thus it appears to be only a matter of time until a much greater percentage of complex cases involving a large magnitude of ESI will constitute good candidates for lawyers using predictive coding techniques, both as available currently and as improved with future technological progress.  As William Gibson once put it, “the future is here, it’s just not evenly distributed.”[46]

 B.  Information Governance and Analytics in the Era of Big Data

[19]      We are now in a post-Da Silva Moore, “Big data” era where lawyers are on constructive (if not actual) notice of a world of technology assisted review techniques available at least in the sphere of e-Discovery.  The proposition being advanced is that the greater revelation of Da Silva Moore is how similar the techniques being put forward as best practices in e-Discovery fit a larger realm of issues familiar to lawyers, many of which fall within what is increasingly being recognized as “information governance” practice.  It is here where we can break new ground in our legal practice by recommending the use of these advanced techniques to solve real-world problems of our clients.  First, however, some definitions are in order to better frame the legal issues that will follow in Section C.

[20]      Big data.  It has been noted that “Big data is a loosely defined term used to describe data sets so large and complex that they become awkward to work with using standard statistical software.”[47]  Alternatively, “Big data” is a term that “describe[s] the technologies and techniques used to capture and utilize the exponentially increasing streams of data with the goal of bringing enterprise-wide visibility and insights to make rapid critical decisions.”[48]

[21]      The fact that the data encountered within the corporate enterprise increasingly is indeed “big” means, at least according to Gartner, that it not only has volume, but velocity and complexity as well.[49]  As Bill Franks has put it, “What this means is that you aren’t just getting a lot of data when you work with big data.  It’s also coming at you fast, it’s coming at you in complex formats, and it’s coming at you from a variety of sources.”[50]  These elements all significantly contribute to the challenge of finding signals in the noise.

[22]      These definitions seem to get us closer to what makes Big data a new and interesting phenomenon in the world: it is not its volume alone, but the fact that we are able to “mine” large data sets using new and advanced techniques to uncover unexpected relationships, patterns and categories within these data sets, that makes the field potentially exciting.  Indeed, “it is tempting to understand big data solely in terms of size. But that would be misleading. Big data is also characterized by the ability to render into data many aspects of the world that have never been quantified before; call it ‘datafication.’”[51]

[23]      Analytics.  Second, we need to place “predictive coding” as one form of active machine learning in the context of the broader realm of “analytics.”  In their book, Keeping Up With the Quants: Your Guide To Understanding and Using Analytics,[52] authors Thomas Davenport and Jinho Kim provide a useful construct in categorizing the newly emergent field of “analytics”: they define analytics to mean “the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and add value,” going on to say that “[a]nalytics is all about making sense of big data, and using it for competitive advantage.”  The authors divide the world of analytics into three categories:

 (i)             descriptive analytics – gathering, organizing, tabulating and depicting data;

(ii)              predictive analytics – using data to predict future courses of action; and

(iii)             prescriptive analytics – recommendations on future courses of action.[53]

[24]      To the extent that “predictive coding” has been used to date to have machines “predict” relevancy in large ESI data sets, the term comfortably can be said to fall within category (ii).   But the world of analytics is a larger universe, encompassing a greater number of mathematical magic tricks,[54] and this should be kept in mind as we choose to limit our discussion here to a few examples of how predictive coding as one form of analytics may be usefully applied in non-traditional contexts.[55]

 [25]      Corporations (much ahead of the legal profession) have rushed headlong during the past half-decade to use a variety of analytics to understand the Big data they increasingly hold, to add value, and to improve the bottom line.[56]  A 2013 AIIM study indicates that corporations find analytics to be useful in a variety of settings.[57]

[26]      Information Governance.  “Information governance,” as defined in The Sedona Conference’s recently published Commentary on the subject, means:

 an organization’s coordinated, interdisciplinary approach to satisfying information legal and compliance requirements and managing information risks while optimizing information value.  As such, Information Governance encompasses and reconciles the various legal and compliance requirements and risks addressed by different information focused disciplines, such as records and information management (“RIM”), data privacy, information security, and e-[D]iscovery.[58]

 Or, as highlighted by the seminal law review article devoted to information governance written by Charles R. Ragan who quotes Barclay Blair in defining information governance as a “‘new approach’ that “builds upon and adapts disciplines like records management and retention, archiving business analytics, and IT governance to create an integrated model for harnessing and controlling enterprise information . . . [I]t is an evolutionary model that requires organizations to make real changes.”[59]

[27]      As the Sedona IG Commentary highlights, “many organizations have traditionally used siloed approaches when managing information.”[60]  The “core shortcoming” of this approach is “that those within particular silos are constrained by the culture, knowledge, and short-term goals of their business unit, administrative function, or discipline.”[61]  This leads in turn to key actors within the organization having “no knowledge of gaps and overlaps in technology or information in relation to other silos. . . .”[62]  In such situations, “[t]here is no overall governance or coordination for managing information as an asset, and there is no roadmap for the current and future use of information technology.”[63]

[28]      The Sedona IG Commentary goes on to provide eleven principles of what constitutes good IG practices, of which Principle 10 is of special relevance to our discussion here: “An organization should consider leveraging the power of new technologies in its Information Governance program.”[64]  As stated therein,

         Organizations should consider using advanced tools and technologies to perform various types of categorization and classification activities. . . such as machine learning, auto-categorization, and predictive analytics to perform multiple purposes, including (i) optimizing the governance of information for traditional RIM [records and information management]; (ii) providing more efficient and more efficacious means of accessing  information for e-discovery, compliance, and open records laws, and (iii) advancing sophisticated business intelligence across the enterprise.[65]

 With respect to the latter category, the Commentary goes on to specifically identify areas where predictive analytics may be used in compliance programs “to predict and prevent wrongful or negligent conduct that might result in data breach or loss,” as a type of “early warning system.”[66] It is precisely this latter type of conduct that we wish to primarily explore in the next section, along with a few final words on using analytics with auto-categorization for the purpose of records classification and data remediation.

C.  Applying the Lessons of E-Discovery In Using Analytics for Optimal Information Governance: Some Examples

 [29]      Advanced analytics are increasingly being used in the e-Discovery context because the legal profession has begun to realize the limitations of manual and keyword searching, while at the same time seeing how advanced techniques are at least as efficacious and far more efficient in a wide variety of substantial engagements.  But more efficient and at least as equally effective at doing what, precisely?  In e-Discovery, the primary information task involves separating relevant from non-relevant, and to a secondary degree, privileged from non-privileged information, in documents and ESI.  Indeed, lawyers are under a duty to make “reasonable”—not perfect—efforts to find all relevant documents within the scope of a given discovery request.[67]  The illusiveness of this quest in an exponentially expanding data universe is becoming increasingly apparent to many.[68]

[30]      Moreover, the degree of success in being able to either find or demand substantial amounts of relevant information is not (nor should it be) the fundamental goal or point of engaging in e-Discovery.[69]  Rather, the liberal discovery rules that at least U.S. lawyers operate within have as their underlying purpose the ferreting out of important, material facts to the case at hand.  The increasingly overwhelming nature of ESI poses clear technological obstacles to a lawyer en route to efficiently engaging in developing facts from all those relevant documents to determine what happened and why.[70]  The promise of using an advanced analytical method such as predictive coding is its ability to quickly find and rank-order the most relevant documents for answering these questions.  For once we determine how something happened and why, it is relatively straightforward to figure out the parties’ respective rights, responsibilities, and even liability.  That is precisely the point of litigation, and the purpose of the Rules that govern it.[71]  And, facts drive it all.

[31]      Given our increasing ability in litigation in finding the most relevant needles (i.e., facts) in the Big data haystack, it stands to consider whether similar methods may be successfully applied in non-litigation contexts.  Somewhat paradoxically, however, experience indicates that there are advantages to dealing with larger volumes of data when applying analytical tools and methods to solve corporate legal issues.  That is, while a vast amount of data residing in corporate networks and repositories admittedly poses complex information governance challenges, the volume of Big data also may be a boon to the investigator simply trying to figure out what happened.  This is the case because there are simply many more data points from which to derive facts.  One can liken the phenomenon to the difference in quality of a one-megapixel versus a ten-megapixel picture: the difference in the quality of the image is a function of the greater density of points of illumination.

[32]      Big data is more data, and more data means the potential for a more complete picture of what happened in a given situation of interest, assuming of course that the facts can be captured efficiently.  The problem is not one of volume, but of visibility.  In the era of Big data, the investigator with the more powerful analytical methods, who can search into vast repositories of ESI to draw out the facts that are critical to the question at hand, is king (or queen).  This is where the skillful application of advanced analytics to Big data can bring about some remarkable results.  The true strategic advant
age of advanced analytics is the speed with which an accurate answer can be ascertained.[72]

[33]      True Life Example #1.[73]  A corporate client is being sued by a former employee in a whistleblower qui tam action.[74]  Because of the False Claims Act allegations, the suit represented a significant threat to the company.  The corporation retains counsel to understand the client’s information systems as well as its key players, and to assist in the implementation of a litigation hold.  Counsel strategically targets the data most likely to shed light on the facts.  The law firm’s Fact Development Team applies advanced analytics to 675,000 documents, and within four days knows enough to defend the client’s position that the allegations are indisputably baseless.  All of this is done before the answer to the Complaint was due.

[34]      Armed with this information, counsel for the corporation approached plaintiff’s counsel and asked to meet.  Prior to the meeting, the corporation voluntarily produced 12,500 documents that laid out the parties’ position precisely.  Counsel then met with plaintiff’s counsel and walked them through the evidence, laying out all the facts.  The case ended up being settled within days for what amounted to nuisance value based on a retaliation claim—without any discovery, and at a small fraction of the cost budgeted for the litigation.

[35]      This example indicates that the real power of advanced analytics is not merely in potentially reducing the cost of vexatious litigation, but rather the strategic advantage that comes with counsel getting to an answer quickly and accurately.  This precise strategic advantage has many applications outside of litigation, each of which involves an aspect of optimizing information governance.

[36]      Only a short step away from the direct litigation realm is using advanced analytics for investigations, either in response to a regulatory inquiry or for purely internal purposes.  As we have already seen, corporate clients are often faced with circumstances where determining whether an allegation is true, and the scope of the potential problem if it is, is critically important.  Often, management must wait, unsure of their company’s exposure and how to remediate it, while traditional investigation techniques crawl along.  However, with the skillful application of advanced analytics upon the right data set, accurate answers can be determined with remarkable speed.

[37]      True Life Example #2.  A highly regulated manufacturing client decided to outsource the function of safety testing some of its products.  A director of the department whose function was being outsourced was offered a generous severance package.  Late on a Friday afternoon, the soon-to-be former director sent an email to the company’s CEO demanding four times the severance amount and threatened to go to the company’s regulator with a list of ten supposed major violations that he described in the email if he did not receive what he was asking for.  He gave the company until the following Monday to respond.

[38]      The lawyers were called in.  They analyzed the list of allegations and determined which IT systems would most likely contain data that would prove their veracity and immediately pulled the data.  Applying advanced analytics, the law firm’s Fact Development Team analyzed on the order of 275,000 documents in thirty-six hours.  By that Monday morning, counsel was able to present a report to the company’s board indisputably proving that the allegations were unfounded.

[39]      True Life Example #3.  A major company received a whistleblower letter from a reputable third party alleging that several senior personnel were involved with an elaborate kickback scheme that also involved FCPA violations.  If true, the company would have faced serious regulatory and legal issues, as well as major internal difficulties.  Because of the extremely sensitive nature of the allegations, a traditional investigation was not possible; even knowing certain personnel were under investigation could have had immense consequences.

[40]      The lawyers were tasked with determining whether there was any information within the company’s possession that shed any light on the allegations.  If there were, the company would proceed to take whatever steps were required.  The investigation was of such a secret nature that no one was authorized to involve the internal IT staff.  Fortunately, counsel knew the company and its information systems well.  Over a weekend, they were able to pull 8.5 million documents from relevant systems using the law firm’s personnel.  This turned out to be a highly complex investigation involving a number of potential subjects, where the task involved tracking the subject’s travel, meetings with suppliers, subsequent sales orders and fulfillments, rebates and promotions, all across several years.

[41]      Again, applying advanced analytics, the law firm’s Fact Development Team analyzed the 8.5 million documents in ten days.  They were able to prove that the allegations were largely baseless, and precisely where there were potential areas of concern.  Counsel also was able to make clear recommendations for areas of further investigation and for modifying compliance tracking and programs.  The company was able to act quickly and with certainty.  These real-life use cases illustrate how the power of analytics enhances the ability of lawyers to provide legal advice under conditions of “certainty” previously unobtainable, at least in the past few decades of the digital era.  “Certainty” is a somewhat foreign concept in the law—lawyers tend to be a conservative and caveating bunch, largely because certainty has historically been hard to come by, or at least prohibitively expensive.  With advanced analytics and good lawyers who know how to use these new tools, that is no longer necessarily the case.  There is so much data that if one cannot, after a reasonable effort, find evidence of a fact in the vastness of a company’s electronic information (as long as you have the right information), the fact most likely is not true.  Such has been illustrated, proving a negative is particularly useful in investigations.

[42]      Using advanced analytics (and good lawyering) for investigations is not that far removed from using it for litigation: one is still attempting to find the answer to the question of what happened and why. But there are many other questions that companies would like to ask of their data.  And indeed, both the analytics tools and the fact development techniques used in litigation and investigations can be “tuned” to solve a variety of novel issues facing our clients.

[43]      For example, analytics can be used to vet candidates for political appointments as well as candidates for senior leadership positions.  Due to the candid nature of the medium, providing access to corporate email coupled with using analytic capabilities allows for an accurate picture to be drawn before a decision is made with regard to making a candidate your next CEO or running mate.  Analytics can be used to analyze business divisions to identify good and bad leaders, how decisions are made, why a division is more successful than another, and many more similar applications.

[44]      Quite simply, a company’s data is the digital imprint of the actions and decisions of all of its managers and employees.  Having insight into those actions and decisions can be immensely valuable.  That value has lain largely fallow, hidden in plain sight because the valuable wheat could not effectively be sifted from the chaff.  With the proper application of advanced analytics, that is no longer the case.  The answers we can obtain are limited only by the creativity of management in asking the right questions.

[45]      True Life Example #4.  Advanced a
nalytics used upon the major acquisition of another company by a corporate client.  As with most acquisitions, the client undertook traditional due diligence, gathering information from the target regarding its financial performance, customers, market share, receivables, potential liabilities, and came up with a valuation, an appropriate multiplier, and a final purchase price.  Also as is typical, the acquisition agreement contained a provision such that if the disclosures made by the target were found to be off by a certain margin within thirty days of the acquisition, the purchase price would be adjusted.

[46]      The moment the acquisition closed, the corporate client then owned all of the target’s information systems.  Having some concern about the bases for some of the target’s disclosures, at the client’s request counsel proceeded to use analytics on those newly acquired systems to determine what we could about those disclosures.  Preparing a company for sale is a complicated affair, with many people involved in gathering information to present to the acquirer to satisfy due diligence.  This gathering and presentation of information is done primarily through electronic means—and leaves a trail.

[47]      Using advanced analytics, the law firm’s Fact Development Team traced the compilation of the target’s due diligence information, including all of the discussion that went along with it.  They were able to understand the source of each disclosure, the reasonableness of its basis, and any weaknesses within it.  They uncovered disagreements within the target over such things as what the right numbers were, or how much of a liability to disclose.  Using this information, counsel prepared a claim in accord with the adjustment provision seeking twenty-five percent of the purchase price totaling millions of dollars.  The claim was primarily composed using quotes from their own documents.  It is difficult to argue with yourself.

[48]      As demonstrated, using advanced analytics in the form of predictive coding and similar technologies can accomplish some notable aims.  But each of the prior examples uses data to look back to determine what has already occurred: the descriptive use of analytics.[75]  This is extremely valuable.  But for many of a law firm’s clients, it would be even more useful to be able to catch bad actors while the misconduct was occurring, or even to predict misconduct before it happens.

[49]      Based on the anecdotal experience gathered from many past investigations, the authors believe that certain kinds of misconduct follow certain patterns, and that when bad actors are acting badly, they tend to undertake the same kinds of actions, or are experiencing similar circumstances.  For example, in our experience the primary factors that pertain to a person committing fraud are personal relationship problems, financial difficulties, drug or alcohol problems, gambling, a feeling of under appreciation at work, and unreasonable pressure to achieve a work outcome without a legitimate way to accomplish it (and so they attempt illegitimate ways to do so).  These factors are often detectable in the electronic information the subject creates.  Similarly, a person who is harassing or discriminating against others also tends to undertake specific actions and use particular language in communications.  All of these indicia of misconduct are detectable using advanced analytics and skillful strategy.

[50]      Lawyers have gotten quite good at finding this information when looking back in time.  We thought, then, that it should not be too difficult to find this information while the misconduct is unfolding, or to identify warning signs that misconduct is likely to occur, and seek to provide relief of certain factors where possible or take corrective action when needed and as early as possible.  So, we put this to the test, developing Early Warning Systems (“EWS”) for some of our clients.

[51]      The idea for an EWS first occurred to one of the authors when working on a pro bono matter with the ACLU in a case against the Baltimore Police Department (“BPD”) alleging unconstitutional arrest practices in its Zero Tolerance Policing policies.[76]  As a result of the case, the BPD agreed to, among other things, implement a tracking system whereby certain data points were collected regarding police officer conduct and arrest practices that research had proven were warning signs of potential problem officers.[77]  The accumulation of certain data points with respect to an officer triggered a review of the officer’s conduct, with various remediation outcomes.[78]  We thought that a similar approach could be used for our clients.

[52]      An EWS is a tricky thing to implement, and requires careful consideration of many factors, employee privacy at the forefront.  However, with careful planning, policy development, and training, an effective EWS can be designed and implemented.  Predictive analytics applications can be trained to search for indicia of the conduct, language, or factors across information systems.  The specific systems to be targeted will vary depending on what is being sought and the systems most likely to contain it and will vary greatly from company to company.  But, when properly trained and targeted, we have found these systems to be very effective in detecting and even preventing misconduct.  We believe that this use of predictive analytics will become one of the most powerful applications of this technology in the near future.

[53]      Moving from the business intelligence aspects of information governance to the arguably more prosaic field of records and information management, the authors also count themselves as true believers in the power of analytics to optimize traditional RIM (records and information management) functionality.  A full discussion of archival and records management practices in the digital age is beyond the scope of this Article, but the interested reader will find a wealth of scholarly literature in the leading journals discussing how the traditional practice of records management is being transformed in the digital age. One of the authors has argued that predictive coding and like methods are the most promising way to open up “dark archives” in the public sector, such as digital collections of data appraised as permanent records (mostly consisting of White House email at this point), that for reasons of privacy or privilege will be otherwise inaccessible to the public for many decades to come.[79]

[54]      In the authors’ experience, email archiving using auto-categorization for recordkeeping purposes is available using existing software in the marketplace.  In such instances, email is populated in specific “buckets” in a repository depending on how it is characterized, based on either the position of the creator or recipient of the email, the subject matter, or based on some other attribute appearing as metadata.[80]  In the most advanced versions of auto-categorization software, the system “learns” as it is trained using exemplars in a seed set selected by subject matter experts (i.e., records managers or expert end users), via a protocol highly reminiscent of the methods adopted by the parties in Da Silva Moore and similar cases.  It is only a matter of time before predictive analytics is more widely used to optimize auto-classification while reducing the burden on end users to perform manual records management functions.[81]

[55]      In similar fashion, the power of predictive analytics to reliably classify content after adequate training makes such tools optimal for data remediation efforts.  The problem of legacy data in corporations is well known, and only growing over time with the inflationary expansion of the ESI universe.[82]  Using advanced analytics to classify low value data, the chaos that is the reality of most shared drives and other joint data repositories, may potentially be reduced by
orders of magnitude.  The challenge of engaging in defensible deletion is one important aspect of optimizing information governance.[83]



 [56]      As was made clear at the outset, it is the authors’ intent merely to scratch the surface of what is possible in the analytics space as applied to matters of importance for corporate information governance.  No one has a one hundred percent reliable crystal ball, but it seems evident that as computing power increases, those forms of artificial intelligence that we have referred to here as analytics will themselves only grow in importance in both our daily and professional lives.  By the end of this decade, we would be surprised if the following do not occur: pervasive use of business intelligence software; the use of more automated decision-making (also known as “operational business intelligence”); the use of alerts in the form of early warning systems including the type described above; much greater use of text mining and predictive technologies across a variety of domains.[84]

[57]      All of these developments dovetail with the expected demand on the part of corporate clients for lawyers to be familiar with state of the art practices in the information governance space, as already anticipated by the type of technology that Da Silva Moore and related cases suggest.  As best said in The Sedona Commentary on Achieving Quality in E-Discovery, “[i]n the end, cost-conscious firms, organizations, and institutions of all types that are intent on best practices . . . will demand that parties undertake new ways of thinking about how to solve e-[D]iscovery problems. . . .” [85]  The same holds true for the greater playing field of information governance.  Lawyers who have embraced analytics will have a leg up on their competition in this brave new space.


* Mr. Borden is a partner in the Commercial Litigation section at Drinker Biddle & Reath, LLP, Washington, D.C., where he serves as Chair of the Information Governance and e-Discovery Group.  He is Co-Chair of the Cloud Computing Committee and Vice Chair of the e-Discovery and Digital Evidence Committee of the Science and Technology Law Section of the ABA.  He is also a founding member of the steering committee for the Electronic Discovery Section of the District of Columbia Bar.  B.A., with highest honors, George Mason University; J.D., cum laude, Georgetown University Law School.

** Mr. Baron serves as Of Counsel in the Information Governance and e-Discovery Group, Drinker Biddle & Reath, LLP, Washington, D.C, and is on the Adjunct Faculty at the University of Maryland.  He formerly served as Director of Litigation at the National Archives and Records Administration, and is a former steering committee Co-Chair of The Sedona Conference Working Group 1 on Electronic Document Retention and Production.  B.A., magna cum laude, Wesleyan University; J.D., Boston University School of Law.  The authors wish to thank Drinker Biddle & Reath associates Amy Frenzen and Nicholas Feltham for their assistance in the drafting of this article.  The views expressed are the authors’ own and do not necessarily reflect the views of any institution, public or private, that they are affiliated with.


[1] See infra text accompanying notes 47-49 for a definition.

[2] Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182, 192 (S.D.N.Y. 2012), aff’dsub nom. Moore v. Publicis Groupe SA, 2012 U.S. Dist LEXIS 58742 (S.D.N.Y. Apr. 26, 2012) (Carter, J.).

[3] We recognize the paradox of articles living “forever” on the Internet, especially when published in online journals such as this one, while at the same time ever more rapidly becoming obsolete and out of date. 

[4]See generally The Sedona Conference, The Sedona Conference Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery, 8 Sedona Conf. J. 189, 198 (2007) [hereinafter Sedona Search Commentary].

[5] Thomas H. Davenport & Jinho Kim, Keeping Up with the Quants: Your Guide To Understanding and Using Analytics 1-2 (2013).

[6] See Sedona Search Commentary, supra note 4, at 200-201 nn.16-19.

[7]See, e.g., George L. Paul & Jason R. Baron, Information Inflation: Can The Legal System Adapt?, 13 Rich. J.L. & Tech. 10, ¶ 2 (2007),

[8] Id.; see Sedona Search Commentary, supra note 4, at 201-202; Mia Mazza, Emmalena K. Quesada, & Ashley L. Stenberg, In Pursuit of FRCP1: Creative Approaches to Cutting and Shifting Costs of Discovery of Electronically Stored Information, 13 Rich. J.L. & Tech. 11, ¶ 46 (2007),

[9]See Victor Stanley v. Creative Pipe, 250 F.R.D. 251, 256-7 (D. Md. 2008); see also United States v. O’Keefe, 537 F. Supp. 2d 14, 23-24 (D.D.C. 2008); William A. Gross Const. Ass’n v Am. Mfrs. Mut. Ins. Co., 256 F.R.D. 134, 135 (S.D.N.Y. 2009); Equity Analytics, LLC v. Lundin, 248 F.R.D. 331, 333 (D.D.C. 2008); In re Seroquel Prod. Liab. Litig., 244 F.R.D. 650, 663 (M.D. Fla. 2007).  See generally Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-Discovery Search, 17 Rich. J.L. & Tech. 9, ¶ 11 n.38 (2011),

[10] See Paul & Baron, supra note 7, at ¶ 20; see also Bennett B. Borden, The Demise of Linear Review, Williams Mullen E-Discovery Alert, Oct. 2010, at 1,

[11] The Sedona Conference, The Sedona Conference Commentary on Achieving Quality in e-Discovery 1 (Post-Public Comment Version 2013), available at (for publication 15 Sedona Conf. J. ___ (2014) (forthcoming)).


[13] See, e.g., William A. Gross Constr., 256 F.R.D. at 136; Seroquel, 244 F.R.D. at 662.  See generally Bennett B. Borden et al., Four Years Later: How the 2006 Amendments to the Federal Rules Have Reshaped the E-Discovery Landscape and Are Revitalizing the Civil Justice System, 17 Rich. J.L. & Tech. 10, ¶¶ 30-37 (2011),; Ralph C. Losey, Predictive Coding and the Proportionality Doctrine: A Marriage Made in Big Data, 26 Regent U. L. Rev. 7, 53 n.189 (2013) (collecting cases on proportionality).

[14] Rajiv Maheshwari, Predictive Coding Guru’s Guide 21 (2013); see also Baron, supra note 9, at ¶ 32, n.124 (stating predictive coding and other like terminology as used by e-Discovery vendors); Maura R. Grossman & Gordon V. Cormack, The Grossman-Cormack Glossary of Technology-Assisted Review, 7 Fed. Cts. L. Rev. 1, 4 (2013),; Nicholas M. Pace & Laura Zakaras, Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery, RAND Institute for Civil Justice 59 (2012), available at (defining predictive coding).

[15] The Sedona Conference, The Sedona Conference Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery (Post-Public Comment Version 2013), available at (for publication in 15 Sedona Conf. J. ___ (2014)).  For an excellent, in-depth discussion of how a practitioner may use predictive coding in e-Di
scovery, with references to experiments by the author, see Losey, supra note 13, at 9. 

[16] See, e.g., Sedona Search Commentary, supra note 4, at app. 217-223 (describing various search methods); Douglas W. Oard & William Webber, Information Retrieval for E-Discovery, 7 Foundations and Trends in Information Retrieval 100 (2013), available at; Jason R. Baron & Jesse B. Freeman, Cooperation, Transparency, and the Rise of Support Vector Machines in E-Discovery: Issues Raised By the Need to Classify Documents as Either Responsive or Nonresponsive (2013),  For good resources in the form of information retrieval textbooks, see Gary Miner, et al., Practical Text Mining and Statistical Structured Text Data Applications (Elsevier: Amsterdam) (2012); Christopher D. Manning, Prabhakar Raghavan, & Hinrich Schutze, Introduction to Information Retrieval  (2008).

[17] SeeTREC Legal Track, U. Md., (last visited Feb. 23, 2014) (collecting Overview reports from 2006-2011) (as explained on its home page, “[t]he goal of the Legal Track at the Text Retrieval Conference (TREC) [was] to assess the ability of information retrieval techniques to meet the needs of the legal profession for tools and methods capable of helping with the retrieval of electronic business records, principally for use as evidence in civil litigation.”); see also Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient than Exhaustive Manual Review, 17 Rich. J.L. & Tech. 11, ¶¶ 3-4 (2011), http:/; Patrick Oot, et al., Mandating Reasonableness in a Reasonable Inquiry, 87 Denv. U.L. Rev. 533, 558-559 (2010); Herbert Roitblat et al., Document Categorization in Legal Electronic Discovery: Computer Classification vs. Manual Review, 61 J. Am. Soc’y for Info. Sci. & Tech. 70, 77-79 (2010), available at; see generally Pace & Zakaras, supra note 14, at 77-80.

[18] Sedona Search Commentary, supra note 4, at 192-193.

[19] Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182, 183 (S.D.N.Y 2012), aff’d sub nom. Moore v. Publicis Groupe SA, 2012 U.S. Dist LEXIS 58742 (S.D.N.Y. Apr. 26, 2012) (Carter, J.).

[20] Id. at 184-87.

[21] Id. at 186-87.

[22] Id. at 187.

[23] Id.

[24] Da Silva Moore, 287 F.R.D. at 187.

[25] Id.

[26] Id.

[27] Id. at 188-89.

[28] Id.

[29] Da Silva Moore, 287 F.R.D. at 188.

[30] Id. at 188-89 (citing Daubert v. Merrell Dow Pharms., 509 U.S. 579, 585 (1993)).  But cf. David J. Waxse & Benda Yoakum-Kris, Experts on Computer-Assisted Review: Why Federal Rule of Evidence 702 Should Apply to Their Use, 52 Washburn L.J. 207, 219-23 (2013) (arguing that the Daubert standard should be applied to experts presenting evidence on ESI search and review methodologies)

[31] Id. at 189.

[32] Id. at 190-91; see Grossman & Cormack, supra note 17, at ¶ 61.

[33] Da Silva Moore, 287 F.R.D. at 192.

[34] Id. at 193.

[35] Id.

[36] Id.

[37] Id.

[38] See, e.g., In re Actos (Pioglitazone) Prods. Liab. Litig., No. 6:11-md-2299, 2012 U.S. Dist. LEXIS 187519, at *20 (W.D. La. July 27, 2012).

[39] Global Aero. Inc. v. Landow Aviation, No. CL 61040, 2012 Va. Cir. LEXIS 50, at *2 (Apr. 23, 2012).

[40] Kleen Products, LLC v. Packaging Corp., No. 10 C 5711, 2012 U.S. Dist. LEXIS 139632, at *61-63 (N.D. Ill. Sept. 28, 2012).

[41] EORHB v. HOA Holdings, Civ. Ac. No. 7409-VCL (Del. Ch. Oct. 15, 2012), 2012 WL 4896670, as amended in a subsequent order, 2013 WL 1960621 (Del. Ch. May 6, 2013).

[42] In re Biomet M2a Magnum Hip Implant Prods. Liab. Litg., No. 3:12-MD-2391, 2013 U.S. Dist. LEXIS 84440, at *5-6, *9-10 (N.D. Ind. Apr. 18, 2013).

[43] See, e.g., Nicholas Barry, Note, Man Versus Machine Review: The Showdown Between Hordes of Discovery Lawyers and a Computer-Utilizing Predictive Coding Technology, 15 Vand. J. Ent. & Tech. L. 343, 344-345 (2013); Harrison M. Brown, Comment, Searching for an Answer: Defensible E-Discovery Search Techniques in the Absence of Judicial Voice, 16 Chap. L. Rev. 407, 407-409 (2013); Jacob Tingen, Technologies-That-Must-Not-Be-Named: Understanding and Implementing Advanced Search Technologies in E-Discovery, 19 Rich. J.L. & Tech 2, ¶ 63 (2012),

[44] See Pace & Zakara, supra note 14, at 61-65.

[45] Cf. Losey, supra note 13, at 68.

[46] Pagan Kennedy, William Gibson’s Future is Now, N.Y. Times (Jan. 13, 2012),

[47] Chris Snijders, Uwe Matzat, & Ulf-Dietrich Reips, “Big Data”: Big Gaps of Knowledge in the Field of Internet Science, 7 Int’l J. Internet Sci. 1 (2012),

[48] Daniel Burrus, 25 Game Changing Trends That Will Create Disruption & Opportunity (Part I), Daniel burrus, (last visited Feb. 24, 2014).

[49] Bill Franks, Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics 5 (John Wiley & Sons, Inc. ed., 2012) (citing Stephen Prentice, CEO Advisory: ‘Big Data’ Equals Big Opportunity (2011)).

[50] Id. at 5.

[51] Kenneth Neil Cukier & Viktor Mayer-Schoenberger, The Rise of Big Data: How It’s Changing the Way We Think About the World, Council on Foreign Relations (Apr. 3, 2013),

[52] Davenport & Kim, supra note 5.

[53] Id. at 3.

[54] See id. at 4-5 (providing a listing of various fields of research that make up a part of and comfortably fit within the broader term “Analytics,” including statistics, forecasting, data mining, text mining, optimization and experimental design).

[55] For additional titles in the popular literature, see Thomas H. Davenport & Jeanne G. Harris, Competing on Analytics: The New Science of Winning (2007); Franks, supra note 49; Thornton May, The New Know: Innovation Powered by Analytics (John Wiley & Sons, Inc. ed., 2009); Michael Minelli, Michele Chambers & Ambiga Dhiraj, Big Data Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses (John Wiley & Sons, Inc. ed., 2013); Eric Siegel, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (John Wiley & Sons, Inc. ed., 2013).

[56] See Davenport & Kim, supra note 5.

[57] See AIIM, Big Data and Content Analytics: measuring the ROI 9 (2013), available at  In a questionnaire asking “What type of analysis would
you like to do/already do on unstructured/semi-structured data?”, respondents identified over a dozen uses for analytics which they would consider of high value to their corporation, including: Metadata creation; Content deletion/retention/duplication; Trends/pattern analysis; Compliance breach, illegality; Fraud detection/prevention; Security re-classification/PII (personally identifiable information) detection; Predictive analysis/modeling; Data visualization; Cross relation with demographics; Incident prediction; Geo-correlation; Brand conformance; Sentiment analysis; Image/video recognition; and Diagnostic/medical.  Id.

[58] The Sedona Conference, The Sedona Conference Commentary on Information Governance 2 (2013), available at [hereinafter Sedona IG Commentary].

[59] Charles R. Ragan, Information Governance: It’s a Duty and It’s Smart Business, 19 Rich. J.L. & Tech. 12, ¶ 32 (2013), (internal quotation marks omitted) (quoting Barclay T. Blair, Why Information Governance, in Information Governance Executive Briefing Book, 7 (2011), available at  For additional useful definitions of what constitutes information governance, see The Generally Accepted Recordkeeping Principles, ARMA Int’l, (last visited Feb. 24, 2014) (setting out eight principles of IG, under the headings Accountability, Integrity, Protection, Compliance, Availability, Retention, Disposition and Transparency); Debra Logan, What is Information Governance? And Why is it So Hard?, Gartner (Jan. 11, 2010), (defining IG on behalf of Gartner to be “the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals.”).

[60] Sedona IG Commentary, supra note 58, at 5.

[61] Id.

[62] Id.

[63] Id.

[64] Id. at 25.

[65] Sedona IG Commentary, supra note 58, at 25.

[66] Id. at 27.

[67] See Pension Comm. of Univ. of Montreal Pension Plan v. Banc of Am. Sec., LLC, 685 F. Supp. 2d 456, 461 (S.D.N.Y. 2010).  The information task in e-Discovery is therefore very unlike the user experience with the leading, well-known commercial search engines on the Web in, for example, finding a place for dinner in a strange city.  For the latter project, few individuals religiously scour hundreds of pages of listings even if thousands of “hits” are obtained in response to a select set of keywords; instead they browse only from the first few pages of listings.  Yet the lawyer is tasked with making reasonable efforts to credibly retrieve “the long tail” represented by “any and all” documents in response to document requests so phrased under Federal Rule of Civil Procedure 34.

[68] See, e.g., Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182, 191 (S.D.N.Y. 2012), aff’dsub nom. Moore v. Publicis Groupe SA, 2012 U.S. Dist LEXIS 58742 (Apr. 26, 2012) (Carter, J.); Pension Comm., 685 F. Supp. 2d at 461.

[69] See Bennett B. Borden et al., Why Document Review Is Broken, EDIG: E-Discovery and Information Governance, May 2011, at 1, available at

[70] An Insider’s Look at Reducing ESI Volumes Before E-Discovery Collection, EXTERRO, (last visited Feb. 24, 2014); Andrew Bartholomew, An Insider’s Perspective on Intelligent E-Discovery, E-Discovery Beat (Sept. 11, 2013),

[71] See Fed. R. Civ. P. 1 (“These rules . . . should be construed and administered to secure the just, speedy, and inexpensive determination of every action and proceeding.”) (emphasis added).

[72] Borden et al., supra note 69, at 3.

[73] All of the “True Life Examples” referred to in this article are “ripped from” the pages of the author’s legal experience, without embellishment.

[74] A qui tam suit is a lawsuit brought by a “private citizen (popularly called a ‘whistle blower’) against a person or company who is believed to have violated the law in the performance of a contract with the government or in violation of a government regulation, when there is a statute which provides for a penalty for such violations.”  Qui Tam Action, The Free Dictionary, (last visited Feb. 24, 2014); see also United States ex rel. Eisenstein v. City of New York, 556 U.S. 928, 932 (2009) (defining a qui tam action as a lawsuit brought by a private party alleging fraud on behalf of the government) (internal citations omitted).

[75] See Davenport & Kim, supra note 5, at 3.

[76] See Amended Complaint and Demand for Jury Trial, NAACP v. Balt. City Police Dep’t, No. 06-1863 (D. Md. Dec. 18, 2007), available at

[77] See Charles F. Wellford, Justice Assessment and Evaluation Services, First Status Report for the Audit of the Stipulation of Settlement Between the Maryland State Conference of NAACP Branches, et. al. and the Baltimore City Police Department, et. al. 2 (2012), available at; see alsoPlaintiffs Win Justice in Illegal Arrests Lawsuit Settlement with the Baltimore City Police Department, ACLU (June 23, 2010),

[78] See Wellford, supra note 77, at 2, 14.

[79] See Jason R. Baron & Simon J. Attfield, Where Light in Darkness Lies: Preservation, Access and Sensemaking Strategies for the Modern Digital Archive, in The Memory of the World in the Digital Age Conference: Digitalization and Preservation 580-595 (2012),

[80] See id. at 587.

[81] See id. at 588; see also Ragan, supra note 59, at ¶ 6.

[82] See, e.g., The Sedona Conference, The Sedona Conference Commentary on Inactive Information Sources 2, 5 (2009), available at®%20Commentary%20on%20Inactive%20Information%20Sources.

[83] SeeSedona IG Commentary, supra note 58, at 20-22.

[84] See Davenport & Harris, supra 55, at 176-78.

[85] The Sedona Conference, The Sedona Conference Commentary on Achieving Quality in the E-Discovery Process, 10 Sedona Conf. J. 299, 325 (2009).

Understanding and Contextualizing Precedents in e-Discovery: The Illusion of Stare Decisis and Best Practices to Avoid Reliance on Outdated Guidance

Defensible Data Deletion: A Practical Approach to Reducing Cost and Managing Risk Associated with Expanding Enterprise Data


Cite as: Dennis R. Kiker, Defensible Data Deletion: A Practical Approach to Reducing Cost and Managing Risk Associated with Expanding Enterprise Data, 20 Rich. J.L. & Tech. 6 (2014),


Dennis R. Kiker*


I.  Introduction

[1]        Modern businesses are hosts to steadily increasing volumes of data, creating significant cost and risk while potentially compromising the current and future performance and stability of the information systems in which the data reside.  To mitigate these costs and risks, many companies are considering initiatives to identify and eliminate information that is not needed for any business or legal purpose (a process referred to herein as “data remediation”).  There are several challenges for any such initiative, the most significant of which may be the fear that information subject to a legal preservation obligation might be destroyed.  Given the volumes of information and the practical limitations of search technology, it is simply impossible to eliminate all risk that such information might be overlooked during the identification or remediation process.  However, the law does not require that corporations eliminate “all risk.”  The law requires that corporations act reasonably and in good faith,[1] and it is entirely possible to design and execute a data remediation program that demonstrates both.   Moreover, executing a reasonable data remediation program yields more than just economic and operational benefits.  Eliminating information that has no legal or business value enables more effective and efficient identification, preservation, and production of information requested in discovery.[2]

 [2]        This Article will review the legal requirements governing data preservation in the litigation context, and will demonstrate that a company can conduct data remediation programs while complying with those legal requirements.  First, we will examine the magnitude of the information management challenge faced by companies today.  Then we will outline the legal principles associated with the preservation and disposition of information.  Finally, with that background, we will propose a framework for an effective data remediation program that demonstrates reasonableness and good faith while achieving the important business objectives of lowering cost and risk.


II.  The Problem: More Data Than We Want or Need

 [3]        Companies generate an enormous amount of information in the ordinary course of business.  More than a decade ago, researchers at the University of California at Berkeley School of Information Management and Systems undertook a study to estimate the amount of new information generated each year.[3]  Even ten years ago, the results were nearly beyond comprehension.  The study estimated that the worldwide production of original information as of 2002 was roughly five exabytes of data, and that the storage of new information was growing at a rate of up to 30% per year.[4]  Put in perspective, the same study estimates that five exabytes is approximately equal to all of the words ever spoken by human beings.[5]  Regardless of the precision of the study, there is little question that the volume of information, particularly electronically stored information (“ESI”) is enormous and growing at a frantic pace.  Much of that information is created by and resides in the computer and storage systems of companies.  And the timeworn adage that “storage is cheap” is simply not true when applied to large volumes of information.  Indeed, the cost of storage can be great and come from a number of different sources.

[4]        First, there is the cost of the storage media and infrastructure itself, as well as the personnel required to maintain them.  Analysts estimate the total cost to store one petabyte of information to be almost five million dollars per year.[6]  The significance of these costs is even greater when one realizes that the vast majority of the storage for which companies are currently paying is not being used for any productive purpose.  At least one survey indicates that companies could defensibly dispose of up to 70% of the electronic data currently retained.[7]

[5]        Second, there is a cost associated with keeping information that currently serves no productive business purpose.  The existence of large volumes of valueless information makes it more difficult to find information that is of use.  Numerous analysts and experts have recognized the tremendous challenge of identifying, preserving, and producing relevant information in large, unorganized data stores.[8]  As data stores increase in size, identifying particular records relevant to a specific issue becomes progressively more challenging.  One of the best things a company can do to improve its ability to preserve potentially relevant information, while also conserving corporate resources, is to eliminate information from its data stores that has no business value and is not subject to a current preservation obligation.

[6]        Eliminating information can be extremely challenging, however, due to the potential cost and complexity associated with identifying information that must be preserved to comply with existing legal obligations.  When dealing with large volumes of information, manual, item-by-item review by humans is both impractical and ineffective.  From the practical perspective, large volumes of information simply cannot be reviewed in a timely fashion with reasonable cost.  For example, consider an enterprise system containing 500 million items.  Even assuming a very aggressive review rate of 100 documents per hour, 500 million items would require five million man-hours to review.  At any hourly rate, the cost associated with such a review would be prohibitive.

[7]        Even when leveraging commonly used methods of data culling to reduce the volume required for review, such as deduplication, date culling, and key word filtering, the anticipated volume would still be unwieldy when even a 90% reduction in volume would require review of 50 million items. Moreover, studies have long demonstrated that human reviewers are often quite inconsistent with respect to identifying “relevant” information, even when assisted by key word searches.[9]

[8]        Current scholarship also shows that human reviewers do not consistently apply the concept of relevance and that the overlap, or the measure of the percentage of agreement on the relevancy of a particular document between reviewers, can be extremely low.[10]  Counter-intuitively, the result is the same even when more “senior” review attorneys set the “gold standard” for determining relevance.[11]  Recent studies comparing technology-assisted processes with traditional human review conclude that the former can and will yield better results.  Technology can improve both recall (the percentage of the total number of relevant documents in the general population that are retrieved through search) and precision (percentage of retrieved documents that are, in fact, relevant) than humans can achieve using traditional methods.[12]

[9]        There is also growing judicial acceptance of parties’ use of technology to help reduce the substantial burdens and costs associated with identifying, collecting, and reviewing ESI.  Recently, the U.S. District Court for the Southern District of New York affirmed Magistrate Judge Andrew Peck’s order approving the parties’ agreement to use “predictive coding,” a method of using specialized software to identify potentially relevant information.[13]

[10]      Likewise, a Loudon County, Virginia Circuit Court judge recently granted a defendant’s motion for protective order allowing the use of predictive coding for document review.[14]  The defendant had a data population of 250 GB of reviewable ESI comprising as many as two million documents, which, it argued, would require 20,000 man-hours to review using traditional human review.[15]  The defendant explained that traditional methods of linear human review likely “misses on average 40% of the relevant documents, and the documents pulled by human reviewers are nearly 70% irrelevant.”[16]

[11]      Similarly, commentary included with recent revisions to Rule 502 of the Federal Rules of Evidence indicate that using computer-assisted tools may demonstrate reasonableness in the context of privilege review: “Depending on the circumstances, a party that uses advanced analytical software applications and linguistic tools in screening for privilege may be found to have taken ‘reasonable steps’ to prevent inadvertent disclosure.”[17]

[12]      Simply put, dealing with the volume of information in most business information systems is beyond what would be humanly possible without leveraging technology.  Because such systems contain hundreds of millions of records, companies effectively have three choices for searching for data subject to a preservation obligation: they can rely on the search capabilities of the application or native operating system, they can invest in and employ third-party technology to index and search the data in its native environment, or they can export all of the data to a third-party application for processing and analysis.


III.  The Solution: Defensible Data Remediation

[13]      Simply adding storage and retaining the ever-increasing volume of information is not a tenable option for businesses given the cost and risk involved.  However, there are risks associated with data disposition as well, specifically that information necessary to the business or required for legal or regulatory reasons will be destroyed.  Thus, the first stage of a defensible data remediation program requires an understanding of the business and legal retention requirements applicable to the data in question.  Once these are understood, it is possible to construct a remediation framework appropriate to the repository that reflects those requirements.

 A.  Retention and Preservation Requirements

 [14]      The U.S. Supreme Court has recognized that “‘[d]ocument retention policies,’ which are created in part to keep certain information from getting into the hands of others, including the Government, are common in business.”[18]  The Court noted that compliance with a valid document retention policy is not wrongful under ordinary circumstances.[19]  Document retention policies are intended to facilitate retention of information that companies need for ongoing or historical business purposes, or as mandated by some regulatory or similar legal requirement.  Before attempting remediation of a data repository, the company must first understand and document the applicable retention and preservation requirements.

[15]      It is beyond the scope of this Article to outline all of the potential business and regulatory retention requirements.[20]  Ideally, these would be reflected in the company’s record retention schedules.  However, even when a company does not have current, up-to-date retention schedules, embarking on a data remediation exercise affords the opportunity to develop or update such schedules in the context of a specific data repository.  Most data repositories contain limited types of data.  For example, an order processing system would not contain engineering documents.  Thus, a company is generally focused on a limited number of retention requirements for any given repository.  There are exceptions to this rule, such as with e-mail systems, shared-use repositories (e.g., Microsoft SharePoint), and shared network drives.  Even then, focusing on the specific repository will enable the company to likewise focus on some limited subset of its overall record retention requirements.  Once a company has identified the business and regulatory retention requirements applicable to a given data repository, information in the repository that is not subject to those requirements is eligible for deletion unless it is subject to the duty to preserve evidence.

[16]      The modern duty to preserve derives from the common law duty to preserve evidence and is not explicitly addressed in the Federal Rules of Civil Procedure.[21]  The duty does not arise until litigation is “reasonably anticipated.”[22]  Litigation is “reasonably anticipated” when a party “knows” or “should have known” that the evidence may be relevant to current or future litigation.[23] Once litigation is reasonably anticipated, a company should establish and follow a reasonable preservation plan.[24]  Although there are no specific court-sanctioned processes for complying with the preservation duty, courts generally measure the parties’ conduct in a given case against the standards of reasonableness and good faith.[25]  In this context, a “defined policy and memorialized evidence of compliance should provide strong support if the organization is called up on to prove the reasonableness of the decision-making process.”[26]

[17]      The duty to preserve is not without limits: “[e]lectronic discovery burdens should be proportional to the amount in controversy and the nature of the case” so the high cost of electronic discovery does not “overwhelm the ability to resolve disputes fairly in litigation.”[27] Moreover, courts do not equate reasonableness with “perfection.”[28] Nor does the law require a party to take “extraordinary” measures to preserve “every e-mail” even if it is technically feasible to do so.[29]  “Rather, in accordance with existing records and information management principles, it is more rational to establish a procedure by which selected items of value can be identified and maintained as necessary to meet the organization’s legal and business needs[.]”[30]

[18]      Critical tasks in a preservation plan are the identification and documentation of key custodians and other sources of potentially relevant information.[31] Custodians identified as having potentially relevant information should generally receive a written litigation hold notice.[32]  The notice should be sent by someone occupying a position of authority within the organization to increase the likelihood of compliance.[33] The Sedona Guidelines also suggest that a hold notice is most effective when it:

 1)      Identifies the persons likely to have relevant information and communicates a preservation notice to those persons;

2)      Communicates the preservation notice in a manner that ensures the recipients will receive actual, comprehensible and effective notice of the requirement to preserve information;

3)      Is in written form;

4)      Clearly defines what information is to be preserved and how the preservation is to be undertaken; and

5)      Is regularly reviewed and reissued in either its original form or an amended form when necessary.[34]

 [19]      The legal hold should also include a mechanism for confirming that recipients received and understood the notice, for following up with custodians who do not acknowledge receipt, and for escalating the issue until it is resolved.[35]  To be effective, the legal hold should be periodically reissued to remind custodians of their obligation and to apprise them of changes required by the facts and circumstances in the litigation.[36]

[20]      Experience has also shown that legal holds that are not properly managed and ultimately released are less likely to receive the appropriate level of attention by employees. Thus, the legal hold process should also include a means for determining when litigation is no longer reasonably anticipated and the hold can be released, while ensuring that information relevant to another active matter is preserved.[37]

 B.  The Remediation Framework

 [21]      Against this backdrop, it is possible to outline a framework for data remediation that is compliant with legal preservation requirements.  The following describes a high-level data remediation process that can be applied to virtually any data environment and any risk tolerance profile.  The general process is described in Figure 1 below:

Figure 1 - Kiker

Figure 1: Data Remediation Framework

1.  Assemble the Team

 [22]      A successful data remediation project depends on invested participation by at least three constituents in the organization: legal, information technology (“IT”), and records and information management (“RIM”).  In addition, the project may require additional support from experts experienced in information search and retrieval and statistical analysis.  In-house and/or outside counsel provides legal oversight and risk assessment for the project team, as well as guidance on legal preservation obligations.  IT provides the technological expertise necessary to understand the structure and capabilities of the target data repository.  RIM professionals provide guidance on business and regulatory retention obligations.  The need for information search and retrieval experts and statisticians depends on the complexity of the data remediation effort as described below.  Finally, including business users of the information may be necessary as required to fully document retention requirements applicable to a particular repository if not adequately documented in the organization’s document retention policy and schedule.

 2.  Select Target Data Repository

 [23]      Selecting the target data repository requires consideration of the costs and benefits of the data remediation exercise.  Each type of repository presents unique opportunities and challenges.  For example, e-mail systems, whether traditional or archived, are notorious for containing vast amounts of information that is not needed for any business or legal purpose.  Similarly, shared network drives tend to contain large volumes of unused and unneeded information.  Backup tapes, legacy systems, and even structured databases are other possible targets.  IT and RIM resources are invaluable in identifying a suitable target repository.  For example, IT can often run reports identifying directories and files that have not been accessed recently.

 3.  Document Retention and Preservation Obligations

[24]      As discussed above, it is critical to understand the retention and preservation obligations that are applicable to the data contained in the target repository.  Retention obligations include the business information needs as well as any regulatory requirements mandating the preservation of data.  Ideally, these are incorporated into the document retention policy and schedule for the organization.  If not, it will be important to document those requirements applicable to the target repository.

[25]      Preservation obligations are driven by existing and reasonably anticipated litigation.[38]  In some cases this may be the most challenging part of the project, particularly for highly litigious companies, because, unlike business needs and regulatory requirements, preservation obligations are constantly changing as new matters arise and circumstances evolve in existing matters.  Successful completion of the remediation project will require a detailed understanding of, and constant attention to, the preservation obligations applicable to the target repository.  As discussed below, some of the risk associated with this aspect of the project can be ameliorated through selection of the appropriate repository and culling criteria.  Nevertheless, the scope and timing of the project will be driven in large part by the preservation obligations applicable to the target repository.

4.  Inventory Target Data Repository

[26]      After selecting the target data repository, the team must inventory the information within that repository.  This does not involve creating an exhaustive list or catalog of every item within the repository.  Rather, inventorying the repository involves developing a good understanding of the types of information that are contained there, the date ranges of the information, and other criteria that will enable identifying information that must be retained and that which can be deleted.  The details of the inventory will vary by data repository.  For example, for an e-mail server, the pertinent criteria may include only date ranges and custodians, whereas for a shared network drive, the pertinent criteria may include departments and individuals with access, date ranges, and file types.

 5.  Gross Culling

 [27]      The next step is to determine the “gross culling” criteria for the data repository.  In this context, “gross culling” refers to an initial phase of data culling based on broad criteria as opposed to fine or detailed culling criteria that may be used in a later phase of the exercise.[39]  The nature of the information contained within the repository will determine the specific criteria to be used, but the objective is to locate the “low-hanging fruit,” the items within the repository that can be readily identified as not falling within any retention or preservation obligation. These are black-and-white decisions where the remediation team can definitively determine without further analysis that the items identified can be deleted.

[28]      For example, in most cases, dates are effective gross culling criteria.  Quite often, large volumes of e-mail and loose files (data retained in shared network drives or other unstructured storage) predate any existing retention or preservation obligation for such items.  Similarly, in repositories that are subject to short or no retention guidelines, the business need for the data can be evaluated in terms of the date last accessed.  In the case of shared network drives, for example, it is not uncommon to find large volumes of information that has not been accessed by any user in many years.[40]  Such information can be disposed of with very little risk.

 6.  Fine Culling

 [29]      Sometimes, the process need go no further than the gross culling stage.  Depending on the volume of data deleted and the volume and nature of the data remaining, the remediation team may determine that the cost and benefit of attempting further culling of the data are not worth the effort and risk.  In some cases, however, gross culling techniques will not identify sufficient volumes of unneeded data and more sophisticated culling strategies must be employed.

[30]      The precise culling technique and strategy will depend on the specific data repository, its native search capabilities, and the availability of other search tools.  For example, many modern e-mail archiving systems have fairly sophisticated native search capabilities that can locate with a high degree of accuracy content pertinent to selected criteria.  Other systems will require the use of third-party technology.  In either case, the fine culling process will require selection of culling criteria that will uniquely identify items not subject to a retention or preservation obligation and be susceptible to verification.  Depending on the nature of the data and the complexity of the necessary search criteria, the remediation team may need to engage an expert in information search and retrieval.

 7.  Sampling and Statistical Analysis

[31]      Regardless of the specific fine culling strategy employed, the remediation team should validate the results by sampling and analysis to ensure defensibility.  Generally, it will be advisable to engage a statistician to direct the sampling effort and perform the analysis because both can be quite complex and rife with opportunity for error.[41]  Moreover, in the event that the company’s process is ever challenged, validation by an independent expert is compelling evidence of good faith.  It is important to realize that the statistical analysis cannot demonstrate that no items subject to a preservation obligation are included in the data to be destroyed.  It can only identify the probability that this is the case, but it can do so with remarkable precision when properly performed.[42]

 8.  Iteration

[32]      Fine culling and validation should continue until the remediation team achieves results that meet its expectations regarding the volume of data identified for deletion and the probability that only data not subject to a preservation obligation are included in the result set.


 IV.  Conclusion

[33]      The enormity of the challenge that expanding volumes of unneeded information creates for businesses is difficult to understate.  Companies literally spend millions of dollars annually to store and maintain information that serves no useful purpose, funds that could be directed to productive uses such as hiring, research, and investment.  Facing this challenge, on the other hand, is a challenge of its own, perhaps due more to the fear of adverse consequences in litigation than any other factor.  It is possible, however, to develop a defensible data remediation process that enables a company to demonstrate good faith and reasonableness while eliminating the cost, waste, and risk of this unnecessary data.


* Dennis Kiker has been a partner in a national law firm, director of professional services at a major e-Discovery company, and a founding shareholder of his own law firm. He has served as national discovery counsel for one of the largest manufacturing companies in the country, and counseled many others on discovery and information governance-related issues. He is a Martindale-Hubbell AV-rated attorney admitted at various times to practice in Virginia, Arizona and Florida, and holds a J.D., magna cum laude & Order of the Coif from the University of Michigan Law School.  Dennis is currently a consultant at Granite Legal Systems, Inc. in Houston, Texas.


[1] See The Sedona Conference, The Sedona Principles: Second Edition Best Practices Recommendations & Principles for Addressing Electronic Document Production  28 (Jonathan M. Redgrave et al. eds., 2007) [hereinafter “The Sedona Principles”], available at (last visited Jan. 30, 2014); see also Louis R. Pepe & Jared Cohane, Document Retention, Electronic Discovery, E-Discovery Cost Allocation, and Spoliation Evidence: The Four Horsemen of the Apocalypse of Litigation Today, 80 Conn. B. J. 331, 348 (2006) (explaining how proposed Rule 37(f) addresses the routine alteration and deletion of electronically stored information during ordinary use).

[2] See The Sedona Principles, supra note 1, at 12.

[3] See Peter Lyman & Hal R. Varian, How Much Information 2003?, (last visited Feb. 9, 2014).

[4] Id.

[5] See id.

[6] Jake Frazier, Hoarders: The Corporate Edition, Business Computing World  (Sept. 25, 2013),

[7] Id.

[8] See James Dertouzos et. al, Rand Inst. for Civil Justice, The Legal and Economic Implications of E-Discovery: Options for Future Research ix (2008), available at; see also Robert Blumberg & Shaku Atre, The Problem with Unstructured Data, Info. Mgmt. (Feb. 1, 2003, 1:00 AM),; The Radicati Group, Taming the Growth of Email: An ROI Analysis 3-4 (2005), available at

[9] See David C. Blair & M.E. Maron, An Evaluation of Retrieval Effectiveness for a Full-Text Document Retrieval System, Comm. ACM, March 1985, at 289-90, 295-96.

[10] See Ellen M. Voorhees, Variations in Relevance Judgments and the Measurement of Retrieval Effectiveness, 36 Info. Processing & Mgmt. 697, 701 (2000), available at http://‌‌courses/‌cs430/‌2006fa/‌cache/‌Trec_8.pdf (finding that relevance is not a consistently applied concept between independent reviewers).  See generally Hebert L. Roitblat et al., Document Categorization in Legal Electronic Discovery: Computer Classification vs. Manual Review, 61 J. Am. Soc’y. for Info. Sci. & Tech. 70, 77 (2010).

[11] See Voorhees, supra note 10, at 701 (finding that the “overlap” between even senior reviewers shows that they disagree as often as they agree on relevance).

[12]  See generally Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, 17 Rich. J.L. & Tech. 11 ¶ 2 (2011), http://‌jolt.‌richmond.‌edu/‌‌v17i3/‌article11.pdf (analyzing data from the TREC 2009 Legal Track Interactive Task Initiative).

[13] See Moore v. Publicis Groupe SA, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 1446534, at *1-3 (S.D.N.Y. Apr. 26, 2012).

[14] See Global Aerospace, Inc. v. Landow Aviation, L.P., No. CL 61040, 2012 Va. Cir. LEXIS 50, at *2 (Va. Cir. Ct. Apr. 23, 2012).

[15] See Mem. in Supp. of Mot. for Protective Order Approving the Use of Predictive Coding at 4-5, Global Aerospace, Inc. v. Landow Aviation, L.P., No. CL 61040, 2012 Va. Cir. LEXIS 50 (Va. Cir. Ct. Apr. 9, 2012).

[16] Id. at 6-7.

[17] Fed. R. Evid. 502(b) Advisory Committee’s Notes, Subdivision (b) (2007).

[18] Arthur Anderson LLP v. United States, 544 U.S. 696, 704 (2005).

[19] Id.; see Managed Care Solutions, Inc. v. Essent Healthcare, 736 F. Supp. 2d 1317, 1326 (S.D. Fla. 2010) (rejecting plaintiffs’ argument that a company policy that e-mail data be deleted after 13 months was unreasonable) (citing Wilson v. Wal-Mart Stores, Inc., No. 5:07-cv-394-Oc-10GRJ, 2008 WL 4642596, at *2 (M.D. Fla. Oct. 17, 2008); Floeter v. City of Orlando, No. 6:05-CV-400-Orl-22KRS, 2007 WL 486633, at *7 (M.D. Fla. Feb. 9, 2007)).  But see Day v. LSI Corp., No. CIV 11–186–TUC–CKJ, 2012 WL 6674434, at *16 (D. Ariz. Dec. 20, 2012) (finding evidence of defendant’s failure to follow its own document policy was a factor in entering default judgment sanction for spoliation).

[20] For purposes of this article, such laws and regulations are treated as retention requirements with which a business must comply in the ordinary course of business.  This article focuses on the requirement to exempt records from “ordinary course” retention requirements due to a duty to preserve the records when litigation is reasonably anticipated.  In short, this article relies on the distinction between retention of information and preservation of information, focusing on the latter.  Seeinfra text accompanying note 23.

[21] See Sylvestri v. Gen. Motors, Inc., 271 F.3d 583, 590 (4th Cir. 2001); see also Stanley, Inc. v. Creative Pipe, Inc., 269 F.R.D. 497, 519 (4th Cir. 2010).

[22] See Cache la Poudre Feeds v. Land O’Lakes, 244 F.R.D. 614, 621, 623 (D. Colo. 2007); see also The Sedona Principles, supra note 1, at 14.

[23] See Pension Comm. of the Univ. of Montreal Pension Plan v. Banc of Am. Sec., LLC, 685 F. Supp. 2d 456, 466 (S.D.N.Y. Jan. 15, 2010 as amended May 28, 2010); Rimkus Consulting Grp., Inc. v. Cammarata, 688 F. Supp. 2d 598, 612-13 (S.D. Tex. 2010);  Zubulake v. UBS Warburg LLC, 220 F.R.D. 212, 216 (S.D.N.Y. 2003) (Zubulake IV); see also The Sedona Conference, Commentary on Legal Holds: The Trigger & The Process, 11 Sedona Conf. J. 265, 269 (2010) [hereinafter “Commentary on Legal Holds”].

[24] Commentary on Legal Holds, supra note 23, at 269 (“Adopting and consistently following a policy or practice governing an organization’s preservation obligations are factors that may demonstrate reasonableness and good faith.”); see The Sedona Principles, supra note 1, at 12.

[25] Commentary on Legal Holds, supra note 23, at 270 (evaluating an organization’s preservation decisions should be based on good faith and reasonable evaluation of relevant facts and circumstances).

[26] Id. at 274.

[27] Rimkus Consulting, 688 F. Supp. 2d at 613 n.8 (quoting The Sedona Principles, supra note 1, at 17); see also Stanley v. Creative Pipe, Inc., 269 F.R.D. 497, 523 (D. Md., 2010); Commentary on Legal Holds, supra note 23, at 270.

[28] Pension Comm., 685 F. Supp. 2d at 461 (“Courts cannot and do not expect that any party can meet a standard of perfection.”).

[29] See The Sedona Principles, supra note 1, at 28, 30 (citing Concord Boat Corp. v. Brunswick Corp., No. LR-C-95-781, 1997 WL 33352759, at *4 (E.D. Ark. Aug. 29, 1997)).

[30] The Sedona Principles, supra note 1, at 15.

[31] See Commentary on Legal Holds, supra note 23, at 270; id. at 28.

[32] See Pension Comm. 685 F. Supp. 2d at 465; see also Commentary on Legal Holds, supra note 23, at 270.

[33] The Sedona Principles, supra, note 1, at 32.

[34] Commentary on Legal Holds, supra note 23, at 270.

[35] Id. at 283-85.

[36] See id. at 285.

[37] Id. at 287.

[38] See supra ¶ 16.

[39] See Alex Vorro, How to Reduce Worthless Data, InsideCounsel (Mar. 1, 2012),

[40] See, e.g., Anne Kershaw, Hoarding Data Wastes Money, Baseline (Apr. 16, 2012), (80% of the data on shared network and local hard drives has not been accessed in three to five years).

[41] Statistical sampling results can be as valid using a small random sample size as they are for using a larger sample size because, in a simple random sample of any given size, all items are given an equal probability of being selected for the statistical assessment.  In fact, to achieve a confidence interval of 95% with a margin of error of 5%, a sample size of 384 would be sufficient for the population of 300 million.  SeeSample Size Table, Research Advisors, (last visited on Jan. 12, 2014) (citing Robert V. Krejcie & Daryle W. Morgan, Determining Sample Size for Research Activities, Educational and Psychological Measurement 30 Educ. & Psychol. Measurement 607, 607-610 (1970).  However, samples can be vulnerable to discrete “sampling error” because the randomness of the selection may result in a sample that does not reflect the makeup of the overall population.  For instance, a simple random sample of messages will on average produce five with attachments and five with no attachments, but any given test may over-represent one message type (e.g., those with attachments) and under-represent the other (e.g., those without).

[42] See,e.g., Statistics, Wikipedia, (last visited on Feb. 9, 2014).

The Compliance Case for Information Governance

Page 1 of 2

Powered by WordPress & Theme by Anders Norén