Richmond Journal of Law and Technology

The first exclusively online law review.

Month: November 2012

Forensic Collection of Electronic Evidence from Infrastructure-As-a-Service Cloud Computing

An Expected Harm Approach To Compensating Consumers for Unauthorized Information Disclosures

A “Pinteresting” Question: Is Pinterest Here to Stay? A Study in How IP Can Help Pinterest Lead a Revolution

pdf_iconDownload PDF

by Stephanie Chau*

I.  Introduction

[1]        Bulletin boards and pushpins are archaic.  Yet, each day represents a new paradigm for the technologically savvy.  Innovators pair old concepts with new functionalities and technology, often achieving groundbreaking results.  Digital counterparts for Post-It notes emerged for computers and other wireless devices.[1]  Other examples abound.[2]  Thus, it is no surprise that pins and boards also have new meaning in the digital age.  Credit is due to the founders of Pinterest, a nascent social networking site with a devoted following, for modernizing the pin.[3]  As a newer social networking site, Pinterest has experienced unparalleled growth after its inception only a few years ago.[4]  Like all social networking sites, Pinterest thrives in the presence of societal materialism, narcissism, and consumerism.  Its subsequent exponential growth is the byproduct of a high-functioning network effect.[5]  However, Pinterest has yet to monetize[6] and currently operates in a contentious area of the law.  As a result, Pinterest appears to be on the cusp of requiring internal analysis of intellectual property law and its own business model.[7]  The case study of Pinterest illustrates the growing pains of a budding start-up company operating, and perhaps profiting, where Internet copyright law lags.  It is particularly apposite that Pinterest recently hired Michael Yang, Google’s former deputy general counsel, as head of its legal department.[8]  This Article argues that Pinterest’s continued ascent and perhaps even its continued existence hinges on Michael Yang, who must judiciously contour Pinterest’s intellectual property strategy.[9]  Pinterest should not only address impending trademark and copyright infringement issues, but also secure additional intellectual property protections and diversify its business.

[2]        This Article explores the possible legal issues Yang will face in his new capacity.  Part II introduces the concept behind Pinterest, Part III comments on the legal hazards currently looming given Pinterest’s business strategy, and Part IV offers insight on how Pinterest can leverage its intellectual property rights.


II.  What is Pinterest and Why Do We Have It?

[3]        Successful Internet companies bring greater public access to previously limited resources, including libraries, archives, government records, goods, and knowledge.[10]  Google-branded services permeate the web, yet freshman companies also compete for a share in the market.[11]  Pinterest is one such company in the social networking arena.  Pinterest is valued at an estimated $7.7 billion,[12] joining the ranks of competitors such as Facebook at $104 billion,[13] Twitter at $8 billion,[14] LinkedIn at $10 billion,[15] and Instagram at $1 billion.[16]  Led by Facebook, social networking accounted for 16.6 percent of all time spent online in the United States in 2011 and will likely become the top online activity in 2012.[17]  Pinterest’s traffic is steadily increasing—positioning the site as the third most popular media platform in the United States.[18]  Its appeal to college-educated females between the ages of twenty-five and forty-four offers advertisers a demographic, which not only logs an average of eighty-nine minutes a month—far exceeding that of Twitter’s demographic—but is also ready and willing to spend. [19]  In fact, ComScore’s “State of the U.S. Internet” report found that Pinterest users spend more money, buy more items, and conduct more transactions than any other demographic in the social network market.[20]

[4]        Launched in March 2010, “Pinterest is a virtual pin board;” it is a pin board-style photo-sharing social website that allows users to create and manage theme-based image collections of events, interests, and hobbies.[21]  Users cannot only browse their friends’ pin boards for inspiration, but also those of complete strangers.[22]  Users can upload images known as ‘pins,’ save favorites, make comments, and ‘share’ or ‘like’ photos.[23]  Pinterest encourages users to “plan their weddings, decorate their homes, and share their favorite recipes.”[24]  For that reason, it comes as no surprise that Pinterest is a female-centric site.[25]  As of 2012, eighty-three percent of its United States users were women.[26]  Nevertheless, just as Facebook moved beyond its college student niche, Pinterest may soon find a new, more balanced group of users once it fine-tunes its strategic approach.

[5]        Social networking sites flourish in the face of a narcissistic society.  The self-admiration movement gained traction in the 1970s, followed by an explosion in the 1980s and 1990s[27] when unbridled self-indulgence and self-expression grew into “a more extraverted, shallow, and materialistic form of narcissism.”[28]  Trends toward self-presentation evolved with cultural norms and new technology, propagated by narcissistic behavior amidst the masses, the presence of Internet social networking sites, and celebrity culture.[29]  Scholars attribute the epidemic to indulgent parenting, excessive praise, obsession with rampant celebrity narcissism, and the pursuit of fame.[30]  Although narcissists and social networking groups are not coextensive, the two are not mutually exclusive.  Social networking sites revel in these narcissistic tendencies and parlay the user’s desire for validation and admiration into an entire platform.  A vicious cycle ensues whereby these sites reinforce narcissistic behavior by rewarding the user with more connections or comments, and as those narcissists connect with other narcissists, the behavior mushrooms quid pro quo.[31]

[6]        The United States is also founded on the principles of materialism and consumerism, the natural descendants of free market capitalism, which encourage individuals to endeavor to increase one’s wealth.[32]  As a result, Americans have an appetite for purchasing goods that they desire but do not need.[33]  Retailers, particularly in the luxury goods market, and social media companies are eager to exploit this consumer extravagance, a lavishness that might otherwise be characterized as an Achilles’ heel.  Much like on any other social networking service, brands already have a presence on Pinterest because consumers post and share photos of the product, the logo, or other marketing manifestations.  However, controlling consumer sentiment can be a challenge as cease-and-desist letters are virtually ineffective and can cast negative light on the company.[34]  On the other hand, given that retailers are a welcome presence on Pinterest, Pinterest has a distinct competitive advantage over Facebook and Twitter.[35]


III.  Musings on the Pinterest Business Model

[7]        Pinterest’s success depends largely on what scholars call “the network effect.”  The network effect is the phenomenon by which the utility of a good or service increases as the number of consumers multiplies, and as a result, the monetary value of participation on a network grows exponentially as well.[36]  At one end of the network continuum are actual networks, which are communication systems, such as telephones and fax machines, “whose entire value lies in facilitating interactions” between the customer and the product owners.[37]  Pinterest is an actual network and if it can overcome diseconomies of scale and manage tipping,[38] its network will become a tremendous asset.  Given the immense growth of Pinterest’s network, Pinterest is now at a crossroads.  There are two areas where its business model is deficient and in need of attention by Pinterest’s top executives and general counsel: monetization strategy and copyright issues.  Two additional contentious areas, trademark[39] and privacy[40] issues, are beyond the scope of this article.

A.  Pinterest Has No Monetization Strategy

[8]        Initially, Pinterest must solidify a monetization strategy before it can grow.  Although Pinterest’s monetization potential is promising, the company has no formalized strategy as of yet.[41]  On its website, Pinterest flatly asserts “making money isn’t our top priority right now.”[42]  Pinterest is currently “focused on growing Pinterest and making it more valuable.”[43]  While these objectives are laudable, Pinterest already has an extensive following.[44]  Dependence on outside investment from entrepreneurs and venture capitalists is not sustainable in the long term and Pinterest must cultivate ways to bring in revenue.

[9]        Pinterest’s swift ascent as a major social networking site suggests that Pinterest is unique and has greater potential than even beyond the likes of Facebook and Twitter.  A plethora of theories might explain Pinterest’s success.  Conceivably, Pinterest’s business model is more adaptable to growth, Pinterest is on a quicker path to monetization than its predecessors, and Pinterest’s entry into the market perfectly coincided with consumer interests.  Such rapid growth allows or possibly even demands Pinterest to monetize sooner.  Regardless, Pinterest will require resources to stay in vogue and remain competitive with social networking sites that have already monetized their businesses.[45]

[10]      As consumer attention shifts away from television, radio, magazines, and newspapers, $200 billion in advertising revenues will shift online.[46]  Pinterest figures to fight for these advertising revenues with the allure of its primary demographic and its pool of user data.  To illustrate, women spend more time social networking than men worldwide.[47]  Pinterest users spend more time on the site per visit (16 minutes) than on Facebook (12 minutes),[48] and Pinterest users spend more money than others.[49]  Accordingly, Pinterest must carefully construct a strategy to leverage its time with this captive audience.  In recognition of Pinterest’s reach, retailer Bergdorf Goodman, online crafts marketplace known as Etsy, and others have already incorporated Pinterest into their marketing strategies.[50]

[11]      Google monetizes its services using behavioral advertising, thereby analyzing user profiles to present users with advertisements tailored to the words they search.[51]  On Facebook, users volunteer vast amounts of personal data and generate data through their own behavior that in turn, is used to track and tailor advertisements.[52]  Harvesting data on users’ interests and online activities is tantamount to “an Internet genome project,” which is invaluable to advertisers.[53]  However, Google and Facebook serve different purposes and as a result, there exists a demarcation in their approaches to online advertising: Google operates under the theory that human behavior can be reduced to an algorithmic equation whereas Facebook relies on social play and the power of sharing.[54]  Pinterest treats the approaches as symbiotic rather than mutually exclusive.

[12]      Like Google and Facebook, Pinterest has access to vast amounts of consumer information.[55]  However, Pinterest focuses less on offering a breadth of services because the entire premise of Pinterest is to share images that reflect a user’s specific interests.  Thus, activity on the site arguably reflects user and purchaser preferences far better than either Google or Facebook, which are used to accomplish diverse objectives.  This detail, coupled with the fact that retailers can advertise on Pinterest with minimal consumer disgruntlement, positions Pinterest well into the retailer strategies.

[13]      In addition to advertisements, affiliate links can drive monetization.  Affiliate links are a form of affiliate marketing, which rewards affiliates for the quantity of web traffic attributed to its marketing efforts.[56]  Although many observe that Pinterest is the perfect forum for an affiliate marketer given Pinterest’s high conversion rates,[57] the stigma of affiliate links makes this option less than ideal.  Recently, Pinterest quietly monetized its pinners with affiliate links through a partnership with Skimlinks[58] whereby Pinterest received a percentage of the revenue every time a pinner clicked to make a purchase from an e-commerce site.[59]  The response from the public was not favorable and Pinterest came under heavy fire for engaging in a covert scheme, which led to the decision to quickly end the relationship.[60]

[14]      Alternatively, Pinterest executives could decide to forgo innovation[61] in a monetization strategy completely and instead sell the business to the highest buyer.  To achieve the best valuation, Pinterest should focus on network growth and improved functionality.  The fruits of Pinterest’s focused efforts would then translate to a competitive advantage for the purchasing company.  In doing so, Pinterest could avoid the copyright issue, although that course of action might appear cowardly.  Pursuant to this hypothetical, corporate executives and lawyers in the industry ought to take notice and engage in their own monetization and copyright analyses of Pinterest.

[15]      Pinterest is currently not making money and thus the first priority is to implement a sound strategy that will enable the company to grow.  Not only will operating at a profit enable Pinterest to further develop the site and new functionalities, but Pinterest may be able to fend off competitors as well.  Although advertisements may displease users, albeit with perhaps less severity than on other sites, affiliate marketing, if not disclosed, intimates secrecy.  Pinterest may also elect to charge users for each click or sell memberships to exclusive material.  Indeed, Pinterest executives have the opportunity to formulate an innovative monetization strategy.  Although abstract business strategies are generally not patentable,[62] Pinterest can develop processes, systems, and methodologies to analyze and synthesize the data it collects, which can be protected as patentable subject matter or as a trade secret.  Part IV will discuss a few additional monetization strategies.

 B.  Pinterest Leaves Users (and Itself) Exposed to Copyright Liability

  1.  User Liability

[16]      Images are most likely copyrightable subject matter under 17 U.S.C. § 102(a)(5) as pictorial, graphic, or sculptural works.[63]  In order for a plaintiff to establish ownership of a valid copyright, he or she must show an original work of authorship fixed in a tangible medium of expression.[64]

[17]      First, the standard of originality is quite low, and courts interpret the statute to require only a minimal degree of creativity original to the author.[65]  Copyright also protects photographs, which enjoy thin protection.[66]  Pursuant to this low threshold, a court will likely consider many of the images sufficiently original to the author, who may or may not be the pinner.  The second prong demands fixation, which requires that the matter be “sufficiently permanent or stable to permit it to be perceived, reproduced, or otherwise communicated.”[67]  Given that the image must already exist in digital form for the Pinterest user to upload, pin, or repin the image, it seems unlikely that the fixation hurdle will be significant.  Moreover, Pinterest pins remain on boards indefinitely unless removed by the user or if the link is broken.

[18]      When Pinterest users upload an image or pin an existing image, there is a definite probability that they do not own the copyright to that image.  Suffice it to say that, under the current business model, even absent mens rea, Pinterest users are arguably engaging in a form of copyright infringement for which they might be liable for violating the copyright holder’s reproduction, derivative work, and distribution rights, or perhaps even public display rights.[68]  Whereas most of the content on Facebook and Twitter is user-generated, a substantial portion of Pinterest content comes from external links.

[19]      Many copyright holders may not realize that they own a copyright and may have no intention of enforcing their rights.  Nevertheless, Pinterest must address the concerns of users who do not want to face liability for using the site.  Moreover, business leaders should be proactive, not reactive and it would behoove Pinterest to think accordingly.  Pinterest ought to preemptively plan for a potential copyright infringement claim.

[20]      Recently, Pinterest took active steps to alleviate scrutiny on the copyright front.  For example, Pinterest began marshalling attribution and direct links to sources, such as Flickr in May 2012.[69]  Although these enhancements may allay copyright holders from pursuing litigation, this does not weaken the infringement argument.  Although attribution is not currently part of copyright law, perhaps courts should consider attribution in its fair use analysis.  With attribution, copyright holders get free publicity and they arguably have no incentive to bring copyright infringement suits.  However, attribution may not suffice as fair use given that it does not protect the earning potential of the image.

[21]      Pinterest also provides advice for website owners who do not want their website content pinned.  Websites can insert a small piece of code to the head of any page on their site so when a user tries to pin from the site, a customizable error message will appear.[70]  Nevertheless, Pinterest users can maneuver around this code in a multitude of ways.  Even absent technical savvy, users can save the images to their own library, by screen capturing the image for instance, or use a non-Pinterest website as an unsuspecting intermediary between the original site and Pinterest.  Website owners who use this code represent a population who does in fact care about copyright liability.  Pinterest should therefore be attuned to their concerns as they can wreak havoc on Pinterest users by bringing infringement claims or by raising negative publicity for Pinterest.

[22]      The fair use defense might be applicable, but courts often apply the defense on a case-by-case basis, evaluating “(1) the purpose and character of the use . . . (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) effect of the use upon the potential market for or value of the copyrighted work.”[71]  Pinterest boards may qualify under fair use because they are likely transformative: the collections of images combined with captions serve a different purpose than the original, even where they contain entire images.[72]  Factors weighing against Pinterest’s fair use include: the high quality of the images, varying levels of commercial use (e.g., corporate pages compared with bloggers), and potential cannibalization of the market for the copyrighted images.[73]

[23]      However, courts’ current fair use analyses are blind to the copyright holder’s involvement as well as attribution by the infringer, as mentioned earlier.  Notably, numerous sites, including profit-driven news sites and retailers, encourage visitors to pin their copyrighted images via a “Pin It” button, which invites readers to pin the site’s work onto Pinterest.[74]  Where the website recommends that a visitor share the image on Pinterest, and assuming the website is in fact the copyright holder, the visitor is persuasively reasonable in believing the copyright owner will not sue for copyright infringement over use of that image.  Applying public policy considerations, a new fair use element might be appropriate where if a copyright owner provides the “Pin It” button, either himself or through his agents, he should be barred from subsequently bringing an infringement suit for use of that image.

[24]      Analysis of the foregoing (the “Pin It” button or any comparable form of encouragement by the copyright holder) also bears on the other four elements.  Like attribution, the “Pin It” button does not protect the earning potential of the image, but it may boost the earning potential of the underlying product itself.  Instead of cannibalizing the market, the infringement may effectively improve sales of the product.  For instance, a pinned photograph of an article of clothing might help a retailer-copyright holder garner sales of that article, whereas an image of décor or food might have no impact on marketability to the copyright holder.  Use of the image on Pinterest can therefore affect two separate markets: the market for the image and the market for the product.  Current fair use analysis only considers the market for the image.  Although potentially too speculative, perhaps courts could also consider the effect on the market for the product.  However, any consideration of the market for the product may undermine Pinterest’s “no self-promotion” policy as well as detract attention from the predominant purpose of copyright law, which is to encourage the creation of original works.  In the alternative, if an image pinned through the “Pin It” button is not fair use, arguably the courts should find that a copyright holder who puts a “Pin It” button on his website grants the pinner either a limited form of consent or a nonexclusive license, which need not be in writing.[75]  By finding consent or a license, a court could circumvent an upset of the law on copyright infringement and fair use, at least with respect to the “Pin It” Button.

[25]      Countervailing public policy might also support a Pinterest user’s First Amendment argument.[76]  Although Pinterest users may elect to draw upon their freedom of speech guarantees as a policy matter, the First Amendment will not likely be a persuasive argument under case law.  After Eldred v. Ashcroft[77] and Harper & Row, Publishers, Inc. v. Nation Enterprises,[78] the traditional contours are constitutional because proper limits, such as the fair use doctrine and idea-expression dichotomy, are already in place.

[26]      Finally, as always, the relevant statute of limitations is an important consideration.  Copyright holders will have a claim until the statute of limitations lapses.  That period will depend on the facts of an individual case.  Perhaps there is an argument for shortening the statute of limitations with respect to claiming copyright infringement on social networking sites.  Public policies of judicial efficiency and certainty favor such a change, and given the pervasiveness of Pinterest-type file sharing, the court dockets could bottleneck pursuant to a Pinterest user losing.  It is within the province of the legislature to consider such changes, but in the meantime, Pinterest users are not immune to allegations of copyright infringement.  Whether a court will rule in their favor remains to be seen.

[27]      Many commentators posit that Pinterest’s hiring of Michael Yang as head of its legal department signals just how critical resolving the copyright question is to the company’s longevity.[79]  In the wake of Yang’s hire, Pinterest will likely discuss the ramifications of its current business model to the user and the company itself, perhaps resulting in the modification of its cornerstone image-sharing business.

 2.  Pinterest Liability

[28]      In addition to claims against its users for direct infringement, copyright holders may also assert a secondary liability claim against Pinterest.  Contributory liability applies to “one who, with knowledge of the infringing activity, induces or causes or materially contributes” to another’s infringement.[80]  In Sony Corporation of America v. Universal City Studios, Inc., the Court held a product “need merely be capable of substantial noninfringing uses” to avoid contributory liability.[81]  Given that Pinterest is capable of substantial noninfringing uses, showing that Pinterest had knowledge is not a foregone conclusion.  For example, Pinterest users may legally own the content they are uploading, or may have permission from the owner to use the content.

[29]      A new inducement doctrine may be more helpful.  In MGM Studios Inc. v. Grokster, Ltd., the Court held that “one who distributes a device with the object of promoting its use to infringe copyright, as shown by clear expression or other affirmative steps taken to foster infringement, is liable for the resulting acts of infringement by third parties.”[82]  Given that Pinterest does not ask users to consider permissions before each pin, its business model is distinguishable from that of Facebook.  Moreover, Facebook encourages sharing personal experiences and photos whereas Pinterest encourages sharing content created by others.  Just as Grokster distributed free software that allowed users to share electronic files through peer-to-peer networks, Pinterest provides an interface for users to freely share images.[83]  Although there are provisions protecting the online service provider (“OSP”),[84] Pinterest must tread carefully to avoid crossing the inducement threshold.

[30]      The Digital Millennium Copyright Act (“DMCA”) broadened copyright owners’ rights beyond the Sony holding.[85]  Title II of the DMCA contains the Online Service Provider Safe Harbor provisions.[86]  These provisions insulate OSPs from liability for transmitting, routing, storing, caching, or linking to unauthorized content if the OSP meets specific conditions.[87]  It must also meet the following threshold conditions: (1) adopting, implementing, and informing subscribers of policy for providing for termination of users who are repeat copyright infringers; (2) adopting standard technical measures used by copyright owners to identify and protect copyrighted works; and (3) designating an agent to receive notification of claimed infringement from copyright owners and register that agent with the Copyright Office.[88]

[31]      Pinterest explicitly addresses the DMCA on its website and describes how Pinterest complies with these provisions.[89]  With a prominent Copyright and Trademark Section, Pinterest seeks not only to educate users, but also attempts to protect itself in recognition of the potential for copyright infringement.  So long as Pinterest meets the DMCA requirements, Pinterest is only vulnerable to consumer and industry backlash.  Conversely, Pinterest’s policy leaves users largely unprotected.


IV.  Shaping Pinterest’s Future

 A.  If Copyright Holders Win a Case Against a Pinterest User or Pinterest

[32]      If Pinterest or its users prevail against a copyright holder in a lawsuit, Pinterest will likely continue business as usual, but copyright law as we know it may forever change.  This is especially true if a court decides to hold no infringement or otherwise if pins fall within fair use.  Such a decision may significantly weaken copyright law, but given the pervasiveness of pinning and file sharing on the Internet, the law may in fact head in that direction.  For this Article, however, only the alternative is considered.

[33]      If a Pinterest user loses a case against a copyright holder, the floodgates will open for other copyright holders to pursue similar claims.  Disgruntled copyright holders will feel betrayed by the company.  Users may eschew the system.  It will take a complete restructuring of the core business for Pinterest to recover.  Timing compounds the pressure on Yang.  If a court rules before Pinterest can monetize, Pinterest may lose the opportunity to capitalize on the network it so famously achieved in the last two years.

[34]      One incremental change that Pinterest could implement is to start asking the user to consider permissions before each pin.  Facebook employs such a method and users can still pin at their own peril.[90]  A warning would put users on notice and hopefully deter copyright infringement, but alone it will not be enough to save Pinterest or its users if a court does indeed rule that pinning constitutes infringement.  A more drastic measure includes asking the source for permissions to pin or repin.  However, this solution would likely be overly burdensome and curb use on the site.  Another resolution would be to contract with certain sites to enable Pinterest users to pin from their sites wholly, not just for one pin.  However, website owners might ask for remuneration in exchange for allowing their content on the sites.  This solution also comes with certain caveats.  For example, limiting the available sites undermines the purpose of Pinterest.  The ability to share essentially everything is paramount to Pinterest’s appeal.  Moreover, Pinterest executives would need to modify its policy against self-promotion.

[35]      There are also endless ways in which Pinterest could completely revamp or reinvent the business.  For instance, Pinterest could become a consumer research company.  Pinterest could collect, synthesize and sell data on the newest trends to companies without cluttering its websites with the types of advertisements and affiliate links that consumers disdain.  Given that Pinterest already has vast amounts of data at its disposal, all Pinterest would need to do is modify the user agreement.  Educating users would be crucial to this strategy and users will only be less hesitant if they know Pinterest is not farming out individual information.  This approach exploits the “Internet genome project” as referenced earlier in the Article.  Still, entering the arena of consumer research presents challenges for Pinterest.  Pinterest would be subject to privacy concerns and public backlash if Pinterest sells data exceeding the threshold granularity that consumers are reasonably willing to provide.

[36]      If Pinterest loses against a copyright holder for secondary liability it should take the same recourse as it would in the case where a user loses.  The impact of such a ruling to Pinterest again intensifies if Pinterest does not monetize quickly enough.  If Pinterest remains indebted to venture capitalists and a court rules for Pinterest to pay a judgment, Pinterest will be in danger of losing those backers and the company will have an even bleaker balance sheet.  Additionally, if Pinterest has an interest in going public, regardless of its future business strategy, it is true that underwriters, the financial community, and the public will expect Pinterest to possess substantial tangible assets as well as intangible assets, such as intellectual property and goodwill.  However, these individuals will find a highly leveraged company in or anticipating litigation even less attractive.

[37]      If Pinterest loses a case for violating the DMCA, Pinterest can cure the situation more easily.  Put simply, Pinterest would need to adhere to its policies set forth in its Trademark & Copyright Section.[91]  If a court finds Pinterest in violation of the DMCA and nothing else, although a strategic change might bolster success in the future, Pinterest should maintain current operations and begin to pursue other markets and consumer bases as suggested in remainder of this Article.

 B.  Brand Management

[38]      Pinterest has a registered service mark.[92]  Therefore, Pinterest should manage its brand.  Pinterest’s brand strategy should include comprehensive analysis of its graphics, marketing efforts, and consumer satisfaction index.  Using its official badge and Bello Script font logo developed in 2011, Pinterest could market and sell logo products, thereby gaining additional protection under trademark, trade dress, and design patent law.[93]  As evidenced by Pinterest’s office, Pinterest already produces a fair amount of swag.[94]  Branding the office space can help corral support and interest in the product, but there is also a market for branded products.  Pursuant to this strategy, Pinterest should consider additional trademarks for related words such as “Pinterested” and “Pinteresting.”  Although branded products should probably be a secondary source of income, Pinterest should at least consider the production costs and benefits.

[39]      With proper brand management, Pinterest could seamlessly enter new markets, including e-commerce.  For instance, Pinterest can develop a Pinterest store and sell special edition items for its top pinners.  Although Pinterest discourages self-promotion, Pinterest itself can promote by building partnerships and its brand.  In exchange for Pinterest’s endorsement, companies or users can compensate Pinterest.  In contrast to affiliate links and advertising, Pinterest’s endorsement would be direct and effectively a seal of approval.  However, as a technology company, Pinterest would need to develop the requisite credibility and competency.  There would also be an inherent risk of pinners manipulating the pins to become a top pinner.

[40]      Once Pinterest enters into e-commerce, the natural progression is to invade mobile commerce.  During the three-month period ending in April 2012, almost nineteen million Americans, or 17.4% of American[95] smartphone owners, used their smartphone to purchase consumer goods.  Of these, almost one in three purchasers used their device to buy clothing or accessories, 23.5% purchased tickets, 22.3% bought electronics or household appliances, 21.2% purchased meals, and 20.2% bought daily deals or discount coupons.[96]  On August 14, 2012, Pinterest rolled out new applications for Android smartphones and the iPad in addition to updating its year-old iPhone application with improved functionality.[97]  Indeed, Pinterest should continue to stay abreast of mobile commerce trends as it provides yet another medium to reach users.  Correspondingly, Pinterest should promptly file patents and claim patent protection or other applicable intellectual property protection for any new inventions or improvements.

C.  International Growth

[41]      Finally, although the United States still represents enormous potential for Pinterest,[98] Pinterest will reach a saturation point domestically at some point.  Therefore, Pinterest should continue its growth across the globe.  Between May 2011 and January 2012, Germany drew 67,000 individual visitors, an increase of over 2956 percent.[99]  Spain came in a close second with 62,000 individuals, for an increase of 1,348 percent.[100]  The United Kingdom had the largest market with 245,000 unique visitors in January 2012.[101]  In addition to English, Pinterest has Spanish and Portuguese language sites.[102]  On August 24, 2012, Pinterest rolled out its German and Dutch sites.[103]  Aptly, Pinterest seems attuned to consumer needs.  The integration of these two new languages is consistent with the consumer trends in European usage reported earlier in the year.[104]  With some tweaks for cultural differences, Internet companies often find additional success in international markets[105] and many of the same growth strategies domestically can translate to gains in an international arena.


V.  Conclusion

[42]      Pinterest capitalizes on societal narcissism, materialism, and consumerism, but its strategy has not paid dividends yet.  Pinterest must act quickly to monetize its growing network, not only to become self-reliant and fund R&D for future growth, but also to preempt future lawsuits that could spell the end for Pinterest.  Although Pinterest faces potential trademark and privacy concerns, its greatest challenge is in the area of copyright.  Executives and lawyers across the industry ought to consider Pinterest a “game changer” in this respect.  The courts have not yet ruled on whether the type of file sharing Pinterest users engage in will incur liability.  If the courts find no liability, a new paradigm in copyright liability will emerge and Pinterest will be there to lead the charge.  However, if copyright holders are successful in showing either direct infringement by Pinterest users or secondary liability by Pinterest, recovery will be extremely costly, not to mention difficult from a public relations perspective.  Not only will Pinterest need to revamp its business model, but its debt will only rise.  Regardless, Pinterest should profit from the assets it does have, namely its brand and its network.  Pinterest should obtain intellectual property protection for these assets to help penetrate other markets, such as commerce and research, both domestically and internationally.

[43]      With bated breath, the world waits.  Will the copyright landscape change?  Will Pinterest reinvent the pin again?  Will Pinterest become a stalwart by resolving the copyright question or will it fade into obscurity like so many other Internet start-ups?  We already know Pinterest is no ordinary start-up.  Indeed, Pinterest experienced unprecedented growth yet its impact on society remains to be seen.  While Ben Silbermann is Pinterest’s Chief Executive Officer and will determine the company’s strategic direction,[106] it will be Yang who advises Pinterest as the company struggles to overcome its legal woes.  Batter’s up, Yang, and the clock is ticking.

*J.D. Candidate 2013, University of San Diego School of Law, and Lead Articles Editor for the San Diego International Law Journal.  She would like to thank Professor Ted Sichelman and the Richmond Journal of Law and Technology editors for their guidance on this Article.  A special thank you goes to her family for their continued support.


[1] See Keith Dsouza, 11 Free and Useful Sticky Notes (Sticky Pad) Software for Your Desktop, Techie Buzz (May 20, 2008),

[2] See Leander Kahney, Apple Lets Cat Out of the Bag, Wired (June 29, 2004), (discussing the anticipated release of the Mac OS X 10.4 version, and highlighting thatthe system will summon a range of accessory apps — including weather, world clocks, a calendar and sticky notes — and dismiss them with a single keyboard button”).

 [3] See Team, Pinterest,  (last visited Oct. 5, 2012).

 [4] See Reb Carlson, Why Pinterest Is Like No Other Social Network, 360i Blog (Dec. 5, 2011), (“[U]nique visitors increased from 418,000 in May [2011] to 3.3 million in October [2011], meaning traffic increased for this site sevenfold in five months alone.”); see also Careers, Pinterest, (“Pinterest is one of the fastest growing social services in the world.”) (last visited Sept. 12, 2012).

 [5] See Carlson, supra note 4 (“Beyond its impressive growth over a year, Pinterest has also emerged as a very new type of platform when it comes to the way people engage with content in its community.”).

 [6] SeeOwen Thomas,Uh Oh! Amazon Researchers Say Pinterest Doesn’t Generate A Lot Of Sales,Bus. Insider (Aug. 28, 2012), (indicating that this lack of monetization “is a big problem for Pinterest, because the whole idea of the site is that it’s supposed to be better at monetizing social activity than Twitter or Facebook.”); see also Carlson, supra note 4 (“There is no paid advertising or media (yet), so all engagement has been purely organic.”).

[7] See Copyright & Trademark, Pinterest, (last visited Sept. 29, 2012).

 [8] Nicholas Carlson, Pinterest Just Hired a Big Name Lawyer From Google to Deal With One of Its Biggest Threats, Bus. Insider (June 8, 2012), Yang’s prior experience at Google includes dealing with several high profile controversies.  Id.  In particular, Yang handled issues with Google Chrome’s terms of service in 2008 and Buzz in 2010, and further acted as Google’s spokesperson, defending a new privacy policy in Washington, D.C. in 2012.  Id.

 [9] See Matt McGee, Pinterest Hires Away A Google Attorney To Start Its Own Legal Department, Marketing Land (June 8, 2012 7:04 P.M.),; Carlson, supra note 4 (“Pinterest faces a serious legal challenge in its future: a huge portion of the content hosted on the site is copyrighted content, posted without the consent or even knowlege [sic] of the copyright owners.”).

 [10] Siva Vaidhyanathan, The Googlization of Everything (And Why We Should Worry) 2 (2011) (“Google puts previously unimaginable sources at our fingertips – huge libraries, archives, warehouses of government records, troves of goods, the coming and goings of whole swaths of humanity.”).

 [11] Id.

 [12] Debra Borchardt, Pinterest is a $7.7 Billion Company, Forbes (Apr. 16, 2012),

 [13] See Bruce Watson, Facebook IPO Valuation Sets Record: Is It Really Worth $104 Billion?,Daily Fin. (May 17, 2012), Underwriters set a price that valued Facebook at $104 billion, but that valuation has come under fire since Facebook’s Initial Public Offering on May 17, 2012.  Lee Spears & Sarah Frier, Facebook Told Regulators IPO Range Was Near Fair Value, Bloomberg (June 15, 2012),

 [14] Borchardt, supra note 12.

 [15] Id.

 [16] Id.

 [17] Facebook-Led Social Media Market is Redefining Communication in the Digital and Physical Worlds, ComScore Data Mine (Feb. 20, 2012),

[18] Todd Wasserman, Pinterest is Now the No. 3 Social Network in the U.S., (Apr. 6, 2012),

[19] Borchardt, supra note 12.

 [20] Salvatore Rodriguez, Pinterest grew more than 4,000% in one year, report says, Los Angeles Times (June 15, 2012),,0,756408.story.

 [21] What is Pinterest?, Pinterest, (last visited July 8, 2012).

 [22] Getting Started, Pinterest, (last visited July 8, 2012).

 [23] Pinning 101, Pinterest, (last visited Oct. 5, 2012).

 [24] Id.

 [25] Umika Pidaparthy, Pinterest not manly enough for you? Try these sites, bro, (June 22, 2012),

 [26] Id.

 [27] See Daniel Altman, United States of Narcissism, The Daily Beast (July 17, 2011),

 [28] Jean Twenge, The Narcissism Epidemic: Living in the Age of Entitlement 4, 67-68 (2009).

 [29] See id. at 38.

 [30] See generally id. at 73-108.

 [31] See id. at 111.

 [32] Cf. Brooke Overby, Contract, in the Age of Sustainable Consumption, 27 J. Corp. L. 603, 626-27 (2002).

 [33] See Alex Woolf, Consumerism 10 (2004).

 [34] See David Kirkpatrick, The Facebook Effect: The Inside Story of the Company That is Connecting the World 263-64 (2010).

 [35] See Borchardt, supra note 12.

 [36] See Mark Lemley & David McGowan, Legal Implications of Network Economic Effects, 86 Calif. L. Rev. 479, 483, 594–98 (1998) (distinguishing network effects from general positive externalities (e.g., where education confers social benefits all those in the classroom), economies of scale (where the effect is on the supply rather than the demand side), path dependence (the tendency of history to influence present decision-making), models of collective behavior, and otherwise exclusionary behavior).

 [37] See id. at 488, 491–94 (noting that lesser forms of network effects include virtual networks (which provide value that increases with the number of additional users of identical and/or interoperable goods) and positive feedback effects (where value increases even where the goods are not themselves connections to a network and do not interoperate with like goods)).

 [38] Id. at 496 (defining “‘[t]ipping’ . . . [as] the tendency of one system to pull away from its rivals in popularity once it has gained an initial edge.”).

 [39] Given that Pinterest has a registered trademark, the mark has protection under the Lanham Act.  On August 31, 2012, Pinterest filed a complaint against a Chinese trademark applicant Qian Jin for trademark infringement; Jin filed applications in March 2012 for the use of “‘Pinterest’ and ‘Pinterests’ for hotel and food services and for advertising and marketing.”  Victoria Slind-Flor, Google, Louboutin, Pinterest, Cengage: Intellectual Property, (Sept. 6, 2012),  In addition to trademark infringement, Pinterest alleges cyberpiracy, trademark dilution and cybersquatting, that is, “bad-faith registration and use of numerous domain names containing, or confusingly similar to, Pinterest’s famous and federally registered PINTEREST trademark.”  Dara Kerr, Pinterest Gives Legal Punch to ‘Serial Cybersquatter,’ cnet (Sept. 5, 2012, 7:26 PM),  In addition to protecting its mark, Pinterest must not facilitate the infringement of other marks.  Pinterest’s trademark policy pointedly states that “[a]ccounts with usernames, Pin Board names, or any other content that misleads others or violates another’s trademark may be updated, transferred, or permanently suspended.”  Copyright & Trademark, supra note 7.  Although Pinterest acknowledges potential for infringement, dilution, or false advertising, its policy leaves users with little guidance on what might lead to violation of one’s trademark.  One report “estimates that brand squatting on Pinterest affects 90 percent of top brands,” such as FedEx, Coca Cola, McDonald’s and Dell PC.  Lauren Rae Orsini, Pinterest’s New Problem: Brand Squatters Are Screwing Over Big Companies, Bus. Insider (Mar. 16, 2012),  Whereas Twitter and Facebook have already addressed “trademark infringement and impersonation [of] brands and celebrities,” Pinterest has yet “to regulate who has the rights to a username.”  Id.  Pinterest may also be liable for indirect or contributory infringement.

 [40] Although internet companies may claim to give users substantial control over how their actions and preferences are collected and used, the reality is that users are at the mercy of these companies, and user choices mean very little; given the amount of surveillance and tracking companies engage in, the privacy policies and infrastructure stack the odds against an unsuspecting or even educated user.  See, e.g., Vaidhyanathan, supra note 10, at 84 (examining Google’s privacy policy, or lack thereof).

 [41] Sarah E. Needleman & Pui-Wing Tam, Pinterest’s Rite of Passage–Huge Traffic, No Revenue, Wall St. J. (Feb. 16, 2012),

 [42] Pinning 101, Pinterest, (last visited Sept. 30, 2012).

 [43] Id.

[44] Jeremy Cabalona, Interest in Pinterest Reaches a Fever Pitch [INFOGRAPHIC], Mashable (Apr. 29, 2012),

[45] See Bill Gurley, How to Monetize a Social Network: MySpace and Facebook Should Follow TenCent, (Mar. 9, 2009),

 [46] Kirkpatrick, supra note 34, at 272.

 [47] Women Spend More Time Social Networking than Men Worldwide, comScore Data Mine (Dec. 22, 2011),

 [48] See 12 Things You Need to Know About Pinterest for Business, (Sept. 4, 2012),

 [49]  Kristin Piombino, Pinterest users spend more money than Facebook, Twitter users, ragan (Sept. 13, 2012),

 [50] Needleman & Tam, supra note 41.

 [51] Michelle Singletary, New Privacy Police Lets Google Watch You – Everywhere, Wash. Post (Mar. 3, 2012), available at–everywhere/2012/02/27/gIQAdyscpR_story.html.

 [52] Kirkpatrick, supra note 34, at 266.

 [53] Id. at 267.

[54] Peter Pham, Pinners and Losers, The Motley Fool (July 11, 2012),

[55] Scott Brave, Pinterest, We’ve Got a Business Model for You, GigaOm (Mar. 24, 2012),

 [56] Jennifer Heidt White, Safe Haven No More: How Online Affiliate Marketing Programs Can Minimize New State Sales Tax Liability, 5 Shidler J. L. Com. & Tech. 21, 1 (2009) (describing affiliate market models).

 [57] Conversion rate measures the number of users who made retail purchasers after clicking on Pinterest’s link.  Although Pinterest’s traffic does not convert as well (1.02%) as Google search traffic (1.62%) or Facebook (1.13%), its potential is huge.  Nicholas Carlson, This Stat Reveals the Incredible Potential of Pinterest – and Why Amazon, Google, or Facebook Should Buy It, Bus. Insider (July 11, 2012),–and-why-amazon-google-or-facebook-should-buy-it-2012-7.

 [58] Laurie Segall, Pinterest Quietly Profits Off Its Users Links, CNN Money (Feb. 10, 2012),

 [59] Samantha Murphy, Pinterest Partner: Yes, They’re Making Money from Pins, Mashable (Feb. 8, 2012),

 [60] Matt McGee, Pinterest Drops Skimlinks, Might Try Ads; Says Copyright Issues Not a Significant Issue Yet, Marketing Land (Feb. 16, 2012),

 [61] Theodore Levitt, Innovative Imitation, Harv. Bus. Rev., at 63 (Sept. 1966), available at (noting that innovation furnishes “(1) newness in the sense that something has never been done before, and (2) newness in that it has not been done before by the industry or by the company now doing it”).

 [62] See, e.g., Bilski v. Kappos, 130 S. Ct. 3218, 3228, 3231 (2010) (holding that a method for hedging risk in the commodities market was not patentable subject matter; still, a patent-eligible process may include some methods of doing business).

 [63] 17 U.S.C. § 102(a)(5) (2006).

 [64] 17 U.S.C. § 102(a) (2006).  Although relevant, the statute does not require formalities today.  See id.

[65] See Feist Publ’n, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340, 345-48 (1991) (holding that a directory that contains absolutely no protectable written expression, only facts, meets the constitutional minimum for copyright protection if it features an original selection or arrangement).

[66] See, e.g., Burrow-Giles Lithographic Co. v. Sarony, 111 U.S. 53, 55 (1884) (holding that a photograph by Sarony deserved copyright protection given that Sarony arranged the subject matter, lighting and composition).

 [67] 17 U.S.C. § 101 (2006).

 [68] Cf. Michelle Sherman, To Pin or Not To Pin: How Businesses Can Use Pinterest and Reduce Their Legal Risks of Copyright Infringement, 17 Cyberspace L. 3, 5 (Apr. 2012) (noting that Pinterest user instructions “do not say anything about paying a royalty to the original creator or getting their permission in advance to use their image on Pinterest”).

 [69] More Attribution and Inline Play, Pinterest (June 20, 2012), (“In May, we announced that content from some of the Web’s biggest creative communities—Flickr, YouTube, Behance, and Vimeo—will be clearly, consistently and automatically credited when pinned to Pinterest. Today, we’re thrilled to announce that photos from 500px, handcrafted and vintage items from Etsy, videos from Kickstarter, presentations from SlideShare, and sounds from SoundCloud will all show the same attribution.”).

 [70] Aaron, Comment to Instant Answers, Pinterest (Mar. 6, 2012, 9:41 AM),

 [71] 17 U.S.C. § 107 (2006).

 [72] Perfect 10, Inc. v., Inc., 508 F.3d 1146, 1169 (9th Cir. 2007) (holding that Google’s use of thumbnails was transformative fair use because it improved access to information on the Internet versus artistic expression); Bill Graham Archives v. Dorling Kindersley Ltd., 448 F.3d 605, 615 (2d Cir. 2006) (holding that even though the images were reproduced in their entirety, the images were displayed in a reduced size and scattered among many other images and text so as to ensure the reader’s identification of the posters as historical artifacts; notwithstanding the creative nature of the posters, publishers’ use of the posters in biographical book was transformative).

 [73] 17 U.S.C. § 107 (2006); see, e.g., Harper & Row, Publishers, Inc. v. Nation Enter., 471 U.S. 539, 560-68 (1985).

 [74] See generally Goodies, Pinterest, (last visited Sept. 28, 2012) (explaining the “Pin It” button).

[75] Asset Mktg. Sys., Inc. v. Gagnon, 542 F.3d 748, 754 (9th Cir. 2008); cf. 17 U.S.C. § 204 (2006) (requiring an exclusive license to be in writing).

[76] Cf. U.S. Const. amend. I (describing the freedom of speech guarantees granted by the First Amendment).

 [77] Eldred v. Ashcroft, 537 U.S. 186, 221 (2003) (stating that “[t]o the extent such assertions raise First Amendment concerns, copyright’s built-in free speech safeguards are generally adequate to address them”).

 [78] Harper & Row, Publishers, Inc. v. Nation Enter., 471 U.S. 539, 560, 595 (1985) (holding that “First Amendment protections [are] already embodied in the Copyright Act’s distinction between copyrightable expression and uncopyrightable facts and ideas, and the latitude for scholarship and comment traditionally afforded by fair use,” while refusing to expand the doctrine of fair use to create what amounts to a public figure exception to copyright).

 [79] See Carlson, supra note 4.

 [80] Gershwin Publ’g Corp. v. Columbia Artists Mgmt., Inc., 443 F.2d 1159, 1162 (2nd Cir. 1971).

 [81] Sony Corp. of Am. v. Universal City Studios, Inc., 464 U.S. 417, 442 (1984).

 [82] Metro-Goldwyn-Mayer Studios, Inc. v. Grokster Ltd., 545 U.S. 913, 919 (2005).

 [83] Id. at 913; see Jonathon Bailey, The Great Pinterest Divide: To Opt Out or Not, Plagiarism Today (Feb. 23, 2012), (likening the file sharing in Grokster to the image sharing on Pinterest and noting that “since Pinterest is actively encouraging pinning of images found on ‘any’ website and their business model relies on widespread infringement, it could be liable” under the Grokster test as an inducer of copyright infringement).

 [84] See generally 17 U.S.C. § 512 (2006).

 [85] See generally id. § 1201 (2006); Sony Corp. of Am. v. Universal City Studios, Inc., 464 U.S. 417 (1984).

[86]  17 U.S.C. § 512 (2006).

[87] Id. § 512(c)(2).

 [88]Id. § 512(i)(1)(A)-(B), (c)(2).

 [89] Copyright & Trademark, supra note 7.

[90] See Privacy Settings, Facebook, a privacy setting that requires the content owner’s permission when other users attempt to tag them or their content) (last visited Oct. 1, 2012); see also Statement of Rights and Responsibilities, Facebook, (requiring users to agree not to “post content or take any action on Facebook that infringes or violates someone else’s rights,” permitting Facebook to remove any infringing content without notice, and providing a mechanism whereby users can report intellectual property infringement) (last visited Oct. 1, 2012).

[91] Copyright & Trademark, supra note 7.

[92] PINTEREST, Registration No. 4,145,087.

[93] See 15 U.S.C. § 1051 (2006); see also id. § 1125; 35 U.S.C. § 171 (2006).

[94] The Daily Muse, Check Out Pinterest’s Palo Alto Office, Where Employees Get Weekly Happy Hours, Hackathons, and Fan Swag, Bus. Insider (June 5, 2012),

[95] Smartphone Shoppers: Top 10 On Device Purchase Categories, ComScore Data Mine (June 4, 2012),

 [96] Id.

[97] Introducing Pinterest for Android, iPad and iPhone, Pinterest (Aug. 14, 2012),

 [98] Pinterest announced open registration on August 8, 2012.  As a result, new users do not need to rely on their Facebook or Twitter connections to send them an invite; they need only provide an email address to register.  Open Registration!, Pinterest (Aug. 8, 2012),

 [99] Is Pinterest the Next Big Social Network in Europe?, ComScore Data Mine (Feb. 22, 2012),

 [100] Id.

 [101] Id.

 [102] Pinterest ya está disponible en más idiomas, Pinterest (June 28, 2012),

 [103] Pinterest in German and Dutch!, Pinterest (Aug. 24, 2012),

 [104] See Is Pinterest the Next Big Social Network in Europe?, supra note 99.

 [105] Cf. Emily Lambert, When a Faux Pas Can Really Cost You, Forbes (Aug. 12, 2004),

[106] Heather Kelly, Pinterest releases new iPad and Android apps, CNNTech (Aug. 15, 2012), (quoting Silbermann who said “’our goal is to actually get you offline, to get you to go out’”).

Technologies-That-Must-Not-Be-Named: Understanding and Implementing Advanced Search Technologies in E-Discovery

pdf_icon Download PDF

 by Jacob Tingen

I. Introduction

[1] The Federal Rules of Civil Procedure were created to promote the “just, speedy, and inexpensive determination of every action and proceeding.”[1] Unfortunately, in the world of e-discovery, case determinations are often anything but speedy and inexpensive.[2] The manual review process is notoriously one of the most expensive parts of litigation.[3] Beyond expense, the time and effort required to carry out large-scale manual review places an immense burden on parties, nearly destroying the possibility of assessing the merits of early settlement before expensive review has already been carried out.[4] Due to the difficulty inherent in the manual review process and the potential for human error, courts have become tired of seeing what they view as incompetence among attorneys.[5] All acknowledge that technology is the main culprit—e-mail alone produces 100 billion new messages daily.[6] At the same time, technology may in fact provide the solution to the e-discovery problem.[7]

[2] In response to the e-discovery challenge, courts and commentators have begun to refer to “new” and “emerging” search technologies.[8] Some tout them as the holy grail of e-discovery, while others dismiss the new technologies as unfit for the task or unable to compete with the raw capability of hundreds of attorneys reviewing documents for hours on end.[9] Even now, doubts exist as to whether new technologies really can help resolve the difficulties experienced by attorneys tasked with increasingly demanding discovery requests.[10]

[3] Even for those who are aware of the existence of advanced search and review tactics beyond keyword search, many questions remain for attorneys and judges alike. First, what are the new and emerging technologies? While courts and commentators mention the existence of the technologies, there is not much guidance with regard to what the new technologies are and what they accomplish.[11] Second, are the new technologies superior to the manual review process? Understandably, attorneys are hesitant to use an unfamiliar e-discovery product that may not work better than the e-discovery process to which they are already accustomed.[12] Third, if attorneys do use a new search and review process, what standards of accuracy or defensibility is a court likely to impose? When managing the discovery process, attorneys want to be sure that the method of production satisfy the expectations of the court.[13]

[4] This article answers those questions. It demonstrates that attorneys have a legal duty to understand and use advanced conceptual search and review technologies as part of an e-discovery review process when dealing with large amounts of information. It then briefly explains how those technologies actually work, why they are superior in both accuracy and efficiency to traditional manual review, and how one can defend use of these new technologies in court.

[5] Part II of this article discusses the need for lawyers to reconsider which search methodologies to use in the e-discovery review process. It reveals that lawyers currently have a duty to understand technology to competently represent their clients and argues that this duty should extend to a cursory understanding of e-discovery search tactics. It discusses the reluctance of the legal community to adopt new search technologies for a variety of reasons, including economic concerns and lack of experience with technology. It briefly explains recent judicial decisions advocating the use of advanced search and review technologies.

[6] Part III provides a background of advanced search technologies, some explanation of what they are, and information on their use in the e-discovery context. It analyzes recent research finding that advanced search and review methodologies are more effective than a keyword process followed by extensive manual review. Furthermore, it discusses steps to ensure that counsel’s implementation of advanced search technologies will be defensible in court.

[7] Finally, Part IV addresses some concluding issues. It identifies when advanced search technologies should be used as opposed to other search and review methods. It discusses the issue of attorney-client privilege and argues that courts should be lenient when evaluating whether privilege has been waived by inadvertently produced documents after an advanced search of millions of documents. It argues that when practitioners properly implement advanced search technologies, they meet their legal duty and help further the goals of the Federal Rules of Civil Procedure by making e-discovery more efficient, more accurate, and less expensive.

II. Adoption of New Search Technologies

[8] “Lawyers need to rethink how they perform ‘searches.’”[14] Familiar with keyword and Boolean search operations from widespread experience with popular legal research services, attorneys tend to apply the same skill set when they approach e-discovery.[15] Unfortunately, simple keyword searches followed by extensive manual review have proven inadequate when it comes to finding the responsive documents necessary to litigate a case on its merits.[16] Overcoming the shortcomings of keyword search and the high expense of complete manual review has become an important goal in e-discovery practice.[17] Courts and commentators have pointed to emerging search and review technologies as the answer to the manual review problem.[18] In effect, attorneys must have a basic understanding of e-discovery and the available search technologies to competently represent their clients.[19]

A. A Legal Duty To Use Advanced Search Technologies?

[9] Requiring attorneys to have a foundational understanding of technology is not without precedent.[20] In the seminal Zubulake cases, Judge Scheindlin went so far as to delineate a new legal duty, requiring attorneys to understand their client’s technology infrastructure.[21] Zubulake, while instructive mainly from the context of determining when the duty to preserve is triggered, also provides helpful background in examining whether attorneys have a duty to cultivate an understanding of technology.[22]

[10] The plaintiff in Zubulake leveled charges of gender discrimination against her former employer in August of 2001.[23] While the factual and procedural background makes for an interesting read for anyone interested in e-discovery, the primary thrust of the e-discovery problems in the case arose from the plaintiff’s request for certain e-mails which the defendant repeatedly failed to produce.[24] The 2006 Amendments to the Federal Rules of Civil Procedure were unavailable to the parties involved, and as a result, Judge Scheindlin’s commentary throughout the entire series of Zubulake cases in some way set the stage for the new rules and continues to prove influential in modern e-discovery practice and discussion.[25] In particular, Judge Scheindlin held that “counsel must become fully familiar with her client’s document retention policies, as well as the client’s data retention architecture.”[26] That’s legalese for saying lawyers must learn to speak tech.[27]

[11] In the future, lawyers must become competent when dealing with and talking about technology.[28] Judge Scheindlin clarified this expectation in 2004 when she said that during the discovery process, attorneys must speak with their client’s information technology personnel to learn about their client’s system-wide information storage procedures and policies.[29] In short, attorneys have a legal duty to understand technology.[30] This article argues that this duty should also extend to understanding and implementing “emerging” search technologies.

[12] In the years since Zubulake, the field of e-discovery has experienced further advances in research and sophistication, including the 2006 Amendments to the Federal Rules of Civil Procedure,[31] guidelines and standards developed by the Sedona Conference,[32] a rising level of education among the bench,[33] and the development of new technologies to assist in searches.[34]

[13] The Sedona Conference Cooperation Proclamation (“Proclamation”) suggests a more collaborative approach to e-discovery in litigation.[35] Endorsed by judges across many jurisdictions, the Proclamation promotes education in e-discovery technology to ensure the “just, speedy, and inexpensive determination of every action.”[36] In particular, the Proclamation identifies the need for cooperation and understanding between not only plaintiff and defense counsel, but also among technology professionals.[37] Furthermore, it advocates educating attorneys about the tools available through law school programs and classes to help new lawyers understand the technical, legal, and cooperative aspects of e-discovery, as well as programs to help businesses understand how to manage their electronic records.[38] The need for training with regard to e-discovery strategies and technologies is widely expressed, and endorsed, by the judiciary in many states.[39]

[14] Indeed, some believe that the “legal profession is at a crossroads: the choice is between continuing to conduct discovery as it has ‘always been practiced’ . . . or, alternatively, embracing new ways of thinking in today’s digital world.”[40] Clients can no longer bear the mounting costs of e-discovery, and overburdened judges are beginning to recommend newer search and review methodologies to attorneys.[41] Extensive manual review of every document in litigation is already impossible in many cases and manual review guided by keyword search alone has proven ineffective in others.[42] The attorney of the future must embrace new technologies or face being drowned in an overwhelming sea of data.[43]

B. Resistance To New Search Technologies

[15] Despite the need for adoption of better technologies, some attorneys assert that keyword and Boolean searches are the industry standard and that newer technologies are cost-prohibitive and less accurate.[44] This assertion is incorrect.[45] In the face of a growing amount of evidence showing that new search technologies can make the e-discovery process easier and more efficient,[46] the legal community tends to push back against newer search technologis for a variety of reasons.

[16] First, the manual review process is notorious for being the most expensive piece of an e-discovery request.[47] With upwards of fifty-percent of e-discovery costs attributed to the manual review process, an attorney’s potential earnings can be tough to ignore.[48] The conflict between the legal industry’s self interest and the just, speedy, and “inexpensive determination” of a case creates serious ethical concerns.[49] Typical manual review costs can range anywhere from two hundred and fifty dollars to five hundred dollars per hour to scan through mountains of documents, a process which can take months.[50] In many situations, law firms charge a premium to boost profits. For example, one contract attorney recently learned that her firm billed its client two hundred and fifty dollars per hour during a manual review while only paying her thirty-five dollars per hour.[51] Firms clearly have an economic incentive to continue using a manual review process that has a potential for huge profits. Acknowledging that new search technologies are more effective than manual review may mean giving up revenues the legal industry is accustomed to receiving.[52]

[17] Other attorneys may not like the idea of learning a new set of technologies. In general, lawyers are not known for being tech savvy.[53] Some commentators have mentioned their dismay with the legal profession’s inability to keep up with the technology industry.[54] Perhaps in e-discovery, this failure to keep up with newer technologies results from over-familiarity with keyword search.[55] Many attorneys are of the opinion that keyword search is the industry standard and that it effectively finds the majority of relevant documents in a given data set.[56] Recognizing that a better method exists may amount to a significant investment of time, classes, and hardware in order to understand and implement new technologies.[57]

[18] Even though more e-discovery resources are available today than ever before, some attorney behavior demonstrates a lack of understanding in how to meet a client’s e-discovery needs.[58] In many cases, counsel’s “apparent lack of savvy” is to blame for overbroad, expensive, or poorly implemented discovery.[59] For example, in a 2010 case, it seemed that both the court and counsel involved were unaware of the possibility of using alternate search methodologies to assist in a more accurate or expedited review.[60]

[19] In fact, with merely two exceptions, there were no judicial opinions prior to 2012 that even mentioned the use of alternative search methods to expedite document review, much less explain what those search methods might be or provide guidance on how to implement them.[61] Only very recently has counsel received explicit judicial approval of the use of advanced search methodologies in e-discovery, as evidenced by Judge Peck’s groundbreaking opinion in Moore v. Publicis Groupe.[62]

C. Judicial Approval of Advanced Search Technologies

[20] The first two opinions to broach the subject of the potential of advanced search technologies address the issue only anecdotally.[63] In Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, an advocacy group brought disability discrimination claims against the transit authority.[64] During discovery, the plaintiff requested information that could only be found on backup tapes because the original e-mails in question had been destroyed.[65] The court ordered restoration and search of the backup tapes.[66] In its order, the court requested that the parties consider how the information on the backup tapes would be searched and directed the parties to recent scholarship arguing that conceptual search technologies could provide more efficient, comprehensive, and accurate results than a keyword search process.[67]

[21] The only other case to recommend alternative or advanced search methodologies was Judge Grimm’s decision in Victor Stanley, Inc. v. Creative Pipe, Inc.[68] Included in Judge Grimm’s criticism of the plaintiff’s discovery efforts, he repeatedly discussed the lack of qualification of the members in the plaintiff’s party to build a targeted search.[69] This language highlights the expectation that attorneys be competent or seek competent help in conducting e-discovery. Furthermore, Judge Grimm cites the potential shortcomings of keyword search and mentions other options that counsel could use in the e-discovery process.[70] In a footnote, the opinion explains some of the potential search alternatives currently available.[71]

[22] In contrast, Judge Peck’s decision in Moore is the first judicial opinion to approve a document review process that leverages advanced search and review technologies.[72] The basic facts of the case along with a summary of Judge Peck’s discussion of the application of advanced search and review technologies are outlined below. Because his opinion provides guidance as to how counsel should proceed when using a technology-assisted review process, it is addressed further throughout this article.[73]

[23] In Moore, plaintiffs claimed that the defendant violated numerous gender discrimination laws.[74] As part of their discovery effort, plaintiffs sought numerous e-mails and other electronically stored information to prove the gender bias.[75] During the parties’ discussion as to how the requested information should be reviewed, plaintiffs raised objections to the defendant’s proposed use of technology-assisted document review in the form of predictive coding.[76] The court took an active role in the discovery dispute, pointing out that advanced search and review technologies often lead to better results than traditional keyword search and document review, and encouraged the parties to continue to work out an acceptable discovery plan.[77] During various discovery conferences, the parties and the court discussed how to proceed with discovery issues, such as the number of custodians and other sources of electronically stored information (“ESI”), the number of phases in which to review documents, the predictive coding or technology-assisted review process, and the level of transparency in the review process.[78] At various points in the opinion, the court emphasized that advanced search and review technologies typically produce more accurate results than keyword search and manual review.[79] Finally, the court ordered the parties to go forward with their agreed upon technology-assisted review process, becoming the first court to judicially approve the use of advanced search and review technologies.[80]

[24] Judge Peck’s order has recently come under intense scrutiny by both the plaintiff’s attorneys in Moore and the legal community at large.[81] Even though U.S. District Judge Andrew Carter initially confirmed the order, Judge Peck granted a motion to stay discovery after the plaintiff’s continued calls for his recusal and for revision of the e-discovery protocol.[82] Despite the predictable pushback by the plaintiffs in this case, attorneys should recognize that widespread use of advanced search technologies in e-discovery will one day be the standard;[83] it simply makes more sense to use a specialized machine to find a needle in a haystack as opposed to manually searching through each individual piece of hay.[84]

[25] Much of the commentary already examined, as well as the opinions coming from the bench, provides the clear message that, “[l]awyers [still] need to rethink how they perform ‘searches.’”[85] Even with this clear instruction to use new technology, practitioners have important questions about how to use them. It is essential to find answers about what the new search technologies are, how they work, and how to defend their use in court. Part III of this article provides those answers.


III. Examination of Search Technologies in E-Discovery

[26] Part II considered the current climate of search technologies in e-discovery and an attorney’s duty to understand those technologies. A legal duty to understand technology is not without precedent. Court opinions and commentary lead to the conclusion that some e-discovery technologies, like keyword search, may be insufficient and therefore attorneys should educate themselves about alternate search technologies.

[27] Given the scarcity of information regarding advanced search technology and how it operates, Part III begins by providing a lay-lawyer description of conceptual search technologies and how they are employed in e-discovery. It analyzes recent research proving that advanced search technologies lead to a more complete, accurate, and cost-efficient e-discovery process. Furthermore, it provides practical guidance to ensure that an attorney’s use of conceptual search technologies is defensible in court.

A. An E-Discovery Search Vocabulary

[28] The purpose of this article is not to provide an in-depth technical examination of search methodologies or to advocate the use of a particular e-discovery vendor or product. Its purpose is merely to present in ordinary language current search methodologies that are now available and that may help counsel and clients throughout the American justice system to better coordinate e-discovery efforts. This paper accomplishes this task by discussing advanced search technologies in lay-lawyer terms that any member of the bar practicing in the twenty-first century should be capable of understanding. The rationale behind a lay-lawyer explanation of advanced search technologies is to use the vocabulary framework that has been developed through commentary in the Sedona Conference and other articles[86] that technical consultants will also understand[87] and upon which further commentators can build and provide new insight as technology advances.[88] To this aim, the following technologies are defined and some of their uses are outlined to a limited extent to help readers understand and apply, or at least defend themselves when discussing, modern e-discovery search methodologies.

[29] To begin, it is also important to recognize that this is more than a theoretical discussion. Courts and commentators have at times referred to the following technologies using vague terms such as “emerging” search methods.[89] However, since it is clear that the technologies exist, they have officially “emerged.”[90] By clearly identifying these search methods, it should help practitioners overcome any fear of dealing with Technologies-That-Must-Not-Be-Named.[91] As a group, the technologies should perhaps be acknowledged as “advanced” search technologies or often as “concept” or “conceptual” searches, though never referred to as “new” or “emerging.”[92] No one should suggest that the technology is unavailable, untried, or not yet suited to the e-discovery task.

[30] Furthermore, given the rapidly evolving state of technology, this should not be considered a comprehensive list requiring no further learning on the part of the practitioner or judge.[93] Rather, the explanations that follow should be considered a starting point, allowing lawyers to quickly gain a basic understanding of some of the overarching search technologies and concepts currently in use in e-discovery practice.

1. Keyword Search

[31] Most practicing attorneys are already familiar with keyword searches due to their experience with popular legal research services like Westlaw and Lexis, as well as society’s general experience with web search engines like Google and Yahoo!.[94] Given the legal industry’s general familiarity with keyword and Boolean search technology, more time will be given to explaining the more advanced conceptual search technologies. Of course, keyword searches will continue to play a part in e-discovery. The simplicity in its use makes it possible to immediately sift through a data set and gain some general ideas about the use of certain keywords.[95]

[32] However, the main problem with keyword search is the very simplicity that has given it widespread use.[96] In its most basic form, keyword search can only find documents with the exact keyword searched for.[97] This means that potentially relevant documents that do not contain any of the keywords searched for will not be found, notwithstanding the expertise in choosing the keywords.[98] At the same time, the use of a keyword in a document does not guarantee relevance.[99]

[33] Two variants on keyword search attempt to overcome this problem: Boolean operators and fuzzy search technologies.

a. Boolean Operators

[34] Boolean operators may help resolve the shortcomings of keyword search to some degree, thereby allowing a user to request documents with multiple keywords, find specific phrases, or even find keywords within a specified proximity to each other.[100] One advanced Boolean tactic allows the use of wildcard operators, a practice known as truncation or stemming, to find keywords that use the same word root.[101] For example, a search for “read*” would find documents containing the words “reads,” “reader,” and “reading.”[102] In this way, Boolean operators extend a keyword search, making it more likely to find relevant documents by combining keywords.[103] However, as an extension of keyword, it suffers from the same weakness: it is still guesswork.[104]

b. Fuzzy Search

[35] As another attempt at overcoming keyword’s simplicity, fuzzy search assists parties in finding misspelled keywords.[105] As mentioned, keyword search is strict in the sense that it can only find documents with the exact keywords used, misspellings notwithstanding.[106] Fuzzy search overcomes the occasional typo by giving more weight to words whose middle letters match since English usage tends to have more word variants or misspellings at the beginning and end of words.[107] This would, for example, recall alternate spellings like “theatre” and “theater,” as well as find words that are simply mistyped.[108] Despite fuzzy search’s utility, it only presents part of the picture in overcoming limitations associated with keyword search.[109]

[36] Even with the help of Boolean and fuzzy technology, keyword search is still guesswork.[110] Parties may make educated guesses about which keywords may have been used in a universe of documents, but the problem remains that keyword can only find documents with the keyword searched.[111] Unfamiliarity with a case and industry specific language or slang used by the key parties may lead to the inability to form an adequate search, which in turn leads to disappointing search results.[112] In effect, keyword search is an attempt at divination; it is a gamble that hopes to find a majority of relevant documents based on informed guesswork about an industry in a particular set of documents.[113] This is why a study has shown that keyword searching reveals only one in five relevant documents.[114] The Moore decision previously discussed lamented this limitation of keyword search technology and cited this weakness in justifying the use of advanced search and review technology.[115] Given the lack of crystal balls in e-discovery, attorneys must instead turn to advanced conceptual searches.

2. Conceptual Search Technologies

[37] Conceptual search technologies overcome the weakness of keyword search by recalling more than just documents containing the exact words in the search query. Instead, conceptual searches find documents based on their relevance or similarity to the ideas expressed in the search query.[116] These advanced technologies can take words, phrases, or even a “training set” of documents as an input query, as the parties did in the Moore decision,[117] and then recall material that is conceptually related to the search query. A very basic overview of how these technologies find conceptually related material is provided below. Again, the following is not an exhaustive discussion of available concept search and review technologies, but is provided to give practicing attorneys a general idea of the kinds of technologies that are available and a quick view of how they work.

a. Ontology and Taxonomy

[38] Perhaps the simplest way to think of taxonomy and ontology search technologies is to consider them from the perspective of a thesaurus.[118] Again, the problem with keyword search is that if the exact word searched for is not present in the document, the document is not included in the realm of potentially relevant documents.[119] Taxonomy and ontology tools overcome this problem by automatically searching for synonyms of keywords.[120] However, taxonomy is more than just finding synonyms; it is finding relationships, which is the science of classification.[121] For example, a search for “shoes” using taxonomy technology might find, boots, slippers, loafers, heels, and many other variations. Ontologies tend to be more generic, leading away from mere shoe types to other topics that are related to shoes.[122] For example, a search for “shoes” using ontology technology might find podiatrists or shoe manufacturers.

[39] A taxonomy is generally represented in graphic form as a tree with a root word and branches to other related words.[123] As provided by example in this article’s appendix, another conceptual way to consider taxonomy and word relationships is to imagine a web of interrelating words.[124][40] Viewing and analyzing words in the context of the relationships between them can also be helpful in the e-discovery context.[125] Not only is it important to find relevant material that uses words synonymous to the main keywords selected, but determining the documents’ relationship to each other helps attorneys to determine where it will be most useful to concentrate one’s e-discovery efforts.[126]

b. Document Clustering

[41] Clustering tools use statistical methods to automatically group documents with similar content.[127] Similarity of content can be defined a number of ways, but a typical way is to automatically group documents by the number of words that overlap from one document to another.[128] The more words that a document has in common with another document, the greater the likelihood that the documents are related.[129]

[42] There are a number of parameters that users can control when using a document-clustering tool.[130] For example, a fixed number of possible clusters can be set and topics for the clusters can even be identified.[131] One effective way of guiding the clustering process is to choose certain documents, analyze them manually, and arrange them as “seed” documents.[132] Subsequently, when the clustering engine is run, it will base its document clusters off those seed documents and parameters placed by the user.[133]

[43] In e-discovery, document clustering can provide a quick snapshot of the data and how all of the documents are related.[134] Many e-discovery vendors boast early case assessment technologies (“ECA”).[135] It is likely that some of these ECA tools include some form of document clustering capability to group similar documents.[136] Document clustering could provide insight into a case by identifying additional key players, creating estimates of the potential number of documents that may eventually need to be produced, and laying the groundwork for deciding which keywords might be important in further identifying relevant documents.[137] Combined with powerful visualization tools that showcase data on graphs and charts that are more easily read than a host of documents, counsel can establish the merits of a given claim and make educated decisions about a case from the very beginning instead of waiting until the end of a long and expensive manual review process.[138]

c. Bayesian Classifiers

[44] In contrast to statistics-based clustering tools that look at the number of common words between documents, Bayesian technologies are based on probability algorithms that determine the likelihood that a document is relevant by placing a value on words, their relationships to each other, and their proximity and frequency in comparison with other documents.[139] In clustering tools, all overlapping words between documents may hold the same value.[140] While that method may be useful to provide a quick comparison, Bayesian systems go the extra mile by setting up a formula that weighs and ranks words and their relationships.[141] One can customize how a Bayesian system ranks words and documents per implementation. However, Bayesian systems typically weigh factors, such as the frequency of certain words in the document, the location of keywords in the document, and the proximity of certain words to other important words.[142] Bayesian systems are also informed by feedback on the relevance of documents and therefore learn during the review process.[143] Before a Bayesian system is even implemented, a set of documents are typically reviewed in order to “train” the system to identify which kinds of documents are relevant or irrelevant.[144]

[45] A complete explanation of Bayesian technology is well outside the scope of this paper. However, Bayesian technology’s application to e-discovery can be informed from other disciplines. To provide two examples, Bayesian technology has been employed in e-mail spam filtering[145] and facial recognition software.[146]

[46] Bayesian technology has been used to filter spam e-mails since the late 1990s.[147] A Bayesian spam filter has one job, which is to determine whether a message is junk.[148] The filter works by comparing new e-mail messages with current messages that have already been organized into junk and non-junk folders.[149] For example, when a new message is received, the spam filter will automatically compare the words in the recent message against the messages in the junk folder.[150] It will compare the frequency of certain words like “Nigerian Prince” and “wire transfer.”[151] The filter might compare the proximity of those keywords to each other.[152] It might also consider where the junk-implicating words are located, whether in the subject line, or the body of the e-mail.[153] Further parameters can also be programmed, such as whether the user has previously received a message from the sender of the e-mail.[154] Ultimately, any e-mail containing the words “Nigerian Prince,” “wire transfer,” and “bank routing number” should end up in the junk folder.[155]

[47] The utility of Apple iPhoto’s “Faces” capability also demonstrates how a Bayesian search and review process might work.[156] In iPhoto, a user can categorize photos by face.[157] To streamline the process, iPhoto allows a user to identify the faces in a photo.[158] After a face has been identified, iPhoto searches through all the photos in the application for a face with matching characteristics.[159] When it begins, iPhoto may draw a large number of false positives. However, after a number of iterations, iPhoto “learns” which photos match or do not match the first “training set” of faces identified.[160] Eventually, the program does so with a high level of accuracy.[161]

[48] In a similar fashion, Bayesian technology in the e-discovery context allows a user to begin by identifying certain documents as relevant, irrelevant, privileged, or not privileged.[162] Instead of having these decisions made by an army of low-level legal associates or contract attorneys, sensitive relevance determinations can be made by senior attorneys familiar with the case in order to produce a high-quality “training set” of documents.[163] This “training set” of documents is then used to search through the universe of documents and ask the user if the next set of documents is relevant to the litigation and whether or not it is privileged.[164] Over time, the computer can learn which documents are relevant with a high level of accuracy.[165] This technology allows attorneys to review wide swaths of documents in short periods of time for both relevance and privilege.[166]

3. Maintaining Quality and the Role of Sampling

[49] Regardless of the search process used, courts may expect attorneys to use safeguards to help ensure the quality of the document review.[167] Sampling is a quality control method urged by courts that consists of manually sampling files identified as relevant or privileged to test whether or not the review process was accurate.[168] However, that explanation might be overly simplistic since courts have also stated that expert assistance may be required to develop an effective sampling protocol.[169] Sampling can serve as a check on advanced search methodologies, thereby helping technology-skeptic attorneys to rest easy by ensuring that machine-assisted search and review results are as accurate and complete as possible.[170] Wise practitioners can leverage sampling techniques to improve overall search and review, informing their process through sampling relevant documents that were not identified, and modifying search processes to increase accuracy.[171] This sampling process, used at various phases of the search and review, can then be used to explain the efficacy of the search tools used.[172]

B. Do Conceptual Search and Review Technologies Work Better than Keyword Search and Manual Review?

[50] In Part III Section A, we discussed the importance of a common language to discuss e-discovery technologies and reviewed current conceptual search technologies using lay-lawyer terms. The examples provided were intended to give practitioners insight into the complexity of conceptual search and a framework for understanding its use. Part III Section B goes one step further by answering a question every astute reader should be asking: do the technologies work better than the status quo?

[51] Understanding the technologies and putting them into practice is not enough.[173] The true measure of potentially helpful conceptual search technologies is whether they actually do a better job than the traditional keyword search followed by a manual review of every single responsive document.[174] Until recently, a comparative test and analysis of the two methods had been lacking.[175] However, this was changed in 2009 when the Text Retrieval Conference (“TREC”) conducted a study comparing traditional search and review methods to advanced search technologies.[176]

1. Factors for TREC Analysis

[52] The analysis focused on three key indicators to determine whether the groups using conceptual search technologies actually performed better during the e-discovery process.[177] The first important factor was recall, which is the percentage of relevant documents a group finds out of the total number of relevant documents in a data set.[178] Thus, if there are 1,000 relevant documents in a universe of 10,000, and an e-discovery process finds 200 of the relevant documents, then its recall is 20% because it only found 200 of the possible 1,000 documents.[179] As will be discussed later, 20% recall is about par for the course with keyword search alone.[180]

[53] The second factor, precision, measures how well the process retrieved only the relevant documents.[181] Using the same example of 10,000 documents with 1,000 relevant documents, assume an e-discovery process identified 400 documents, but only 200 of those documents were relevant while the remaining 200 documents were irrelevant to the litigation. The resulting precision calculation would be 50%.[182]

[54] The third factor utilized in the study to determine the quality of an e-discovery process is entitled F1, which is derivative of the first and second factors of recall and precision.[183] Using the same example with a 20% recall rate and a 50% precision rate, the resulting F1, or harmonic mean,[184] would fall somewhere in between the two numbers at about 28.57%.[185] This third factor is the most important as it measures a balance between the two important factors involved in determining the quality of the document search and review process.[186] The higher the F1, the more complete and more accurate the review process is.[187]

2. Advanced Technology Used

[55] With regards to the actual search and review technologies employed in the TREC study, half of the groups used a manual review process and the other half used custom search technologies developed by the parties themselves.[188] Of the advanced technology groups, one group described their technology as “deterministic,” beginning the review by tailoring a highly detailed definition of relevance.[189] Then, documents could easily be compared against the relevance parameters to determine if it was responsive.[190] The intent was to bring a high level of precision to the review process, rejecting the practice of using broad keyword searches and later narrowing down the data set.[191]

[56] Another advanced technology group used a computer assisted learning approach that estimated the probability that a document was relevant.[192] The system used had previously been developed to assist in spam filtering.[193] As the technology “learned” from new documents, so did the reviewers, adjusting the search and judging system to improve the review throughout the process.[194]

[57] In the real world, e-discovery vendors may not describe their systems as “conceptual search.” However, the vocabulary framework of Part III Section A should help attorneys identify and understand how a given technology might work. Even though a “training set” of documents may not have been used, detailed relevance parameters and computer learning systems helped parties in the TREC study identify and group responsive documents.[195] These strategies seem similar to a Bayesian classification system.[196]

3. Study Findings

[58] Each group involved in the 2009 TREC study was assigned a topic and requested to sift through the data provided to build a “case” as if for litigation.[197] The results of the varying review processes, manual review versus technology assisted review, were compared and analyzed.[198] Across the wide majority of the topics tested, the groups using advanced search technologies performed at a statistically significant rate higher than the groups who used traditional review methods.[199] The average recall and precision for the traditional review groups was 59.3% and 31.7% respectively, while the recall and precision for the concept search and review was 76.7% and 84.7% respectively.[200] On average, the F1 for the traditional review groups was 36%.[201] Among those who used advanced search technologies, the F1 was 80%.[202]

[59] Clearly, the advantages of conceptual search technologies can be understood on a superficial level. After discussing the available search technologies and their possible uses in the e-discovery context, a number of strategies can be imagined to automatically organize documents in a data set, see relationships among the information, search more accurately and widely to find the relevant documents, and then use automated learning tools to speed up the review process, all more accurately than with manual review.[203] This is not the future of search technology; this is now.[204]

C. Defending E-Discovery Process Through Conceptual Search Technologies

[60] Even though a practitioner may now be aware of conceptual search technologies after reading Part III Section A, and understand that conceptual search and review produces better results after reviewing Part III Section B, opposing counsel and courts may still need some convincing. Part III Section C discusses how to defend conceptual search in court. Regardless of the e-discovery search methodology employed, whether keyword or conceptual, the parties must be able to defend the methodology used before a judge.[205] While using a defensible process throughout the entirety of a discovery request is beyond the scope of this paper, this discussion would be incomplete without reviewing aspects of defensibility applicable to advanced search methods.

1. Accuracy

[61] First, a defensible search methodology is not a perfect search methodology.[206] In the Sedona Conference Best Practices Commentary on the Use of Search, the authors discuss a 1985 study by David Blair and M.E. Maron.[207] The case dealt with an unfortunate train accident in the San Francisco area that resulted in an e-discovery workload of 40,000 documents and some 350,000 pages.[208] After a thorough review of the documents, presumably based on some form of keyword search to identify potentially relevant documents, attorneys in the case estimated they had found seventy-five percent of all relevant documents.[209] However, a detailed analysis of the documents involved revealed that attorneys on the case had only identified about twenty percent of the relevant documents.[210] The article attributes this lack of accuracy to the ambiguity inherent in word usage, giving more weight to the idea that the assistance of search experts may become necessary, as Judge Grimm implied in Victor Stanley.[211]

[62] Regardless, the key point is that even though keyword and Boolean search has been the “state-of-the-art” in terms of e-discovery search for many years, keyword search has never led to perfection in the e-discovery process.[212] It follows, then, that any search performed using conceptual searches must merely meet the low threshold of a keyword search process.[213] As previously discussed in Part III Section B, technology assisted search and review has proven to be more accurate.

2. Efficiency

[63] With regards to efficiency, the more quickly the universe of documents can be culled and reviewed, the better.[214] However, efficiency and accuracy do not exist on a sliding scale where accuracy can be sacrificed. Quickly reviewing a universe of 350,000 documents without finding a single responsive file would not be defensible.[215] Certain standards of accuracy must be met while using efficient and cost-effective means at the same time.[216] Some practitioners are experiencing success with conceptual search, thereby providing a much quicker review period with appropriate levels of accuracy.[217] One report stated that a body of 20,933 documents reviewed first using traditional review methods took 180 hours to review.[218] Afterward, the same documents were loaded into a system that learned as a separate review progressed and grouped documents according to topic.[219] This second review took 18.5 hours, nearly one-tenth of the manual review time,[220] a speedy determination indeed.[221]

3. Transparency

[64] This aspect of defending the use of advanced search technology, transparency, goes not to the efficacy of the technology itself, but the cooperation of the parties with regard to its use.[222] The Moore Court went so far as to state that the defendant’s willingness to be transparent in their implementation of advanced technologies made it possible for the court to approve the use of the technology-assisted review process.[223] In Moore, the defendant agreed to provide plaintiffs with a complete copy of all “seed” documents they had reviewed, except for privileged documents, which they then used to “train” the computer in the review process.[224] By providing the 2,399 documents in the “seed set,” plaintiffs and the court would be able to plainly evaluate and provide guidance in setting the parameters of the advanced review.[225] Arguably, this level of transparency also gives unprecedented power to plaintiffs who can effectively provide input to the decisions in the defendant’s review process.[226] While this level of transparency may not be required in every situation using advanced technologies, it may help opposing counsel and the court to feel more at ease with technologies that are admittedly difficult to understand.[227]

4. Other Factors of Defensibility

[65] The Sedona Commentary also mentions defensibility guidelines, such as cost effectiveness and a showing of fairness and good faith.[228] While no one factor seems to predominate, “the just, speedy, and inexpensive determination of every action and proceeding” appears to be the underlying factor of defensible e-discovery.[229] Counsel should be prepared to articulate how the search methods employed helped meet the ends of the Federal Rules of Civil Procedure.[230] Additionally, evidence regarding the efficacy of a search methodology must be introduced through experts.[231] Attorneys and experts should be prepared to explain that a well-implemented conceptual search speeds up the review process and leads to more accurate results.[232] Saved costs are the logical byproduct and should be included as part of any defense concerning the efficacy of advanced search technologies.[233] In the end, attorneys defend their use of advanced conceptual search by demonstrating that it is more just, speedier, and less expensive than keyword search followed by manual review.


IV. Required Use of Advanced Search Technologies

[66] In Part III, we defined some of the advanced search technologies currently employed in e-discovery, determined that they are indeed more accurate than keyword search alone followed by manual review, and also considered how to defend and explain their use for courts and opposing counsel. Part IV provides guidance on when the use of advanced search technologies should be required. It also discusses how courts should deal with inadvertent production of privileged documents after using an advanced e-discovery process. In addition, Part IV provides guidance to practitioners regarding what technology-related legal duties have formed around the e-discovery process and how those duties help fulfill the purposes of the Federal Rules of Civil Procedure.

A. Requiring Conceptual Search

[67] Cases that should require the use of advanced search technologies involve millions of documents.[234] These cases involve situations where a keyword search followed by manual review would be truly unfeasible and overly expensive.[235] Knowing that one of the underlying purposes behind the duty to use conceptual search technologies is to save money, and recognizing that the biggest expense in e-discovery is manual review, an understanding of advanced search technologies should help courts and counsel draw the conclusion that an advanced process saves both time and money.[236] Although some pushback from counsel is to be expected, court enforcement of a duty to use advanced search technologies, accompanied with further research and learning about conceptual search technology, should help allay any concerns about the efficacy of conceptual search.[237]

[68] Should the use of concept search technologies be required in all cases? No. Given the discussion of the technologies above, there is clearly some level of preparation and analysis required before conceptual search and review is initiated.[238] In some cases, it may be more effective to formulate a basic keyword strategy, especially when dealing with smaller data sets where manual review is feasible and less expensive than employing the services of an e-discovery vendor.[239] Using advanced technology in those situations would be akin to cutting the Thanksgiving turkey with a chainsaw—simply overkill.

B. Dealing with Privileged Documents

[69] When courts evaluate discovery productions that result from a conceptual search and review process, they should keep in mind the impossibility of manually reviewing millions of documents.[240] Given that review of documents does not necessarily equal viewing documents, courts should respect clawback agreements between parties and be hesitant to find waiver of privilege from documents that were inadvertently produced. Despite an attorney’s best efforts, it is possible and even likely that after a review of millions of documents, some privileged material will be produced to opposing counsel.[241] Courts and counsel, in the interest of the speedy and just determination of a case, should expect a certain level of inaccuracy involved with sifting through large quantities of documents, regardless of the search and review methodologies employed.[242]

[70] In fact, the 2006 Amendments to the Federal Rules of Civil Procedure anticipated a margin of error when dealing with a large universe of documents, by purposefully crafting the ability to institute clawback agreements between parties.[243] Clawback agreements should be discussed as part of any electronic discovery plan and when disputes over privileged documents occur, judges should use an “appropriate mathematical yardstick” when determining whether to waive privilege.[244]

[71] Compare, for example, the application of certain factors in the Mt. Hawley case with the Victor Stanley case.[245] In Victor Stanley, the court found that privilege had been waived on 165 documents out of a universe of 9,000 documents that had been inadvertently produced.[246] In contrast, the Mt. Hawley Court found that privilege had been waived on 377 documents out of a universe of five million documents.[247] The court’s conclusion that privilege had been waived is not necessarily the problem. However, the reasoning behind the Mt. Hawley holding with regard to those 377 documents is.[248] In stating that privilege had been waived on the inadvertently produced documents, the court relied in part on the Victor Stanley holding, concluding that 377 documents was more than double the number of documents at issue in the Victor Stanley case.[249] However, the number of documents that were inadvertently produced provides a poor comparison.[250] Using instead the number of documents in terms of a proportion or a percentage of the possible privileged documents that could have been inadvertently produced, the parties in the Mt. Hawley case did much better.[251] For this reason, courts should approach inadvertent disclosure problems with a relative mindset instead of thinking in terms of bright-line non-proportional rules.[252]

[72] Ironically, the Mt. Hawley decision highlights much of the information that supports a conclusion that data should be evaluated on a relative basis. In its analysis, the court examines a five-factor test for determining whether privilege has been waived, the parties’ own clawback agreement, the Federal Rules of Evidence and Procedure that authorize clawback agreements, and finally, the Advisory Committee Notes discussing how to evaluate clawback agreements.[253] Specifically, the Advisory Committee states that:

[o]ther considerations bearing on the reasonableness of a producing party’s efforts include the number of documents to be reviewed and the time constraints for production. Depending on the circumstances, a party that uses advanced analytical software applications and linguistic tools in screening for privilege and work product may be found to have taken “reasonable steps” to prevent inadvertent disclosure.[254]

Essentially, the Advisory Committee was aware of the potential need to review huge amounts of data and that perfection in the review process would be impossible.[255] Courts and practitioners alike should be prepared for a margin of error in the discovery process and should be flexible enough to work out and enforce clawback agreements that preserve privilege while speeding along the review process.[256] Some practitioners have begun to claim that if the advanced review is carried out properly, privilege should not be a worry because the same systems that help quickly and efficiently identify relevance can also make privilege determinations with a high level of accuracy.[257] It may be that in the future, privilege stops being a concern for parties who use advanced search technology. Until then, courts should always consider a review augmented by advanced conceptual searches to be found to have taken “reasonable steps” to preserve privilege under the meaning of the Federal Rules.[258]

C. Legal Duties

[73] The purpose of the Federal Rules of Civil Procedure is to secure “the just, speedy, and inexpensive determination of every action and proceeding.”[259] In the past, speed and expense were sacrificed in the name of justice, giving time to long-term manual review projects to ensure that the most accurate and complete set of information was discovered and produced.[260] Today, a more accurate, complete, and just process exists through conceptual search tools.[261] The fact that the same tools also give way to speedier and less expensive determination is a bonus.[262]

[74] Judges who have recommended advanced search technologies in the past may require parties to use them in the near future, especially in litigation with large data sets.[263] It would not be the first time a judge has required counsel to learn about and become familiar with technology.[264] Legal duties in terms of the e-discovery process will continue to emerge and become more defined.[265] Just as Federal Rule of Civil Procedure 26(f) requires parties to discuss an in-depth discovery plan—including discovery subjects, production format, and privilege issues—future evolutions of the rule may require discussion of, and plans to use, advanced search technologies in order to secure the just, speedy, and inexpensive determination of the case.[266]

[75] Requiring the attorneys in a case to use advanced search technologies may raise competency concerns.[267] Furthermore, any mandate to use conceptual search should have at its root the purpose of helping to resolve cases on their merits rather than e-discovery issues.[268]

1. Competency

[76] The legal community should embrace new conceptual search technologies.[269] Where expertise is lacking, attorneys should not hesitate before seeking help with managing a large database of electronic information.[270] A defensible e-discovery strategy for large data sets should employ the review of documents through a variety of search tactics, including document clustering and keyword search assisted by Bayesian and ontology search mechanisms.[271] Since attorneys do not typically have access to those tools on their desktop computers, in some cases, attorneys should be required to seek help either by a firm or a vendor who specializes in e-discovery.[272] In the past, courts have sanctioned parties for botching e-discovery requests and requirements and as a result, the legal community should consider themselves “on notice” with regard to their competency qualifications, or lack thereof, in the e-discovery context.[273]

2. Resolve Cases on their Merits

[77] One of the best-named tools in opposing counsel’s arsenal is the Weapon of Mass Discovery.[274] Counsel can sometimes try to make overbroad discovery requests, hoping for settlement from larger defendants because it would be more cost-effective for the defendant to settle than to try the case on its merits.[275] This is due in part to the impossibility of manually reviewing millions of documents.[276] The capacity of advanced search technologies to conceptually organize a universe of documents should help larger defendants avoid this threat by analyzing the merits of a claim from day one.[277] Some work and preparation for litigation will be required on the part of the defendant to effectively use this strategy, but in the long term, a strategy that includes preparation and use of conceptual search will help cases resolve on their merits instead of the difficulty of the e-discovery process.[278] This purpose should be at the core of any mandate to implement advanced search technology.


V. Conclusion

[78] Courts have been developing a legal duty to understand and implement advanced search technologies in the e-discovery process.[279] This duty is informed by scholarship demonstrating the efficacy of advanced search technologies and their advantage over the status quo of a keyword search method followed by an extensive manual review. To meet the needs of clients, practitioners must strive to gain some technical knowledge regarding available search and review methods. Given that manual review is the most expensive piece of the e-discovery process and that using conceptual search inevitably erases much of the manual review process along with its accompanying high cost, attorneys should implement conceptual search technologies as often as possible. The understanding that conceptual searches are more effective and efficient should help attorneys defend an advanced search process in court. Finally, as the review process is shortened considerably and the burden of review is lifted from the shoulders of counsel and courts, cases can again be resolved on their merits instead of diving down the rabbit hole of e-discovery disputes.



* Jacob Tingen is licensed Virginia attorney and a graduate of the University of Richmond School of Law. In the summer of 2011 he interned with Vault26, an e-discovery startup, where he consulted on current e-discovery practices. Living on the cutting edge of technology, Jacob maintains a home on the web at He would like to thank Professor James Gibson for his guidance and help in preparing this article.


 Appendix A: Click to view full size image


[1] Fed. R. Civ. P. 1.

[2] David Degnan, Accounting for the Costs of Electronic Discovery, 12 Minn. J.L. Sci. & Tech. 151, 152 (2011).

[3] See George L. Paul & Jason R. Baron, Information Inflation: Can the Legal System Adapt?, 13 Rich. J.L. & Tech. 10, ¶ 4 (2007), (noting that manual review is too time-consuming and expensive).

[4] See, e.g., id. at ¶ 20 (providing an example showing the time it takes for manual review of one billion e-mail records).

[5] See Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-Discovery Search,17 Rich. J.L. & Tech. 9, ¶ 13 (2011),

[6] See Paul & Baron, supra note 3, at ¶ 12.

[7] See, e.g., H. Christopher Boehning & Daniel J. Toal, Assessing Alternative Search Methodologies, N.Y. L.J. Tech. Today, Apr. 22, 2008, at 5.

[8] See discussion infra Part II.

[9] See Boehning & Toal, supra note 7, at 6 (comparing classic Boolean keyword searching with new technological approaches to e-discovery).

[10] See Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, 17 Rich. J.L. & Tech. 11, ¶ 1 (2011), (stating that there has been little scientific evidence proving whether advanced search and review tactics are more effective than keyword search and manual review).

[11] See Baron, supra note 5, at ¶ 34 (noting that only two cases even mention the existence of conceptual search technologies). Since the publication of Jason R. Baron’s article in 2011, two additional cases have spoken in more detail regarding the use of advanced search technologies. See Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *1, *12 (S.D.N.Y. Feb. 24, 2012) (approving the use of predictive coding in e-discovery for the first time); Case Management Order: Protocol Relating To Production of Electronically Stored Information at 1-26 Actos (Pioglitazone) Products Liability Litigation, No. 6:11-md-2299 (W.D. La. July 27, 2012) [hereinafter Actos Order] (emphasizing the importance of collaboration when using an advanced e-discovery process for all pending and future related litigation involving Actos Products).

[12] See Grossman & Cormack, supra note 10, at ¶ 1.

[13] See Baron, supra note 5, at ¶ 37.

[14] Paul & Baron, supra note 3, at ¶ 36.

[15] Id. at ¶ 37.

[16] See id. at ¶ 40.

[17] The Sedona Conference, Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery, 8 Sedona Conf. J. 189, 194 (2007) [hereinafter Best Practices].

[18] See Paul & Baron, supra note 3, at 36-37.

[19] See Monica Bay, Georgetown E-Discovery Conference Opens With Case Law Update, Law Tech. News (Nov. 18, 2011), (quoting U.S. District Court Judge James Francis: “I don’t see how you can provide competent representation if you don’t have some basic understanding of e-discovery.”).

[20] See, e.g., Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 432 (S.D.N.Y. 2004).

[21] Id.

[22] Id. at 441.

[23] Id. at 425.

[24] Id. at 425-29.

[25] See Ralph C. Losey, Introduction to e-Discovery: New Cases, Ideas, and Techniques 441-42 (2009) [hereinafter Losey, Introduction to e-Discovery].

[26] Zubulake, 229 F.R.D. at 432.

[27] See Ralph C. Losey, e-Discovery Current Trends and Cases 56 (2008) [hereinafter Losey, Current Trends].

[28] See Zubulake, 229 F.R.D. at 440.

[29] Id. at 432.

[30] Id.

[31] See generally Fed. R. Civ. P. (2006); see also Losey, Current Trends, supra note 27, at 241-63 explaining the 2006 amendments to the Federal Rules of Civil Procedure).

[32] About The Sedona Conference, The Sedona Conference, (last visited Oct. 27, 2012) [hereinafter About The Sedona Conference].

[33] See The Sedona Conference, The Sedona Conference Cooperation Proclamation 4-11 (2008) [hereinafter Cooperation Proclamation], available at (listing judicial endorsements of the Cooperation Proclamation).

[34] See Paul & Baron, supra note 3, at ¶ 66.

[35] See Cooperation Proclamation, supra note 33, at 1-3.

[36] Id. at 3 (quoting Fed. R. Civ. P. 1).

[37] See id.

[38] See id.

[39] Id. at 4-11 (providing a detailed list of judicial endorsements).

[40] The Sedona Conference, Commentary on Achieving Quality in the E-Discovery Process 1 (2009), available at [hereinafter Achieving Quality].

[41] See id.

[42] See Paul & Baron, supra note 3, at ¶¶ 4, 39-40.

[43] See id. at ¶ 36.

[44] Cf. Boehning & Toal, supra note 7.

[45] See Grossman & Cormack, supra note 10, at ¶ 52; see also discussion infra Part III.

[46] See Grossman, & Cormack, supra note 10, at ¶ 52; see also discussion infra Part III.

[47] See Degnan, supra note 2, at 161.

[48] See id.

[49] Fed. R. Civ. P. 1.

[50] See Degnan, supra note 2, at 160.

[51] See, e.g., Kashmir Hill & David Lat, Top Lawyers, Washingtonian (Sept. 6, 2012, 1:27 PM),

[52] See Justin Scheck, Tech Firms Pitch Tools For Sifting Legal Records, Wall St. J., (Sept. 6, 2012, 1:35 PM),

[53] See Losey, Introduction to e-Discovery, supra note 25, at 72.

[54] See, e.g., id.

[55] See Paul & Baron, supra note 3, at ¶ 37.

[56] See id. at ¶¶ 37, 40.

[57] See Cooperation Proclamation, supra note 33, at 3.

[58] See Baron, supra note 5, at ¶ 13.

[59] Id.

[60] See id. at ¶ 14 (citing Helmert v. Butterball, LLC, No. 4:08CV00342 JLH, 2010 WL 2179180, at *1-5 (E.D. Ark. May 27, 2010)).

[61] See id. at ¶ 34.

[62] Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *1, *12 (S.D.N.Y. Feb. 24, 2012). The parties in Moore have hotly contested the judicial order in the case, and even though predictive coding met with Judge Peck’s approval, it is now uncertain whether the parties will even use an advanced search and review methodology. Since this article’s writing, another case has emerged where the court approved the use of predictive coding. See Actos Order, supra note 11. In Actos, the order emphasizes the collaboration between the parties that made the use of predictive coding in the e-discovery process possible. Id.

[63] See Baron, supra note 5, at ¶¶ 34-35 (discussing Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 (D.C. Cir. 2007) and Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008)).

[64] Disability Rights Council, 242 F.R.D. at 141.

[65] Id. at 145-46.

[66] Id. at 148.

[67] Id.

[68] See generally Victor Stanley, Inc., 250 F.R.D. 251.

[69] See id. at 256.

[70] Id. at 259-60.

[71] Id. at 259 n.9.

[72] See Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *1 (S.D.N.Y. Feb. 24, 2012).

[73] See id. at *8-12.

[74] Id. at *1.

[75] See generally id. at *4-5.

[76] Id. at *3-6.

[77] See Moore, 2012 WL 607412, at *3.

[78] See id. at *3-6.

[79] See, e.g., id. at *10-11.

[80] See id. at *12.

[81] See, e.g., Alison Frankel, That federal court e-discovery breakthrough? Not so fast…, Thomson Reuters (May 15, 2012),

[82] Id. (noting that Judge Peck “issued an order staying MSL’s discovery of electronically stored information until there’s a ruling on whether the case can be certified as a collective action”).

[83] See Andrew Peck, Search, Forward: Will manual document review and keyword searches be replaced by computer-assisted coding?, Law Technology News (Oct. 1, 2011) (stating that more attorneys are using advanced search technology as the technology and methods improve).

[84] See Mythbusters: Exploding House, Episode 23 (Discovery television broadcast Nov. 16, 2004) (showing that a needle can literally be found in a haystack, but only by using a specialized machine or process).

[85] Paul & Baron, supra note 3, at ¶ 36.

[86] E.g., About The Sedona Conference, supra note 32; see, e.g., The Sedona Conference Glossary: Commonly Used Terms for E-Discovery and Digital Information Management (3d ed.),The Sedona Conference (Oct. 2010),

[87] See Jonathan Jaffe, Comment to Hash, e-Discovery Team (Dec. 2, 2009, 9:03 AM), (describing a language inconsistency between the legal and technology worlds manifested in the actual blog post’s discussion regarding hashing algorithms). In order for attorneys and technology consultants to work together in a multidisciplinary field like e-discovery, they must both learn to speak the same language.

[88] Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 440 (2004) (“The subject of the discovery of electronically stored information is rapidly evolving.”).

[89] See, e.g., Paul & Baron, supra note 3, at ¶ 43.

[90] Best Practices, supra note 17, at 204.

[91] Cf. The Harry Potter Lexicon, (last visited Nov. 22, 2011) (discussing the added fear inherent in not naming Voldemort, or, He-Who-Must-Not-Be-Named).

[92] See Paul & Baron, supra note 3, at ¶ 43.

[93] Best Practices, supra note 17, at 217.

[94] See Paul & Baron, supra note 3, at ¶ 37.

[95] See id.

[96] See id. at ¶¶ 4, 6.

[97] Id. at ¶ 37 n. 92 (citing J.C. Smith, Machine Intelligence and Legal Reasoning, 73 Chi.-Kent L. Rev. 277, 334-35 (1998)).

[98] See Victor Stanley, Inc. v. Creative Pipe, Inc., 253 F.R.D. 251, 257 (D. Md. 2008).

[99] Id.

[100] Best Practices, supra note 17, at 217.

[101] See id. at 218.

[102] See id.

[103] See id.

[104] See Paul & Baron, supra note 3, at ¶¶ 37-40.

[105] See Best Practices, supra note 17, at 219.

[106] Susan W. Brenner & Barbara A. Frederiksen, Computer Searches and Seizures: Some Unresolved Issues, 8 Mich. Telecomm. & Tech. L. Rev. 39, 60-61 (2002).

[107] Best Practices, supra note 94, at 219.

[108] Id.

[109] Id. at 202.

[110] Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *10 (S.D.N.Y. Feb. 24, 2012).

[111] Id.

[112] See Paul & Baron, supra note 3, at ¶ 38.

[113] Cf. id. at ¶ 39 (stating that searching via keyword is “fraught with technological difficulties”).

[114] See id. at ¶ 40 (citing a study where a keyword search only revealed 20% of the relevant documents in the litigation).

[115] Moore,2012 WL 607412, at *10-12.

[116] See Best Practices, supra note 17, at 202.

[117] Moore, 2012 WL 607412, at *5.

[118] See Best Practices, supra note 17, at 221.

[119] See Paul & Baron, supra note 3, at ¶ 37.

[120] Best Practices, supra note 17, at 221.

[121] Id.

[122] See id. at 222.

[123] See, e.g., id. at 221.

[124] Id. at 221-22. An online tool——may be helpful in visualizing what a taxonomy looks like. A screenshot of a taxonomy example taken from an online visual thesaurus is provided in the Appendix. Thinkmap Visual Thesaurus, (last visited Sept. 5, 2012).

[125] See Best Practices, supra note 17, at 222.

[126] See id.

[127] Id. at 219.

[128] Id.

[129] Id.

[130] Best Practices, supra note 17, at 219.

[131] Id.

[132] Id.

[133] See id.

[134] See id.

[135] See, e.g., Early Case Assessment, Clearwell Systems, http://www.clearwell (last visited Sept. 7, 2012).

[136] Cf. Best Practices, supra note 17, at 219.

[137] See id. at 203.

[138] See id. at 222.

[139] Id. at 218.

[140] See id. at 219.

[141] Best Practices, supra note 17, at 218.

[142] Id.

[143] See id. at 218-19.

[144] See id. at 218.

[145] See generally Mehran Sahami et al., A Bayesian Approach to Filtering Junk E-Mail, in Learning for Text Categorization 55 (1998) available at

[146] See generally Baback Moghaddam et al., Bayesian Face Recognition, 33 Pattern Recognition 1771 (2000).

[147] See, e.g.,Sahami, supra note 145, at 56.

[148] See id.

[149] See, e.g., id.

[150] See, e.g., id.

[151] See id.

[152] See Sahami, supra note 145.

[153] Id.

[154] See id.

[155] See id.

[156] See Wilson Rothman, What to Know About iPhoto ‘09 Face Detection and Recognition, Gizmodo (Jan. 29, 2009, 8:00 AM), The author is unaware whether iPhoto uses Bayesian classifiers as part of its iPhoto facial recognition software; however, Bayesian technology has been employed in facial recognition software and iPhoto provides a popular example of technology that is at least similar to how a Bayesian search and review system might work.

[157] Id.

[158] Id.

[159] Id.

[160] Id.

[161] Rothman, supra note 156.

[162] See Best Practices, supra note 17, at 218.

[163] This appears to be the approach in the now influential Moore decision. Arguably, by having senior attorneys carefully review a smaller “seed” or “training” set of documents, the overall document review process is honed and more attuned to the issues being litigated, leading to a more complete, accurate, and efficient review. See Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *5 (S.D.N.Y. Feb. 24, 2012).

[164] See id. at *5-6.

[165] See id.

[166] See, e.g., id.

[167] Achieving Quality, supra note 40, at 11.

[168] Id. at 9.

[169] See Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260 n.10 (D. Md. 2008).

[170] See Achieving Quality, supra note 40, at 11.

[171] See id.

[172] See id. at 11-12.

[173] See Boehning & Toal, supra note 7, at 2.

[174] See Grossman & Cormack, supra note 10, at ¶ 6.

[175] See id. at ¶ 27.

[176] See id. at ¶¶ 44-46.

[177] See id. at ¶ 34.

[178] See id. at ¶ 7.

[179] The numbers used here are merely provided as an example to help explain the concepts of recall and precision. When determining the total number of potentially relevant documents in the TREC study, a series of mathematical formulas was applied to the data resulting from the various reviews. Four calculations were applied to each group’s review to determine: (1) the proportion of relevant documents within the group of documents reviewed; (2) the number of relevant documents within the group of documents reviewed; (3) the estimate of variance within the produced documents; and (4) the estimate of variance within all the documents reviewed by the given groups. Because of time and resource constraints, the groups were only able to review portions of the full document collection available for the study. Further estimates of the total number of relevant documents were determined as well as the variance calculation on the full document collection. Using information from the review itself, and applying the formulas mentioned above, the TREC study was able to determine a probability estimate of the total number of relevant documents in each sample tested in addition to the full document collection. More information regarding estimating the denominator of potentially relevant documents, including an in-depth analysis and the specific formulas used, can be found online in the TREC 2008 report. See Douglas W. Oard et al., Overview of the TREC 2008 Legal Track: Estimation of Metrics—Interactive Task, at 21-23, 40-44 (2008), available at

[180] See Best Practices, supra note 17, at 206.

[181] See Grossman & Cormack, supra note 10, at ¶ 7.

[182] See id.

[183] See id. at ¶ 9.

[184] See id. at ¶ 9 n.30.

[185] See id.

[186] See Grossman & Cormack, supra note 10, at ¶ 9.

[187] See id.

[188] See id. at ¶ 37 tbl.7.

[189] Id. at ¶ 39.

[190] See id. at ¶¶ 38-39.

[191] See Grossman & Cormack, supra note 10, at ¶¶ 38-39.

[192] See id. at ¶¶ 40-41.

[193] See id.

[194] See id.

[195] See Id. at ¶¶ 38-41.

[196] See discussion supra Part III.A.

[197] See Grossman & Cormack, supra note 10, at ¶¶ 30-32.

[198] Id. at ¶ 45.

[199] See id. at ¶ 45 tbl.7.

[200] Id.

[201] Id.

[202] See Grossman & Cormack, supra note 10, at ¶ 45 tbl.7.

[203] See discussion supra Part II.A.

[204] See Grossman & Cormack, supra note 10, at ¶ 45 (proving that a technology assisted review process using advanced search and review methods is more effective than maual review).

[205] See Best Practices, supra note 17, at 214.

[206] See Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *11 (S.D.N.Y. Feb. 24, 2012) (“While this Court recognizes that computer-assisted review is not perfect, the Federal Rules of Civil Procedure do not require perfection.”); see also Best Practices, supra note 94, at 206.

[207] See Best Practices, supra note 17, at 206.

[208] See id.

[209] See Grossman & Cormack, supra note 10, at ¶ 45.

[210] Best Practices, supra note 17, at 206.

[211] See Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260 (D. Md. 2008) (“[The] proper selection and implementation [of keywords] obviously involves technical, if not scientific knowledge.”).

[212] See id.

[213] Cf. Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *10-11 (S.D.N.Y. Feb. 24, 2012) (pointing out the low accuracy threshold of keyword searches).

[214] See Achieving Quality, supra note 40, at 5.

[215] See id. at 1-3.

[216] See id.

[217] See generally Bennett B. Borden et al., Why Document Review is Broken, The Williams Mullen Edge (May 2011),

[218] See id. at 2 (explaining the amount of time taken to complete a review of 20,933 documents using the traditional method).

[219] See id.

[220] See id.(explaining the amount of time taken to complete a review of 20,933 documents was ten times faster using linear review than the traditional method).

[221] See Fed. R. Civ. P. 1 (“They shall be construed and administered to secure the just, speedy, and inexpensive determination of every action”).

[222] See Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412 at *11 (S.D.N.Y. Feb. 24, 2012) (“Electronic discovery requires cooperation between opposing counsel and transparency in all aspects of preservation and production of ESI.”).

[223] See id.

[224] See id. at *5.

[225] See id. at *3.

[226] See, e.g., id. at *5.

[227] See Moore, 2012 WL 607412, at *11.

[228] See Best Practices, supra note 17, at 195.

[229] Fed. R. Civ. P. 1.

[230] See supra Part II.A-B.

[231] See United States v. O’Keefe, 537 F. Supp. 2d. 14, 24 (D.D.C. 2008) (“This topic is clearly beyond the ken of a layman and requires that any such conclusion be based on evidence that, for example, meets the criteria of Rule 702 of the Federal Rules of Evidence [regarding the introduction of evidence via experts].”).

[232] See discussion supra Part II.B.

[233] See Borden, supra note 217, at 4.

[234] See Best Practices, supra note 17, at 194.

[235] See id.

[236] See Achieving Quality, supra note 40, at 309.

[237] See Boehning & Toal, supra note 7, at 2; see also discussion infra Part III.B.

[238] See Best Practices, supra note 17, at 194 (“[A]ny automated search method or technology will be enhanced by a well-thought out process with substantial human input on the front end.”).

[239] See id. at 209-10. If substantial human input is required to initiate an advanced search methodology, then smaller data sets that take less time to review manually than it would to create the right search environment should not use advanced search and review methods.

[240] See id. at 194.

[241] See Achieving Quality, supra note 40, at 320-21.

[242] See id. at 321.

[243] See Fed. R. Civ. P. 26(b)(5)(B); Fed. R. Evid. 502 (b)(1).

[244] See Baron, supra note 5, at 40.

[245] Compare Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 257 (D. Md. 2008) (waiving privilege on 165 documents out of a universe of 9000 documents), with Mt. Hawley Ins. Co. v. Felman Prod., Inc., 271 F.R.D. 125, 136, 139 (S.D. W. Va. 2010) (waiving privilege on 377 documents out of a universe of millions of documents because 377 was double the number of privileged documents produced in the Victor Stanley case).

[246] See Victory Stanley, 250 F.R.D. at 257.

[247] See Mt. Hawley, 271 F.R.D. at 138-39;Baron, supra note 5, at 40 (citing Ralph Losey, The Good, the Bad, and the Ugly: “Mt. Hawley Ins. Co. v. Felman Production, Inc.”, e-Discovery Team (June 10, 2010, 7:11 AM),“mt-hawley-ins-co-v-felman-production-inc-”/).

[248] See Mt. Hawley, 271 F.R.D. at 136, 139.

[249] See id.

[250] See Ralph Losey, The Good, the Bad, and the Ugly: “Mt. Hawley Ins. Co. v. Felman Production, Inc.”, e-Discovery Team® (June 10, 2010; 7:11 AM),

[251] See Baron, supra note 5, at ¶ 40.

[252] See id.

[253] See Mt. Hawley, 271 F.R.D. at 133-34.

[254] Fed. R. Evid. 502 advisory committee’s note (emphasis added).

[255] Cf. Best Practices, supra note 17, at 194.

[256] See Baron, supra note 5, at ¶ 40.

[257] See Borden, supra note 217, at 3.

[258] See, e.g.,Moore v. Publicis Groupe, No. 11 Civ. 1279(ALC)(AJP), 2012 WL 607412, at *11 (S.D.N.Y. Feb. 24, 2012) (“[T]he Federal Rules of Civil Procedure do not require perfection.”).

[259] Fed. R. Civ. P. 1.

[260] Cf. Best Practices, supra note 17, at 192.

[261] See discussion supra Part III.B.

[262] See Achieving Quality, supra note 40, at 1.

[263] Cf. Best Practices, supra note 17, at 194.

[264] See Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 432 (S.D.N.Y. 2004).

[265] See Baron, supra note 5, at ¶ 37.

[266] See Fed. R. Civ. P. 26(f).

[267] See Baron, supra note 5, at ¶ 13.

[268] Cf. Paul & Baron, supra note 3, at ¶ 28.

[269] See Achieving Quality, supra note 40, at 17.

[270] See Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260 n.10 (D. Md. 2008).

[271] See Best Practices, supra note 17, at 194.

[272] See Victor Stanley, Inc., 250 F.R.D. at 260 n.10.

[273] Cf. Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 440 (S.D.N.Y. 2004).

[274] See Losey, Current Trends, supra note 27, at 41.

[275] See id. at 29.

[276] See Best Practices, supra note 17, at 194.

[277] See discussion supra Part II.B.

[278] See Cooperation Proclamation, supra note 33, at 331.

[279] See Bennett B. Borden et al., Four Years Later: How The 2006 Amendments To The Federal Rules Have Reshaped The E-Discovery Landscape And Are Revitalizing The Civil Justice System, 17 Rich. J.L. & Tech. 10, ¶ 36 (2011).


Who will take care of your social media accounts?

Blog: My Executor Has Never Used the Internet: Estate Planning and Digital Property

By Associate Editor Kevin McCann

In 2007, a devoted World of Warcraft player decided it was time to put down his virtual crossbow and axe and sell his player account. Given the amount of time put into leveling up the abilities and gear of the character, the account was in high demand and sold for 7,000 Euros (approximately $9,000). What if before the player decided to sell this he experienced an unfortunate real life death? Most likely there would be no provision in his last will and testament stating what to do with this asset, and the account would have been deleted and the potential money lost.

While this is an extreme example of protecting a digital asset, estate planners and lawyers indicate that few people give the new reality of digital assets and online accounts consideration when drafting their wills. There is a range of issues to contemplate involving electronically stored items, such as preserving online photos, projects and personal records to how you would want your family to manage your social media accounts. A survey by McAfee revealed that U.S. consumers value their digital assets, on average, at nearly $55,000, with approximately $19,000 attributed to personal memories (photographs and videos) alone. A living person would certainly want to determine the distribution of these electronically stored personal memories just as if they were photos in an attic.

In addition, social media websites such as Facebook and Twitter now have deceased user policies. Both policies allow interested parties to select one of two options: either delete the user account entirely or save the account in order to memorialize the deceased and allow others to interact with his or her preserved account. (For an interesting look at the differences between the two policies, see One could see a situation where a person would want his account deleted to save his family embarrassment, or the opposite situation where a person would want his family to continue to interact with his account through the grieving process after his death. This would be another consideration to contemplate when drafting a will.

Several states have enacted legislation that pertains to post-death access of digital accounts. For instance, a New Jersey bill was introduced in June of this year that would grant the executor or administrator of an estate the power to take control of any account of the deceased person for social networking, blogging, or e-mail service websites. However, many of the states’ legislation specify that the deceased must have designated the representative in writing prior to the death. The U.S. General Services Administration recommends people set up a “social-media will,” and even go as far as naming a separate “digital executor” who is more up to speed on technology innovations and is more qualified to oversee the administration of the deceased’s digital assets. In addition, estate planners advise that the probate process would take considerable less time if the devisee were to include in his will a list of all accounts, passwords, and security question answers. Otherwise the executor would have to go through the process of submitting death certificates and relationship authentication to each of the websites.

The internet has changed the way society communicates and expresses itself, and various legal issues arose with this modernization. The protection of online assets at death is now a growing concern, with states just beginning to recognize the need for legislation. As the internet continues to reinvent itself with new services to better connect the world, so to must the estate planning process strive to keep up with these innovations.


Additional Resources:

Wall Street Journal Article on issue

Chicago tribune article on the issue

List of online services that are designed to help someone plan for probate process of digital assets.

Sedona Conference to use 4 JOLT Articles for Cooperation Proclamation: Resources for the Judiciary

The Sedona Conference,® a nonprofit, 501(c)(3) research and educational institute dedicated to the advanced study of law and policy in the areas of antitrust law, complex litigation, and intellectual property rights, will use JOLT articles in its Cooperation Proclamation, a resource for members of the Judiciary.  The Sedona Converence’s mission is to drive the reasoned and just advancement of law and policy by stimulating ongoing dialogue amongst leaders of the bench and bar to achieve consensus on critical issues.

The conference will use the following four JOLT articles:

The Proclamation aims to “reverse the legal culture of adversarial discovery that is driving up costs and delaying justice; to help create “toolkits” of model case management techniques and resources for the Bench, inside counsel, and outside counsel to facilitate proportionality and cooperation in discovery; and to help create a network of trained electronic discovery mediators available to parties in state and federal courts nationwide, regardless of technical sophistication, [or] financial resources.”

For more information on the Sedona Conference or the Cooperation Proclamation, please visit

Powered by WordPress & Theme by Anders Norén