Category Archives: eDiscovery

Substantial Reduction in Review Effort Required to Demonstrate Adequate Recall

Measuring the recall achieved to within +/- 5% to demonstrate that a production is defensible can require reviewing a substantial number of random documents.  For a case of modest size, the amount of review required to measure recall can be larger than the amount of review required to actually find the responsive documents with predictive coding.  This article describes a new method requiring much less document review to demonstrate that adequate recall has been achieved.  This is a brief overview of a more detailed paper I’ll be presenting at the DESI VII Workshop on June 12th.

The proportion of a population having some property can be estimated to within +/- 5% by measuring the proportion on a random sample of 400 documents (you’ll also see the number 385 being used, but using 400 will make it easier to follow the examples).  To measure recall we need to know what proportion of responsive documents are produced, so we need a sample of 400 random responsive documents.  Since we don’t know which documents in the population are responsive, we have to select documents randomly and review them until 400 responsive ones are found.  If prevalence is 10% (10% of the population is responsive), that means reviewing roughly 4,000 documents to find 400 that are relevant so that recall can be estimated.  If prevalence is 1%, it means reviewing roughly 40,000 random documents to measure recall.  This can be quite a burden.

multistage_acceptance_from_multistageOnce recall is measured, a decision must be made about whether it is high enough.  Suppose you decide that if at least 300 of the 400 random responsive documents were produced (75%) the production is acceptable.  For any actual level of recall, the probability of accepting the production can be computed (see figure to right).  The probability of accepting a production where the actual recall is less than 70% will be very low, and the probability of rejecting a production where the actual recall is greater than 80% will also be low — this comes from the fact that a sample of 400 responsive documents is sufficient to measure recall to within +/- 5%.

multistage_acceptance_procedureThe idea behind the new method is to achieve the same probability profile for accepting/rejecting a production using a multi-stage acceptance test.  The multi-stage test gives the possibility of stopping the process and declaring the production accepted/rejected long before reviewing 400 random responsive documents.  The procedure is shown in the flowchart to the right (click to enlarge).  A decision may be reached after reviewing enough documents to find just 25 random documents that are responsive.  If a decision isn’t made after reviewing 25 responsive documents, review continues until 50 responsive documents are found and another test is applied.  At worst, documents will be reviewed until 400 responsive documents are found (the same as the traditional direct recall estimation method).

multistage_barriers_85_recall_pathsThe figure to the right shows six examples of the multi-stage acceptance test being applied when the actual recall is 85%.  Since 85% is well above the 80% upper bound of the 75% +/- 5% range, we expect this production to virtually always be accepted.  The figure shows that acceptance can occur long before reviewing a full 400 random responsive documents.  The number of random responsive documents reviewed is shown on the vertical axis.  Toward the bottom of the graph the sample is very small and the percentage of the sample that has been produced may deviate greatly from the right answer of 85%.  As you go up the sample gets larger and the proportion of the sample that is produced is expected to get closer to 85%.  When a green decision boundary is touched, causing the production to be accepted as having sufficiently high recall, the color of the remainder of the path is changed to yellow — the yellow part represents the document review that is avoided by using the multi-stage acceptance method (since the traditional direct recall measurement would involve going all the way to 400 responsive documents).  As you can see, when the actual recall is 85% the number of random responsive documents that must be reviewed is often 50 or 100, not 400.

multistage_effort_for_multistageThe figure to the right shows the average number of documents that must be reviewed using the multi-stage acceptance procedure from the earlier flowchart.  The amount of review required can be much less than 400 random responsive documents.  In fact, the further above/below the 75% target (called the “splitting recall” in the paper) the actual recall is, the less document review is required (on average) to come to a conclusion about whether the production’s recall is high enough.  This creates an incentive for the producing party to aim for recall that is well above the minimum acceptable level since it will be rewarded with a reduced amount of document review to confirm the result is adequate.

It is important to note that the multi-stage procedure provides an accept/reject result, not a recall estimate.  If you follow the procedure until an accept/reject boundary is hit and then use the proportion of the sample that was produced as a recall estimate, that estimate will be biased (the use of “unbiased” in the paper title refers to the sampling being done on the full population, not on a subset [such as the discard set] that would cause a bias due to inconsistency in review of different subsets).

You may want to use a splitting recall other than 75% for the accept/reject decision — the full paper provides tables of values necessary for doing that.

Highlights from the South Central IG Retreat 2017

The 2017 South Central Information Governance Retreat was the first retreat in the Ing3nious series held in Texas at the La Cantera Resort & Spa.  The retreat featured two simultaneous sessions throughout the day.  My notes below provide some highlights from the sessions I was able to attend.

The day started with roundtable discussions that were kicked off by a speaker who talked about the early days of the Internet.  He made the point that new lawyers may know less about how computers actually work even though they were born in an era when they are more pervasive.  He mentioned that one of the first keyword searches he performs when he receives a production is for “f*ck.”  If a company was having problems with a product and there isn’t a single email using that word, something was surely withheld from the production.  He made the point that expert systems that are intended to replace lawyers must be based on how the experts (lawyers) actually think.  How do you identify the 50 documents that will actually be used in trial?

Borrowing Agile Development Concepts To Jump-Start Your Information Governance Program
I couldn’t attend this

Your Duty To Preserve: Avoiding Traps In Troubled Times
When storing data in the cloud, what is actually retained?  How can you get the data out?  Google Vault only indexes newly added emails, not old ones.  The company may not have the right to access employee data in the cloud.  One panelist commented that collection is preferred to preservation in place.

Enhancing eDiscovery With Next Generation Litigation Management Software
I couldn’t attend this one.

Leveraging The Cloud & Technology To Accelerate Your eDiscovery Process
Cloud computing seems to have reached an inflection point.  A company cannot put the resources into security and data protection that Amazon can.  The ability to scale up/down is good for litigation that comes and goes.  Employees can jump into cloud services without the preparation that was required for doing things on site.  Getting data out can be hard.  Office 365 download speed can be a problem (2-3 GB/hr) — reduce data as much as possible.

Strategies For Effectively Managing Your eDiscovery Spend
I couldn’t attend this one.

TAR: What Have We Learned?
I moderated this panel, so I didn’t take notes.

Achieving GDPR Compliance For Unstructured Content
I couldn’t attend this one.

Zen & The Art Of Multi-Language Discovery: Risks, Review & Translation
The translation company should be brought in when the team is formed (it often isn’t done until later).  Help may be needed from translator / localization expert to come up with search terms.  For example, there are 20 ways to say “CEO” in Korean.  Translation must be done by an expert to be certified.  When using TAR, do review in the native language and translate the result before presenting to the legal team.  Translation is much slower than review.  Machine translation has improved over the last 2 years, but it’s not good enough to rely on for anything important.  A translator leaked Toyota’s data to the press — keep the risk in mind and make sure you are informed about the environment where the work is being done (screenshots should be prohibited).

Beyond The Firewall: Cybersecurity & The Human Factor
I couldn’t attend this one.

Ethical Obligations Relating To Metadata
Nineteen states have enacted ethical rules on meta-data.  Sometimes, metadata is enough to tell the whole story.  John McAfee was found and arrested because of GPS coordinates embedded in a photo of him.  Metadata showed that a terminated whistleblower’s employee review was written 3 months after termination.  Forensic collection is important to not spoil the metadata.  Ethical obligations of attorneys are broader than attorney-client privilege.  Should attorneys be encrypting email?  Make the client aware of metadata and how it can be viewed.  The attorney must understand metadata and scrub it as necessary (e.g, change tracking in Word).  In e-discovery metadata is treated like other ESI.  Think about metadata when creating a protective order.  What are the ethical restrictions of viewing and mining metadata received through discovery?  Whether you need to disclose receipt of confidential or privileged metadata depends on the jurisdiction.

Legal Risks Associated With Failing To Have A Cyber Incident Response Plan
I couldn’t attend this one.

“Defensible Deletion” Is The Wrong Frame
Defensible deletion started with an IBM survey that found that on average 69% of corporate data has no value, 6% is subject to litigation hold, and 25% is useful.  IBM started offering to remove 45% of data without doing any harm to a company (otherwise, you don’t have to pay).  Purging requires effort, so make deletion the default.  Statistical sampling can be used to confirm that retention rules won’t cause harm.  After a company said that requested data wasn’t available because it had been deleted in accordance with the retention policy, an employee who was being deposed said he had copied everything to 35 CDs — it can be hard to ensure that everything is gone even if you have the right policy.

 

Highlights from Ipro Innovations 2017

The 16th annual Ipro Innovations conference was held at the Talking Stick Resort.ipro2017_stage  It was a well-organized conference with over 500 attendees, lots of good food and swag, and over two days worth of content.  Sometimes, everyone attended the same presentation in a large hall.  Other times, there were seven simultaneous breakout sessions.  My notes below cover only the small subset of the presentations that I was able to attend.  I visited the Ipro office on the final day.  It’s an impressive, modern office with lots of character.  If you are wondering whether the Ipro people have a sense of humor, you need look no farther than the signs for the restrooms.ipro2017_bathroom

The conference started with a summary of recent changes to the Ipro software line-up, how it enables a much smaller team to manage large projects, and stats on the growing customer base.  They announced that Clustify will soon replace Content Analyst as their analytics engine.  In the first phase, both engines will be available and will be implemented similarly, so the user can choose which one to use.  Later phases will make more of Clustify’s unique functionality available.  They announced an investment by ParkerGale Capital.  Operations will largely remain unchanged, but there may be some acquisitions.  The first evening ended with a party at Top Golf.ipro2017_topgolf

Ari Kaplan gave a presentation entitled “The Opportunity Maker,” where he told numerous entertaining stories about business problems and how to find opportunities.  He explained that doing things that nobody else does can create opportunities.  He contacts strangers from his law school on LinkedIn and asks them to meet for coffee when he travels to their town — many accept because “nobody does that.”  He sends postscards to his clients when traveling, and they actually keep them.  To illustrate the value of putting yourself into the path of opportunity, he described how he got to see the Mets in the World Series.  He mentioned HelpAReporter.com as a way to get exposure for yourself as an expert.

One of the tracks during the breakout sessions was run by The Sedona Conference and offered CLE credits.  One of the TSC presentations was “Understanding the Science & Math Behind TAR” by Maura Grossman.  She covered the basics like TAR 1.0 vs. 2.0, human review achieving roughly 70% recall due to mistakes, and how TAR performs compared to keyword search.  She mentioned that control sets can become stale because the reviewer’s concept of relevance may shift during the review.  People tend to get pickier about relevance as the review progresses, so an estimate of the number of relevant docs taken on a control set at the beginning may be too high.  She also warned that making multiple measurements against the control set can give a biased estimate about when a certain level of performance is achieved (sidenote: this is because people watch for a measure like F1 to cross a threshold to determine training completeness, which is not the best way to use a control set).  She mentioned that she and Cormack have a new paper coming out that compares human review to TAR using better-reviewed data (Tim Kaine’s emails) that addresses some criticisms of their earlier JOLT study.ipro2017_computers

There were also breakout sessions where attendees could use the Ipro software with guidance from the staff in a room full of computers.  I attended a session on ECA/EDA.  One interesting feature that was demonstrated was checking the number of documents matching a keyword search that did not match any of the other searches performed — if the number is large, it may not be a very good search query.

ipro2017_salsaAnother TSC session I attended was by Brady, Grossman, and Shonka on responding to government and internal investigations.  Often (maybe 20% of the time) the government is inquiring because you are a source of information, not the target of the investigation, so it may be unwise to raise suspicion by resisting the request.  There is nothing similar to the Federal Rules of Civil Procedure for investigations.  The scope of an investigation can be much broader than civil discovery.  There is nothing like rule 502 (protecting privilege) for investigations.  The federal government is pretty open to the use of TAR (don’t want to receive a document dump), though the DOJ may want transparency.  There may be questions about how some data types (like text messages) were handled.  State agencies can be more difficult.

Talking Stick ResortThe last session I attended was the analytics roundtable, where Ipro employees asked the audience questions about how they were using the software and solicited suggestions for how it could be improved.  The day ended with the Salsa Challenge (as in food, not dancing) and dinner.  I wasn’t able to attend the presentations on the final day, but the schedule looked interesting.

Webinar: 10 Years Forward and Back: Automation in eDiscovery

George Socha, Doug Austin, David Horrigan, Bill Dimm, and Bill Speros will give presentations in this webinar on the history and future of ediscovery moderated by Mary Mack on December 1, 2016.  Bill Dimm will talk about the evolution of predictive coding technologies and our understanding of best practices, including recall estimation, the evil F1 score, research efforts, pre-culling, and the TAR 1.0, 2.0, and 3.0 workflows.  CLICK HERE FOR RECORDING OF WEBINAR, SLIDES, AND LINKS TO RELATED RESOURCES.

Highlights from the Northeast eDiscovery & IG Retreat 2016

The 2016 Northeast eDiscovery & IG Retreat was held at the Ocean Edge Resort & Golf Club.  It was the third annual Ing3nious retreat held in Cape Cod.  The retreat featured two 2016northeast_mansionsimultaneous sessions throughout the day in a beautiful location.  My notes below provide some highlights from the sessions I was able to attend.  You can find additional photos here.

Peer-to-Peer Roundtables
The retreat started with peer-to-peer round tables where each table was tasked with answering the question: Why does e-discovery suck (gripes, pet peeves, issues, etc.) and how can it be improved?  Responses included:

  • How to drive innovation?  New technologies need to be intuitive and simple to get client adoption.
  • Why are e-discovery tools only for e-discovery?  Should be using predictive coding for records management.
  • Need alignment between legal and IT.  Need ongoing collaboration.
  • Handling costs.  Cost models and comparing service providers are complicated.
  • Info governance plans for defensible destruction.
  • Failure to plan and strategize e-discovery.
  • Communication and strategy.  It is important to get the right people together.
  • Why not more cooperation at meet-and-confer?  Attorneys that are not comfortable with technology are reluctant to talk about it.  Asymmetric knowledge about e-discovery causes problems–people that don’t know what they are doing ask for crazy things.

Catching Up on the Implementation of the Amended Federal Rules
I couldn’t attend this one.

Predictive Coding and Other Document Review Technologies–Where Are We Now?
It is important to validate the process as you go along, for any technology.  It is important to understand the client’s documents.  Pandora is more like TAR 2.0 than TAR 1.0, because it starts giving recommendations based on your feedback right away.  The 2012 Rand Study found this e-discovery cost breakdown:73% document review, 8% collection, and 19% processing.  A question from the audience about pre-culling with keyword search before applying predictive coding spurred some debate.  Although it wasn’t mentioned during the panel, I’ll point out William Webber’s analysis of the Biomet case, which shows pre-culling discarded roughly 40% of the relevant documents before predictive coding was applied.  There are many different ways of charging for predictive coding: amount of data, number of users, hose (total data flowing through) or bucket (max amount of data allowed at one time).  Another barrier to use of predictive coding is lack of senior attorney time (e.g., to review documents for training).  Factors that will aid in overcoming barriers: improving technologies, Sherpas to guide lawyers through the process, court rulings, influence from general counsel.  Need to admit that predictive coding doesn’t work for everything, e.g., calendar entries.  New technologies include anonymization tools and technology to reduce the size of collections.  Existing technologies that are useful: entity extraction, email threading, facial recognition, and audio to text.  Predictive coding is used in maybe less than 1% of cases, but email threading is used in 99%.

It’s All Greek To Me: Multi-Language Discovery Best Practices 2016northeast_intro
Native speakers are important.  An understanding of relevant industry terminology is important, too.  The ALTA fluency test is poor–the test is written in English and then translated to other languages, so it’s not great for testing ability to comprehend text that originated in another language.  Hot documents may be translated for presentation.  This is done with a secure platform that prohibits the translator from downloading the documents.  Privacy laws make it best to review in-country if possible.  There are only 5 really good legal translation companies–check with large firms to see who they use.  Throughput can be an issue.  Most can do 20,000 words in 3 days.  What if you need to do 200,000 in 3 days?  Companies do share translators, but there’s no reason for good translators to work for low-tier companies–good translators are in high demand.  QC foreign review to identify bad reviewers (need proficient managers).  May need to use machine translation (MT) if there are millions of documents.  QC the MT result and make sure it is actually useful–in 85% of cases it is not good enough.  For CJK (Chinese, Japanese, Korean), MT is terrible.  The translation industry is $40 billion.  Google invested a lot in MT but it didn’t help much.  One technology that is useful is translation memory, where repeated chunks of text are translated just once.  People performing review in Japanese must understand the subtlety of the American legal system.

Top Trends in Discovery for 2016
I couldn’t attend this one

Measure Twice, Discover Once 2016northeast_beach
Why measure in e-discovery?  So you can explain what happened and why, for defensibility.  Also important for cost management.  The board of directors may want reports.  When asked for more custodians you can show the cost and expected number of relevant documents that will be added by analyzing the number of keyword search hits.  Everything gets an ID number for tracking and analysis (USB drives, batches of documents, etc.).  Types of metrics ordered from most helpful to most harmful: useful, no metric, not useful, and misleading.  A simple metric used often in document review is documents per hour per reviewer.  What about document complexity, content complexity, number and type of issue codes, review complexity, risk tolerance instructions, number of “defect opportunities,” and number coded correctly?  Many 6-sigma ideas from manufacturing are not applicable due to the subjectivity that is present in document review.

Information Governance and Data Privacy: A World of Risk
I couldn’t attend this one

The Importance of a Litigation Hold Policy
I couldn’t attend this one

Alone Together: Where Have All The Model TAR Protocols Gone? 2016northeast_roof
If you are disclosing details, there are two types: inputs (search terms used to train, shared review of training docs) and outputs (target recall or disclosure of recall).  Don’t agree to a specific level of recall before looking at the data–if prevalence is low it may be hard.  Plaintiff might argue for TAR as a way to overcome cost objections from the defendant.  There is concern about lack of sophistication from judges–there is “stunning” variation in expertise among federal judges.  An attorney involved with the Rio Tinto case recommends against agreeing on seed sets because it is painful and focuses on the wrong thing.  Sometimes there isn’t time to put eyes on all documents that will be produced.  Does the TAR protocol need to address dupes, near-dupes, email threading, etc.?

Information Governance: Who Owns the Information, the Risk and the Responsibility?
I couldn’t attend this one

Bringing eDiscovery In-House — Savings and Advantages
I was on this panel so I didn’t take notes

Webinar: How Automation is Revolutionizing eDiscovery

Doug Austin, Bill Dimm, and Bill Speros will give presentations in this webinar moderated by Mary Mack on August 10, 2016.  In addition to broad topics on automation in e-discovery, expect a fair amount on technology-assisted review, including a description of TAR 1.0, 2.0, and 3.0, comparison to human review, and controversial thoughts on judicial acceptance.  CLICK HERE FOR RECORDED WEBINAR

Highlights from the NorCal eDiscovery & IG Retreat 2016

The 2016 NorCal Retreat was held at the Ritz-Carlton at Half Moon Bay, marking the fifth anniversarynorcal2016_cliffs of the Ing3nious retreat series (originally under the name Carmel Valley eDiscovery Retreats).  As always, the location was beautiful and the talks were informative.  Two of the speakers had books available: Christopher Surdak’s Jerk: Twelve Steps to Rule the World and Michael Quartararo’s Project Management in Electronic Discovery: An Introduction to Core Principles of Legal Project Management and Leadership In eDiscovery.  My notes below provide some highlights from the sessions I was able to attend (there were two simultaneous sessions most of the day).  You can find more photos here.

The Changes, Opportunities and Challenges of the Next Four Years
The keynote by Christopher Surdak covered topics from his new book, Jerk (jerk is the rate of change of acceleration, i.e., the third derivative of position with respect to time).  After surveying the audience and finding there was nobody in the room that didn’t have a smartphone, he listed the six challenges of the new normal: quality (consumers expect perfection), ubiquity (anything anywhere anytime), immediacy (there’s an app for that, instantly), disengagement (people buy the result–they don’t care where it came from), intimacy (customers want connectedness and sense of community), and purpose (support customers’ need to feel a sense of purpose, like paying a high price to be “green”).  He then described the four Trinities of Power that we’ve gone through over history: tools, dirt (land), analog (capital), and digital (information).  Information is now taking over from capital–the largest companies are Apple, Google, and Microsoft.  Much of the global economy is experiencing negative interest rates–the power of capital is going away.  He then described the twelve behaviors of Jerks, the disruptive companies that come out of nowhere and take off:

  1. Use other people’s capital – Airbnb uses your home; Uber uses your car
  2. Replace capital with information – Amazon is spending money to create retail stores to learn why you go there.
  3. Focus on context, not content
  4. Eliminate friction
  5. Create value webs, not value chains – supply chains slow you down when you have to wait for a step to complete.  What someone values will change tomorrow, so don’t get locked into a contract/process.
  6. Invert economies of scale and scope – concierge healthcare and doctor on demand are responses to unsatisfying healthcare system
  7. Sell with and through, not to
  8. Print your own money – Hilton points, etc.
  9. Flout the rules – rules are about controlling capital. Fan Duel (fantasy sports) refuses cease and desist because there is more money in continuing to operate even after legal costs.  Tesla sells directly (no dealerships).   Uber has a non-compliance department (not sure if he meant that literally).
  10. Hightail it – people with unmet needs (tail of distribution) are willing to norcal2016_keynotepay the most
  11. Do then learn, not learn then do – learning first is driven by not wanting to waste capital
  12. Look forward, not back – business intelligence is about looking back (where is my capital?)

Dubai is legally obligating its government to open data to everyone.  They want to become the central data clearinghouse.  You can become an e-resident of Dubai (no reporting back to the U.S. government).

How About Some Truly Defensible QC in eDiscovery? Applying Statistical Sampling to Corporate eDiscovery Processes
I was on this panel, so I didn’t take notes.

Analytics & eDiscovery: Employing Analytics for Better, More Efficient, and Cost-Effective eDiscovery
I wasn’t able to attend this one.

Can DO-IT-YOURSELF eDiscovery Actually Deliver?
This was a software demo by Ipro.  Automation that reduces human touching of data improves quality and speed.  They will be adding ECA over the next month.

Behind the eDiscovery Ethics Wheel: Cool, Calm, and Competent
I wasn’t able to attend this one.

“Shift Left”: A New Age of eDiscovery – Analytics and ECA
I wasn’t able to attend this one.

When the Government Comes Knocking; Effective eDiscovery Management During Federal Investigations
Talk to custodians–they can provide useful input to the TAR process or help you learn what relevant documents are expected to look like.  Do keyword search over all emails and use relevant documents found to identify important custodians.  Strategy is determined by time frame, volume, and budget.  Don’t tell the government how you did the production–more details tends to lead to more complications.  Expectations depend on the agency.  Sophistication varies among state AGs.  Different prosecutors/regions have different expectations and differing trust.  The attorney should talk to technologists about documenting the process to avoid scrutiny later on.  Having good processes lined up early demonstrates that you are on top of things.  Be prepared to explain what body of data you plan to search.  Only disclose details if it is necessary for trust.  Describe results rather than methods.  The FTC, DOJ, and SEC will ask up front if you are using keywords or predictive coding.  If you use keywords, they will require disclosure of of the words.  When dealing with proprietary databases, negotiate to produce only a small subset.  Government uses generic template requests–negotiate to reduce effort in responding.  In-place holds can over-preserver email (can’t delete email about kids’ soccer practice).  Be aware of privacy laws when dealing with international data.

The Next Generation of Litigation Technology: Looking Beyond eDiscovery
I wasn’t able to attend this one.

Top Trends in Discovery for 2016
Gartner says 50% of employers will require employees to BYOD by 2017.  Very few in the audience had signed a BYOD policy.  Very few had had a litigation hold on their personal phones.  Text messages are often included in a discovery request, but it is burdensome to collect them.  Wickr is a text messaging app that encrypts data in transit and the message self-destructs (much more secure than Snapchat).  BYOD policy should address security (what is permitted, what must be password protected, what to do if lost or stolen–remote wipe won’t be possible if you have the carrier disable the phone), ban jailbreaking, what happens when the employee leaves the company, prohibitions on saving data to the cloud, and they should require iOS users to enable “find my phone.”  Another trend is the change to rule 37(e).  There is now a higher bar to get sanctions for failure to preserve.  If an employee won’t turn over data, you can fire them but that won’t help in satisfying the requesting party.  It is too soon to tell if the changes to the FRCP will really change things.  With such large data volumes, law firms are starting to cooperate (e.g., telling producing party when they produced something they shouldn’t have).  Cybersecurity is another trend.  Small service providers may not be able to afford cybersecurity audits.  The final trend is the EU/US privacy shield.  The new agreement will go back to the courts since the U.S. is still doing mass surveillance.  Model contract clauses are not the way to go, either (being challenged in Ireland).

Through the Looking Glass: On the Other Side of the FRCP Amendments
I wasn’t able to attend this one.

norcal2016_receptionBest Practices for eDiscovery in Patent Litigation
For preservation, you should know which products and engineers are involved.  It is wise to over-preserve.  You may collect less than you preserve. There is more data in patent litigation.  Ask about notebooks (scientists like paper).  Also, look for prototypes shown at trade shows, user manuals, testing documents, and customer communications.  Tech support logs/tickets may show inducement of infringement.  Be aware of relevant accounting data.  50-60% of new filings are from NPEs (non-practicing entities), so asymmetric.  In 2015 the cost of ediscovery was $200k to $1 million for a $1 million patent case.  In the rule 26(f) meeting, you should set the agenda to control costs.  Get agreement about what to collect in writing.  Don’t trust the person suing you to not use your confidential info–mark as “for attorney’s eyes only,” require encryption and an access log.  It is difficult to educate a judge about your product.  Proportionality will only change if courts start looking at the product earlier.

Everything is the Same/Nothing Has Changed
What has really changed in the FRCP?  Changes to 26(b)(1) clarify, but proportionality is not really new.  The changes did send a message to take proportionality and limiting of discovery seriously.  The “reasonably calculated to lead to the discovery of admissible evidence” part was removed.  The judiciary should have more involvement to push proportionality.  The responding party has a better basis to say “no.”  NPEs push for broad discovery to get a settlement.  There is now more premium on being armed and prepared for the meet and confer.  Need to be persuasive–simply saying “it is burdensom” is not enough–you need to explain why (no boilerplate).  Offer an altnorcal2016_golfernative (e.g., fewer custodians).  One panelist said it was too early to say if proportionality has improved, but another said there has been a sea change already (limiting discovery to fewer custodians).  The lack of adoption of TAR is not due to the rules–the normal starting point for negotiation is keywords.  Proportionality may reduce the use of predictive coding because we are looking at fewer custodians/documents.

It’s a Social World After All
I wasn’t able to attend this one.

The day ended with an outdoor reception.