June | 2017 | Clustify Blog – eDiscovery, Document Clustering, Technology-Assisted Review (Predictive Coding), Information Retrieval, and Software Development

The 2017 Northeast Information Governance Retreat was held at the Salamander Resort & Spa in Middleburg, Virginia. After round table discussions, the retreat featured two simultaneous sessions throughout the day. My notes below provide some highlights from the sessions I was able to attend.

Enhancing eDiscovery With Next Generation Litigation Management Software
I couldn’t attend this

Legal Tech and AI – Inventing The Future
Machines are currently only good a routine tasks. Interactions with machines should allow humans and machines to do what they do best. Some areas where AI can aid lawyers: determining how long litigation will take, suggesting cases you should reference, telling how often the opposition has won in the past, determining appropriate prices for fixed fee arrangements, recruiting, or determining which industry on which to focus. AI promises to help with managing data (e.g., targeted deletion), not just e-discovery. Facial recognition may replace plane tickets someday.

Zen & The Art Of Multi-Language Discovery: Risks, Review & Translation
I couldn’t attend this

NexLP Demo
The NexLP tool emphasizes feature extraction and use of domain knowledge from external sources to figure out the story behind the data. It can generate alerts based on changes in employee behavior over time. Company should have a policy allowing the scanning of emails to detect bad behavior. It was claimed that using AI on emails is better for privacy than having a human review random emails since it keeps human eyes away from emails that are not relevant.

TAR: What Have We Learned?
I moderated this panel, so I didn’t take notes.

Are Managed Services Manageable?
I couldn’t attend this

Cyber And Data Security For The GC: How To Stay Out Of Headlines And Crosshairs
I couldn’t attend this

The Office Is Out: Preservation And Collection In The Merry Old LandOf Office 365
Enterprise 5 (E5) has advanced analytics from Equivio. E3 and E1 can do legal hold but don’t have advanced analytics. There are options available that are not on the website, and there are different builds — people are not all using the same thing. Search functionality works on limited file types (e.g., Microsoft products). Email attachments are OK if they are from Microsoft products. It will not OCR PDFs that lack embedded text. What about emails attached to emails? Previously, it only went one layer deep on attachments. Latest versions say they are “relaxing” that, but it is unclear what that means (how deep?). User controls sync — are we really searching everything? Make sure you involve IT, privacy, info governance, etc. if considering transition to 365. Be aware of data that is already on hold if you migrate to 365. Start by migrating a small group of people that are not often subject to litigation. Test each data type after conversion.

How To Make Sense Of Information Governance Rules For Contractors When The Government Itself Can’t?
I couldn’t attend this

Judges, The Law And Guidance: Does ‘Reasonableness’ Provide Clarity?
This was primarily about the impact of the new Federal rules of civil procedure. Clients are finally giving up on putting everything on hold. Tie document retention to business needs — shouldn’t have to worry about sanctions. Document everything (e.g., why you chose specific custodians to hold). Accidentally missing one custodian out of a hundred is now OK. Some judges acknowledge the new rules but then ignore them. Boilerplate objections to discovery requests needs to stop — keep notes on why you made each objection.

Beyond The Firewall: Cybersecurity & The Human Factor
I couldn’t attend this

The Theory of Relativity: Is There A Black Hole In Electronic Discovery?
The good about Relativity: everyone knows it, it has plug-ins, and moving from document to document is fast compared to previous tools. The bad: TAR 1.0 (federal judiciary prefers CAL). An audience member expressed concern that as Relativity gets close to having a monopoly we should expect high prices and a lack of innovation. Relativity One puts kCura in competition with service providers.

The day ended with a wine social.

Measuring the recall achieved to within +/- 5% to demonstrate that a production is defensible can require reviewing a substantial number of random documents. For a case of modest size, the amount of review required to measure recall can be larger than the amount of review required to actually find the responsive documents with predictive coding. This article describes a new method requiring much less document review to demonstrate that adequate recall has been achieved. This is a brief overview of a more detailed paper I’ll be presenting at the DESI VII Workshop on June 12th (slides available here).

The proportion of a population having some property can be estimated to within +/- 5% by measuring the proportion on a random sample of 400 documents (you’ll also see the number 385 being used, but using 400 will make it easier to follow the examples). To measure recall we need to know what proportion of responsive documents are produced, so we need a sample of 400 random responsive documents. Since we don’t know which documents in the population are responsive, we have to select documents randomly and review them until 400 responsive ones are found. If prevalence is 10% (10% of the population is responsive), that means reviewing roughly 4,000 documents to find 400 that are relevant so that recall can be estimated. If prevalence is 1%, it means reviewing roughly 40,000 random documents to measure recall. This can be quite a burden.

Once recall is measured, a decision must be made about whether it is high enough. Suppose you decide that if at least 300 of the 400 random responsive documents were produced (75%) the production is acceptable. For any actual level of recall, the probability of accepting the production can be computed (see figure to right). The probability of accepting a production where the actual recall is less than 70% will be very low, and the probability of rejecting a production where the actual recall is greater than 80% will also be low — this comes from the fact that a sample of 400 responsive documents is sufficient to measure recall to within +/- 5%.

The idea behind the new method is to achieve the same probability profile for accepting/rejecting a production using a multi-stage acceptance test. The multi-stage test gives the possibility of stopping the process and declaring the production accepted/rejected long before reviewing 400 random responsive documents. The procedure is shown in the flowchart to the right (click to enlarge). A decision may be reached after reviewing enough documents to find just 25 random documents that are responsive. If a decision isn’t made after reviewing 25 responsive documents, review continues until 50 responsive documents are found and another test is applied. At worst, documents will be reviewed until 400 responsive documents are found (the same as the traditional direct recall estimation method).

The figure to the right shows six examples of the multi-stage acceptance test being applied when the actual recall is 85%. Since 85% is well above the 80% upper bound of the 75% +/- 5% range, we expect this production to virtually always be accepted. The figure shows that acceptance can occur long before reviewing a full 400 random responsive documents. The number of random responsive documents reviewed is shown on the vertical axis. Toward the bottom of the graph the sample is very small and the percentage of the sample that has been produced may deviate greatly from the right answer of 85%. As you go up the sample gets larger and the proportion of the sample that is produced is expected to get closer to 85%. When a green decision boundary is touched, causing the production to be accepted as having sufficiently high recall, the color of the remainder of the path is changed to yellow — the yellow part represents the document review that is avoided by using the multi-stage acceptance method (since the traditional direct recall measurement would involve going all the way to 400 responsive documents). As you can see, when the actual recall is 85% the number of random responsive documents that must be reviewed is often 50 or 100, not 400.

The figure to the right shows the average number of documents that must be reviewed using the multi-stage acceptance procedure from the earlier flowchart. The amount of review required can be much less than 400 random responsive documents. In fact, the further above/below the 75% target (called the “splitting recall” in the paper) the actual recall is, the less document review is required (on average) to come to a conclusion about whether the production’s recall is high enough. This creates an incentive for the producing party to aim for recall that is well above the minimum acceptable level since it will be rewarded with a reduced amount of document review to confirm the result is adequate.

It is important to note that the multi-stage procedure provides an accept/reject result, not a recall estimate. If you follow the procedure until an accept/reject boundary is hit and then use the proportion of the sample that was produced as a recall estimate, that estimate will be biased (the use of “unbiased” in the paper title refers to the sampling being done on the full population, not on a subset [such as the discard set] that would cause a bias due to inconsistency in review of different subsets).

You may want to use a splitting recall other than 75% for the accept/reject decision — the full paper provides tables of values necessary for doing that.

Clustify Blog – eDiscovery, Document Clustering, Technology-Assisted Review (Predictive Coding), Information Retrieval, and Software Development

Thoughts on e-discovery, computers, and software development.

Monthly Archives: June 2017

Highlights from the Northeast IG Retreat 2017

Substantial Reduction in Review Effort Required to Demonstrate Adequate Recall