The IG3 West conference was held by Ing3nious at the Paséa Hotel & Spa in Huntington Beach, California. This conference differed from other recent Ing3nious events in several ways. It was two days of presentations instead of one. There were three simultaneous panels instead of two. Between panels there were sometimes three simultaneous vendor technology demos. There was an exhibit hall with over forty vendor tables. Due to the different format, I was only able to attend about a third of the presentations. My notes are below. You can find my full set of photos here.
Stop Chasing Horses, Start Building Fences: How Real-Time Technologies Change the Game of Compliance and Governance
Chris Surdak, the author of Jerk: Twelve Steps to Rule the World, talked about changing technology and the value of information, claiming that information is the new wealth. Facebook, Amazon, Apple, Netflix, and Google together are worth more than France [apparently he means the sum of their market capitalizations is greater than the GDP of France, though that is a rather apples-to-oranges comparison since GDP is an annualized number]. We are exposed to persistent ambient surveillance (Alexa, Siri, Progressive Snapshot, etc.). It is possible to detect whether someone is lying by using video to detect blood flow to their face. Car companies monetized data about passengers’ weight (measured due to air bags). Sentiment analysis has a hard time with sarcasm. You can’t find emails about fraud by searching for “fraud” — discussions about fraudulent activity may be disguised as weirdly specific conversations about lunch. The problem with graph analysis is that a large volume of talk about something doesn’t mean that it’s important. The most important thing may be what’s missing. When RadioShack went bankrupt, its remaining value was in its customer data — remember them asking for your contact info when you bought batteries? A one-word change to FRCP 37(e) should have changed corporate retention policies, but nobody changed. The EU’s right to be forgotten is virtually impossible to implement in reality (how to deal with backup tapes?) and almost nobody does it. Campbell’s has people shipping their DNA to them so they can make diet recommendations to them. With the GDPR, consent nullifies the protections, so it doesn’t really protect your privacy.
AI and the Corporate Law Department of the Future
Gartner says AI is at the peak of inflated expectations and a trough of disillusionment will follow. Expect to be able to buy autonomous vehicles by 2023. The economic downturn of 2008 caused law firms to start using metrics. Legal will take a long time to adopt AI — managing partners still have assistants print stuff out. Embracing AI puts a firm ahead of its competitors. Ethical obligations are also an impediment to adoption of technology, since lawyers are concerned about understanding the result.
Advanced TAR Considerations: A 500 Level Crash Course
Continuous Active Learning (CAL), also called TAR 2.0, can adapt to shifts in the concept of relevance that may occur during the review. There doesn’t seem to be much difference in the efficiency of SVM vs logistic regression when they are applied to the same task. There can be a big efficiency difference between different tasks. TAR 1.0 requires a subject-matter expert for training, but senior attorneys are not always readily available. With TAR 1.0 you may be concerned that you will be required to disclose the training set (including non-responsive documents), but with TAR 2.0 there is case law that supports that being unnecessary [I’ve seen the argument that the production itself is the training set, but that neglects the non-responsive documents that were reviewed (and used for training) but not produced. On the other hand, if you are taking about disclosing just the seed set that was used to start the process, that can be a single document and it has very little impact on the result.]. Case law can be found at predictivecoding.com, which is updated at the end of each year. TAR needs text, not image data. Sometimes keywords are good enough. When it comes to government investigations, many agencies (FTC, DOJ) use/accept TAR. It really depends on the individual investigator, though, and you can’t fight their decision (the investigator is the judge). Don’t use TAR for government investigations without disclosing that you are doing so. TAR can have trouble if there are documents having high conceptual similarity where some are relevant and some aren’t. Should you tell opposing counsel that you’re using TAR? Usually, but it depends on the situation. When the situation is symmetrical, both sides tend to be reasonable. When it is asymmetrical, the side with very little data may try to make things expensive for the other side, so say something like “both sides may use advanced technology to produce documents” and don’t give more detail than that (e.g., how TAR will be trained, who will do the training, etc.) or you may invite problems. Disclosing the use of TAR up front and getting agreement may avoid problems later. Be careful about “untrainable documents” (documents containing too little text) — separate them out, and maybe use meta data or file type to help analyze them. Elusion testing can be used to make sure too many relevant documents weren’t missed. One panelist said 384 documents could be sampled from the elusion set, though that may sometimes not be enough. [I have to eat some crow here. I raised my hand and pointed out that the margin of error for the elusion has to be divided by the prevalence to get the margin of error for the recall, which is correct. I went on to say that with a sample of 384 giving ±5% for the elusion you would have ±50% for the recall if prevalence was 10%, making the measurement worthless. The mistake is that while a sample of 384 technically implies a worst case of ±5% for the margin of error for elusion, it’s not realistic for the margin of error to be that bad for elusion because ±5% would occur if elusion was near 50%, but elusion is typically very small (smaller than the prevalence), causing the margin of error for the elusion to be significantly less than ±5%. The correct margin of error for the recall from an elusion sample of 384 documents would be ±13% if the prevalence is 10%, and ±40% if the prevalence is 1%. So, if prevalence is around 10% an elusion sample of 384 isn’t completely worthless (though it is much worse than the ±5% we usually aim for), but if prevalence is much lower than that it would be].
40 Years in 30 Minutes: The Background to Some of the Interesting Issues we Face
Steven Brower talked about the early days of the Internet and the current state of technology. Early on, a user ID was used to tell who you were, not to keep you out. Technology was elitist, and user-friendly was not a goal. Now, so much is locked down for security reasons that things become unusable. Law firms that prohibit access to social media force lawyers onto “secret” computers when a client needs something taken down from YouTube. Emails about laws against certain things can be blocked due to keyword hits for the illegal things being described. We don’t have real AI yet. The next generation beyond predictive coding will be able to identify the 50 key documents for the case. During e-discovery, try searching for obscenities to find things like: “I don’t give a f*** what the contract says.” Autonomous vehicles won’t come as soon as people are predicting. Snow is a problem for them. We may get vehicles that drive autonomously from one parking lot to another, so the route is well known. When there are a bunch of inebriated people in the car, who should it take commands from? GDPR is silly since email bounces from computer to computer around the world. The Starwood breach does not mean you need to get a new passport — your passport number was already out there. To improve your security, don’t try to educate everyone about cybersecurity — you can eliminate half the risk by getting payroll to stop responding to emails asking for W2 data that appear to come from the CEO. Scammers use the W2 data to file tax returns to get the refunds. This is so common the IRS won’t even accept reports on it anymore. You will still get your refund if it happens to you, but it’s a hassle.
Digging Into TAR
I moderated this panel, so I didn’t take notes. We did the TAR vs. Keyword Search Challenge again. The results are available here.
After the Incident: Investigating and Responding to a Data Breach
Plan in advance, and remember that you may not have access to the laptop containing the plan when there is a breach. Get a PR firm that handles crises in advance. You need to be ready for the negative comments on Twitter and Facebook. Have the right SMEs for the incident on the team. Assume that everything is discoverable — attorney-client privilege won’t save you if you ask the attorney for business (rather than legal) advice. Notification laws vary from state to state. An investigation by law enforcement may require not notifying the public for some period of time. You should do an annual review of your cyber insurance since things are changing rapidly. Such policies are industry specific.
Employing Technology/Next-Gen Tools to Reduce eDiscovery Spend
Have a process, but also think about what you are doing and the specifics of the case. Restrict the date range if possible. Reuse the results when you have overlapping cases (e.g., privilege review). Don’t just look at docs/hour when monitoring the review. Look at accuracy and get feedback about what they are finding. CAL tends to result in doing too much document review (want to stop at 75% recall but end up hitting 89%). Using a tool to do redactions will give false positives, so you need manual QC of the result. When replacing a patient ID with a consistent anonymized identifier, you can’t just transform the ID because that could be inverted, resulting in a HIPAA violation.
eDiscovery for the Rest of us
What are ediscovery considerations for relatively small data sets? During meet and confer, try to cooperate. Judges hate ediscovery disputes. Let the paralegals hash out the details — attorneys don’t really care about the details as long as it works. Remote collection can avoid travel costs and hourly fees while keeping strangers out of the client’s office. The biggest thing they look for from vendors is cost. Need a certain volume of data for TAR to be practical. Email threading can be used at any size.
Does Compliance Stifle or Spark Innovation?
Startups tend to be full of people fleeing big corporations to get away from compliance requirements. If you do compliance well, that can be an advantage over competitors. Look at it as protecting the longevity of the business (protecting reputation, etc.). At the DoD, compliance stifles innovation, but it creates a barrier against bad guys. They have thousands of attacks per day and are about 8 years behind normal innovation. Gray crimes are a area for innovation — examples include manipulation (influencing elections) and tanking a stock IPO by faking a poisoning. Hospitals and law firms tend to pay, so they are prime targets for ransomware.
Panels That I Couldn’t Attend:
California and EU Privacy Compliance
What it all Comes Down to – Enterprise Cybersecurity Governance
Selecting eDiscovery Platforms and Vendors
Defensible Disposition of Data
Biometrics and the Evolving Legal Landscape
Storytelling in the Age of eDiscovery
Technology Solution Update From Corporate, Law Firm and Service Provider Perspective
The Internet of Things and Everything as a Service – the Convergence of Security, Privacy and Product Liability
Similarities and Differences Between the GDPR and the New California Consumer Privacy Act – Similar Enough?
The Impact of the Internet of Things on eDiscovery
Escalating Cyber Risk From the IT Department to the Boardroom
So you Weren’t Quite Ready for GDPR?
Security vs. Compliance and Why Legal Frameworks Fall Short to Improve Information Security
How to Clean up Files for Governance and GDPR
Deception, Active Defense and Offensive Security…How to Fight Back Without Breaking the Law?
Information Governance – Separating the “Junk” from the “Jewels”
What are Big Law Firms Saying About Their LegalTech Adoption Opportunities and Challenges?
Cyber and Data Security for the GC: How to Stay out of Headlines and Crosshairs