Tag Archives: information governance

Highlights from the NorCal eDiscovery & IG Retreat 2018

The 2018 NorCal eDiscovery & IG Retreat was held at the Carmel Valley Ranch,norcal2018_valley location of the first Ing3nious retreat in 2011 (though the company wasn’t called Ing3nious at the time).  It was a full day of talks with a parallel set of talks on Cybersecurity, Privacy, and Data Protection in the adjacent room.  Attendees could attend talks from either track.  Below are my notes (certainly not exhaustive) from the eDiscovery and IG sessions.  My full set of photos is available here.

Digging Into TAR
I moderated this panel, so I didn’t take notes.  We challenged the audience to create a keyword search that would work better than TAR.  Results are posted here.

Information Governance In The Age Of Encryption And Ephemeral Communications
Facebook messenger has an ephemeral mode, though it is currently only available to Facebook executives.  You can be forced to decrypt data (despite the 5th Amendment) if it can be proven that you have the password.  Ephemeral communication is a replacement for in-person communication, but it can look bad (like you have something to hide).  53% of email is read on mobile devices, but personal devices often aren’t collected.  Slack is useful for passing institutional knowledge along to new employees, but general counsel wants things deleted after 30 days.  Some ephemeral communication tools have archiving options.  You may want to record some conversations in email–you may need them as evidence in the future.  Are there unencrypted copies of encrypted data in some locations?norcal2018_intro

Blowing The Whistle
eDiscovery can be used as a weapon to drive up costs for an adversary.  The plaintiff should be skeptical about about claims of burden–has appropriate culling been performed? Do a meet and confer as early as possible.  Examine data for a few custodians and see if more are needed. A data dump is when a lot of non-relevant docs are produced (e.g., due to a broad search or a search that matches an email signature).  Do sampling to test search terms.  Be explicit about what production formatting you want (e.g., searchable PDF, color, meta data).

Emerging Technology And The Impact On eDiscovery
There may be a lack of policy for new data sources.  Text messages and social media are becoming relevant for more cases.  Your Facebook info can be accessed through your friends.  Fitbit may show whether the person could have committed the murder. IP addresses can reveal whether email was sent from home or work. The change to the Twitter character limit may break some collection tools–QC early on to detect such problems.  Vendors should have multiple tools.  Communicate about what tech is involved and what you need to collect.norcal2018_lunch

Technology Solution Update From Corporate, Law Firm And Service Provider Perspective
Cloud computing (infrastructure, storage, productivity, and web apps) will cause conflict between EU privacy law and US discovery.  AWS provides lots of security options, but it can be difficult to get right (must be configured correctly).  Startups aim to build fast and don’t think enough about how to get the data out.  Are law firm clients looking at cloud agreements and how to export data?  Free services (Facebook, Gmail, etc.) spy on users, which makes them inappropriate for corporate use where privacy is needed.  Slack output is one long conversation.  What about tools that provide a visualization?  You may need the data, not just a screenshot.  Understand the limit of repositories–Office 365 limits to 10GB of PST at a time.  What about versioning storage?  It is becoming more common as storage prices decline.  Do you need to collect all versions of a document?  “Computer ate my homework” excuses don’t fare well in court (e.g., production of privileged docs due to a bad mouse click, or missing docs matching a keyword search because they weren’t OCRed).  GDPR requires knowing where the users are (not where the data is stored).  Employees don’t want their private phones collected, so sandbox work stuff.

Employing Intelligence – Both Human And Artificial (AI) – To Reduce Overall eDiscovery Costs
You need to talk to custodians–the org chart doesn’t really tell you what you need to know.  Search can show who communicates with whom about a topic.  To discover that a custodian is involved that is not known to the attorney, look at the data and interview the ground troops.  Look for a period when there is a lack of communication.  Use sentiment analysis (including emojis).  Watch for strange bytes in the review tool–they may be emojis that can only be viewed in the original app.  Automate legal holds as much as possible.  Escalate to a manager if the employee doesn’t respond to the hold in a timely manner.  Filter on meta data to reduce the amount that goes into the load file.  Sometimes things go wrong with the software (trained on biased data, not finding relevant spreadsheets, etc.).  QC to ensure the human element doesn’t fail.  Use phonetic search on audio files instead of transcribing before search.  Analyze data as it comes in–you may spot months of missing email.  Do proof of concept when selecting tools.norcal2018_pool

Practical Discussion: eDiscovery Process With Law Firms, In-House And Vendor
Stick with a single vendor so you know it is done the same way every time.  Figure out what your data sources are.  Get social media data into the review platform in a usable form (e.g., Skype).  Finding the existence of cloud data stores requires effort.  How long is the cloud data being held (Twitter only holds the last 100 direct messages)?  The company needs to provide the needed apps so employees aren’t tempted to go outside to get what they need.

Highlights from the SoCal eDiscovery & IG Retreat 2017

The 2017 SoCal eDiscovery & IG Retreat was held at the Pelican Hill Resort in Newport Coast, California.   The format was somewhat different from other recent Ing3nious retreats, having at single session at a time instead of two sessions in parallel.  My notes below provide some highlights.  I’ve posted my full set of photos from the conference and nearby Crystal Cove here.socal2017_building

How Well Can Your Organization Protect Against Encrypted Traffic Threats?
Companies should be concerned about encrypted traffic, because they don’t know what is leaving their network.  Get encryption keys for cloud services the company uses so you can check outgoing content and block all other encrypted traffic — if something legitimate breaks, employees will let you know.  It is important to distinguish personal Drop Box use from corporate use.  Make sure you have a policy that says all corporate devices are subject to inspection and monitoring.  The CSO should report to the CEO rather than the CIO or too much ends up being spent on technology with too little spent on risk reduction.  Security tech must be kept up to date.  Some security vendors are using artificial intelligence.  The board of directors needs to be educated about their fiduciary duty to provide oversight for security, as established in a 1996 case in Delaware (see this article).  In what country is the backup of your cloud data stored?  That could be important during litigation.  The amount of unstructured data companies have can be surprising, and represents additional risk.  When the CSO reports to the board, he/she should speak in terms of risk (don’t use tech speak).  Build in security from the beginning when starting new projects.  GDPR violations could bring penalties of up to 4% of revenue. Guidance papers on GDPR are all over 40 pages long.  “Internet of Things” devices (e.g., refrigerators) are typically not secure.  Use DNS to detect attempts by IoT devices to call out.  IoT is collecting data about you to sell.  The book Future Crimes by Marc Goodman was recommended.

Using Technology To Reduce eDiscovery Spend
Artificial intelligence (AI) can be used before collection to reduce data volume.  Have a conversation about what’s really needed and use ECA to cull by date, topic, etc.  Process data from key players first.  It is important for project managers to know the data.  Parse out domain names, see who is talking to whom, see which folders people really have access to, and get rid of bad file types.  Image the machine of the person who will be leaving, then tell them you will be imaging the machine in the near future and see what they delete.  Use sentiment analysis and see if sentiment changes over time.  Use clustering to identify stuff that can be culled (e.g., stuff about the NFL).  Use clustering, rather than random sampling, to see what the data looks like.  Redaction of things like social security numbers can be automated.socal2017_hall

It’s All Greek To Me: Multi-Language Document Review from Shakespeare To FCPA
Examples were given of Craigslist ads seeking temporary people for foreign language document review, showing that companies performing such reviews may not have capable people on staff.  Law firms are relying on external providers to manage reviews in languages in which they are not fluent. English in Singapore is not the same as English in the U.S. (different expressions) — cultural context is important.  There are 6,900 languages around the world.  Law firms must do diligence to ensure a language expert is trustworthy.  Law firms don’t like being beta testers for technologies like TAR and machine translation.  Communications in Asia are often not in text file format (e.g., chat applications) and can involve hundreds of thousands of non-standard emojis (how to even render them?).  Facebook got a Palestinian man arrested by mistranslating his “good morning” to “attack them” (see this article).  One speaker suggested Googling “fraudulent foreign language reviewers” (the top match is here).  There was skepticism about the ALTA language proficiency test.

Artificial Intelligence – Facial Expression Analytics As A Competitive Advantage In Risk Mitigation
Monitoring emotional response can provide an advantage at trial.  Universal emotions: joy, sadness, surprise, fear, anger, disgust, and contempt.  The lawyer should avoid causing sadness since that is detrimental to being liked — let the witness do it.  Emotional response can depend on demographics.  For example, the contempt response depends on age, and women tend to show a larger fear response.  Software can now detect emotion from facial photos very quickly.  One panelist warned against using the iPhone X’s authentication via face recognition because Apple has software for detecting emotion and could monitor your mood.  80% of what a jury picks up on is non-verbal.  Analyze video of depositions to look for ways to improve.  Senior people at companies refuse to believe they don’t come across well, but they often show signs of contempt at questions they deem to be petty.  There is no facial expression for deception — look for a shift in expression.  Realize that software may not be making decisions in the same way as a human would.  For example, a neural network that did a good job of distinguishing wolves from dogs was actually making the decision based on the presence or absence of snow in the background.

TAR: What Have We Learned?
I moderated this panel, so I didn’t take notes.

Bridging The Gap Between Inside And Outside Counsel: Next Generation Strategies For Collaborating On Complex Litigation Matters
Communicate about what you actually need or they may collect everything regardless of date or custodian, resulting in high costs for hosting.  Insourcing is a trend — the company keeps the data in house (reduce cost and risk) and provides outside counsel with access.  socal2017_golfThis means imposing technology on the outside counsel.  One benefit of insourcing is that in house counsel learns about the data, which may help with future cases.  Another trend is disaggregation, where legal tasks are split up among different law firms instead of using a single firm for everything.  It is important to ensure that technologies being used are understood by all parties from the start to avoid big problems later.  Paralegals can be good at keeping communication flowing between the outside attorney and the client.  Tech companies that want people to adopt their products need to help outside counsel explain the benefits to clients.

Cyber And Data Security For The GC: How To Stay Out Of Headlines And Crosshairs
I couldn’t attend this panel because I had to catch my flight.

 

Highlights from the NorCal IG Retreat 2017

The 2017 NorCal Information Governance Retreat was norcal2017_lodgeheld by Ing3nious at the Quail Lodge & Golf Club in Carmel Valley, California.  After round table discussions, the retreat featured two simultaneous sessions throughout the day. My notes below provide some highlights from the sessions I was able to attend.  I’ve posted additional photos here.

The intro to the round table discussions included some comments on the evolution of the Internet, the importance of searching for obscenities to find critical documents or to identify data that has been scrubbed (it is implausible that there are no emails containing obscenities for a failing project), the difficulty of searching for “IT” (meaning information technology rather than the pronoun), and the inability of many tools to search for emojis.norcal2017_keynote

TAR: What Have We Learned?
I moderated this panel, so I didn’t take notes.

How Well Can Your Organization Protect Against Encrypted Traffic Threats?
I couldn’t attend this

IG Analytics And Infonomics: The Future Is Now
I couldn’t attend this

Breaches Happen. Going On The Cyber Offense With Deception
Breach stories that were mentioned included Equifax, Target, an employee that built their own (insecure) tunnel to get data out to their home, and an employee that carried data out on a microSD card.  In the RSA / Lockheed Martin breach, a Lockheed contractor was fooled by a phishing email, illustrating how hard it is to keep attackers out.  Email is a very common source of breaches.  A big mistake is not knowing that you’ve been breached.  People put honeypots outside the firewall to detect attacks. It’s better to use deception technology, which puts decoys inside the firewall.

Social Media And Website Information Governance
There has been some regulation of social media, especially for certain industries.  The SEC in 2012 required financial institutions to archive it.  The FTC has been enforcing paid endorsement disclosure guidelines (e.g., Kim Kardashian’s endorsement of a morning sickness drug).  Collecting evidence from social media is tricky.  A screenshot could be photoshopped, so how to prove it is legitimate?  Should collect a screenshot, source code, meta data, and a digital signature with time stamp.  Corporate policy on social media use will depend on the kind of company and the industry it is in.  There should also be a policy on monitoring employee’s social media use.  Companies using an internal social media system are asking for problems.  How will they police/discipline improper usage?  If an employee posts “Why haven’t I seen John lately?” and another replies that John has cancer, you have a problem.  Does a company social media system really improve productivity?  Can you find out who posted something anonymously on public social media?  If they posted from Starbucks or a library, probably not (finding the IP address won’t reveal the person’s identity).  This strategy worked for a bad review of a doctor that was thought to be from another doctor: 1) file in Federal court and get a court order to get the user’s IP address from the social media website, 2) go back to the judge and get a court order to get the ISP to give the identity of the person using that IP address at that time, 3) there is a motion to quash, which confirms that the right person was found (otherwise wouldn’t bother to fight it).

Bridging The Gap Between Inside And Outside Counsel: Next Generation Strategies For Collaborating On Complex Litigation Matters
I couldn’t attend thisnorcal2017_lunch

Preventing Inadvertent Disclosure In A Multi-Language World
Start by identifying teams and process.  Be aware of cultural differences.  Be aware of technological issues — there are 2 or 3 alternatives to MS Word that you might encounter for documents in Korean.  Be aware of laws against removing certain documents from the country.  There was disagreement among panel members about whether review quality of foreign documents was better in the U.S. due to reviewers better understanding U.S. law.  Viewing a document in the U.S. that is stored on a server in the E.U. is not a valid work-around for restrictions on exporting the documents.  Review in the U.S. is much cheaper than reviewing overseas (about 1/5 to 1/10 of the cost).  Violation of GDPR puts 4% of revenue at risk, but a U.S. judge may not care.  Take only what you need out of the country.  Many tools work best when they are analyzing documents in a single language, so use language identification and separate documents by language before analysis.  TAR may not work as well for non-English documents, but it does work.

What’s Your Trust Point?
I couldn’t attend this

Legal Tech And AI – Inventing The Future
Humans are better than computers at handling low-probability outlier events, because there is a lack of training data to teach machines to handle such things.  It is important for the technology to be easy for the user to interact with.  Legal clients are very cost averse, so a free trial of new tech is attractive.

The Cloud, New Technologies And Other Developments In Trade Secret Theft
I couldn’t attend this

Are You Prepared For The Impact Of Changing EU Data Privacy On U.S. Litigation?
I couldn’t attend this

IG Policy Pain Points In E-Discovery
Deletion of data that is not on hold 60 days after an employee norcal2017_mountainsleaves the company may not get everything since other custodians may have copies.  You may find that employees have archived their emails on a local hard drive.  Be clear about data ownership — wiping the phone of an employee that left the company may hit their personal data.  The general counsel is often not involved in decisions like BYOD (treated as an IT decision), but they should be.  Realize that having more data about employee behavior (e.g., GPS tracking) makes the company more responsible.  You rarely need the employee’s phone since there is little data cached there (data is on mail servers, etc.).  You should do info governance compliance testing to ensure that employees are following the procedures.  Policies must be realistic — there won’t be perfect separation of work and personal activity.  Flouted rules may be worse than no rules.  Keep personal data separate (personal folder, personal email address, use phone for accessing Facebook).  When doing an annual cleanup, what about the data from the employee who left the company?  A study showed that 85% of stored data is rot.  Have a checklist that you follow when an employee leaves — don’t wipe the computer without copying stuff you may need.

Highlights from the Northeast IG Retreat 2017

The 2017 Northeast Information Governance Retreat was held at the Salamander northeast2017_buildingResort & Spa in Middleburg, Virginia.  After round table discussions, the retreat featured two simultaneous sessions throughout the day. My notes below provide some highlights from the sessions I was able to attend.

Enhancing eDiscovery With Next Generation Litigation Management Software
I couldn’t attend this

Legal Tech and AI – Inventing The Futurenortheast2017_keynote
Machines are currently only good a routine tasks.  Interactions with machines should allow humans and machines to do what they do best.  Some areas where AI can aid lawyers: determining how long litigation will take, suggesting cases you should reference, telling how often the opposition has won in the past, determining appropriate prices for fixed fee arrangements, recruiting, or determining which industry on which to focus.  AI promises to help with managing data (e.g., targeted deletion), not just e-discovery.  Facial recognition may replace plane tickets someday.

Zen & The Art Of Multi-Language Discovery: Risks, Review & Translation
I couldn’t attend this

NexLP Demo
The NexLP tool emphasizes feature extraction and use of domain knowledge from external sources to figure out the story behind the data.  It can generate alerts based on changes in employee behavior over time.  Company should have a policy allowing the scanning of emails to detect bad behavior.  It was claimed that using AI on emails is better for privacy than having a human review random emails since it keeps human eyes away from emails that are not relevant.northeast2017_lunch

TAR: What Have We Learned?
I moderated this panel, so I didn’t take notes.

Are Managed Services Manageable?
I couldn’t attend this

Cyber And Data Security For The GC: How To Stay Out Of Headlines And Crosshairs
I couldn’t attend this

The Office Is Out: Preservation And Collection In The Merry Old LandOf Office 365
Enterprise 5 (E5) has advanced analytics from Equivio.  E3 and E1 can do legal hold but don’t have advanced analytics.  There are options available that are not on the website, and there are different builds — people are not all using the same thing.  Search functionality works on limited file types (e.g., Microsoft products).  Email attachments are OK if they are from Microsoft products.  It will not OCR PDFs that lack embedded text.  What about emails attached to emails?  Previously, it only went one layer deep on attachments.  Latest versions say they are “relaxing” that, but it is unclear what that means (how deep?).  User controls sync — are we really searching everything?  Make sure you involve IT, privacy, info governance, etc. if considering transition to 365.  Be aware of data that is already on hold if you migrate to 365.  Start by migrating a small group of people that are not often subject to litigation.  Test each data type after conversion.

How To Make Sense Of Information Governance Rules For Contractors When The Government Itself Can’t?northeast2017_garden
I couldn’t attend this

Judges, The Law And Guidance: Does ‘Reasonableness’ Provide Clarity?
This was primarily about the impact of the new Federal rules of civil procedure.  Clients are finally giving up on putting everything on hold.  Tie document retention to business needs — shouldn’t have to worry about sanctions.  Document everything (e.g., why you chose specific custodians to hold).  Accidentally missing one custodian out of a hundred is now OK.  Some judges acknowledge the new rules but then ignore them.  Boilerplate objections to discovery requests needs to stop — keep notes on why you made each objection.

Beyond The Firewall: Cybersecurity & The Human Factor
I couldn’t attend this

The Theory of Relativity: Is There A Black Hole In Electronic Discovery?northeast2017_social
The good about Relativity: everyone knows it, it has plug-ins, and moving from document to document is fast compared to previous tools.  The bad: TAR 1.0 (federal judiciary prefers CAL).  An audience member expressed concern that as Relativity gets close to having a monopoly we should expect high prices and a lack of innovation.  Relativity One puts kCura in competition with service providers.

The day ended with a wine social.

Highlights from the South Central IG Retreat 2017

The 2017 South Central Information Governance Retreat was the first retreat in the Ing3nious series held in Texas at the La Cantera Resort & Spa.  The retreat featured two simultaneous sessions throughout the day.  My notes below provide some highlights from the sessions I was able to attend.

The day started with roundtable discussions that were kicked off by a speaker who talked about the early days of the Internet.  He made the point that new lawyers may know less about how computers actually work even though they were born in an era when they are more pervasive.  He mentioned that one of the first keyword searches he performs when he receives a production is for “f*ck.”  If a company was having problems with a product and there isn’t a single email using that word, something was surely withheld from the production.  He made the point that expert systems that are intended to replace lawyers must be based on how the experts (lawyers) actually think.  How do you identify the 50 documents that will actually be used in trial?

Borrowing Agile Development Concepts To Jump-Start Your Information Governance Program
I couldn’t attend this

Your Duty To Preserve: Avoiding Traps In Troubled Times
When storing data in the cloud, what is actually retained?  How can you get the data out?  Google Vault only indexes newly added emails, not old ones.  The company may not have the right to access employee data in the cloud.  One panelist commented that collection is preferred to preservation in place.

Enhancing eDiscovery With Next Generation Litigation Management Software
I couldn’t attend this one.

Leveraging The Cloud & Technology To Accelerate Your eDiscovery Process
Cloud computing seems to have reached an inflection point.  A company cannot put the resources into security and data protection that Amazon can.  The ability to scale up/down is good for litigation that comes and goes.  Employees can jump into cloud services without the preparation that was required for doing things on site.  Getting data out can be hard.  Office 365 download speed can be a problem (2-3 GB/hr) — reduce data as much as possible.

Strategies For Effectively Managing Your eDiscovery Spend
I couldn’t attend this one.

TAR: What Have We Learned?
I moderated this panel, so I didn’t take notes.

Achieving GDPR Compliance For Unstructured Content
I couldn’t attend this one.

Zen & The Art Of Multi-Language Discovery: Risks, Review & Translation
The translation company should be brought in when the team is formed (it often isn’t done until later).  Help may be needed from translator / localization expert to come up with search terms.  For example, there are 20 ways to say “CEO” in Korean.  Translation must be done by an expert to be certified.  When using TAR, do review in the native language and translate the result before presenting to the legal team.  Translation is much slower than review.  Machine translation has improved over the last 2 years, but it’s not good enough to rely on for anything important.  A translator leaked Toyota’s data to the press — keep the risk in mind and make sure you are informed about the environment where the work is being done (screenshots should be prohibited).

Beyond The Firewall: Cybersecurity & The Human Factor
I couldn’t attend this one.

Ethical Obligations Relating To Metadata
Nineteen states have enacted ethical rules on meta-data.  Sometimes, metadata is enough to tell the whole story.  John McAfee was found and arrested because of GPS coordinates embedded in a photo of him.  Metadata showed that a terminated whistleblower’s employee review was written 3 months after termination.  Forensic collection is important to not spoil the metadata.  Ethical obligations of attorneys are broader than attorney-client privilege.  Should attorneys be encrypting email?  Make the client aware of metadata and how it can be viewed.  The attorney must understand metadata and scrub it as necessary (e.g, change tracking in Word).  In e-discovery metadata is treated like other ESI.  Think about metadata when creating a protective order.  What are the ethical restrictions of viewing and mining metadata received through discovery?  Whether you need to disclose receipt of confidential or privileged metadata depends on the jurisdiction.

Legal Risks Associated With Failing To Have A Cyber Incident Response Plan
I couldn’t attend this one.

“Defensible Deletion” Is The Wrong Frame
Defensible deletion started with an IBM survey that found that on average 69% of corporate data has no value, 6% is subject to litigation hold, and 25% is useful.  IBM started offering to remove 45% of data without doing any harm to a company (otherwise, you don’t have to pay).  Purging requires effort, so make deletion the default.  Statistical sampling can be used to confirm that retention rules won’t cause harm.  After a company said that requested data wasn’t available because it had been deleted in accordance with the retention policy, an employee who was being deposed said he had copied everything to 35 CDs — it can be hard to ensure that everything is gone even if you have the right policy.

 

Highlights from the Northeast eDiscovery & IG Retreat 2016

The 2016 Northeast eDiscovery & IG Retreat was held at the Ocean Edge Resort & Golf Club.  It was the third annual Ing3nious retreat held in Cape Cod.  The retreat featured two 2016northeast_mansionsimultaneous sessions throughout the day in a beautiful location.  My notes below provide some highlights from the sessions I was able to attend.  You can find additional photos here.

Peer-to-Peer Roundtables
The retreat started with peer-to-peer round tables where each table was tasked with answering the question: Why does e-discovery suck (gripes, pet peeves, issues, etc.) and how can it be improved?  Responses included:

  • How to drive innovation?  New technologies need to be intuitive and simple to get client adoption.
  • Why are e-discovery tools only for e-discovery?  Should be using predictive coding for records management.
  • Need alignment between legal and IT.  Need ongoing collaboration.
  • Handling costs.  Cost models and comparing service providers are complicated.
  • Info governance plans for defensible destruction.
  • Failure to plan and strategize e-discovery.
  • Communication and strategy.  It is important to get the right people together.
  • Why not more cooperation at meet-and-confer?  Attorneys that are not comfortable with technology are reluctant to talk about it.  Asymmetric knowledge about e-discovery causes problems–people that don’t know what they are doing ask for crazy things.

Catching Up on the Implementation of the Amended Federal Rules
I couldn’t attend this one.

Predictive Coding and Other Document Review Technologies–Where Are We Now?
It is important to validate the process as you go along, for any technology.  It is important to understand the client’s documents.  Pandora is more like TAR 2.0 than TAR 1.0, because it starts giving recommendations based on your feedback right away.  The 2012 Rand Study found this e-discovery cost breakdown:73% document review, 8% collection, and 19% processing.  A question from the audience about pre-culling with keyword search before applying predictive coding spurred some debate.  Although it wasn’t mentioned during the panel, I’ll point out William Webber’s analysis of the Biomet case, which shows pre-culling discarded roughly 40% of the relevant documents before predictive coding was applied.  There are many different ways of charging for predictive coding: amount of data, number of users, hose (total data flowing through) or bucket (max amount of data allowed at one time).  Another barrier to use of predictive coding is lack of senior attorney time (e.g., to review documents for training).  Factors that will aid in overcoming barriers: improving technologies, Sherpas to guide lawyers through the process, court rulings, influence from general counsel.  Need to admit that predictive coding doesn’t work for everything, e.g., calendar entries.  New technologies include anonymization tools and technology to reduce the size of collections.  Existing technologies that are useful: entity extraction, email threading, facial recognition, and audio to text.  Predictive coding is used in maybe less than 1% of cases, but email threading is used in 99%.

It’s All Greek To Me: Multi-Language Discovery Best Practices 2016northeast_intro
Native speakers are important.  An understanding of relevant industry terminology is important, too.  The ALTA fluency test is poor–the test is written in English and then translated to other languages, so it’s not great for testing ability to comprehend text that originated in another language.  Hot documents may be translated for presentation.  This is done with a secure platform that prohibits the translator from downloading the documents.  Privacy laws make it best to review in-country if possible.  There are only 5 really good legal translation companies–check with large firms to see who they use.  Throughput can be an issue.  Most can do 20,000 words in 3 days.  What if you need to do 200,000 in 3 days?  Companies do share translators, but there’s no reason for good translators to work for low-tier companies–good translators are in high demand.  QC foreign review to identify bad reviewers (need proficient managers).  May need to use machine translation (MT) if there are millions of documents.  QC the MT result and make sure it is actually useful–in 85% of cases it is not good enough.  For CJK (Chinese, Japanese, Korean), MT is terrible.  The translation industry is $40 billion.  Google invested a lot in MT but it didn’t help much.  One technology that is useful is translation memory, where repeated chunks of text are translated just once.  People performing review in Japanese must understand the subtlety of the American legal system.

Top Trends in Discovery for 2016
I couldn’t attend this one

Measure Twice, Discover Once 2016northeast_beach
Why measure in e-discovery?  So you can explain what happened and why, for defensibility.  Also important for cost management.  The board of directors may want reports.  When asked for more custodians you can show the cost and expected number of relevant documents that will be added by analyzing the number of keyword search hits.  Everything gets an ID number for tracking and analysis (USB drives, batches of documents, etc.).  Types of metrics ordered from most helpful to most harmful: useful, no metric, not useful, and misleading.  A simple metric used often in document review is documents per hour per reviewer.  What about document complexity, content complexity, number and type of issue codes, review complexity, risk tolerance instructions, number of “defect opportunities,” and number coded correctly?  Many 6-sigma ideas from manufacturing are not applicable due to the subjectivity that is present in document review.

Information Governance and Data Privacy: A World of Risk
I couldn’t attend this one

The Importance of a Litigation Hold Policy
I couldn’t attend this one

Alone Together: Where Have All The Model TAR Protocols Gone? 2016northeast_roof
If you are disclosing details, there are two types: inputs (search terms used to train, shared review of training docs) and outputs (target recall or disclosure of recall).  Don’t agree to a specific level of recall before looking at the data–if prevalence is low it may be hard.  Plaintiff might argue for TAR as a way to overcome cost objections from the defendant.  There is concern about lack of sophistication from judges–there is “stunning” variation in expertise among federal judges.  An attorney involved with the Rio Tinto case recommends against agreeing on seed sets because it is painful and focuses on the wrong thing.  Sometimes there isn’t time to put eyes on all documents that will be produced.  Does the TAR protocol need to address dupes, near-dupes, email threading, etc.?

Information Governance: Who Owns the Information, the Risk and the Responsibility?
I couldn’t attend this one

Bringing eDiscovery In-House — Savings and Advantages
I was on this panel so I didn’t take notes

Highlights from the Southeast eDiscovery & IG Retreat 2016

This retreat was the first one held by Ing3nious in the Southeast.  It was at the Chateau Elan2016_SE_retreat_outside Winery & Resort in Brasel­ton, Geor­gia.  Like all of the e-discovery retreats organized by Chris LaCour, it featured informative panels in a beautiful setting.  My notes below offer a few highlights from the sessions I attended.  There were often two sessions occurring simultaneously, so I couldn’t attend everything.

Peer-to-Peer Roundtables
My table discussed challenges people were facing.  These included NSF files (Lotus Notes), weird native file formats, and 40-year-old documents that had to be scanned and OCRed. Companies having a “retain everything” culture are problematic (e.g., 25,000 backup tapes).  One company had a policy of giving each employee a DVD containing all of their emails when they left the company.  When they got sued they had to hunt down those DVDs to retrieve emails they no longer had.  If a problem (information governance) is too big, nothing will be done at all.  In Canada there are virtually never sanctions, so there is always a fight about handing anything over.2016_SE_retreat_roundtables

Proactive Steps to Cut E-Discovery Costs
I couldn’t attend this one.

The Intersection of Legal and Technical Issues in Litigation Readiness Planning
It is important to establish who you should go to.  Many companies don’t have a plan (figure it out as you go), but it is a growing trend to have one due to data security and litigation risk.  Having an IT / legal liaison is becoming more common.  For litigation readiness, have providers selected in advance.  To get people on board with IG, emphasize cost (dollars) vs. benefit (risk).  Should have an IG policy about mobile devices, but they are still challenging.  Worry about data disposition by a third party provider when the case is over.  Educate people about company policies.2016_SE_retreat_panel

Examining Your Tools & Leveraging Them for Proactive Information Governance Strategy
I couldn’t attend this one.

Got Data? Analytics to the Rescue
Only 56% of in-house counsel use analytics, but 93% think it would be useful.  Use foreign language identification at start to know what you are dealing with.  Be careful about coded language (e.g., language about fantasy sports that really means something else) — don’t cull it!  Graph who is speaking to whom.  Who are emails being forwarded to?  Use clustering to find themes.  Use assisted redaction of PII, but humans should validate the result (this approach gives a 33% reduction in time).  Re-OCR after redaction to make sure it is really gone.  Alex Ponce de Leon from Google said they apply predictive coding immediately as early-case assessment and know the situation and critical documents before hiring outside counsel (many corporate attorneys in the audience turned green with envy).  Predictive coding is also useful when you are the requesting party.  Use email threading to identify related emails.  The requesting party may agree to receive just the last email in the thread.  Use analytics and sampling to show the judge the burden of adding custodians and the number of relevant documents expected — this is much better than just throwing around cost numbers.  Use analytics for QC and reviewer analysis.  Is someone reviewing too slow/fast (keep in mind that document type matters, e.g. spreadsheets) or marking too many docs as privileged?

The Power of Analytics: Strategies for Investigations and Beyond
Focus on the story (fact development), not just producing documents.  Context is very important for analyzing instant messages.  Keywords often don’t work for IMs due to misspellings.  Analytics can show patterns and help detect coded language.  Communicate about how emails are being handled — are you producing threads or everything, and are you logging threads or everything (producing and logging may be different).  Regarding transparency, are the seed set and workflow work product?  When working with the DOJ, showed them results for different bands of predictive coding results and they were satisfied with that.  Nobody likes the idea of doing a clawback agreement and skipping privilege review.

Freedom of Speech Isn’t Free…of Consequences
The 1st Amendment prohibits Congress from passing laws restricting speech, but that doesn’t keep companies from putting restrictions on employees.  With social media, cameras everywhere, and the ability of things to go viral (the grape lady was mentioned), companies are concerned about how their reputations could be damaged by employees’ actions, even outside the workplace.  A doctor and a Taco Bell executive were fired due to videos of them attacking Uber drivers.  Employers creating policies curbing employee behavior must be careful about Sec. 8 of the National Labor Relations Act, which prohibits employers from interfering with employees’ Sec. 7 rights to self-organize or join/form a labor organization.  Taken broadly, employers cannot prohibit employees from complaining about working conditions since that could be seen as a step toward organizing.  Employers have to be careful about social media policies or prohibiting employees from talking to the media because of this.  Even a statement in the employee handbook saying employees should be respectful could be problematic because requiring them to be respectful toward their boss could be a violation.  The BYOD policy should not prohibit accessing Facebook (even during work) because Facebook could be used to organize.  On the other hand, employers could face charges of negligent retention/hiring if they don’t police social media.

Generating a Competitive Advantage Through Information Governance: Lessons from the Field
I couldn’t attend this one.

Destruction Zone
The government is getting more sophisticated in its investigations — it is important to give 2016_SE_retreat_insidethem good productions and avoid losing important data.  Check to see if there is a legal hold before discarding old computer systems and when employees leave the company.  It is important to know who the experts are in the company and ensure communication across functions.  Information governance is about maximizing value of information while minimizing risks.  The government is starting to ask for text messages.  Things you might have to preserve in the future include text messages, social media, videos, and virtual reality.  It’s important to note the difference between preserving the text messages by conversation and by custodian (where things would have to be stitched back together to make any sense of the conversation).  Many companies don’t turn on recording of IMs, viewing them as conversational.

Managing E-Discovery as a Small Firm or Solo Practitioner
I couldn’t attend this one.

Overcoming the Objections to Utilizing TAR
I was on this panel, so I didn’t take notes.

Max Schrems, Edward Snowden and the Apple iPhone: Cross-Border Discovery and Information Management Times Are A-Changing
I couldn’t attend this one.