How to Pick a Lousy Keyword

This is the title I wish Brian Larsen would have given his article on Instead, he went with the tamer title of “Filtering Responsive Data in EDD” where he provides some excellent guidelines for the filtering phase of an e-discovery project.

The second paragraph gets right to the point:

“…the attorneys requested production of all documents containing the word “buy.” Despite being cautioned against this broad search, they were reluctant to heed the warnings, and many unrelated documents were incorrectly deemed responsive.”

Data Funnel

One of the most difficult aspects of an e-discovery project is crafting an effective culling or filtering plan for large volumes of ESI. From a legal perspective, there is always an overwhelming fear that “we’ll miss something” if we don’t cast a broad a net as possible over all the data. Therefore we come up with some “lowest common denominator” words or phrases that will grab everything remotely related to a litigation matter.

Brian provides an example in his article where an attorney wanted to use the keyword “gas” presumably for a litigation matter involving something like a gas pipeline.  The only problem was that one of the parties was an oil and gas company which meant that the “gas” keyword returned every e-mail message since the word appeared in every e-mail signature.

In my opinion, the best solution to this problem is to communicate with the key custodians involved in a litigation matter. Brian addresses this about midway through the article:

“Custodians will often know more about the situation than almost anyone else involved. Attorneys should take advantage of custodians, if possible, as a source of ideas for keywords since they can positively identify specific terms, documents and files related to the matter. Custodians may also offer some insight into the unique vocabulary of the company, industry or subject matter, abbreviations or slang used as a reference to common organizational or project terminology.”

And while it makes total sense to talk to the people closest to the matter, I don’t find a lot of attorneys willing to take the time to do this. Is it because they’re afraid to bother the custodians? Do they not trust the custodians to give them accurate information? Do attorneys believe they can glean more information about the matter by reading and reviewing the documents themselves?

It comes down to communication. As consultants, we need to better communicate why a keyword like “gas” might be ridiculously broad, and be better prepared to offer alternate methods for retrieving accurate results. Attorneys also must understand that filtering ESI is not as simple as running a Google or Lexis search – it takes thoughtful conversations with key custodians and experienced vendors to effectively narrow down the body of data while still ensuring that no relevant results get lost in the shuffle.

Brian Larsen’s article further provides some good tips on excluding items like file types and file locations.

Link to article.