The opinion of the court was delivered by: Andrew J. Peck, United States Magistrate Judge:
In my article Search, Forward: Will manual document review and keyword searches be replaced by computer-assisted coding?, I wrote:
To my knowledge, no reported case (federal or state) has ruled on the use of computer-assisted coding. While anecdotally it appears that some lawyers are using predictive coding technology, it also appears that many lawyers (and their clients) are waiting for a judicial decision approving of computer-assisted review.
Perhaps they are looking for an opinion concluding that: "It is the opinion of this court that the use of predictive coding is a proper and acceptable means of conducting searches under the Federal Rules of Civil Procedure, and furthermore that the software provided for this purpose by [insert name of your favorite vendor] is the software of choice in this court." If so, it will be a long wait.
Until there is a judicial opinion approving (or even critiquing) the use of predictive coding, counsel will just have to rely on this article as a sign of judicial approval. In my opinion, computer-assisted coding should be used in those cases where it will help "secure the just, speedy, and inexpensive" (Fed. R. Civ. P. 1) determination of cases in our e-discovery world.
Andrew Peck, Search, Forward, L. Tech. News, Oct. 2011, at 25, 29. This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.*fn1
In this action, five female named plaintiffs are suing defendant Publicis Groupe, "one of the world's 'big four' advertising conglomerates," and its United States public relations subsidiary, defendant MSL Group. (See Dkt. No. 4: Am. Compl. ¶¶ 1, 5, 26-32.) Plaintiffs allege that defendants have a "glass ceiling" that limits women to entry level positions, and that there is "systemic, company-wide gender discrimination against female PR employees like Plaintiffs." (Am. Compl. ¶¶ 4-6, 8.) Plaintiffs allege that the gender discrimination includes (a) paying Plaintiffs and other female PR employees less than similarly-situated male employees; (b) failing to promote or advance Plaintiffs and other female PR employees at the same rate as similarly-situated male employees; and (c) carrying out discriminatory terminations, demotions and/or job reassignments of female PR employees when the company reorganized its PR practice beginning in 2008 . . . . (Am. Compl. ¶ 8.)
Plaintiffs assert claims for gender discrimination under Title VII (and under similar New York State and New York City laws) (Am. Compl. ¶¶ 204-25), pregnancy discrimination under Title VII and related violations of the Family and Medical Leave Act (Am. Compl. ¶¶ 239-71), as well as violations of the Equal Pay Act and Fair Labor Standards Act (and the similar New York Labor Law) (Am. Compl. ¶¶ 226-38).
The complaint seeks to bring the Equal Pay Act/FLSA claims as a "collective action" (i.e., opt-in) on behalf of all "current, former, and future female PR employees" employed by defendants in the United States "at any time during the applicable liability period" (Am. Compl. ¶¶ 179-80, 190-203), and as a class action on the gender and pregnancy discrimination claims and on the New York Labor Law pay claim (Am. Compl. ¶¶ 171-98). Plaintiffs, however, have not yet moved for collective action or class certification at this time.
Defendant MSL denies the allegations in the complaint and has asserted various affirmative defenses. (See generally Dkt. No. 19: MSL Answer.) Defendant Publicis is challenging the Court's jurisdiction over it, and the parties have until March 12, 2012 to conduct jurisdictional discovery. (See Dkt. No. 44: 10/12/11 Order.)
COMPUTER-ASSISTED REVIEW EXPLAINED
My Search, Forward article explained my understanding of computer-assisted review, as follows:
By computer-assisted coding, I mean tools (different vendors use different names) that use sophisticated algorithms to enable the computer to determine relevance, based on interaction with (i.e., training by) a human reviewer.
Unlike manual review, where the review is done by the most junior staff, computer-assisted coding involves a senior partner (or [small] team) who review and code a "seed set" of documents. The computer identifies properties of those documents that it uses to code other documents. As the senior reviewer continues to code more sample documents, the computer predicts the reviewer's coding. (Or, the computer codes some documents and asks the senior reviewer for feedback.)
When the system's predictions and the reviewer's coding sufficiently coincide, the system has learned enough to make confident predictions for the remaining documents. Typically, the senior lawyer (or team) needs to review only a few thousand documents to train the computer.
Some systems produce a simple yes/no as to relevance, while others give a relevance score (say, on a 0 to 100 basis) that counsel can use to prioritize review. For example, a score above 50 may produce 97% of the relevant documents, but constitutes only 20% of the entire document set.
Counsel may decide, after sampling and quality control tests, that documents with a score of below 15 are so highly likely to be irrelevant that no further human review is necessary. Counsel can also decide the cost-benefit of manual review of the documents with scores of 15-50.
Andrew Peck, Search, Forward, L. Tech. News, Oct. 2011, at 25, 29.*fn2
My article further explained my belief that Daubert would not apply to the results of using predictive coding, but that in any challenge to its use, this Judge would be interested in both the process used and the results:
[I]f the use of predictive coding is challenged in a case before me, I will want to know what was done and why that produced defensible results. I may be less interested in the science behind the "black box" of the vendor's software than in whether it produced responsive documents with reasonably high recall and high precision.
That may mean allowing the requesting party to see the documents that were used to train the computer-assisted coding system. (Counsel would not be required to explain why they coded documents as responsive or non-responsive, just what the coding was.) Proof of a valid "process," including quality control testing, also will be important.
Of course, the best approach to the use of computer-assisted coding is to follow the Sedona Cooperation Proclamation model. Advise opposing counsel that you plan to use computer-assisted coding and seek agreement; if you cannot, consider whether to abandon predictive coding for that case or go to the court for advance approval.
THE ESI DISPUTES IN THIS CASE AND THEIR RESOLUTION
After several discovery conferences and rulings by Judge Sullivan (the then-assigned District Judge), he referred the case to me for general pretrial supervision. (Dkt. No. 48: 11/28/11 Referral Order.) At my first discovery conference with the parties, both parties' counsel mentioned that they had been discussing an "electronic discovery protocol," and MSL's counsel stated that an open issue was "plaintiff's reluctance to utilize predictive coding to try to cull down the" approximately three million electronic documents from the agreed-upon custodians. (Dkt. No. 51: 12/2/11 Conf. Tr. at 7-8.)*fn3 Plaintiffs' counsel clarified that MSL had "over simplified [plaintiffs'] stance on predictive coding," i.e., ...