Predictive coding uses a supervised, machine learning (a type of artificial intelligence) to assist an attorney in the review and classification of electronically stored information. Predictive coding type software analyzes whole documents in a dataset, not just keywords, and uses advanced mathematics, including near-infinite-dimensional vector space probability analysis, and logistic regression algorithms, to order, compare, and rank them.
In predictive coding driven CARs attorneys train a computer to find documents identified by the attorney as a target, typically as relevant to a particular lawsuit, or some other classification, such as privileged.
Below is the diagram of the latest Predictive Coding 4.0 workflow for use in a typical CAR project.
For a full description of the eight steps see Predictive Coding 4.0, see Parts Six and Seven of Predictive Coding 4.0 – Nine Key Points of Legal Document Review and an Updated Statement of Our Workflow. The complete article on Predictive Coding 4.0, all seven parts, can be found here. There is also a 97 page PDF version of this article (does not include the ten videos) that can be found here.
Please remember that before you begin to actually carry out a predictive coding project as described, you need to plan for it. This is critical to the success of the project. We suggest you consult this detailed outline of a Form Plan for a Predictive Coding Project for a complete checklist.
The use of multimodal judgmental sampling in steps two, four and six to find documents for training follows the consensus view of information scientists specializing in information retrieval, but is not followed by several prominent predictive coding software vendors in e-discovery. They instead rely entirely on machine selected documents for training, or even worse, rely entirely on random selected documents to train the software. See Part One of Predictive Coding 3.0 where some of the errors in Predictive Coding 1.0 and 2.0, are described. Also see Predictive Coding 4.0, Part Two where the first of the nine insights, Active Machine Learning, is explained, including the method of double-loop learning. In Part Three of Predictive Coding 4.0 we explain what is mean by the Balanced Hybrid approach where both Man and Machine are relied upon.