How Automated Contract Provision Extraction Systems Find Relevant Provisions, And Why "How" Matters

Written by: Noah Waisberg

7 minute read

People who review contracts generally need to do this work accurately. In fact, one of the principal problems with the current non-tech enhanced way contracts get reviewed (at law firms and elsewhere) is that people make mistakes in this work. Accurate automated contract review systems can help. We, for example, have found that our system helps a user generate more accurate contract summaries in less time than traditional contract review (stay tuned for details). Automated contract provision extraction systems make their most powerful impact when they are accurate. The trouble is that building accurate provision models is hard, a lot harder than it first seems. And the details of which technology is used to build provision models and who is involved in building models really matter.

Knowing something of how the underlying extraction technology works is key to understanding automated contract review software. So, since this is the Contract Review Software Buyer’s Guide, the next several posts will cover important details on where provision models come from. This post is part III of the the Contract Review Software Buyer’s Guide. Earlier posts introduce the series, then cover what’s wrong with the traditional approach to contract review and what contract review software does and why it’s worthwhile.


Since a pair of terms, “accuracy” and “unfamiliar documents” will recur throughout this system-mechanics part of the Contract Review Software Buyer’s Guide, here are definitions:


In search, accuracy is usually measured on the basis of two metrics: precision and recall. As we cover in more detail in an earlier post on our system’s accuracy in finding contract provisions:

  • Recall measures the extent to which relevant results are found. For example, in a set of documents with 50 assignment provisions, if a search system found 48 of the 50, recall would be 96%.

  • Precision measures the extent to which only relevant results are found; irrelevant results hurt precision. Effectively, precision shows how many hits are junk out of the total number returned. In the recall example above, if the system presented 300 provisions as assignment hits to turn-up 48 actual assignment provisions, precision would be 16%.

    There is generally a tradeoff between recall and precision. In our problem of trying to find contract provisions, one way to get perfect recall would be to return entire contracts as results. We would never miss a provision following this strategy. Alternatively, we could make extra-sure the system only ever showed users relevant results, but this could come at the cost of missing some (perhaps atypical) provisions.

So “accuracy” is a function of both finding relevant contract provisions and only presenting relevant results. Our view is that automated contract review systems need high recall more than precision. Users are more hurt by missed provisions than the odd false positive result. That said, results cannot be hugely unbalanced; results with too low precision are not useful.

Unfamiliar Documents

Sometimes contracts to be reviewed are simply executed versions of form agreement(s), and reviewers have the form in advance. Imagine a large technology company considering all of its executed system integrator agreements, an insurance company needing to extract data from its own policies, a landlord looking at leases it executed over a short time period in a given building, someone needing to get through a pile of ISDAs or review NDAs prepared using Koncision’s drafting tool. All of these situations might feature “known documents”. Automatically extracting data from known documents is easy. Since provisions are known in advance, searchers can write rules or train models to closely fit the form. But diversity reigns in typical contract review! The form and wording of agreements to be reviewed in any given contract review are typically not known in advance. This is especially so in M&A due diligence review of a target company’s contracts, but is also regularly the case in contract management database population work—even the biggest companies execute agreements on others' paper. These unknown-in-advance agreements are “unfamiliar documents”. Performance on them is a big dividing line in automated contract review software systems. If all you seek is data from known documents, you can skip the next technology-centric posts. Pretty much any system should be able to meet your technical needs, and you can decide more on other user experience factors. If, however, you need to review unfamiliar documents, the system-mechanics details in the next several posts will really matter for you. Accurately extracting data from unfamiliar documents is hard, and not every system will perform the same here.

“Automated Contract Provision Extraction” And Its Equivalents

Finally, while we’re at definitions, note that “automated contract provision extraction” = “automated contract review” = “automatic contract abstraction” = “automated contract abstraction” = “contractual metadata extraction”. Different vendors, different words, same idea. (Though significantly different implementations, including at least one offering that is only partially automated.) These are subsets of “contract review software”; as discussed in part II of the Contract Review Software Buyer’s Guide, nearly all contract review software systems now on the market include an automated provision extraction feature.

How Automated Contract Provision Extraction Systems Actually Work And Why This Matters

How the DiligenceEngine automatic contract provision extraction system works.

Contract provision extraction systems are pretty simple in concept. Most systems have contract provision models either pre-loaded or available on request/with customization.* Documents being reviewed are converted to machine readable text (if needed).** This text is then scanned by the system’s provision models and hits are extracted. The chart on the left shows how this works in our automated contract provision extraction system. More details on how ours works are available here.

A key issue with automated contract review systems is where their provision models come from. This has two critical subparts:

  • what technology is used to build models, and
  • who is involved in instructing the system what correct hits look like.

The quick and easy way to generate provision models is to write manual rules (such as boolean search strings) describing provisions. While this approach is reasonably simple to set up and easy to understand, it has drawbacks. Most notably, the resulting models' are unlikely to be especially accurate on unfamiliar documents. And it will also be hard to measure system accuracy on unfamiliar documents. Comparison- and machine learning-based technology underlie more difficult to build alternatives. With significant work, machine learning-based technology can give accurate provision models on unfamiliar agreements.

The next several posts will give details on how the three technological alternatives (manual rules, comparison-based methods, and machine learning) are used to build provision models, and note advantages and disadvantages of each. Then, the series will discuss how who is involved in this process matters.

* Comparison-type contract metadata extraction systems work slightly differently. A later instalment of the series will give information on these.

** Contract review systems convert agreements that are not already in text format to machine-readable text using optical character recognition (OCR) software. Some or all automated contract provision extraction software providers integrate third party OCR software systems for this function; we are not aware of any providers that build their own.

Contract Review Buyers Guide Series:

Share this article: