Coverage vs. Accuracy
  • 15 Mar 2022
  • 1 Minute to read
  • Dark
    Light

Coverage vs. Accuracy

  • Dark
    Light

The tradeoffs between accuracy and coverage and how to optimally control results

Adding more data to a query does not necessarily result in better ER. Though adding more information about the entity may enrich the query results, in many cases, the result is the opposite. 

Certain data types can increase accuracy, assuming the data is current. Inaccurate data entered into queries can potentially reduce the accuracy of query results, thus reducing coverage. Accurate information about more companies will be provided if the query does not include unnecessary and inaccurate query values. Use of addresses is a typical case: if a company moves its location, including the original address may be the cause the model to conclude that two entities are different when, in fact, they are the same. As previously mentioned, it is helpful to include unique entity identifiers in the query. (The preceding section, ‘Ensuring reliable ER results,’ describes the practical implications of these tradeoffs.)

The following diagrams illustrate this tradeoff between coverage and accuracy. The top graph shows the coverage comparison, and the bottom graph shows the precision comparison when more input is added:




Was this article helpful?