Trade-offs of Data-Dependence

When do data-dependent algorithms offer improved error rates?

The figures below show the error rates of data-dependent algoritms at a low scale (1000) and a high scale (10,000,000). For a given algorithm (on the x-axis) each dots represents mean error of the algorithm on one dataset. Horizontal lines represent the best and worst performing data-independent algorithms (HB and IDENTITY, respectively).

A circle for a dataset is colored blue when the algorithm achieves lower error than the best data-independent method on that dataset. Similarly, circles are colored red when the algorithm achieves higher error than the worst data-independent method. Grey dots are between these two extremes.

Settings
  • At low scale many data-dependent algorithms outperform the best data-independent algorithm on at least some datasets (blue circles); some algorithms like DAWA and MWEM uniformly outperform the best data-independent algorithm.
  • At high scale most data-dependent algorithms underperform even the worst data-independent algorithm, Identity, as shown by red circles.