Trade-offs of Data-Dependence

When do data-dependent algorithms offer improved error rates?

Finding: Data-dependence can offer significant improvements in error, at smaller scales or lower epsilon values, but some data-dependent algorithms fail to offer benefits at larger scales or higher epsilons.

The figures below show the error rates of data-dependent algoritms at a low scale (1000) and a high scale (10,000,000). For a given algorithm (on the x-axis) each dots represents mean error of the algorithm on one dataset. Horizontal lines represent the best and worst performing data-independent algorithms (HB and IDENTITY, respectively).

A circle for a dataset is colored blue when the algorithm achieves lower error than the best data-independent method on that dataset. Similarly, circles are colored red when the algorithm achieves higher error than the worst data-independent method. Grey dots are between these two extremes.

At low scale many data-dependent algorithms outperform the best data-independent algorithm on at least some datasets (blue circles); some algorithms like DAWA and MWEM uniformly outperform the best data-independent algorithm.
At high scale most data-dependent algorithms underperform even the worst data-independent algorithm, Identity, as shown by red circles.