|Title||Using caution to explain and improve collective classification|
|Year of Publication||2009|
|Authors||McDowell, LK, Gupta, K, Aha, DW|
|Institution||Naval Research Laboratory, Navy Center for Applied Research in Artificial Intelligence|
|Type||NCARAI Technical Note|
|Keywords||collective classification, machine learning|
Many algorithms for collective classification (CC) have been shown to increase accuracy when instances are interrelated. Such algorithms must be carefully applied, however, since CC’s use of estimated labels can in some cases decrease accuracy. Thus, a deeper understanding of algorithmic performances on data sets of different characteristics is needed. Although prior work has begun to study and compare such algorithms, many important questions remain unanswered. To address these limitations, we extend the recently introduced notion of caution in CC algorithms to predict which CC algorithms and training techniques will outperform others and identify the data characteristics for which such performance differences will be substantial. Using the theme of caution and our experimental results we demonstrate the close relationship between two very different algorithms (Gibbs sampling and Gradual Commit), show when they outperform less cautious algorithms, and explain multiple conflicting results from prior CC research.
NRL Publication Release Number: