Contributor(s)
The Pennsylvania State University CiteSeerX ArchivesKeywords
algorithm selectionalgorithm portfolios
online learning
life-long learning
bandit problem
expert advice
Full record
Show full item recordOnline Access
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.296.5713http://www.cs.ubc.ca/~hutter/EARG.shtml/earg/papers08/IDSIA-02-07.pdf
Abstract
Algorithm selection can be performed using a model of runtime distribution, learned during a preliminary training phase. There is a trade-off between the performance of model-based algorithm selection, and the cost of learning the model. In this paper, we treat this trade-off in the context of bandit problems. We propose a fully dynamic and online algorithm selection technique, with no separate training phase: all candidate algorithms are run in parallel, while a model incrementally learns their runtime distributions. A redundant set of time allocators uses the partially trained model to propose machine time shares for the algorithms. A bandit problem solver mixes the model-based shares with a uniform share, gradually increasing the impact of the best time allocators as the model improves. We present experiments with a set of SAT solvers on a mixed SAT-UNSAT benchmark; and with a set of solvers for the Auction Winner Determination problem.Date
2013-08-13Type
textIdentifier
oai:CiteSeerX.psu:10.1.1.296.5713http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.296.5713