• English
    • français
    • Deutsch
    • español
    • português (Brasil)
    • Bahasa Indonesia
    • русский
    • العربية
    • 中文
  • English 
    • English
    • français
    • Deutsch
    • español
    • português (Brasil)
    • Bahasa Indonesia
    • русский
    • العربية
    • 中文
  • Login
View Item 
  •   Home
  • OAI Data Pool
  • OAI Harvested Content
  • View Item
  •   Home
  • OAI Data Pool
  • OAI Harvested Content
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

All of the LibraryCommunitiesPublication DateTitlesSubjectsAuthorsThis CollectionPublication DateTitlesSubjectsAuthorsProfilesView

My Account

Login

The Library

AboutNew SubmissionSubmission GuideSearch GuideRepository PolicyContact

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors

The K-armed Dueling Bandits Problem

  • CSV
  • RefMan
  • EndNote
  • BibTex
  • RefWorks
Author(s)
Yisong Yue
Josef Broder
Robert Kleinberg
Thorsten Joachims
Contributor(s)
The Pennsylvania State University CiteSeerX Archives

Full record
Show full item record
URI
http://hdl.handle.net/20.500.12424/801834
Online Access
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.251
http://www.yisongyue.com/publications/colt2009_dueling_bandit.pdf
Abstract
We study a partial-information online-learning problem where actions are restricted to noisy comparisons between pairs of strategies (also known as bandits). In contrast to conventional approaches that require the absolute reward of the chosen strategy to be quantifiable and observable, our setting assumes only that (noisy) binary feedback about the relative reward of two chosen strategies is available. This type of relative feedback is particularly appropriate in applications where absolute rewards have no natural scale or are difficult to measure (e.g., user-perceived quality of a set of retrieval results, taste of food, product attractiveness), but where pairwise comparisons are easy to make. We propose a novel regret formulation in this setting, as well as present an algorithm that achieves (almost) information-theoretically optimal regret bounds (up to a constant factor). 1
Date
2010-12-30
Type
text
Identifier
oai:CiteSeerX.psu:10.1.1.180.251
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.251
Copyright/License
Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Collections
OAI Harvested Content

entitlement

 
DSpace software (copyright © 2002 - 2023)  DuraSpace
Quick Guide | Contact Us
Open Repository is a service operated by 
Atmire NV
 

Export search results

The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.