• English
    • français
    • Deutsch
    • español
    • português (Brasil)
    • Bahasa Indonesia
    • русский
    • العربية
    • 中文
  • English 
    • English
    • français
    • Deutsch
    • español
    • português (Brasil)
    • Bahasa Indonesia
    • русский
    • العربية
    • 中文
  • Login
View Item 
  •   Home
  • OAI Data Pool
  • OAI Harvested Content
  • View Item
  •   Home
  • OAI Data Pool
  • OAI Harvested Content
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

All of the LibraryCommunitiesPublication DateTitlesSubjectsAuthorsThis CollectionPublication DateTitlesSubjectsAuthorsProfilesView

My Account

LoginRegister

The Library

AboutNew SubmissionSubmission GuideSearch GuideRepository PolicyContact

Importance sampling for reinforcement learning with multiple objectives

  • CSV
  • RefMan
  • EndNote
  • BibTex
  • RefWorks
Author(s)
Shelton, Christian R. (Christian Robert), 1975-
Contributor(s)
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Tomaso Poggio.
Keywords
Electrical Engineering and Computer Science.

Full record
Show full item record
URI
http://hdl.handle.net/20.500.12424/1003790
Online Access
http://hdl.handle.net/1721.1/86774
Abstract
This thesis considers three complications that arise from applying reinforcement learning to a real-world application. In the process of using reinforcement learning to build an adaptive electronic market-maker, we find the sparsity of data, the partial observability of the domain, and the multiple objectives of the agent to cause serious problems for existing reinforcement learning algorithms. We employ importance sampling (likelihood ratios) to achieve good performance in partially observable Nlarkov decision processes with few data. Our importance sampling estimator requires no knowledge about the environment and places few restrictions on the method of collecting data. It can be used efficiently with reactive controllers, finite-state controllers, or policies with function approximation. We present theoretical analyses of the estimator and incorporate it into a reinforcement learning algorithm. Additionally, this method provides a complete return surface which can be used to balance multiple objectives dynamically. We demonstrate the need for multiple goals in a variety of applications and natural solutions based on our sampling method. The thesis concludes with example results from employing our algorithm to the domain of automated electronic market-making.
by Christian Robert Shelton.
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.
Includes bibliographical references (p. 115-118).
Date
2014-05-07
Type
Thesis
Identifier
oai:dspace.mit.edu:1721.1/86774
http://hdl.handle.net/1721.1/86774
49837623
Copyright/License
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.
Collections
OAI Harvested Content

entitlement

 
DSpace software (copyright © 2002 - 2021)  DuraSpace
Quick Guide | Contact Us
Open Repository is a service operated by 
Atmire NV
 

Export search results

The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.