Font Size: a A A

Thompson Sampling for the Control of a Queue with Demand Uncertaint

Posted on:2018-11-13Degree:M.A.SType:Thesis
University:University of Toronto (Canada)Candidate:Gimelfarb, MichaelFull Text:PDF
GTID:2478390020957288Subject:Operations Research
Abstract/Summary:PDF Full Text Request
We study an admission control problem in which the customer arrival rate is unknown and needs to be learned from data using Bayesian inference. Two key defining features of this model are that: (1) when the arrival rate is known, the DP equations can be solved explicitly to obtain the optimal policy over the infinite horizon, and (2) uninformative actions are unavoidable and occur infinitely often.;We extend the standard proof techniques for Thompson sampling to admission control, in which uninformative actions occur infinitely often, and show that asymptotically optimal convergence rates of the posterior error and worst-case average regret are achieved. Finally, we show that under simple assumptions, our techniques generalize to a broader class of policies, which we call Generalized Thompson sampling. We show that this class of policies achieves asymptotically optimal convergence rates and can outperform standard Thompson sampling in numerical simulation.
Keywords/Search Tags:Thompson sampling
PDF Full Text Request
Related items