The Erlang (k) The Bandit Sampling Process

Posted on:2007-02-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liang

Full Text:PDF

GTID:2190360215986462

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

Bandit sampling processes are considered in this thesis, which include two basic models: Bandit reward processes and Bandit target processes. Gittins discussed Bandit sampling processes based on common distributions, such as Bernoulli distribution and negative exponential distribution. Based on Gittins' results, more widely-used Bandit sampling processes are investigated by using dynamic programming backward induction and the Bayesian approach.The main achievements of this thesis are listed as follows:Firstly, important properties of several special Bandit sampling processes are examined. The posterior distribution of the unknown parameter, the conditional distribution of sampling values and the function of sampling rewards are derived and the monotonicity of them are discussed.Scecondly, the optimal decision problem of the Erlang(k) Bandit reward process whose sampling values have an Erlang(k) distribution is discussed. An algorithm is constructed to calculate the sequence of break-even values characterizing optimal selections of arms. The asymptotic properties of the Gittins index and the sequence of break-even values are also studied. Thus, the optimal decision problem of the Erlang(k) Bandit reward process can be solved effectively. It is more practical to use an Erlang(k) distribution instead of a negative exponential distribution, which is an extension of Bandit reward process.Thirdly, the optimal decision problem of the Erlang(2) Bandit target process whose sampling values have an Erlang(2) distribution is studied and an algorithm is presented to calculate the sequence of break-even values characterizing optimal selections of arms. It is more practical to use an Erlang(2) distribution instead of a negative exponential distribution, which is an extension of Bandit target process.Lastly, random sampling times are considered in our models. From now on, almost all bandit processes assume uniform or geometric discounting rather than real-time discounting which is more appropriate in some practical cases. A special Bandit reward process with unrestricted switching times are examined in this thesis, whose random sampling times have a negative exponential distribution and sampling values have an Erlang(2) distribution. The monotonicity of the Gittins index of this process is discussed and an algorithm is constructed to compute optimal stopping times of this Bandit reward process. It is more practical to consider random sampling times (real-time discounting). And by using an Erlang(2) distribution, supplement and extension to Bandit reward processes are obtained.

Keywords/Search Tags:

Bandit sampling process, Gittins index, break-even values, Bayesian approach, Erlang(k) distribution

PDF Full Text Request

Related items

1	Dynamic Allocation Of Non Standard Multi-armed Bandit
2	A Research Of The Prior Distribution Effect On Bayesian Variational Inference Of Erlang Mixture Model
3	Bayesian Parameter Estimation Of Univariate Mixture Erlang Model Based On Censored Data
4	Erlang And Mixed General Normal Distribution And Weak Convergence Of Exceedances Point Process
5	A Class Of Risk Model With Claim Inter-Arrival Times Distributed As Erlang
6	Bayesian Analysis For Single-index Varying-coefficient Models
7	Applications Of Erlang Mixtures In Insurance
8	Bayesian Parameter Estimation Of Finite Mixture Erlang Model
9	On Ruin Probability For The Erlang(n) Risk Process And Related Problem
10	Bayesian Approach For Inverse Scattering Problems In Acoustics And Elasticity