Font Size: a A A

Modelling And Statistical Inference For Seasonal Integer-valued Time Series

Posted on:2020-10-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Q TianFull Text:PDF
GTID:1360330575478808Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In real-life applications,seasonal integer-valued time series data are encountered in many fields.They often exhibit similarities between each seasonal periods.For example,the number of hospital emergency service arrivals caused by diseases that present seasonal behaviour,the monthly number of claims of disability benefits made by injured workers in logging industry,the number of traffic accidents,some crimi-nal data,annual sunspot relative number and so on.The causes of these seasonal phenomena may stem from the factors such as whether or its intrinsic qualities.To model this kind of data,some researchers use covariates or let the model parameters change periodically,which will make their models not stationary and with too com-plex structure.Based on binomial thinning operator,some other researchers proposed a seasonal INAR process with Poisson marginal distribution.However,these models cannot handle the productive data generating schemes and the over-dispersed seasonal situation effectively.Moreover,the study of this kind of data has not received much attention so far in the literature.In this thesis,we studied the issues about modelling and statistical inference for seasonal integer-valued time series data.Firstly,we introduce two popular thinning operators in the field of modelling integer-valued time series data based on the thinning scheme,which are the binomial thinning operator and the negative binomial thinning operator.We assume X is a non-negative integer-valued random variable,and let ?,? ?[0,1).Then the binomial thinning operator denoted by "o" is defined by the following equation,?oX=(?)Vi,whereas {Vi} is a sequence of i.i.d.Bernoulli random variables with success probability P(Vi=1)=?.The negative binomial thinning operator denoted by "*" is defined as?*X=(?)Wi,where {Wi} is a sequence of i.i.d.geometric random variables with probability mass function P{Wi=m)=?m/(1+?)m+1,m?N0.In the following part,we will introduce our main results of this thesis.1.Modelling and statistical inference for the seasonal geometric integer-valued autoregressive process(SGINAR(s))To model over-dispersed seasonal integer-valued time series data with a productive data generating scheme in real life,based on the negative binomial thinning operator,we proposed a seasonal integer-valued autoregressive process with geometric marginal distribution,denoted by SGINAR(s).The definition is as follows,Definition 1 A non-negative integer-valued process {Xt}t?N given byXt=?*Xt-s+?t,(1)is said to be a seasonal geometric integer-valued autoregressive process(SGINAR(s)),if the following conditions are satisfied:(i)? ? 0,1),s N denotes the seasonal period,and we let s? 2;(ii)*is the negative binomial thinning operator defined previously,all the thinningoperators perform mtually independent;(iii){Xt}t?N is a sequence of geometric distributed random variables with expectation?;(iv){?t} is a sequence of i.i.d.non-negative integer-valued randor variables,inde-pendent with Xt-l(l? 1)and the random variable sequence {Wi} in the thinningoperator;(v)the sequence of geometric random,variables {Wi} in the thinning term ?*Xt areindependent with Xt-1,Xt-2,....Proposition 1 states the probability mass function of the innovation ?t.Proposition 1 The random variable ?t is a mixture of two random variables with geometric(?+/(1+?))and geometric(?/(1+?))distributions.The probability mass function is given by P(?t=l)=(1-??/?-?)ˇ?/l(1+?)l+1+??/?-?ˇ?l/(1+?)l+1,l?N0,where a ?[0,?/1+?].Proposition 2 states some important probabilistic properties of SGINAR(s)pro-cess.Proposition 2 Suppose {Xt} satis SGINAR(s)proces,X(k+h)s+j,Xks+i are any two random variables from the process,denoted by Xk(j):=Xks+j,Xk+h(i):=X(k+h)s+i,h ? N,k?N0,i,j {1,2,...s}.Then we have following results:(i)The conditional expectation of Xk+h(i)given Xk(i)is and when h ??,implies E(Xk+h(i)|Xk(j)??,which is the unconditional mean.(ii)The conditional variance of Xk+h(i)given Xk(j),is whereas ??2=(1+?)?[(1+?)(1-?)-?].When h??,we obtain that Var{Xk+h(i)|X,k(j)??(1+?),which is the unconditional variance.(iii)The covariace of Xk(j)and Xk+h(i)is Obviously,when i=j,the autocorrelation function ?(hs)=?h decays exponen-tially as h??.The following theorem states that SGINAR(s)process is stationary and ergodic.Theorem 1 If {Xt} satisfies SGINAR(s)process,then the process is an irre-ducible,aperiodic and positive recurrent(and hence erogdic)s-step Markov chain.Morcover,the unique stationary marginal distribution can be expressed in terms of the innovation process {?t} aswhen k? 1,?k:=(?),and ?0*?t:=?t.For all t ? N,the infinite series is understood as the limit in probability of the finite sum.Next,we discuss the parameters estimation problem of SGINAR(s)process by three methods,which are conditional least squares estimation,Yule-Walker estimationand conditional maximum likelihood estimation.Moreover,we prove that the CLS andYW estimators are the asymptotically normal,and the CML estimators are consistent under assumptions(C1)and(C2).Theorem 2 Let ?cls=(?cls,?cls)T be the conditional least squares estimators.then they are asymptotically normal,i.e.,Theorem 3 Let ?cls=(?cls,?cls)T and ?yw=(?yw,?yw)T denote the conditionalleast squares estimators and the Yule-Walker estimators respectively,then they have the same asymptotic distribution,i.e.,they are asymptotically equivalent.(C1)The observed process {Xt}t=1n is generated from the SGINAR(s)process,with true parameter ?0 ?(?),where parameter space(?)=(0,1)x R+is a compact subset of R2.(C2)The SGINAR(s)model is identifiable,i.e.,p?:?p?0,if ???0,where p?denotes the conditional distribution of Xt with parameter ?.Theorem 4 Under the assumptions(C1)-(C2),the conditional maximum like-lihood estimators ?cml are consistent i.e.,We also consider the forecasting problem.Based on the observations {X1,X2,…,Xn}from SGINAR(s)process,the h-step ahead forecast is given by Xn+h=?q(Xn+h-qs-?)+?where ?,? are estimators for a and ?,respectively.The following proposition formalized some properties of the h-step ahead condi-tional expectation and conditional variance.Proposition 3 Let {Xt} be from the SGINAR(s)process,then(i)E(Xt+h |Ft)=?q(Xt+h-qs-?)+?,(ii)Var(Xt+h)|Ft)=1-?2q/1-?2??2+(1+?)?q(1-?q)/1-?Xt+h-qs+?(1+?)[1-?2(-q-1)/1-?2-?q-1(1-?q-1)/1-?]?,(iii)lim E(Xt+h | Ft)=?,(iv)lim Var(Xt+h |Ft)=?(1+?),where h ?N and q=[h/s].We also discuss the sample path and ACF plot from SGINAR(s)process and compare the performance by these three estimation methods.The results show that the CML is superior to the other two methods.At last,we apply our new model to two real data sets,which are U.S.polio counts and annual sunspot relative numbers from 1936 to 1972.By comparing the fit quality performance of all the candidate models,the SGINAR(s)model show the best performance in the case of fitting this kind of counting time series.2.Modelling and statistical inference for the combined seasonal geometric integer-valued autoregressive process(CSGINAR1,s)In real-life situations,the over-dispersed seasonal integer-valued time series data always present serial dependence(lag-1 dependence)as well.Motivated by this,we propose a seasonal geometric integer-valued autoregressive process with a combined structure,defined as follows:Definition 2 A non-negative integer-valued process {Xt}t?N given by is said to be a combined seasonal geometric integer-valued autoregressive(CSGINAR1,s)process,if the following conditions are met:(i)p E[0,1],?,??[0,1),s ? N denotes the seasonal period,and s? 2;(ii)*is the negative binomial thinning operator defined previously,and all the thin-nings are performed independently of each other;(iii){Xt}t?N are geometric distributed with expectation?(?>0);(iv){?t} is a sequence of i.i.d.non-negative integer-valued random variables,inde-pendent of all Xm,?*Xm,?*Xm(m<t);(v)the geometric distributed random variables sequence {Wi(?)} and {Wi(?)}in the thinning of Xt are mutually independent with the random variables Xt-1 Xt-2,.…;(vi)the conditional probabilities P(?*Xt=i1,?*Xt=i2|Ft)and P(?*Xt=ii,?*Xt=i2|Xt=x)are equal,where Ft is the historic information of all random variables Xm,?*Xm and ?*Xm(m<t).In the following part,we study some important properties of CSGINAR1,s process.Proposition 4 states the probability mass function of the innovation et in CSGINAR1,s process.Proposition 4 The random variable et can be represented as a mixture of three geometric distributed random variables,where ?,??[0,?/1+?],and b are the solutions of the equation x2-{?[1+(1-p)?]+?(1+p?)}x+??(1+?)=0 and a<b.Next proposition gives the autocorrelation function of the process.Proposition 5 The autocorrelation function of CSGINAR1,s process is given by?{k)=?p?(|k-1|)+?(1-p)?(|k-s|),k ?N.Furthermore,?(k)is decreasing to 0 with exponential rate,when k tends to infinity.Proposition 6 formalizes one-step ahead conditional probability generating func-tion,one-step ahead conditional expectation and one-step ahead conditional variance of the CSGINAR1,s model.Proposition 6 Let {Xt} be a CSGINAR1,s process,?1=?p,?2 =?{1-p),and?=?1+?2.Then.(i)the one-step ahead conditional probability generating function(cpgf)is E(sXt+1|Ft)(ii)The one-step ahead conditional expectation is E(Xt+1|Ft)=?1Xt+?2Xt-s+1+(1-?)?.(iii)The one-step ahead conditional variance is Var(Xt+1|Ft)=Var(?t)+p(1-p)(?Xt-?Xt-s+1)2+?p(1+?)Xt+?(1-p)(1+?)Xt-s+1=(1+?2)?2+(1-?)?-2?(1 +?)(?12/p+?22/1-p)+p(1-p)(?Xt-?xts+1)2+?1(1+?)Xt+?2(1+?)Xt-s+1,where p1,p2,p3 and a,b represent the same with those in Proposition 4.The following theorem states that CSGINAR1,s process is strictly stationary and ergodic.Theorem 5 The process {Xt} given by CSGINAR1,s is strictly stationary and ergodic.Then we study the parameters estimation problem by three methods,which are two-step conditional least squares estimation,modified two-step Yule-Walker estima-tion and conditional maximum likelihood estimation.Based on the assumptions(C3)and(C4),we prove that the conditional maximum likelihood estimators are consistent.(C3)The observed process{Xt}tn=1 is generated from the CSGINAR1,s model with true parameters vector ?O ?(?),where parameter space(?)=R+x(0,1)×(0.1)×(0,1)is a compact subset of R4.(C4)The model CSGINAR1,s is identifiable,i.e.,p??p?0,if ???0,where p?denotes the conditional distribution of Xt with parameter ?.Under above assumptions,the following theorem shows that the conditional max-imum likelihood estimators are consistent.Theorem 6 Under assumptions(C3)-(C4),the conditional maximum likelihood estimators are consistent,i.e.,We also discuss the sample path and ACF plot from CSGINAR1,s process and compare the performance by these three estimation methods.The results show that the conditional maximum likelihood estimation present better to the other two methods.At last,we apply our new model to the U.S.polio counts.By comparing the fit quality performance of all the candidate models,the CSGINAR1,s model show the best performance in the case of fitting this data set.3.Modelling and statistical inference for the r states random seasonal environment integer-valued autoregressive process(RrSINAR(s))With deepening research,we found that the seasonal environment conditions,in which the counting objects exist,may vary through time,which could significantly affect the frequency of occurring the recorded random events.To model this kind of data,we propose a r states random seasonal environment integer-valued autoregressive process,denoted by RrSINAR(s).Firstly,we define the random seasonal environment variables as follows.Definition 3 Suppose {Zk} is a sequence of random variables,k?{1,2,…,s},if {Zk} is a Markov chain taking values in Er={1,2,…,r},r ? {1,2,…,s},then{Zk} is called the r states random seasonal environment variables.Next,we give the definition of RrSINAR(s)process.Definition 4 Let M={?i,?2,…,?r},when i ? Er,we have that ?i>0,?i ?(0,1).If a sequence of non-negative integer-valued random variables {Xt(Zk)}t?N satisfy the following expression then {Xt(Zk)}t?N is called the r states random seasonal environment integer-valued autoregressive process,denoted by RrSINAR(s),where(i)s ? N denotes the seasonal period,and let s>2;(ii){Zk} is the r states random seasonal environment variables,which satisfies that t=ms+k,k {1,2,…,s},m? N0;(iii){Wi(?Zk)} is a sequence of i.i.d.random variables,which is the counting process in the thinning operator with expectation ?Zk,and independent with all the other random variables:(iv)the expectation of {Xt{Zk)} is ?Zk;(v)when z ? Er,{?t(z)} is a sequence of i.i.d.non-negative integer-valued random variables,independent with {Zk},moreover,when t>m and l ? Er,?t(z)is independent with Xm(l).To model seasonal time series data with different dispersion and generating schemes more effectively,we construct two specific random seasonal environment models based on RrSINAR(s)process,the definitions are given in the following.Definition 5 If a non-negative integer-valued process {Xt(zk)}t?N,k?{1,2,…,s}.satisfy the following expression Xt(zk)=?zk*Xt-s{zk)+?t{zk),(4)then {Xt(zk)}t?N is said to be a r states random seasonal environment geometric INAR(s)process based on negative binomial thinning operator(RrSGINAR(s)),where(i)s ? N denotes the seasonal period,and let s? 2;(ii)zk is the observation of the kth random seasonal environment random variable,zk?Er and t=ms+k,k?{1,2,…,s},m?N0;(iii)azk*is the negative binomial thinning operator defined previously,the expectation of the counting sequence consist in the operator is ?zk,independent with other variables;(iv){Xt(zk)}t?N is geometric distributed with expectation ?zk;(v){?t(zk)} is a sequence of i.i.d.non-negative integer-valued random variables,mu-tually independent with Xt-l{zj){l?1),where t-l=m0s+j,j ?{1,2,…,s},m0 ? N0.Definition 6 If a non-negative integer-valued process {Xt(zk)}t?N,k?{1,2,…,s}satisfy the following expression Xt(zk)=?zko Xt-s(zk)+?t(zk),(5)then {Xt(zk)}t?N is said to be a r states random seasonal environment poisson INAR(s)process based on binomial thinning operator(RrSPINAR(s)),where(i)s?N denotes the seasonal period,and let s ? 2;(ii)zk is the observation of the kth random seasonal environment random variable,zk ? Er,and t=ms+k,k?{1,2,…,s},m ? N0;(iii)?zk is the binomial thinning operator defined previously,the expectation of the counting sequence consist in the operator is ?zk,independent with other variables;(iv){Xt(zk)}t?N is Poisson distributed with expectation ?zk;(v){?t{zk)}is a sequence of i.i.d.non-negative integer-valued random variables,mu-tually independent with Xt-l{zj)(l?1),where t-l=m0s+j,j? {1,2,…s},m0?N0.The following two propositions present the distribution of the innovation {?t(zk)}of RrSGINAR(s)and RrSPINAR(s).Proposition 7 The random variable ?t{zk)in RrSGINAR(s)process is a mixture of two geometric random variables,whose expectations are ?zk and ?zk respectively,the probability mass function is given byP(?t(zk)=m)=1-(1-?zk?zk/?zk-?zk)ˇ?zkm/(1+?zk)m+1+?zk?zk/?zk-?zkˇ?zk?zk/?zk-?zkˇ?zkm/(1+?zk)m+1,where ?zk?[0,?zk/1+?zk]and t=m0s+k,k?{1,2,…,s},m,m0 ? N0.Proposition 8 The random variable ?t(zk)in RrSPINAR(s)process is Poisson distributed with expectation(1-?zk)?zk,the probability mass function is given by P(?t(zk)=m)=[(1-?zk)?zk]m/m!e-(1-?zk)?zk,where t=m0s+k,k?{1,2,…,s},m,m0 ? N0.The next two propositions state some important probabilistic properties of RrSGINAR(s)and RrSPINAR(s).Proposition 9 Let {Xt(zk)} generated from RrSGINAR(s)process,denoted any two random,variables as Xm(i)(zi)=Xms+i{zi),Xm+h(j)(zi)=X(m+h)s+j(zj),h?N,m?N0,i,j?{1,2,…,s}.Then,(i)when Xm(i)(zj)is given,the conditional expectation of Xm+h(j)(zj)is expressed as when h??,E(Xm+h(j)(zj)|Xm(i)(zi))??zi.(ii)When Xm(i)(zi)is given,the conditional variance of Xm+h(j)(zj)is expressed as whecre ??2=(1+?zi)?zi[(1+?zi)(1-?zi)-azi],and when h??,Var(Xm+h(j)(zj)|Xm(i)(zi))??zi(1+?zi).(iii)The covariance of Xm(i)(zi)and Xm+h(j)(zj)is given by when i=j and h??,the autocorrelation function ?(hs)=?zih decays exponen-tially to zero.Proposition 10 Let {Xt(zk)} generated from RrSPINAR(s)process,denoted any two random variables as Xm(i)(zi),Xm+h(j)(zj),h?N,m?N0,i,j?{1,2,…,s}.Then,(i)when Xm{(i)zi)is given,the conditional expectation of Xm+h(j)(zj)is expressed as when h??,E(Xm+h(j)(zj)|Xm(i)(zi))??zi.(ii)When Xm(i)(zi)is given,the conditional variance of Xm+h(i)(zj)is expressed as when h??,Var{Xm+h(j){zj)|Xm(i)(zi))??zi.(iii)The covariance of Xm(i)(zi)and Xm+h(j)(zj)is given by nentially to zero.We also discuss the parameter estimation for RrSPINAR(s)and RrSGINAR(s)process.Firstly,by the K-means clustering method,we partition all the observations into r clusters in which all the observations are under the same seasonal environment.Then we estimate parameters by conditional least squares and Yule-Walker estimation under each sub-samples.Also,we use conditional maximum likelihood estimation method.By numerical simulation,the results show that the conditional maximum likelihood estimation is superior to the other two methods.At last,we apply our new model to the Wolfer sunspot numbers from 1770 to 1868.By comparing the fit quality performance of all the candidate models,the RrSGINAR(s)model show the best performance in this case.
Keywords/Search Tags:Seasonality, thinning operator, integer-valued time series, autoregressive process, random seasonal environment model
PDF Full Text Request
Related items