Home DE ES FR


Advanced Search

Our On-Line PhDs

Submit a Thesis
My Account Register Help

About
Fields
Mathematics and Applications
Information and Communication Sciences and Technologies
Physics, Optics
Materials Science, Mechanics and Mechanical Engineering
Fluid Mechanics and Energy
Chemistry, Physical Chemistry and Chemical Engineering
Life Sciences and Engineering
Earth Sciences and Environmental Engineering
Sciences of Economy, Management and Society
Test of fit and model selection based on likelihood function

Abdolreza, Sayyareh (2007) Test of fit and model selection based on likelihood function. PhD thesis Statistique, ISPED, Unité 875 Biostatistique, Université de Bordeaux 2, AgroParistech 2007AGPT0020 p.205.

Full text available as:

- These-Reza-avril2007Ber.pdf ( 1880 Kb )
Licence: Copyright

Abstract

Notre travail port sur l’inf´erence au sujet de l’AIC (un cas de vraisemblance

p`enalis´ee) d’Akaike (1973), o`u comme estimateur de divergence de Kullback-Leibler est

intimement reli´ee `a l’estimateur de maximum de vraisemblance. Comme une partie de la statistique

inf´erentielle, dans le contexte de test d’hypoth`ese, la divergence de Kullback-Leibler et le lemme

de Neyman-Pearson sont deux concepts fondamentaux. Tous les deux sont au sujet du rapports de

11

vraisemblance. Neyman-Pearson est au sujet du taux d’erreur du test du rapport de vraisemblance

et la divergence de Kullback-Leibler est l’esp´erance du rapport de log-vraisemblance.

Item Type:PhD Thesis (PhD)
Thesis Supervisor:Bar-hen, Avner
Date:22 June 2007
Board of examiners:Biernacki, Christophe and Saracco, Jérôme and Daudin, Jean Jacques and Commenges, Daniel
Ecole Doctorale:ED 435 AGRICULTURE, ALIMENTATION, BIOLOGIE, ENVIRONNEMENTS ET SANTE
Discipline:Statistique
Collection (Fonds):AgroParistech
Institution:AgroParistech
Department:ISPED, Unité 875 Biostatistique, Université de Bordeaux 2
Subjects:1. Mathematics and Applications
Uncontrolled Keywords:Akaike criterion, Confidence interval, Kullback leibler, Logistic regression, Model selection, Multiple regression, Variable selection

References

Akaike, H. (1973) Information theory and an extension of maximum likelihood principle. Second

International Symposium on Information Theory, Akademia Kiado, 267-281.

Atkinson, A.C.(1970) A method for discriminating between models Journal of the Royal Statistical

Society B 32, 323-344

Berk, R.H. and Jones, D.H. (1979) Goodness-of-Fit Test that dominate the Kolmogorov statistics.

Zeitschrift fur Wahrsheinlichkeitstheorie und Verwandte Gebiete, 47, 47-59.

Biernacki, C. (2004) Testing for a Global Maximum of the Likelihood. Journal of Computational and

Graphical Statistics, 14, 3, 657-674.

Bozdogan, H. (2000) Akaik’s information criterion and recent developments in information complexity.

Journal of Mathematical Psychology,44, 62-91.

Chernoff, H. and Lehmann, E.L. (1954) The use of maximum likelihood estimates in c2 tests of

goodness of fit. Ann. Math. statist. 25, 579-586.

Cochran, W.G. (1952) The c2 test of goodness of fit. Ann. Math. statist. 23, 315-345.

161

Commenges, D. Joly, P. Gegout-Petit, A. and Liquet, B. (2007)Choice between semi-parametric estimators

of Markov and non-Markov multi-stat models from generally onservations. Scandinavian

Journal of statistics, in press.

Cox, D.R.(1961) Test of separate families of hypothesis proceeding of the 4th Berkeley symposium,

Vol. 1(University of California Press,Berkeley), 105-123.

Cox, D.R.(1962) Further result on tests of separate families of hypothesesJournal of the Royal Statistical

Society B 24, 406-424.

Dastoor, N.K. (1983) Some aspects of testing non-nested hypothesis Journal of Econometrics 21,

213-228.

Davidson, R. and MacKinnon, (1981)Several tests for model specification in the presence of alternative

hypotheses Econometrica 49, 781-793.

Fisher, G.R., and McAleer, M. (1981) Alternative procedures and associated tests of significance for

non-nested hypotheses Journal of Econometrics, 16, 103-119.

Hurvich, C.M. and Tsai,C.L. (1989)Regression and time series model selection criterion Biometrika

76, 297-307

Ishiguro, M., Sakamoto,Y. and Kitagawa, G. (1997) Bootstrapping log likelihood and EIC, an extension

of AICAnnals of the institue of Statistical Mathematics, 49, 411-434

Jager, L. and Wellner, J.A. (2005)A new goodness of fit test: the reversed Berk-Jones statistic

http://bayes.stat.washington.edu/www/research/reports/2004/tr443.pdf.

Knight, K. (1999) Mathematical Statistics Chapman and Hall.

Konishi, S. and Kitagawa, G. (1996) Generalized Information Criteria in Model Selection. Biometrika

83, 4, 575-590.

Lehmann, E.L. (1998) Elements of Large-Sample Theory. Springer-Verlag, New York.

162

Lehmann, E.L. (1986) Testing Statistical Hypothesis. Wiley, New York.

Linhart, H. and Zucchini, W. (1986) Model Selection. Wiley, New York.

Mallows, C.L. (1973) Some comments on Cp Technometrics, 15, 661-675.

McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models CHAPMAN AND Hall.

Myung, I.J. (2000) The importance of complexity in model selection Journal of Mathematic 44, 190-

204.

Pesaran, M.H. (1974) On the general problem of model selection Review of Economic Studies 41,

153-171.

Pesaran, M.H. andDeaton, A.S. (1978) Testing non-nested nonlinear regression models Econometrica

46, 667-694.

Shapiro, S.S. andWilk, M.B. (1965) An analysis of variance test for normality. Biometrika 52, 591-

611.

Shapiro, S.S., Wilk, M.B. and Chen, H.J. (1968) A comparative study of various tests for normality.

J. Amer. Statist. Ass. 63, 1343-1372.

Schwarz, G. (1978) Estimating the dimension of a model Annals of Statistics, 6, 461-464

Shimodaira, H. (1998) An application of multiple comparison techniques to model selection Annals

of Ins. statistical mathemathics 50, No. 1, 1-13.

Shimodaira, H. (2001) Multiple comparisons of log-likelihoods and combining non-nested models

with application to phylogenetic tree selection Communication in statistics30, 1751-1772.

Stephens, M.A. (1986) editors. Goodness-of-Fit Techniques. Marcel Dekker, New York.

Van der Varrt, A.W. (1998) Asymptotic Statistics. Cambridge University Press.

Vuong, Quang H. (1989) Likelihood ratio tests for model selection and non-nested hypotheses The

level of test and efficiency for this test will be verified. It seems that our statistic is comparable by

163

Berk-Jones statistic. . Econometrica, 57, No. 2, 307-333.

White, H. (1982) Maximum Likelihood Estimation of Misspecified Models. Econometrica, 50(1):1-

26, jan.

White, H. (1994) Estimation Inference and Specification Analysis. Cambridge University Press.

Weisberg, S. and Bingham, C. (1975) An approximate analysis of variance test for non-normality

suitable for machine calculation. Technometrics 17, 133-134.

Yanagihara, H. and Ohomoto, C. (2005) On distribution of AIC in linear regression models Journal

of Statistical Planning and Inference 133, 417-433.

164

Table of content

1 Introduction 19

1.1 Our Objective - 27

1.2 Plan of Thesis - 29

2 Reminders about models

and some asymptotic results 30

2.1 Models - 30

2.2 Model Selection - 32

2.3 Goal of Model Selection and its means - 34

2.4 Nested and Non-Nested Models - 35

2.5 Probability Metrics - 36

2.6 Akaike framework and his Theorem - 37

2.7 Complexity in model selection - 38

2.8 Asymptotic theory - 42

2.9 Goodness of Fit Test and

Classical Hypothesis Testing - 43

14

CONTENTS CONTENTS

2.10 Reminder on Theorems and Lemmas - 45

3 Reminder on Goodness of Fit Tests 47

3.1 Testing fit to a fixed distribution - 47

3.1.1 Basic Goodness of Fit Test - 48

3.1.2 Tests on the basis of Functional Distance - 49

3.2 Adaptation of tests coming from the fixed-distribution - 51

3.3 Tests on the basis of Correlation and Regression - 52

3.4 Tests on the basis of Likelihood Functions - 54

3.4.1 Berk-Jones’s statistics - 54

3.4.2 Generalized Linear Models (GLMs) and Deviance - 55

4 Motivation to Model Selection Tests 61

4.1 Introduction - 61

4.2 Assumptions - 63

4.3 Likelihood Function and

Maximum Likelihood Estimator - 65

4.3.1 Correctly Specified and Mis-Specified models - 68

4.4 Metrics on spaces of probability - 69

4.4.1 Kullback-Leibler Discrepancy (divergence) - 73

4.5 Consistency of Maximum Likelihood

Estimator - 76

4.6 Akaike Information Criterion (AIC) - 77

15

CONTENTS CONTENTS

4.7 Distribution of Maximum Likelihood

Estimator - 79

5 Proposed test for Goodness of Fit Test:

A test based on empirical likelihood ratio 82

5.1 Introduction - 82

5.2 Our objective - 84

5.3 Union-Intersection Test - 85

5.4 Proposed test based on empirical

likelihood ratio - 86

5.4.1 Level of test - 89

5.4.2 Comparison with Berk-Jones’s test - 89

5.4.3 Bahadur efficiency of proposed test - 89

6 Proposed Model selection tests based on likelihood and AIC 92

6.1 Introduction - 92

6.2 Known parameters case - 98

6.3 Unknown parameters case - 106

6.4 Test function - 110

6.5 Variance estimation - 111

6.6 Distribution of Tn under H1

and Power of Test - 123

6.6.1 Distribution of Test Statistic Tn under H1 - 123

6.6.2 Power of Test - 127

16

CONTENTS CONTENTS

6.7 Consistency of Test - 130

6.7.1 Power computation - 131

6.7.2 Invariance - 132

7 Test For Model Selection based on difference of AIC’s:

application to tracking interval for DEKL 136

7.1 Introduction - 136

7.2 Objective - 138

7.3 Non-Nested Models comparison - 140

7.3.1 Motivation to Confidence Interval construction - 142

7.3.2 Confidence Interval for DEKL - 144

7.4 Logistic Regression: - 149

8 Conclusion and perspective 156

9 Bibliography 161

10 Appendix 165

I APPENDIX A 166

10.1 Introduction - 1

10.2 Expected Kullback-Leibler Criteria and AIC - 5

10.3 Hypothesis Testing - 7

10.4 Simulation - 10

10.4.1 exploration of our result - 10

17

CONTENTS CONTENTS

10.4.2 Application to The Multiple Regression Model - 11

II APPENDIX B 20

10.5 Introduction - 22

10.6 Theory about inference of differences of AIC criteria - 24

10.6.1 Estimating a difference of Kullback-Leibler divergences - 24

10.6.2 Tracking interval for a difference of Kullback-Leibler divergences - 27

10.6.3 Extension to regression models - 29

10.7 Application to logistic regression: a simulation study - 30

10.8 Choice of the best coding of age in a study of depression - 32

10.8.1 The Paquid study - 32

10.9 Discussion - 34

18

ID Code:3400
Deposited By:Nadine Pontal
Deposited On:12 February 2008

Statistiques de consultation

Repository Staff Only: edit this item

© ParisTech 2007 - Réalisé par RILK.com - Graphisme par Winch Communication