What are VitalSource eBooks?
For Instructors Request Inspection Copy. Generalized Estimating Equations, Second Edition updates the best-selling previous edition, which has been the standard text on the subject since it was published a decade ago. Combining theory and application, the text provides readers with a comprehensive discussion of GEE and related models. Numerous examples are employed throughout the text, along with the software code used to create, run, and evaluate the models being examined.
Stata is used as the primary software for running and displaying modeling output; associated R code is also given to allow R users to replicate Stata examples. This second edition incorporates comments and suggestions from a variety of sources, including the Statistics. Other enhancements include an examination of GEE marginal effects; a more thorough presentation of hypothesis testing and diagnostics, covering competing hierarchical models; and a more detailed examination of previously discussed subjects.
Along with doubling the number of end-of-chapter exercises, this edition expands discussion of various models associated with GEE, such as penalized GEE, cumulative and multinomial GEE, survey GEE, and quasi-least squares regression. It also offers a thoroughly new presentation of model selection procedures, including the introduction of an extension to the QIC measure that is applicable for choosing among working correlation structures.
He is also an affiliated faculty in the Institute for Families in Society. He has also co-authored with P. Shults , and R for Stata Users with R. Both the theory and practical aspects of constructing and analysing such models is covered. Inclusion of code for many of the analyses is an excellent feature.
Also, the number of exercises increased significantly ….
Let be the regression parameters resulting from solving the GEE under the restricted model , and let be the generalized estimating equation values at. The argument variable is a variable name that defines the "block effects" between clusters. The number of observations in the ZDATA data set is , where is the size of a complete cluster a cluster with no missing observations. The title will be removed from your cart because it is not available in this region. Numerous examples are employed throughout the text, along with the software code used to create, run, and evaluate the models being examined.
For those who want to use this book in the classroom, including me, having extra exercise sets is certainly a welcome addition. It can serve as supplemental reading in longitudinal data analysis classes as well. Praise for the First Edition: The book contains challenging problems in exercises and is suitable to be a textbook in a graduate-level course on estimating functions.
The references are up-to-date and exhaustive. I find it to be a good reference text for anyone using generalized linear models GLIM.
The authors do a good job of not only presenting the general theory of GEE models, but also giving explicit examples of various correlation structures, link functions and a comparison between population-averaged and subject-specific models. Furthermore, there are sections on the analysis of residuals, deletion diagnostics, goodness-of-fit criteria, and hypothesis testing. Good data-driven examples that give comparisons between different GEE models are provided throughout the book.
Perhaps the greatest strength of this book is its completeness. Let , , , represent the th measurement on the th subject. There are measurements on subject and total measurements. Correlated data are modeled using the same link function and linear predictor setup systematic component as the independence case. The random component is described by the same variance functions as in the independence case, but the covariance structure of the correlated measurements must also be modeled.
Let the vector of measurements on the th subject be with corresponding vector of means , and let be the covariance matrix of. Let the vector of independent, or explanatory, variables for the th measurement on the th subject be. The generalized estimating equation of Liang and Zeger for estimating the vector of regression parameters is an extension of the independence estimating equation to correlated data and is given by. Let be an "working" correlation matrix that is fully specified by the vector of parameters.
The covariance matrix of is modeled as. If is the true correlation matrix of , then is the true covariance matrix of. The working correlation matrix is usually unknown and must be estimated.
It is estimated in the iterative fitting process by using the current value of the parameter vector to compute appropriate functions of the Pearson residual. If you specify the working correlation as , which is the identity matrix, the GEE reduces to the independence estimating equation. Following are the structures of the working correlation supported by the GENMOD procedure and the estimators used to estimate the working correlations.
The dispersion parameter is estimated by. The following is an algorithm for fitting the specified model by using GEEs. Note that this is not in general a likelihood-based method of estimation, so that inferences based on likelihoods are not possible for GEE methods. Compute an initial estimate of with an ordinary generalized linear model assuming independence. Compute the working correlations based on the standardized residuals, the current , and the assumed structure of.
See Diggle, Liang, and Zeger , Chapter 11 for a discussion of missing values in longitudinal data. Suppose that you intend to take measurements for the th unit. Missing values for which are missing whenever is missing for all are called dropouts.
Otherwise, missing values that occur intermixed with nonmissing values are intermittent missing values. The GENMOD procedure can estimate the working correlation from data containing both types of missing values by using the all available pairs method, in which all nonmissing pairs of data are used in the moment estimators of the working correlation parameters defined previously.
The resulting covariances and standard errors are valid under the missing completely at random MCAR assumption. Estimates of the parameters for other working correlation types are computed in a similar manner, using available nonmissing pairs in the appropriate moment estimators.
The contribution of the th unit to the parameter update equation is computed by omitting the elements of , the columns of , and the rows and columns of corresponding to missing measurements. The model-based estimator of is given by. This is the GEE equivalent of the inverse of the Fisher information matrix that is often used in generalized linear models as an estimator of the covariance estimate of the maximum likelihood estimator of.
It is a consistent estimator of the covariance matrix of if the mean model and the working correlation matrix are correctly specified. It has the property of being a consistent estimator of the covariance matrix of , even if the working correlation matrix is misspecified—that is, if. See Zeger, Liang, and Albert , Royall , and White for further information about the robust variance estimate.
Generalized Estimating Equations, Second Edition updates the best-selling previous edition, which has been the standard text on the subject since it was. Generalized Estimating Equations, Second Edition 2nd edition by Hardin, James W., Hilbe, Joseph M. () Hardcover on www.farmersmarketmusic.com *FREE* shipping on.
In computing , and are replaced by estimates, and is replaced by the estimate. If the responses are binary that is, they take only two values , then there is an alternative method to account for the association among the measurements. The alternating logistic regressions ALR algorithm of Carey, Zeger, and Diggle models the association between pairs of responses with log odds ratios, instead of with correlations, as ordinary GEEs do.
For binary data, the correlation between the j th and k th response is, by definition,. The joint probability in the numerator satisfies the following bounds, by elementary properties of probability, since:. The correlation, therefore, is constrained to be within limits that depend in a complicated way on the means of the data. The ALR algorithm seeks to model the logarithm of the odds ratio, , as.
The parameter can take any value in with corresponding to no association. The log odds ratio, when modeled in this way with a regression model, can take different values in subgroups defined by. For example, can define subgroups within clusters, or it can define "block effects" between clusters.
You specify a GEE model for binary data that uses log odds ratios by specifying a model for the mean, as in ordinary GEEs, and a model for the log odds ratios. You can use any of the link functions appropriate for binary data in the model for the mean, such as logistic, probit, or complementary log-log. The ALR algorithm alternates between a GEE step to update the model for the mean and a logistic regression step to update the log odds ratio model. Upon convergence, the ALR algorithm provides estimates of the regression parameters for the mean, , the regression parameters for the log odds ratios, , their standard errors, and their covariances.
Specifying a regression model for the log odds ratio requires you to specify rows of the z -matrix for each cluster and each unique within-cluster pair.