|Deviance is a measure of how well data
are fitted by a statistical model. Data typically contain
variation from two sources: randomness (the random component)
and factors (or covariates) that affect the mean, or average,
of the data (the systematic component). Factors can include
age, gender, radiation exposure, diet, type of occupation,
and so on. An analysis of data is based on a model that encompasses
both components. The random component is captured through
a statistical distribution. The systematic component is handled
through a mathematical function of the factors and effect
parameters. The parameters are estimated by finding the model
that fits the data best--hence the use of deviance. After
fitting a model, each observed data point has an associated "fitted
value" or estimate of the mean derived from the fitted model.
The deviance measures the discrepancy between observed data
and fitted values in light of the random variation described
by the statistical distribution. If the model explains the
variation in the data well, the discrepancy between data
and fitted values will be small and the model will be accepted.
On the other hand, if the discrepancy is large, the model
must be revised.
Most analyses of atomic-bomb survivor data at the Radiation Effects Research
Foundation are based on models developed using deviance. Reports typically
mention "deviance" as a short-hand notation for "deviance
difference." The difference in deviance between two models for the
systematic component is particularly useful when one model is "nested"
within the other. A nested model can be obtained by fixing as constant
some parameters in the larger model. According to statistical theory, the
difference in deviances for two such models has approximately a chi-square
distribution with degrees of freedom equal to the number of fixed parameters.
This is the basis of testing whether the constrained model (the model with
fixed parameters) fits the data as well as the larger model, or whether
those parameters are needed in the model. For example, when testing whether
there is a gender difference in sensitivity to radiation, one could include
a parameter describing interaction between gender and dose response; if
the difference in deviance is not statistically significant between that
model and the constrained model (where the interaction parameter is set
equal to zero), then one would conclude that there is no interaction.
Mathematically, the deviance is the logarithm of the likelihood ratio statistic
comparing a particular fitted model to the so-called "full model."
The "full model" uses the observed data as fitted values and
therefore ascribes all of the variation to the systematic component with
none to the random component. At the other extreme is the so-called null
model, where a common mean is fit to all of the data. The null model does
not incorporate any factors, so all of the variation is ascribed to the
random component and none to the systematic component. The goal of a statistical
analysis is to find a model that describes the data better than the null
model (if such a model exists) without merely repeating the data in the
model as the "full model" does. This is accomplished by assessing
the difference in deviance among several nested models and choosing the
model that describes the data best with the fewest possible parameters.
This is because the investigator wants to know how, or if, the data depend
on various factors.