
Regression
analysis 
Simple regression analysis is a statistical procedure used
to summarize a trend or relationship between two variables.
The result is a line or curve on a graph, representing a model (a
mathematical function), which describes the general relationship
in the data.
The simplest model is a straight line (linear model), Y
= a + bX. The variable Y is the response,
or outcome variable, which is plotted on the ordinate
(vertical axis) of a graph. The variable X is
the predictor, or explanatory variable,
which is plotted on the abscissa (horizontal axis) of
a graph. The other values in the model are called parameters; a is
called the intercept of the line at the Y axis
and b is called the slope. See, for example,
the plot of erythrocyte mutant
fraction versus radiation dose produced by radiobiologists
at RERF, where Y is mutant fraction and X is
radiation dose.
In regression analysis, the values of the parameters, a and b,
are estimated using methods that seek the best fit of the
model to the data. Different methods are used depending
on the type of data. Common methods include simple linear
regression (least squares) for continuous data (such
as height, weight, or blood pressure), Poisson regression for
data that are counts (e.g., number of persons with leukemia
in a population), logistic regression for binary
data (a yes/no outcome, such as having a certain symptom
or not), and Cox regression for event times (such
as how long a patient treated for cancer remains free of
disease before suffering a relapse following therapy).
Many types of model are possible. Sometimes the model
is used to describe (illustrate simply) the relationship
between X and Y; such models are called descriptive
models. The linear model is typically used this way.
A linearquadratic model, Y = a + bX + cX^{2},
can be used to describe data that display curvature (the
parameter c is called the curvature). Sometimes
the model is based on biological or physiological assumptions
about the mechanism of how the explanatory variable X affects,
or causes, the outcome Y; such models are called mechanistic
models. With mechanistic models the mathematical function
can be quite complicated, but the parameters have meaning
in terms of biological or physiological quantities. It
is also possible to include many explanatory variables,
which is necessary when several related variables are associated
with Y (confounding). Sometimes the joint
effects of two or more explanatory variables include mechanistic
interaction, where one explanatory variable modifies the
effect of another (effect modification).
See also Dose response, Linear dose response, and Linearquadratic dose response.



