Correlation | Linear Regression | Variance and Standard Deviation | Normal Distribution
Calculator Linear Regression
Calculates the simple linear regression, i.e. a straight line that predicts the points of a data set with two sizes as well as possible. If you have two connected quantifiable characteristics, such as height and weight of people, and enter many different values of these sizes in a diagram, then the result is a point cloud in which the points are not randomly distributed, but have a direction. This direction can be represented by a straight line.
Please enter the values of the two characteristics separately. For each characteristic, the values must be separated from one another with a blank or a line break. The number of values per characteristic must be the same. The n-th value of the first feature belongs to the n-th value of the second feature.
Example calculates with the size (in 1000 km²) and population (in millions) of some European countries.
The formulas are:
n: number of value pairs, Σ: sum i=1 to n
x: mean values of all xi, y: mean values of all yi
β1 = Σ[(xi−x)*(yi−y)] / Σ(xi−x)²
β0 = y − β1*x
f = β0 + β1xi
Linear regression is the simplest special case of regression analysis, a branch of statistics. It is assumed that there is a linear relationship between two variables. For some contexts this is true or at least a useful simplification. This is certainly the case in the examples with the size and weight of people and the population and size of a country. Other cases cannot be represented in this way and you have to work with a curve instead of the regression line, for example with a parabola for quadratic relationships or with exponential functions. All of these cases are mathematically much more complex than the linear relationship. This calculator cannot decide whether a linear regression is possible and useful or not; this must be checked for each case.
Like correlation, regression says nothing about causality, i.e. what causes what.
↑ top ↑