## Linear regression

This document describes a very flexible, efficient and robust linear regression package, developed by me. Please download it for evaluation and let me have your comments.

### Notation and conventions

We assume that a dependent quantity y can be expressed in terms of a linear function of a number of independent quantities x, i.e.
y = ∑_{j} α_{j}x_{j}.
Note that this expression does not allow for a constant term.
However, this is not a loss of generality: it can be emulated by making
one of the independent quantities, say x_{0}, equal to 1 for all points
in your dataset.

### Linear regression algorithm

The algorithm used is the well-known least squares method.
It determines the values of the model parameters α by minimising the quantity
∑_{i} w_{i}(y_{i} - ∑_{j} α_{j}x_{ij})^{2}
where the index i runs over all points in the
dataset and w_{i} is a weight factor for each datapoint.
A special feature of the algorithm in linmodel.dll is that it allows linear
constraints being imposed on the model parameters α:
∑_{j} c_{j}α_{j} = d.
The algorithm has no restrictions on the number of datapoints, parameters or constraints.
It has been implemented in such a way that it does not need any external routines, not even standard mathematical functions like "square root". It is entirely written in
terms of basic operations: addition, subtraction, multiplication and division, nothing else!

### Download

The regression package can be downloaded by clicking this link for evaluation purposes only.
Documentation on how to use it in your own programs is included.
The package is written in C++, compiled and linked in Windows XP. It should work in any 32 bit version of MS Windows (Windows 95 or later).

### Testing

I have tested the algorithm on a few standard datasets from
the NIST website.
These datasets are included in a format the sample program lintest.cpp can use.
In all cases the certified outputs are reproduced to at least 10 significant digits and I am therefore convinced that the code in linmodel.dll is fully correct (and
also very robust and efficient).

N.B. you may notice some differences for some of the statistical measures: this is because in lintest.cpp I only implemented
the definitions for the case without a constant term.

### Disclaimer

I will not accept any liability for damages that you may incur, either directly or indirectly, by using this software.

Back to my home page