Linear regression with confidence intervals.

| categories: data analysis | View Comments

Linear regression with confidence intervals.

Linear regression with confidence intervals.

Fit a fourth order polynomial to this data and determine the confidence interval for each parameter. Data from example 5-1 in Fogler, Elements of Chemical Reaction Engineering

Contents

data to fit

note the transpose so these are column vectors

t = [0 50 100 150 200 250 300]';
Ca = [50 38 30.6 25.6 22.2 19.5 17.4]'*1e-3;

We want the equation $Ca(t) = b0 + b1*t + b2*t^2 + b3*t^3 + b4*t^4$ fit to the data in the least squares sense. We can write this in a linear algebra form as: T*b = Ca where T is a matrix of columns [1 t t^2 t^3 t^4], and b is a column vector of [b0 b1 b2 b3 b4]'. We want to solve for the b vector.

T = [ones(size(t)) t t.^2 t.^3 t.^4];

% we use the :command:`regress` to solve for b
alpha = 0.05 % confidence interval is 100*(1 - alpha)
[b,bint] = regress(Ca,X, alpha)
% b is the solution to our equation
% bint is the 95% confidence interval for each parameter

% the numbers are a little small to see with the default format
alpha =

    0.0500


b =

    0.0500
   -0.0003
    0.0000
   -0.0000
    0.0000


bint =

    0.0497    0.0503
   -0.0003   -0.0003
    0.0000    0.0000
   -0.0000   -0.0000
    0.0000    0.0000

change the format to see the numbers

format long
b
bint
% this lets you see 15 decimal places, but alot of them are zeros.
b =

   0.049990259740260
  -0.000297846320346
   0.000001343484848
  -0.000000003484848
   0.000000000003697


bint =

   0.049680246760215   0.050300272720305
  -0.000315461979157  -0.000280230661536
   0.000001071494761   0.000001615474936
  -0.000000004903197  -0.000000002066500
   0.000000000001350   0.000000000006044

engineering format

format shorte
b
bint
b =

  4.9990e-002
 -2.9785e-004
  1.3435e-006
 -3.4848e-009
  3.6970e-012


bint =

  4.9680e-002  5.0300e-002
 -3.1546e-004 -2.8023e-004
  1.0715e-006  1.6155e-006
 -4.9032e-009 -2.0665e-009
  1.3501e-012  6.0439e-012

compute R^2

http://en.wikipedia.org/wiki/Coefficient_of_determination

format % set format back to normal
SS_tot = sum((Ca - mean(Ca)).^2);
SS_err = sum((X*b - Ca).^2);

Rsq = 1 - SS_err/SS_tot
Rsq =

    1.0000

Plot the data and the fit

hold on
plot(t,Ca,'ko ','DisplayName','raw data')
plot(t,X*b,'b ','DisplayName','fit')
legend show
xlabel('Time (s)')
ylabel('Ca (mol/L)')

text(10,0.025,sprintf('R^2 = %1.2f',Rsq))
text(10,0.017,sprintf('Ca(t) = %1.2g + %1.2g t + %1.2g t^2 + %1.2g t^3 + %1.2g t^4',b))

Summary

a fourth order polynomial fits the data well, with a good R^2 value. All of the parameters appear to be significant, i.e. zero is not included in any of the parameter confidence intervals. This does not mean this is the best model for the data, just that the model fits well.

'done'

% categories: Data analysis
ans =

done

blog comments powered by Disqus