what_do_you_want.m

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



What would you like to see on the Matlab blog?

Hi everyone! We recently passed the 100-post mark on the blog! What would you like to see as future blog posts? Leave a comment here!

% categories: miscellaneous

VN:F [1.9.20_1166]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Peak finding in Raman spectroscopy

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Peak finding in Raman spectroscopy

John Kitchin

Raman spectroscopy is a vibrational spectroscopy. The data typically comes as intensity vs. wavenumber, and it is discrete. Sometimes it is necessary to identify the precise location of a peak. In this post, we will use spline smoothing to construct an interpolating function of the data, and then use fminbnd to identify peak positions.

clear all; close all

Read and plot the raw data

[w,i] = textread('raman.txt');

plot(w,i)
xlabel('Raman shift (cm^{-1})')
ylabel('Intensity (counts)')


Narrow to the region of interest

We will focus on the peaks between 1340 and 1360 cm^-1.

ind = w > 1340 & w < 1360;
w1 = w(ind);
i1 = i(ind);
figure(2)
plot(w1, i1,'b. ')
xlabel('Raman shift (cm^{-1})')
ylabel('Intensity (counts)')


Generate the spline

The last parameter in csaps determines the amount of smoothing. Closer to 1.0 is less smoothing.

pp = csaps(w1, i1,.99)
hold all
plot(w1, ppval(pp,w1))
pp = 

      form: 'pp'
    breaks: [1x51 double]
     coefs: [50x4 double]
    pieces: 50
     order: 4
       dim: 1


Find all the zeros in the first derivative

We will brute force this by looking for sign changes in the derivative vector evaluated at all the points of the original data

d1 = fnder(pp, 1); % first derivative
s = ppval(d1, w1);
s(s >= 0) = 1;
s(s < 0) = 0;

% where diff = -1 indicates a transition from positive slope to
% negative slope, which is a maximum. These are approximate locations
% of the zeros
z = find(diff(s) == -1);

Make an interpolating function to find the minima

to use fminbnd we need a function. we make a function handle that is the negative of the main function, so that the peak locations are located at minima. interpolating function handle

f = @(xx) -ppval(pp,xx);

% now loop through each value found in z to find the minimum over the
% interval around each zero.
x = zeros(size(z));
for i=1:length(x)
    lowerbound = w1(z(i)-5);
    upperbound = w1(z(i)+5);
    x(i) = fminbnd(f,lowerbound,upperbound);
end

Plot the derivatives and peaks found

figure
hold all
plot(w1, ppval(d1,w1))
plot(w1(z), ppval(d1, x),'ro ')

xlabel('Raman shift (cm^{-1})')
legend('1^{st} derivative', 'zeros')


Label the peaks in the original data

figure(2)
plot(x, ppval(pp,x), 'ro','markerfacecolor','r')

% print some offset labels
for i=1:length(x)
    sprintf('%1.1f cm^{-1}',x(i))
end
ans =

1346.5 cm^{-1}


ans =

1348.1 cm^{-1}


Discussion

In the end, we have illustrated how to construct a spline smoothing interpolation function and to find maxima in the function, including generating some initial guesses. There is more art to this than you might like, since you have to judge how much smoothing is enough or too much. With too much, you may smooth peaks out. With too little, noise may be mistaken for peaks.

done

% categories: data analysis
% post_id = 2007; %delete this line to force new post;
% permaLink = http://matlab.cheme.cmu.edu/2012/08/27/peak-finding-in-raman-spectroscopy/;

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

curve fitting to get overlapping peak areas

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

curve fitting to get overlapping peak areas

John Kitchin

Today we examine an approach to fitting curves to overlapping peaks to deconvolute them so we can estimate the area under each curve. You will need to download gc-data-2.txt for this example. This file contains data from a gas chromatograph with two peaks that overlap. We want the area under each peak to estimate the gas composition. You will see how to read the text file in, parse it to get the data for plotting and analysis, and then how to fit it.

function main

read in the data file

The data file is all text, and we have to read it in, and find lines that match certain patterns to identify the regions that are data, then we read in the data lines.

clear all; close all

datafile = 'gc-data-2.txt';

fid = fopen(datafile);

first we get the number of data points, and read up to the data

i = 0;
while 1
    line = fgetl(fid);
    sm = strmatch('# of Points',line);
    if ~isempty(sm)
        regex = '# of Points(.*)';
        [match tokens] = regexp(line,regex,'match','tokens');
        npoints = str2num(tokens{1}{1});
    elseif strcmp(line,'R.Time	Intensity')
        break
    i = i + 1
    end
end

initialize the data vectors

t = zeros(1,npoints);
intensity = zeros(1,npoints);

now read in the data

for j=1:npoints
    line = fgetl(fid);
    [a] = textscan(line,'%f%d');
    t(j) = a{1};
    intensity(j) = a{2};
end

fclose(fid);

Plot the data

plot(t,intensity)
xlabel('Time (s)')
ylabel('Intensity (arb. units)')
xlim([4 6])


correct for non-zero baseline

intensity = intensity + 352;
plot(t,intensity)
xlabel('Time (s)')
ylabel('Intensity (arb. units)')
xlim([4 6])


a fitting function for one peak

The peaks are asymmetric, decaying gaussian functions.

    function f = asym_peak(pars,t)
        % from Anal. Chem. 1994, 66, 1294-1301
        a0 = pars(1);  % peak area
        a1 = pars(2);  % elution time
        a2 = pars(3);  % width of gaussian
        a3 = pars(4);  % exponential damping term
        f = a0/2/a3*exp(a2^2/2/a3^2 + (a1 - t)/a3)...
            .*(erf((t-a1)/(sqrt(2)*a2) - a2/sqrt(2)/a3) + 1);
    end

a fitting function for two peaks

to get two peaks, we simply add two peaks together.

    function f = two_peaks(pars, t)
        a10 = pars(1);  % peak area
        a11 = pars(2);  % elution time
        a12 = pars(3);  % width of gaussian
        a13 = pars(4);  % exponential damping term
        a20 = pars(5);  % peak area
        a21 = pars(6);  % elution time
        a22 = pars(7);  % width of gaussian
        a23 = pars(8);  % exponential damping term

        p1 = asym_peak([a10 a11 a12 a13],t);
        p2 = asym_peak([a20 a21 a22 a23],t);

        f = p1 + p2;
    end

Plot fitting function with an initial guess for each parameter

the fit is not very good.

hold all;
parguess = [1500,4.85,0.05,0.05,5000,5.1,0.05,0.1];
plot (t,two_peaks(parguess,t),'g-')
legend 'raw data' 'initial guess'


nonlinear fitting

now we use nonlinear fitting to get the parameters that best fit our data, and plot the fit on the graph.

pars = nlinfit(t,intensity, @two_peaks, parguess)
plot(t,two_peaks(pars, t),'r-')
legend 'raw data' 'initial guess' 'total fit'
pars =

   1.0e+03 *

    1.3052    0.0049    0.0001    0.0000    5.3162    0.0051    0.0000    0.0001

the fits are not perfect. The small peak is pretty good, but there is an unphysical tail on the larger peak, and a small mismatch at the peak. There is not much to do about that, it means the model peak we are using is not a good model for the peak. We will still integrate the areas though.

now extract out the two peaks and integrate the areas

pars1 = pars(1:4)
pars2 = pars(5:8)

peak1 = asym_peak(pars1, t);
peak2 = asym_peak(pars2, t);

plot(t,peak1)
plot(t,peak2)
legend 'raw data' 'initial guess' 'total fit' 'peak 1' 'peak 2';

area1 = trapz(t, peak1)
area2 = trapz(t, peak2)
pars1 =

   1.0e+03 *

    1.3052    0.0049    0.0001    0.0000


pars2 =

   1.0e+03 *

    5.3162    0.0051    0.0000    0.0001


area1 =

   1.3052e+03


area2 =

   5.3162e+03


Compute relative amounts

percent1 = area1/(area1 + area2)
percent2 = area2/(area1 + area2)
percent1 =

    0.1971


percent2 =

    0.8029

This sample was air, and the first peak is oxygen, and the second peak is nitrogen. we come pretty close to the actual composition of air, although it is low on the oxygen content. To do better, one would have to use a calibration curve.

end

% categories: data analysis

% post_id = 1994; %delete this line to force new post;
% permaLink = http://matlab.cheme.cmu.edu/2012/06/22/curve-fitting-to-get-overlapping-peak-areas/;

VN:F [1.9.20_1166]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Colors, 3D Plotting, and Data Manipulation

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Colors, 3D Plotting, and Data Manipulation

Guest authors: Harrison Rose and Breanna Stillo

In this post, Harrison and Breanna present three-dimensional experimental data, and show how to plot the data, fit curves through the data, and plot surfaces. You will need to download mycmap.mat

function main
close all
clear all
clc

In order to generate plots with a consistent and hopefully attractive color scheme, we will generate these color presets. Matlab normalizes colors to [R G B] arrays with values between 0 and 1. Since it is easier to find RGB values from 0 to 255, however, we will simply normalize them ourselves. A good colorpicker is http://www.colorpicker.com.

pcol = [255,0,0]/255; % red
lcol = [135,14,179]/255; % a pinkish color

Raw Data

This is raw data from an experiment for three trials (a and b and c). The X and Y are independent variables, and we measured Z.

X_a = [8.3 8.3 8.3 8.3 8.3 8.3 8.3];
X_b = [11 11 11 11 11 11 11];
X_c = [14 14 14 14 14 14 14];

Y_a = [34 59 64 39 35 36 49];
Y_b = [39 32 27 61 52 57 65];
Y_c = [63 33 38 50 54 68 22];

Z_a = [-3.59833 7.62 0 4.233333333 -2.54 -0.635 7.209];
Z_b = [16.51 10.16 6.77 5.08 15.24 13.7 3.048];
Z_c = [36 20 28 37 40 32 10];

Plotting the raw data for all three trials:

As you can see, the raw data does not look like very much, and it is pretty hard to interperet what it could mean.

We do see, however, that since X_a is all of one value, X_b is all of another value, and X_c is all of a third, that the data lies entirely on three separate planes.

figure
hold on     % Use this to plot multiple series on a single figure
plot3(X_a,Y_a,Z_a,'.','Color',pcol)
plot3(X_b,Y_b,Z_b,'.','Color',pcol)
plot3(X_c,Y_c,Z_c,'.','Color',pcol)
hold off    % Use this to make sure the next plot command will not
% try to plot on this figure as well.
title('Raw Experimental Data for Trials A, B, and C')
xlabel('x Data')
ylabel('y Data')
zlabel('z Data')
grid on     % 3D data is easier to visualize with the grid. Normally
% the grid defaults to 'on' but using the 'hold on'
% command as we did above causes the grid to default to
% 'off'


A note on the view:

The command

 view(Az,El)

lets you view a 3D plot from the same viewpoint each time you run the code. To determine the best viewpoint to use, use the click the ‘Rotate 3D’ icon in the figure toolbar (it looks like a box with a counterclockwise arrow around it), and drag your plot around to view it from different angles. You will notice the text “Az: ## El: ##” appear in the lower left corner of the figure window. This stands for Azimuth and Elevation which represent a point in spherical coordinates from which to view the plot (the radius is fixed by the axes sizes). The command used here will always display the plot from azimuth = -39, and elevation = 10.

view(-39,10)


A closer look at the raw data:

figure
hold on
plot(Y_a,Z_a,'o','MarkerFaceColor',pcol,'MarkerEdgeColor','none')
title('Raw Data for Trial A, x = 8.3')
xlabel('y')
ylabel('z')
hold off

figure
hold on
plot(Y_b,Z_b,'o','MarkerFaceColor',pcol,'MarkerEdgeColor','none')
title('Raw Data for Trial B, x = 11')
xlabel('y')
ylabel('z')
hold off

figure
hold on
plot(Y_c,Z_c,'o','MarkerFaceColor',pcol,'MarkerEdgeColor','none')
title('Raw Data for Trial C, x = 14')
xlabel('y')
ylabel('z')
hold off


Fitting the raw data:

In this case, we expect the data to fit the shape of a binomial distribution, so we use the following fit function with three parameters:

    function v = mygauss(par, t)
        A  = par(1);
        mu = par(2);
        s  = par(3);

        v=A*exp(-(t-mu).^2./(2*s.^2));
    end

Fitting the data

res = 20;
Yfit=linspace(20,70,res);

% Dataset A
guesses=[20, 40, 20];
[pars residuals J]=nlinfit(Y_a, Z_a-min(Z_a), @mygauss, guesses);

A1=pars(1);
mu1=pars(2);
s1=pars(3);

Zfitfun_a=@(y) A1*exp(-(y-mu1).^2./(2*s1.^2))+min(Z_a);
% Note: We have to shift the dataset up to zero because a gaussian
% typically cannot go below the horizontal axis

% Generate a fit-line through the data

Zfit_a=Zfitfun_a(Yfit);

% Dataset B
guesses=[10, 25, 20];
[pars residuals J]=nlinfit(Y_b, Z_b, @mygauss, guesses);

A2=pars(1);
mu2=pars(2);
s2=pars(3);

Zfitfun_b=@(y) A2*exp(-(y-mu2).^2./(2*s2.^2));

% Generate a fit-line through the data
Zfit_b=Zfitfun_b(Yfit);

% Dataset c
guesses=[20, 60, 20];
[pars residuals J]=nlinfit(Y_c, Z_c, @mygauss, guesses);

A3=pars(1);
mu3=pars(2);
s3=pars(3);

Zfitfun_c=@(y) A3*exp(-(y-mu3).^2./(2*s3.^2));

% Generate a fit-line through the data
Zfit_c=Zfitfun_c(Yfit);

Plotting the fitted data:

% Trial A
figure
hold on
plot(Y_a,Z_a,'o','MarkerFaceColor',pcol,'MarkerEdgeColor','none')
title('Fitted Data for Trial A, x = 8.3')
xlabel('y')
ylabel('z')
plot(Yfit,Zfit_a,'Color',lcol,'LineWidth',2)
hold off

% Trial B
figure
hold on
plot(Y_b,Z_b,'o','MarkerFaceColor',pcol,'MarkerEdgeColor','none')
title('Fitted Data for Trial B, x = 11')
xlabel('y')
ylabel('z')
plot(Yfit, Zfit_b,'Color',lcol,'LineWidth',2)
hold off

% Trial C
figure
hold on
plot(Y_c,Z_c,'o','MarkerFaceColor',pcol,'MarkerEdgeColor','none')
title('Fitted Data for Trial C, x = 14')
xlabel('y')
ylabel('z')
plot(Yfit,Zfit_c,'Color',lcol,'LineWidth',2)
hold off


Generate a surface plot:

For every point along the fit-line for dataset A, connect it to the corresponding point along the fit-line for dataset B using a straight line. This linear interpolation is done automatically by the surf command. If more datasets were available, we could use a nonlinear fit to produce a more accurate surface plot, but for now, we will assume that the experiment is well-behaved, and that 11 and 8.3 are close enough together that we can use linear interpolation between them.

Interpolate along the fit lines to produce YZ points for each data series (X):

Fits through the fits

Yspace = linspace(25,65,res);
Xs = [8.3; 11; 14];
Xspan1 = linspace(8.3,14,res);
Xspan2 = linspace(8,14.3,res);
q = 0;
r = 0;
for i = 1:res
    Zs = [Zfitfun_a(Yspace(i)); Zfitfun_b(Yspace(i)); Zfitfun_c(Yspace(i))];
    Yspan = linspace(Yspace(i),Yspace(i),res);
    p(:,i) = polyfit(Xs,Zs,2);
    for j = 1:res
        Zfit_fit(i,j) = polyval(p(:,i),Xspan1(j));
    end
end

Plot a surface through the fits through the fits

figure
hold all
surf(Xspan1,Yspace,Zfit_fit,'EdgeColor','black');   % Generate the surface
%surf(Xsurf,Ysurf,Zsurf,'EdgeColor','black');   % Generate the surface

% plot leading and lagging lines (just for show!)
for i = 1:res
    Zs = [Zfitfun_a(Yspace(i)); Zfitfun_b(Yspace(i)); Zfitfun_c(Yspace(i))];
    Yspan = linspace(Yspace(i),Yspace(i),res);
    plot3(Xspan2,Yspan,polyval(p(:,i),Xspan2),'Color','black')
end

%Plot the raw data:
plot3(X_a,Y_a,Z_a,'o','markersize',5,'MarkerFaceColor',pcol,'MarkerEdgeColor','none')
plot3(X_b,Y_b,Z_b,'o','markersize',5,'MarkerFaceColor',pcol,'MarkerEdgeColor','none')
plot3(X_c,Y_c,Z_c,'o','markersize',5,'MarkerFaceColor',pcol,'MarkerEdgeColor','none')

%Plot the fit lines:
plot3(linspace(8.3,8.3,res),Yfit,Zfit_a,'Color',lcol,'LineWidth',2)
plot3(linspace(11,11,res),Yfit,Zfit_b,'Color',lcol,'LineWidth',2)
plot3(linspace(14,14,res),Yfit,Zfit_c,'Color',lcol,'LineWidth',2)
view(-39,10)
alpha(.2)   % Make the plot transparent
title('z vs. x vs. y')
xlabel('x')
ylabel('y')
zlabel('z')
grid on


Add a line through the max

We must minimize the negative of each of our initial three fits to find the maximum of the fit.

f1 = @(y) -Zfitfun_a(y);
f2 = @(y) -Zfitfun_b(y);
f3 = @(y) -Zfitfun_c(y);

Ystar_a = fminbnd(f1,0,100);
Ystar_b = fminbnd(f2,0,100);
Ystar_c = fminbnd(f3,0,100);

Ystars = [Ystar_a; Ystar_b; Ystar_c];
Zstars = [Zfitfun_a(Ystar_a); Zfitfun_b(Ystar_b); Zfitfun_c(Ystar_c)];

hold on
plot3(Xs,Ystars,Zstars,'o','markersize',7,'MarkerFaceColor','white')

xy = polyfit(Xs,Ystars,2);
xz = polyfit(Xs,Zstars,2);

plot3(Xspan1,polyval(xy,Xspan1),polyval(xz,Xspan1),'Color','yellow','LineWidth',2)


A note on colormaps

To edit the color of the surface, you can apply a colormap. Matlab has several built-in colormaps (to see them, type ‘doc colormap’ in the Command Window. As you see here, however, we are using our own colormap, which has been stored in the file ‘mycmap.mat’

To modify a colormap, type

 cmapeditor

in the Command Window or click ‘Edit’ ‘Colormap…’ in the figure window. For instructions on how to use the colormap editor, type ‘doc colormapeditor’ in the Command Window. If you have ‘Immediate apply’ checked, or you click ‘Apply’ the colormap will load onto the figure. To save a colormap, type the following into the Command Window:

 mycmap = get(gcf,'Colormap')
 save('mycmap')
s = load('mycmap.mat');
newmap = s.mycmap;
set(gcf,'Colormap',newmap)  % See corresponding 'Note on colormaps')

end

Summary

Matlab offers a lot of capability to analyze and present data in 3-dimensions.

% categories: plotting, data analysis

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Some of this, sum of that

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Some of this, sum of that

John Kitchin 5/29/2012

Matlab provides a sum function to compute the sum of a vector. However, the sum function does not work on a cell array of numbers, and it certainly does not work on nested cell arrays. We will solve this problem with recursion.

function main

Basic sum of a vector

v = [1 2 3 4 5 6 7 8 9]
sum(v)
v =

     1     2     3     4     5     6     7     8     9


ans =

    45

sum does not work for a cell

v = {1 2 3 4 5 6 7 8 9}  % note {} is a cell array
try
    sum(v)
catch
    'An error was caught!'
end
v = 

    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]


ans =

An error was caught!

Compute sum of a cell array with a loop

In this case, it is not hard to write a loop that computes the sum. Note how the indexing is done. v{i} is the contents of the ith element in the cell array. v(i) is a new cell array containing the contents of the ith element.

s = 0;
for i=1:length(v)
    s = s + v{i};
end
sprintf('The sum of the cell array v = %d', s)
ans =

The sum of the cell array v = 45

Nested cell arrays

Suppose now we have nested cell arrays. This kind of structured data might come up if you had grouped several things together. For example, suppose we have 5 departments, with 1, 5, 15, 7 and 17 people in them, and in each department they are divided into groups.

Department 1: 1 person
Department 2: group of 2 and group of 3
Department 3: group of 4 and 11, with a subgroups of 5 and 6 making
              up the group of 11.
Department 4: 7 people
Department 5: one group of 8 and one group of 9.

We can represent that data as nested cell arrays like this.

v = {1,
    {2,3}
    {4,{5,6}},
    7,
    {8,9}}
v = 

    [       1]
    {1x2 cell}
    {1x2 cell}
    [       7]
    {1x2 cell}

Sum of nested cell arrays

Now, say we want to know the sum of the people in all departments. We can not write a loop to do this because of the nesting. Lets see what a loop would do.

for i=1:length(v)
    v{i}, class(v{i})
end
ans =

     1


ans =

double


ans = 

    [2]    [3]


ans =

cell


ans = 

    [4]    {1x2 cell}


ans =

cell


ans =

     7


ans =

double


ans = 

    [8]    [9]


ans =

cell

You can see the output of v{i} is variable. Sometimes it is a number, sometimes it is a new cell array, and sometimes a new nested cell array. There is no set of loops we can construct to add all the numbers. Instead, we have to construct a recursive function that goes into each nested cell array to the end, gets the numbers and adds them up.

a recursive sum function

For a recursive function we need to identify the situation that is terminal, work towards the terminal condition, and then call the function again. Here, as we traverse the cell array, when we get a number we end the recursive call, and when we get another cell array we recurse. So, we will iterate over the cell array, and everytime we find a number we add it to the sum, and otherwise call the function again

    function s = recursive_sum(v)
        % v is a (possibly nested) cell array
        s=0; % initial value of sum
        for i=1:length(v)
            if isnumeric(v{i})
                % this is the terminal step
                v{i}  % show the order of traversal
                s = s + v{i};
            else
                % we got another cell, so we recurse
                s = s + recursive_sum(v{i});
            end
        end
    end

Test the recursive function

We should get the same answer as before. and we do. You can also see that the numbers are added up in the order you go through the cell array.

recursive_sum(v)
ans =

     1


ans =

     2


ans =

     3


ans =

     4


ans =

     5


ans =

     6


ans =

     7


ans =

     8


ans =

     9


ans =

    45

check on a non-nested cell array

recursive_sum({1,2,3,4,5,6,7,8,9})
ans =

     1


ans =

     2


ans =

     3


ans =

     4


ans =

     5


ans =

     6


ans =

     7


ans =

     8


ans =

     9


ans =

    45

check on a non-cell array argument

try
    recursive_sum([1,2,3,4,5,6,7,8,9])
catch
    disp('our function does not work if the argument is not a cell array!')
end
our function does not work if the argument is not a cell array!

Conclusions

In Post 1970 we examined recursive functions that could be replaced by loops. Here we examine a function that can only work with recursion because the nature of the nested data structure is arbitrary. There are arbitary branches and depth in the data structure. Recursion is nice because you don’t have to define that structure in advance.

end

% categories: Miscellaneous
% tags: recursive

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Lather, rinse and repeat

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Lather, rinse and repeat

John Kitchin 5/28/2012

Recursive functions are functions that call themselves repeatedly until some exit condition is met. Today we look at a classic example of recursive function for computing a factorial. The factorial of a non-negative integer n is denoted n!, and is defined as the product of all positive integers less than or equal to n.

function main

recursive function definition

The key ideas in defining a recursive function is that there needs to be some logic to identify when to terminate the function. Then, you need logic that calls the function again, but with a smaller part of the problem. Here we recursively call the function with n-1 until it gets called with n=0. 0! is defined to be 1.

    function f = recursive_factorial(n)
        % compute the factorial recursively. Note if you put a negative
        % number in, this function will never end. We also do not check if
        % n is an integer.
        if n == 0
            % this is the terminating case
            f = 1
        else
            % otherwise we call the function again, this time with n-1
            f = n * recursive_factorial(n-1)
        end
    end
f =

     1


f =

     1


f =

     2


f =

     6


f =

    24


f =

   120


ans =

   120

Matlab factorial function

factorial(5)
ans =

   120

Now our function

recursive_factorial(5)

Compare to a loop solution

This example can also be solved by a loop. This loop is easier to read and understand than the recursive function. Note the recursive nature of defining the variable as itself times a number.

n = 5;
factorial_loop = 1;
for i=1:n
    factorial_loop = factorial_loop*i
end
factorial_loop =

     1


factorial_loop =

     2


factorial_loop =

     6


factorial_loop =

    24


factorial_loop =

   120

Conclusions

Recursive functions have a special niche in mathematical programming. There is often another way to accomplish the same goal. That isn’t always true though, and in a future post we will examine cases where recursion is the only way to solve a problem.

end

% categories: math
% tags: recursive

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Enhancements to Matlab wordpress

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Enhancements to Matlab wordpress

John Kitchin May 7, 2012

I modified some of my wordpress matlab code to make some enhancements in the publishing process. Mostly I wanted some markup to make the blog posts a little nicer. For example, a markup to indicate a command, or to make sure a file can be downloaded, and to make tooltips on text.


Contents

a better command markup

I wanted command references in the posts to have a link to some documentation of the command. there doesn’t seem to be a systematic way to get the url to Matlab documentation, so I settled for a search on Matlab’s site. See: fsolve command! Click on the link to search Matlab’s website for it. All that from:

:command:`fsolve`

Download links

sometimes examples need a file to work. you can download test.m

:download:`test.m`

tooltips


. Here is the markup for a tooltip. At publish time I add a little javascript to the html to make this happen.

:tooltip:`<Surprise! you just found a tooltip!> Hover on me!`

Random plot

I have to upload the graphics to wordpress, then change the url in the img src attribute.

plot(1:5,5:9)


Warnings

Sometimes warning text should be obvious!

WARNING: Something can fail badly if you do that.

Notes

I also like to separate out notes about some things.
Note: You can also try this variation!

where is the code?

You can see the code at https://github.com/johnkitchin/matlab-wordpress.

% categories: miscellaneous
% tags: blog, publish


% post_id = 1937; %delete this line to force new post;
% permaLink = http://matlab.cheme.cmu.edu/2012/05/08/enhancements-to-matlab-wordpress-2/;

Brief intro to regular expressions

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Brief intro to regular expressions

John Kitchin 5/6/2012

This example shows how to use a regular expression to find strings matching the pattern :cmd:`datastring`. We want to find these strings, and then replace them with something that depends on what cmd is, and what datastring is.

function main
clear all;

    function html = cmd(datastring)
        % replace :cmd:`datastring` with html code with light gray
        % background
        s = '<FONT style="BACKGROUND-COLOR: LightGray">%s</FONT>';
        html = sprintf(s,datastring);
    end

    function html = red(datastring)
        % replace :red:`datastring` with html code to make datastring
        % in red font
        html = sprintf('<font color=red>%s</font>',datastring)
    end

Define a multiline string

text = ['Here is some text. use the :cmd:`open` to get the text into\n'...
    ' a variable. It might also be possible to get a multiline :red:`line\n' ...
    ' one line 2` directive.'];
sprintf(text)
ans =

Here is some text. use the :cmd:`open` to get the text into
 a variable. It might also be possible to get a multiline :red:`line
 one line 2` directive.

find all instances of :*:`*`

regular expressions are hard. there are whole books on them. The point of this post is to alert you to the possibilities. I will break this regexp down as follows. 1. we want everything between :*: as the directive. ([^:]*) matches everything not a :. :([^:]*): matches the stuff between two :. 2. then we want everything between `*`. ([^`]*) matches everything not a `. 3. The () makes a group that matlab stores as a token, so we can refer to the found results later.

regex = ':([^:]*):`([^`]*)`';
[tokens matches] = regexp(text,regex, 'tokens','match');

for i = 1:length(tokens)
    directive = tokens{i}{1};
    datastring = tokens{i}{2};
    sprintf('directive = %s', directive)
    sprintf('datastring = %s', datastring)

    % construct string of command to evaluate directive(datastring)
    runcmd = sprintf('%s(''%s'')', directive, datastring)
    html = eval(runcmd)

    % now replace the matched text with the html output
    text = strrep(text, matches{i}, html);
    % now
end
ans =

directive = cmd


ans =

datastring = open


runcmd =

cmd('open')


html =

<FONT style="BACKGROUND-COLOR: LightGray">open</FONT>


ans =

directive = red


ans =

datastring = line\n one line 2


runcmd =

red('line\n one line 2')


html =

<font color=red>line\n one line 2</font>


html =

<font color=red>line\n one line 2</font>

See modified text

sprintf(text)
ans =

Here is some text. use the <FONT style="BACKGROUND-COLOR: LightGray">open</FONT> to get the text into
 a variable. It might also be possible to get a multiline <font color=red>line
 one line 2</font> directive.

this shows the actual html, rendered to show the changes.

web(sprintf('text://%s', text))
end

% categories: miscellaneous
% tags: regular expression

% post_id = 1701; %delete this line to force new post;
% permaLink = http://matlab.cheme.cmu.edu/2012/05/07/1701/;

Issues with nested functions

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Issues with nested functions

John Kitchin

Nested functions are great for passing lots of information between functions conveniently. But, there are some gotchas to be wary of!

function main

a typical nested function use

It is common to define several parameters, and then to use them inside a nested function

a = 4;
b = 3;

    function y = f1(x)
        y = a + b*x;
    end

f1(2)
ans =

    10

classic example of a potential problem

Nested functions share the same namespace as the main function, which means you can change the value of a variable in the main function in the nested function. This does not happen with subfunctions.

i = 5 % initial value of i
    function y = f2(x)
        % nested function that uses a and b from above, and uses an
        % internal loop to compute a value of y.
        y = 0;
        for i=1:10  % this is going to change the value of i outside this function
            y = y + a*b + i;
        end
    end

f2(2)
i % note i = 10 now
i =

     5


ans =

   175


i =

    10

This kind of behavior can be very difficult to debug or even know is a problem! you have to be aware that it is an issue when using nested functions.

subfunctions

subfunctions do not share namespaces, so you can reuse variable names. This can prevent accidental variable value changes, but, you have to put the variable information into the subfunction

i = 5
f3(2)
i % note that the value of i has not been changed here.
i =

     5


ans =

   175


i =

     5

end

function y = f3(x)
% subfunction that uses a and b from above, and uses an internal loop
% to compute a value of y.
a = 4;
b = 3;
y = 0;
for i=1:10  % this is going to change the value of i outside this function
    y = y + a*b + i;
end
end

% categories: basic

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Better interpolate than never

PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">



Contents

Better interpolate than never

John Kitchin

close all; clear all

We often have some data that we have obtained in the lab, and we want to solve some problem using the data. For example, suppose we have this data that describes the value of f at time t.

t = [0.5 1 3 6]
f = [0.6065    0.3679    0.0498    0.0025]
plot(t,f)
xlabel('t')
ylabel('f(t)')
t =

    0.5000    1.0000    3.0000    6.0000


f =

    0.6065    0.3679    0.0498    0.0025


Estimate the value of f at t=2.

This is a simple interpolation problem.

interp1(t,f,2)
ans =

    0.2089

The function we sample above is actually f(t) = exp(-t). The linearly interpolated example isn’t too accurate.

exp(-2)
ans =

    0.1353

improved interpolation

we can tell Matlab to use a better interpolation scheme: cubic polynomial splines like this. For this example, it only slightly improves the interpolated answer. That is because you are trying to estimate the value of an exponential function with a cubic polynomial, it fits better than a line, but it can’t accurately reproduce the exponential function.

interp1(t,f,2,'cubic')
ans =

    0.1502

The inverse question

It is easy to interpolate a new value of f given a value of t. What if we want to know the time that f=0.2? We can approach this a few ways.

method 1

we setup an equation that interpolates f for us. Then, we setup another function that we can use fsolve on. The second function will be equal to zero at the time. The second function will look like 0 = 6.25 – f(t). The answer for 0.2=exp(-t) is t = 1.6094. Since we use interpolation here, we will get an approximate answer. The first method is not to bad.

func = @(time) interp1(t,f,time);
g = @(time) 0.2 - func(time);

initial_guess = 2;
ans = fsolve(g, initial_guess)
Equation solved.

fsolve completed because the vector of function values is near zero
as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.




ans =

    2.0556

method 2: switch the interpolation order

We can switch the order of the interpolation to solve this problem.

interp1(f,t,0.2)
interp1(f,t,0.2,'cubic')
ans =

    2.0556


ans =

    1.5980

Note we get the same answer with linear interpolation, but a different answer with the cubic interpolation. Here the cubic interpolation does much better job of fitting. You can see why in this figure. The cubic interpolation is much closer to the true function over most of the range. Although, note that it overestimates in some regions and underestimates in other regions.

figure
hold all
x = linspace(0.25, 6);
plot(t,f,'ko ')
plot(x, interp1(t,f,x),'r-')
plot(x, interp1(t,f,x,'cubic'),'b-')
plot(x,exp(-x),'k--')
xlabel('t')
ylabel('f(t)')
legend('raw data','linear interpolation','cubic interpolation','exp(-t)')


A harder problem

Suppose we want to know at what time is 1/f = 100? Now we have to decide what do we interpolate: f(t) or 1/f(t). Let’s look at both ways and decide what is best. The answer to $1/exp(-t) = 100$ is 4.6052

interpolate on f(t) then invert the interpolated number

g2 = @(time) 100 - 1./func(time);
initial_guess = 4.5;
a1 = fsolve(g2, initial_guess)
sprintf('%%error = %1.2f',((a1 - 4.6052)/2.5*100))
Equation solved.

fsolve completed because the vector of function values is near zero
as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.




a1 =

    5.5243


ans =

%error = 36.76

invert f(t) then interpolate on 1/f

ifunc2 = @(time) interp1(t,1./f,time);
g5 = @(time) 100 - ifunc2(time);
a2 = fsolve(g5, initial_guess)
sprintf('%%error = %1.2f',((a2 - 4.6052)/2.5*100))
Equation solved.

fsolve completed because the vector of function values is near zero
as measured by the default value of the function tolerance, and
the problem appears regular as measured by the gradient.




a2 =

    3.6311


ans =

%error = -38.96

discussion

In this case you get different errors, one overestimates and one underestimates the answer, and by a lot: ~plus or minus 37% . Let’s look at what is happening.

plotting 1/f

figure
hold all
x = linspace(0.25, 6);
plot(t,1./f,'ko ')
plot(x, 1./interp1(t,f,x),'r-')
plot(x, interp1(t,1./f,x,'cubic'),'b-')
plot(x,1./exp(-x),'k--')
xlabel('t')
ylabel('1/f(t)')
legend('raw data','linear interpolation','cubic interpolation','1/exp(-t)')

You can see that the cubic interpolation is overestimating the true function between 3 and 6, while the linear interpolation underestimates it. This is an example of where you clearly need more data in that range to make good estimates. Neither interpolation method is doing a great job. The trouble in reality is that you often do not know the real function to do this analysis. Here you can say the time is probably between 3.6 and 5.5 where 1/f(t) = 100, but you can’t read much more than that into it. If you need a more precise answer, you need better data, or you need to use an approach other than interpolation. For example, you could fit an exponential function to the data and use that to estimate values at other times.

'done'

% categories: basic matlab
% tags: interpolation
ans =

done

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)