Brief intro to regular expressions

| categories: miscellaneous | View Comments

intro_regexp

Contents

Brief intro to regular expressions

John Kitchin 5/6/2012

This example shows how to use a regular expression to find strings matching the pattern :cmd:`datastring`. We want to find these strings, and then replace them with something that depends on what cmd is, and what datastring is.

function main
clear all;

    function html = cmd(datastring)
        % replace :cmd:`datastring` with html code with light gray
        % background
        s = '<FONT style="BACKGROUND-COLOR: LightGray">%s</FONT>';
        html = sprintf(s,datastring);
    end

    function html = red(datastring)
        % replace :red:`datastring` with html code to make datastring
        % in red font
        html = sprintf('<font color=red>%s</font>',datastring)
    end

Define a multiline string

text = ['Here is some text. use the :cmd:`open` to get the text into\n'...
    ' a variable. It might also be possible to get a multiline :red:`line\n' ...
    ' one line 2` directive.'];
sprintf(text)
ans =

Here is some text. use the :cmd:`open` to get the text into
 a variable. It might also be possible to get a multiline :red:`line
 one line 2` directive.

find all instances of :*:`*`

regular expressions are hard. there are whole books on them. The point of this post is to alert you to the possibilities. I will break this regexp down as follows. 1. we want everything between :*: as the directive. ([^:]*) matches everything not a :. :([^:]*): matches the stuff between two :. 2. then we want everything between `*`. ([^`]*) matches everything not a `. 3. The () makes a group that matlab stores as a token, so we can refer to the found results later.

regex = ':([^:]*):`([^`]*)`';
[tokens matches] = regexp(text,regex, 'tokens','match');

for i = 1:length(tokens)
    directive = tokens{i}{1};
    datastring = tokens{i}{2};
    sprintf('directive = %s', directive)
    sprintf('datastring = %s', datastring)

    % construct string of command to evaluate directive(datastring)
    runcmd = sprintf('%s(''%s'')', directive, datastring)
    html = eval(runcmd)

    % now replace the matched text with the html output
    text = strrep(text, matches{i}, html);
    % now
end
ans =

directive = cmd


ans =

datastring = open


runcmd =

cmd('open')


html =

<FONT style="BACKGROUND-COLOR: LightGray">open</FONT>


ans =

directive = red


ans =

datastring = line\n one line 2


runcmd =

red('line\n one line 2')


html =

<font color=red>line\n one line 2</font>


html =

<font color=red>line\n one line 2</font>

See modified text

sprintf(text)
ans =

Here is some text. use the <FONT style="BACKGROUND-COLOR: LightGray">open</FONT> to get the text into
 a variable. It might also be possible to get a multiline <font color=red>line
 one line 2</font> directive.

this shows the actual html, rendered to show the changes.

web(sprintf('text://%s', text))
end

% categories: miscellaneous
% tags: regular expression

% post_id = 1701; %delete this line to force new post;
% permaLink = http://matlab.cheme.cmu.edu/2012/05/07/1701/;
blog comments powered by Disqus