Brief intro to regular expressions
May 07, 2012 at 03:34 PM | categories: miscellaneous | View Comments
Contents
Brief intro to regular expressions
John Kitchin 5/6/2012
This example shows how to use a regular expression to find strings matching the pattern :cmd:`datastring`. We want to find these strings, and then replace them with something that depends on what cmd is, and what datastring is.
function main
clear all; function html = cmd(datastring) % replace :cmd:`datastring` with html code with light gray % background s = '<FONT style="BACKGROUND-COLOR: LightGray">%s</FONT>'; html = sprintf(s,datastring); end function html = red(datastring) % replace :red:`datastring` with html code to make datastring % in red font html = sprintf('<font color=red>%s</font>',datastring) end
Define a multiline string
text = ['Here is some text. use the :cmd:`open` to get the text into\n'... ' a variable. It might also be possible to get a multiline :red:`line\n' ... ' one line 2` directive.']; sprintf(text)
ans = Here is some text. use the :cmd:`open` to get the text into a variable. It might also be possible to get a multiline :red:`line one line 2` directive.
find all instances of :*:`*`
regular expressions are hard. there are whole books on them. The point of this post is to alert you to the possibilities. I will break this regexp down as follows. 1. we want everything between :*: as the directive. ([^:]*) matches everything not a :. :([^:]*): matches the stuff between two :. 2. then we want everything between `*`. ([^`]*) matches everything not a `. 3. The () makes a group that matlab stores as a token, so we can refer to the found results later.
regex = ':([^:]*):`([^`]*)`'; [tokens matches] = regexp(text,regex, 'tokens','match'); for i = 1:length(tokens) directive = tokens{i}{1}; datastring = tokens{i}{2}; sprintf('directive = %s', directive) sprintf('datastring = %s', datastring) % construct string of command to evaluate directive(datastring) runcmd = sprintf('%s(''%s'')', directive, datastring) html = eval(runcmd) % now replace the matched text with the html output text = strrep(text, matches{i}, html); % now end
ans =
directive = cmd
ans =
datastring = open
runcmd =
cmd('open')
html =
<FONT style="BACKGROUND-COLOR: LightGray">open</FONT>
ans =
directive = red
ans =
datastring = line\n one line 2
runcmd =
red('line\n one line 2')
html =
<font color=red>line\n one line 2</font>
html =
<font color=red>line\n one line 2</font>
See modified text
sprintf(text)
ans = Here is some text. use the <FONT style="BACKGROUND-COLOR: LightGray">open</FONT> to get the text into a variable. It might also be possible to get a multiline <font color=red>line one line 2</font> directive.
this shows the actual html, rendered to show the changes.
web(sprintf('text://%s', text))
end % categories: miscellaneous % tags: regular expression % post_id = 1701; %delete this line to force new post; % permaLink = http://matlab.cheme.cmu.edu/2012/05/07/1701/;