forked from boostorg/regex
71 lines
3.8 KiB
HTML
71 lines
3.8 KiB
HTML
![]() |
<html>
|
||
|
<head>
|
||
|
<title>Regular Expression Performance Comparison</title>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||
|
<meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
|
||
|
<meta name="Template" content="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">
|
||
|
<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
|
||
|
</head>
|
||
|
<body bgcolor="#ffffff" link="#0000ff" vlink="#800080">
|
||
|
<h2>Regular Expression Performance Comparison</h2>
|
||
|
<p>
|
||
|
The following tables provide comparisons between the following regular
|
||
|
expression libraries:</p>
|
||
|
<p><a href="http://research.microsoft.com/projects/greta">GRETA</a>.</p>
|
||
|
<p><a href="http://www.boost.org/">The Boost regex library</a>.</p>
|
||
|
<p><a href="http://arglist.com/regex/">Henry Spencer's regular expression library</a>
|
||
|
- this is provided for comparison as a typical non-backtracking implementation.</p>
|
||
|
<P>Philip Hazel's <A href="http://www.pcre.org">PCRE</A> library.</P>
|
||
|
<H3>Details</H3>
|
||
|
<P>Machine: Intel Pentium 4 2.8GHz PC.</P>
|
||
|
<P>Compiler: %compiler%.</P>
|
||
|
<P>C++ Standard Library: %library%.</P>
|
||
|
<P>OS: %os%.</P>
|
||
|
<P>Boost version: %boost%.</P>
|
||
|
<P>PCRE version: %pcre%.</P>
|
||
|
<P>
|
||
|
As ever care should be taken in interpreting the results, only sensible regular
|
||
|
expressions (rather than pathological cases) are given, most are taken from the
|
||
|
Boost regex examples, or from the <a href="http://www.regxlib.com/">Library of
|
||
|
Regular Expressions</a>. In addition, some variation in the relative
|
||
|
performance of these libraries can be expected on other machines - as memory
|
||
|
access and processor caching effects can be quite large for most finite state
|
||
|
machine algorithms.</P>
|
||
|
<H3>Averages</H3>
|
||
|
<P>The following are the average relative scores for all the tests: the perfect
|
||
|
regular expression library would score 1, in practice anything less than 2
|
||
|
is pretty good.</P>
|
||
|
<P>%averages%</P>
|
||
|
<h3>Comparison 1: Long Search</h3>
|
||
|
<p>For each of the following regular expressions the time taken to find all
|
||
|
occurrences of the expression within a long English language text was measured
|
||
|
(<a href="ftp://ibiblio.org/pub/docs/books/gutenberg/etext02/mtent12.zip">mtent12.txt</a>
|
||
|
from <a href="http://promo.net/pg/">Project Gutenberg</a>, 19Mb). </p>
|
||
|
<P>%long_twain_search%</P>
|
||
|
<h3>Comparison 2: Medium Sized Search</h3>
|
||
|
<p>For each of the following regular expressions the time taken to find all
|
||
|
occurrences of the expression within a medium sized English language text was
|
||
|
measured (the first 50K from mtent12.txt). </p>
|
||
|
<P>%short_twain_search%</P>
|
||
|
<H3>Comparison 3: C++ Code Search</H3>
|
||
|
<P>For each of the following regular expressions the time taken to find all
|
||
|
occurrences of the expression within the C++ source file <A href="../../../boost/crc.hpp">
|
||
|
boost/crc.hpp</A> was measured. </P>
|
||
|
<P>%code_search%</P>
|
||
|
<H3>
|
||
|
<H3>Comparison 4: HTML Document Search</H3>
|
||
|
</H3>
|
||
|
<P>For each of the following regular expressions the time taken to find all
|
||
|
occurrences of the expression within the html file <A href="../../libraries.htm">libs/libraries.htm</A>
|
||
|
was measured. </P>
|
||
|
<P>%html_search%</P>
|
||
|
<H3>Comparison 3: Simple Matches</H3>
|
||
|
<p>
|
||
|
For each of the following regular expressions the time taken to match against
|
||
|
the text indicated was measured. </p>
|
||
|
<P>%short_matches%</P>
|
||
|
<hr>
|
||
|
<p>Copyright John Maddock April 2003, all rights reserved.</p>
|
||
|
</body>
|
||
|
</html>
|