I also did some timing tests on the BSD regexec(). I wrote some code to convert the common date formats in internet headers to an ISO 8601 type string (yyyymmddThhmmss) and tested three methods against the 746044-line file above. regexec() with full substring matches took 484 seconds. regexec() with one group match took 383 seconds. Compiling the pattern with REG_NOSUB and using the regex only to verify there was a match somewhere took 71 seconds. Using a custom function to scan the pattern directly averaged 9 seconds! (Practically all of that is I/O for loading 182M of data from the disk.) If performance is an issue at all, either write your pattern matcher by hand or commit to using a compile-time lexer like Flex or ANTLR.