Wednesday, April 6, 2011

How can I remove linebreaks with a Perl regex?

when I run:

perl -e '$x="abc\nxyz\n123"; $x =~ s/\n.*/... multiline.../; printf("str %s\n", $x);'

I expect result to be:

str abc... multiline...

instead I get

str abc... multiline...
123

Where am I going wrong?

From stackoverflow
  • $x =~ s/\n.*/... multiline.../s
    

    /s modifier tells Perl to treat the matched string as single-line, which causes . to match newlines. Ordinarily it doesn't, resulting in your observed behavior.

    obecalp : AFAIK, perl doesn't use PCRE. PCRE is a library to mimic perl regex syntax.
    chaos : My intended meaning was PCRE, not as a specific library, but as the general set of Perl-compatible regular expression engines, including Perl's. Perhaps my usage is inappropriate.
    Alan Moore : Then how are we supposed to tag questions that relate to the PCRE library? Anyway, not all "Perl-compatible" regex flavors use /s (or /m, /i, etc.). Languages without regex literals, like Python, Java and .NET, use compilation flags like "DOTALL" or "SingleLine" instead.
    chaos : Okeydokey. Edited.
  • You need to use the 's' modifier on your regex, so that the dot '.' will match any subsequent newlines. So this:

    $x =~ s/\n.*/... multiline.../;
    

    Becomes this:

    $x =~ s/\n.*/... multiline.../s;
    

0 comments:

Post a Comment