Perl正規表現の罠と罠はずし その2
マッチングパターン「.」に改行文字を含める場合、マッチング修飾子「s」を用いるが、行末アンカー「$」を付けた場合に、キチンと行末(改行の前もしくは文字列の末尾)を認識するマッチングパターンは以下のようになる。
(?:(?:$)|(?=[\n]))
行頭・行末それぞれを正しく認識できるマッチングパターンを使ったプログラム例
$text = "12345\nABCDEFG\n12345\n"; print '=' x 10 . '(1-0)' . '=' x 10 . "\n"; print $text; $text =~ s/^ABCDE/x/gs; print '=' x 10 . '(1-1)' . '=' x 10 . "\n"; print $text; $text =~ s/(?:(?:^)|(?<=[\n]))ABCDE/abcde/gs; print '=' x 10 . '(1-2)' . '=' x 10 . "\n"; print $text; $text = "12345\nABCDEFG\n12345\n"; print '=' x 10 . '(2-0)' . '=' x 10 . "\n"; print $text; $text =~ s/^1234/6789/gs; print '=' x 10 . '(2-1)' . '=' x 10 . "\n"; print $text; $text =~ s/(?:(?:^)|(?<=[\n]))1234/6789/gs; print '=' x 10 . '(2-2)' . '=' x 10 . "\n"; print $text; $text =~ s/(?:(?:^)|(?<=[\n]))6789/1234/gs; print '=' x 10 . '(2-3)' . '=' x 10 . "\n"; print $text; $text = "12345\nABCDEFG\n12345\n"; print '=' x 10 . '(3-0)' . '=' x 10 . "\n"; print $text; $text =~ s/EFG$/x/gs; print '=' x 10 . '(3-1)' . '=' x 10 . "\n"; print $text; $text =~ s/EFG(?:(?:$)|(?=[\n]))/efg/gs; print '=' x 10 . '(3-2)' . '=' x 10 . "\n"; print $text; $text = "12345\nABCDEFG\n12345\n"; print '=' x 10 . '(4-0)' . '=' x 10 . "\n"; print $text; $text =~ s/2345$/7890/gs; print '=' x 10 . '(4-1)' . '=' x 10 . "\n"; print $text; $text =~ s/2345(?:(?:$)|(?=[\n]))/7890/gs; print '=' x 10 . '(4-2)' . '=' x 10 . "\n"; print $text; $text =~ s/7890(?:(?:$)|(?=[\n]))/2345/gs; print '=' x 10 . '(4-3)' . '=' x 10 . "\n"; print $text;
実行結果
==========(1-0)========== 12345 ABCDEFG 12345 ==========(1-1)========== 12345 ABCDEFG 12345 ==========(1-2)========== 12345 abcdeFG 12345 ==========(2-0)========== 12345 ABCDEFG 12345 ==========(2-1)========== 67895 ABCDEFG 12345 ==========(2-2)========== 67895 ABCDEFG 67895 ==========(2-3)========== 12345 ABCDEFG 12345 ==========(3-0)========== 12345 ABCDEFG 12345 ==========(3-1)========== 12345 ABCDEFG 12345 ==========(3-2)========== 12345 ABCDefg 12345 ==========(4-0)========== 12345 ABCDEFG 12345 ==========(4-1)========== 12345 ABCDEFG 17890 ==========(4-2)========== 17890 ABCDEFG 17890 ==========(4-3)========== 12345 ABCDEFG 12345