Regexp to extract value of src attribute; handles single-quoted values, double-quoted values, and values without quotes (they must not contain any whitespace or quotes).
my $reSqString = qr{
\'
[^\']*
\'
}x;
my $reDqString = qr{
\"
[^\"]*
\"
}x;
my $reAttrValue = qr{
(?: $reSqString | $reDqString | [^\'\"\s]+ )
}x;
my $reImgSrc =
qr{
<[iI][mM][gG]
\s+
(?: \w+ \s*=\s* $reAttrValue \s+ )*
[sS][rR][cC] \s*=\s*
(?: (?:\'([^\']+)\') | (?:\"([^\"]+)\") | ([^\'\"\s]+) )
(?: \s+ \w+ \s*=\s* $reAttrValue )*
\s*/?>
}x;
- testhtmlregexp: Script to test a regexp for extracting an attribute value from an HTML tag