使用正則表達式:
1、先提取src到最近一個雙引號(也可以是單引號)
2、提取http到最近一個圖片格式
注意,提取的時候會包括末尾匹配!
方法:
private List<String> getIMG(String detail) {
String regex = "src=\"(.*?)\"";;
List<String> list = new ArrayList<>();
Pattern pa = Pattern.compile(regex, Pattern.DOTALL);
Matcher ma = pa.matcher(detail);
while (ma.find())
{
String src = ma.group();
String regex1 = "http(.*?)(.jpg|.png)";
Pattern pa1 = Pattern.compile(regex1, Pattern.DOTALL);
Matcher ma1 = pa1.matcher(src);
while (ma1.find())
{
list.add(ma1.group());
}
}
for(String str : list){
log.info("解析後圖片:" + str);
}
return list;
}
html示例:
<p style="text-align:center;"><img src="http://haitao.nos.netease.com/7d4278ff684845568b7c60d42c33d7cf1555558034978jum339k210560.jpg?imageView&quality=98&crop=0_13500_750_500" /></p>
<p style="text-align:center;"><img src="http://haitao.nos.netease.com/b4c188cd362f407b88eb9ee85fac0be51572419265474k2cxty9u12781.jpg?imageView&quality=98&crop=0_0_750_161" /></p>
<p style="text-align:center;"><img src="http://haitao.nos.netease.com/38ff93e978f04bf2abc1240544fe02241572419265578k2cxtycq12782.jpg?imageView&quality=98&crop=0_0_750_500" /></p>
<p style="text-align:center;"><img src="http://haitao.nos.netease.com/38ff93e978f04bf2abc1240544fe02241572419265578k2cxtycq12782.jpg?imageView&quality=98&crop=0_500_750_267" /></p>
解析後
http://haitao.nos.netease.com/7d4278ff684845568b7c60d42c33d7cf1555558034978jum339k210560.jpg
http://haitao.nos.netease.com/b4c188cd362f407b88eb9ee85fac0be51572419265474k2cxty9u12781.jpg
http://haitao.nos.netease.com/38ff93e978f04bf2abc1240544fe02241572419265578k2cxtycq12782.jpg
http://haitao.nos.netease.com/38ff93e978f04bf2abc1240544fe02241572419265578k2cxtycq12782.jpg