How do I preserve line breaks when using jsoup to convert html to plain text?

問題:

I have the following code: 我有以下代碼:

 public class NewClass {
     public String noTags(String str){
         return Jsoup.parse(str).text();
     }


     public static void main(String args[]) {
         String strings="<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN \">" +
         "<HTML> <HEAD> <TITLE></TITLE> <style>body{ font-size: 12px;font-family: verdana, arial, helvetica, sans-serif;}</style> </HEAD> <BODY><p><b>hello world</b></p><p><br><b>yo</b> <a href=\"http://google.com\">googlez</a></p></BODY> </HTML> ";

         NewClass text = new NewClass();
         System.out.println((text.noTags(strings)));
}

And I have the result: 結果是:

hello world yo googlez

But I want to break the line: 但我想打破界限:

hello world
yo googlez

I have looked at jsoup's TextNode#getWholeText() but I can't figure out how to use it. 我已經看過jsoup的TextNode#getWholeText(),但是我不知道如何使用它。

If there's a <br> in the markup I parse, how can I get a line break in my resulting output? 如果我解析的標記中有一個<br> ,如何在我得到的輸出中換行?


解決方案:

參考: https://stackoom.com/en/question/NfJ8
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章