分析界面,在全国公共资源交易平台使用java获取全国的招投标数据接口

任务:获取全国的建筑招投标数据信息,并打开界面获取详情页抓取html保存至本地。

  1. 打开网址地址,进行网页分析。

2.获取省市区联动,在控制台并没有发现任何往后台获取省市区联动的信息,怀疑是在js里写死了。

 

我们在页面上看到了省一级的菜单源码。

                        <select id="provinceId">
	        				<option value="0">不限</option>
					                                                                               
							<option value="110000">北京</option>
					                                                                               
							<option value="120000">天津</option>
					                                                                               
							<option value="130000">河北</option>
					                                                                               
							<option value="140000">山西</option>
					                                                                               
							<option value="150000">内蒙古</option>
					                                                                               
							<option value="210000">辽宁</option>
					                                                                               
							<option value="220000">吉林</option>
					                                                                               
							<option value="230000">黑龙江</option>
					                                                                               
							<option value="310000">上海</option>
					                                                                               
							<option value="320000">江苏</option>
					                                                                               
							<option value="330000">浙江</option>
					                                                                               
							<option value="340000">安徽</option>
					                                                                               
							<option value="350000">福建</option>
					                                                                               
							<option value="360000">江西</option>
					                                                                               
							<option value="370000">山东</option>
					                                                                               
							<option value="410000">河南</option>
					                                                                               
							<option value="420000">湖北</option>
					                                                                               
							<option value="430000">湖南</option>
					                                                                               
							<option value="440000">广东</option>
					                                                                               
							<option value="450000">广西</option>
					                                                                               
							<option value="460000">海南</option>
					                                                                               
							<option value="500000">重庆</option>
					                                                                               
							<option value="510000">四川</option>
					                                                                               
							<option value="520000">贵州</option>
					                                                                               
							<option value="530000">云南</option>
					                                                                               
							<option value="540000">西藏</option>
					                                                                               
							<option value="610000">陕西</option>
					                                                                               
							<option value="620000">甘肃</option>
					                                                                               
							<option value="630000">青海</option>
					                                                                               
							<option value="640000">宁夏</option>
					                                                                               
							<option value="650000">新疆</option>
					                                                                               
							<option value="660000">兵团</option>
						
	        			</select>

在js里搜索provinceId代码太多没有太多有用信息,搜索11000北京,发现了写在js里写死了的所有市信息,我们复制出来。

 

 3.以山东省为例,获取山东省获取的建筑招投标信息。

  分析调用后台接口的列表url:http://deal.ggzy.gov.cn/ds/deal/dealList_find.jsp。

 

static String sendPost(String url,String area,String page) throws UnsupportedEncodingException {
        try {
            //睡眠,防止调用过快被封
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpPost httpPost = new HttpPost(url);
        httpPost.addHeader("Accept","application/json");
        httpPost.addHeader("Content-Type","application/x-www-form-urlencoded; charset=UTF-8");
        httpPost.addHeader("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");
        List<NameValuePair> nvps = new ArrayList<NameValuePair>();
        SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        Date currentDate = new Date();
        Calendar c = Calendar.getInstance();
        c.setTime(currentDate);
        c.add(Calendar.DATE, - 9);
        Date d = c.getTime();
        String day = format.format(d);

        //分析了部分请求参数
        nvps.add(new BasicNameValuePair("TIMEBEGIN_SHOW",day));
        nvps.add(new BasicNameValuePair("TIMEEND_SHOW",new SimpleDateFormat("yyyy-MM-dd").format(new Date())));

        //通过在前端分析接口发现 timebegin和timeend相差10天
        nvps.add(new BasicNameValuePair("TIMEBEGIN",day));
        nvps.add(new BasicNameValuePair("TIMEEND",new SimpleDateFormat("yyyy-MM-dd").format(currentDate)));

        nvps.add(new BasicNameValuePair("SOURCE_TYPE","1"));
        nvps.add(new BasicNameValuePair("DEAL_TIME","01"));
        nvps.add(new BasicNameValuePair("DEAL_CLASSIFY","01"));
        nvps.add(new BasicNameValuePair("DEAL_STAGE","0101"));

        //山东的省代码
        nvps.add(new BasicNameValuePair("DEAL_PROVINCE","370000"));

        //市代码
        nvps.add(new BasicNameValuePair("DEAL_CITY",area));

        nvps.add(new BasicNameValuePair("DEAL_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("BID_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("DEAL_TRADE","0"));
        nvps.add(new BasicNameValuePair("isShowAll","1"));
        nvps.add(new BasicNameValuePair("PAGENUMBER",page));
        nvps.add(new BasicNameValuePair("FINDTXT",""));
        httpPost.setEntity(new UrlEncodedFormEntity(nvps, "utf-8"));
        String res = "";
        HttpResponse response = null;
        try {
            response = httpClient.execute(httpPost);
            res = EntityUtils.toString(response.getEntity(), "utf-8");
        } catch (Exception e) {
            e.printStackTrace();
        }
       return res;
    }

4.对返回格式进行处理并且打开页面处理点击事件。

   页面返回的json格式,url字为打开的详情页。

        {
            "classify":"01",
            "title":"王马社区南片老旧小区综合改造提升工程设计-采购-施工(EPC)总承包",
            "timeShow":"2020-06-29",
            "stageName":"信息类型",
            "platformName":"杭州市电子招投标平台",
            "classifyShow":"工程建设",
            "tradeShow":"",
            "districtShow":"浙江",
            "url":"http://www.ggzy.gov.cn/information/html/a/330000/0102/202005/28/0033b5bf6b41dcd34309ae1ed58280fa6244.shtml",
            "stageShow":"开标记录",
            "titleShow":"王马社区南片老旧小区综合改造提升工程设计-采购-施工(EPC)总承包"
        }

 打开以后我们发现页面是停留在开标记录界面的,但是我们要获取的招标公告,需要触发一个点击事件。

private void addTargetUrl(Spider spider) {


        System.setProperty("webdriver.chrome.driver",
                "C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\Application\\75.0.3770.100\\chromedriver_win32\\chromedriver.exe");


        //webmagic默认会打开浏览器 关闭浏览器
        ChromeOptions chromeOptions=new ChromeOptions();
        chromeOptions.addArguments("-headless");
        WebDriver driver = new ChromeDriver(chromeOptions);

        driver.manage().window().maximize();

        //tenderinfos  爬取到的所有url
        for(TenderInfo tenderInfo: tenderInfos){
            try {
                String url = tenderInfo.getUrl();
                driver.get(tenderInfo.getUrl());
                //获取招标/资审公告按钮
                WebElement element = driver.findElement(By.xpath("//li[@id='t_0101']"));
                //点击
                element.click();
                //获取招标/资审公告 url
                WebElement element1 = driver.findElement(By.xpath("//div[@id='show0101']"));
                String substring = url.substring( url.lastIndexOf("/")+1,url.length()-6);
                String targetUrl = element1.findElement(By.xpath("//iframe[@id='iframe0101']")).getAttribute("src");

                //把外面列表的id追加到招标/资审公告上,来确定列表和url之间的关系
                targetUrl+="?url="+substring;
                spider.addUrl(targetUrl);
                //睡3秒 防止被封
                Thread.sleep(3000);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }

 5.获取到最终的url,使用webmagic打开抓取并保存到本地。

 

String pageUrl = page.getUrl().toString();
        Html pageHtml = page.getHtml();
        try {
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        String substring = pageUrl.substring( pageUrl.lastIndexOf("=")+1,pageUrl.length());
        String xpath = pageHtml.xpath("//div[@id=\"mycontent\"]").toString();
        String html = "<!DOCTYPE html>\n" +
                "<html lang=\"en\">\n" +
                "<head>\n" +
                "    <meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,maximum-scale=1.0,user-scalable=0\">\n"+"<script src=\"jquery-1.6.4.min.js\"></script>"+
                "    <meta\n" +
                "      http-equiv=\"X-UA-Compatible\"\n" +
                "      content=\"IE=edge,chrome=1\"\n" +
                "      charset=\"utf-8\"\n" +
                "    />\n" +
                "</head>\n" +
                "<body>";
        html+=xpath;

        html+="</body>\n" +
                "</html>";


        File fp=new File("F:\\zfpackage\\"+substring+".html");
        PrintWriter pfp= null;
        try {
            pfp = new PrintWriter(fp);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        pfp.print(html);
        pfp.close();

 

 

最终代码:




import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.magic.demo.ConsolePipeline;
import com.sun.scenario.effect.impl.sw.sse.SSEBlend_SRC_OUTPeer;
import org.apache.commons.lang3.StringUtils;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import us.codecraft.webmagic.Page;
import us.codecraft.webmagic.Site;
import us.codecraft.webmagic.Spider;
import us.codecraft.webmagic.downloader.selenium.SeleniumDownloader;
import us.codecraft.webmagic.processor.PageProcessor;
import us.codecraft.webmagic.selector.Html;

import java.io.*;
import java.sql.SQLOutput;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Date;
import java.util.List;

/**
 * 获取所有的项目文件列表
 */
public class TenderInfoWebmagic implements PageProcessor {


    //webmagic site信息
        private Site site = Site
                .me()
                .setSleepTime(30000)
                // .setCycleRetryTimes(5)失败则会重试
                .setUserAgent(
                        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");

    //招投标获取列表接口
    private static String BIAO_URL = "http://deal.ggzy.gov.cn/ds/deal/dealList_find.jsp";

    //招投标所有的详情url
    private static List<TenderInfo> tenderInfos = new ArrayList<TenderInfo>();


    private static String provinceInfo = "";

    //城市二级信息
    private static String addressInfo = "{\"120000\":[{\"id\":\"120000\",\"name\":\"省本级\"},{\"id\":\"120101\",\"name\":\"和平区\"},{\"id\":\"120102\",\"name\":\"河东区\"},{\"id\":\"120103\",\"name\":\"河西区\"},{\"id\":\"120104\",\"name\":\"南开区\"},{\"id\":\"120105\",\"name\":\"河北区\"},{\"id\":\"120106\",\"name\":\"红桥区\"},{\"id\":\"120110\",\"name\":\"东丽区\"},{\"id\":\"120111\",\"name\":\"西青区\"},{\"id\":\"120112\",\"name\":\"津南区\"},{\"id\":\"120113\",\"name\":\"北辰区\"},{\"id\":\"120114\",\"name\":\"武清区\"},{\"id\":\"120115\",\"name\":\"宝坻区\"},{\"id\":\"120116\",\"name\":\"滨海新区\"},{\"id\":\"120117\",\"name\":\"宁河区\"},{\"id\":\"120118\",\"name\":\"静海区\"},{\"id\":\"120119\",\"name\":\"蓟州区\"}],\"450000\":[{\"id\":\"450000\",\"name\":\"省本级\"},{\"id\":\"450100\",\"name\":\"南宁市\"},{\"id\":\"450200\",\"name\":\"柳州市\"},{\"id\":\"450300\",\"name\":\"桂林市\"},{\"id\":\"450400\",\"name\":\"梧州市\"},{\"id\":\"450500\",\"name\":\"北海市\"},{\"id\":\"450600\",\"name\":\"防城港市\"},{\"id\":\"450700\",\"name\":\"钦州市\"},{\"id\":\"450800\",\"name\":\"贵港市\"},{\"id\":\"450900\",\"name\":\"玉林市\"},{\"id\":\"451000\",\"name\":\"百色市\"},{\"id\":\"451100\",\"name\":\"贺州市\"},{\"id\":\"451200\",\"name\":\"河池市\"},{\"id\":\"451300\",\"name\":\"来宾市\"},{\"id\":\"451400\",\"name\":\"崇左市\"}],\"140000\":[{\"id\":\"140000\",\"name\":\"省本级\"},{\"id\":\"140100\",\"name\":\"太原市\"},{\"id\":\"140200\",\"name\":\"大同市\"},{\"id\":\"140300\",\"name\":\"阳泉市\"},{\"id\":\"140400\",\"name\":\"长治市\"},{\"id\":\"140500\",\"name\":\"晋城市\"},{\"id\":\"140600\",\"name\":\"朔州市\"},{\"id\":\"140700\",\"name\":\"晋中市\"},{\"id\":\"140800\",\"name\":\"运城市\"},{\"id\":\"140900\",\"name\":\"忻州市\"},{\"id\":\"141000\",\"name\":\"临汾市\"},{\"id\":\"141100\",\"name\":\"吕梁市\"}],\"630000\":[{\"id\":\"630000\",\"name\":\"省本级\"},{\"id\":\"630100\",\"name\":\"西宁市\"},{\"id\":\"630200\",\"name\":\"海东市\"},{\"id\":\"632200\",\"name\":\"海北藏族自治州\"},{\"id\":\"632300\",\"name\":\"黄南藏族自治州\"},{\"id\":\"632500\",\"name\":\"海南藏族自治州\"},{\"id\":\"632600\",\"name\":\"果洛藏族自治州\"},{\"id\":\"632700\",\"name\":\"玉树藏族自治州\"},{\"id\":\"632800\",\"name\":\"海西蒙古族藏族自治州\"}],\"440000\":[{\"id\":\"440000\",\"name\":\"省本级\"},{\"id\":\"440100\",\"name\":\"广州市\"},{\"id\":\"440200\",\"name\":\"韶关市\"},{\"id\":\"440300\",\"name\":\"深圳市\"},{\"id\":\"440400\",\"name\":\"珠海市\"},{\"id\":\"440500\",\"name\":\"汕头市\"},{\"id\":\"440600\",\"name\":\"佛山市\"},{\"id\":\"440700\",\"name\":\"江门市\"},{\"id\":\"440800\",\"name\":\"湛江市\"},{\"id\":\"440900\",\"name\":\"茂名市\"},{\"id\":\"441200\",\"name\":\"肇庆市\"},{\"id\":\"441300\",\"name\":\"惠州市\"},{\"id\":\"441400\",\"name\":\"梅州市\"},{\"id\":\"441500\",\"name\":\"汕尾市\"},{\"id\":\"441600\",\"name\":\"河源市\"},{\"id\":\"441700\",\"name\":\"阳江市\"},{\"id\":\"441800\",\"name\":\"清远市\"},{\"id\":\"441900\",\"name\":\"东莞市\"},{\"id\":\"442000\",\"name\":\"中山市\"},{\"id\":\"445100\",\"name\":\"潮州市\"},{\"id\":\"445200\",\"name\":\"揭阳市\"},{\"id\":\"445300\",\"name\":\"云浮市\"}],\"430000\":[{\"id\":\"430000\",\"name\":\"省本级\"},{\"id\":\"430100\",\"name\":\"长沙市\"},{\"id\":\"430200\",\"name\":\"株洲市\"},{\"id\":\"430300\",\"name\":\"湘潭市\"},{\"id\":\"430400\",\"name\":\"衡阳市\"},{\"id\":\"430500\",\"name\":\"邵阳市\"},{\"id\":\"430600\",\"name\":\"岳阳市\"},{\"id\":\"430700\",\"name\":\"常德市\"},{\"id\":\"430800\",\"name\":\"张家界市\"},{\"id\":\"430900\",\"name\":\"益阳市\"},{\"id\":\"431000\",\"name\":\"郴州市\"},{\"id\":\"431100\",\"name\":\"永州市\"},{\"id\":\"431200\",\"name\":\"怀化市\"},{\"id\":\"431300\",\"name\":\"娄底市\"},{\"id\":\"433100\",\"name\":\"湘西土家族苗族自治州\"}],\"620000\":[{\"id\":\"620000\",\"name\":\"省本级\"},{\"id\":\"620100\",\"name\":\"兰州市\"},{\"id\":\"620200\",\"name\":\"嘉峪关市\"},{\"id\":\"620300\",\"name\":\"金昌市\"},{\"id\":\"620400\",\"name\":\"白银市\"},{\"id\":\"620500\",\"name\":\"天水市\"},{\"id\":\"620600\",\"name\":\"武威市\"},{\"id\":\"620700\",\"name\":\"张掖市\"},{\"id\":\"620800\",\"name\":\"平凉市\"},{\"id\":\"620900\",\"name\":\"酒泉市\"},{\"id\":\"621000\",\"name\":\"庆阳市\"},{\"id\":\"621100\",\"name\":\"定西市\"},{\"id\":\"621200\",\"name\":\"陇南市\"},{\"id\":\"622900\",\"name\":\"临夏回族自治州\"},{\"id\":\"623000\",\"name\":\"甘南藏族自治州\"}],\"640000\":[{\"id\":\"640000\",\"name\":\"省本级\"},{\"id\":\"640100\",\"name\":\"银川市\"},{\"id\":\"640200\",\"name\":\"石嘴山市\"},{\"id\":\"640300\",\"name\":\"吴忠市\"},{\"id\":\"640400\",\"name\":\"固原市\"},{\"id\":\"640500\",\"name\":\"中卫市\"}],\"230000\":[{\"id\":\"230000\",\"name\":\"省本级\"},{\"id\":\"230100\",\"name\":\"哈尔滨市\"},{\"id\":\"230200\",\"name\":\"齐齐哈尔市\"},{\"id\":\"230300\",\"name\":\"鸡西市\"},{\"id\":\"230400\",\"name\":\"鹤岗市\"},{\"id\":\"230500\",\"name\":\"双鸭山市\"},{\"id\":\"230600\",\"name\":\"大庆市\"},{\"id\":\"230700\",\"name\":\"伊春市\"},{\"id\":\"230800\",\"name\":\"佳木斯市\"},{\"id\":\"230900\",\"name\":\"七台河市\"},{\"id\":\"231000\",\"name\":\"牡丹江市\"},{\"id\":\"231100\",\"name\":\"黑河市\"},{\"id\":\"231200\",\"name\":\"绥化市\"},{\"id\":\"232700\",\"name\":\"大兴安岭地区\"}],\"410000\":[{\"id\":\"410000\",\"name\":\"省本级\"},{\"id\":\"410100\",\"name\":\"郑州市\"},{\"id\":\"410200\",\"name\":\"开封市\"},{\"id\":\"410300\",\"name\":\"洛阳市\"},{\"id\":\"410400\",\"name\":\"平顶山市\"},{\"id\":\"410500\",\"name\":\"安阳市\"},{\"id\":\"410600\",\"name\":\"鹤壁市\"},{\"id\":\"410700\",\"name\":\"新乡市\"},{\"id\":\"410800\",\"name\":\"焦作市\"},{\"id\":\"410900\",\"name\":\"濮阳市\"},{\"id\":\"411000\",\"name\":\"许昌市\"},{\"id\":\"411100\",\"name\":\"漯河市\"},{\"id\":\"411200\",\"name\":\"三门峡市\"},{\"id\":\"411300\",\"name\":\"南阳市\"},{\"id\":\"411400\",\"name\":\"商丘市\"},{\"id\":\"411500\",\"name\":\"信阳市\"},{\"id\":\"411600\",\"name\":\"周口市\"},{\"id\":\"411700\",\"name\":\"驻马店市\"},{\"id\":\"419001\",\"name\":\"济源市\"}],\"330000\":[{\"id\":\"330000\",\"name\":\"省本级\"},{\"id\":\"330100\",\"name\":\"杭州市\"},{\"id\":\"330200\",\"name\":\"宁波市\"},{\"id\":\"330300\",\"name\":\"温州市\"},{\"id\":\"330400\",\"name\":\"嘉兴市\"},{\"id\":\"330500\",\"name\":\"湖州市\"},{\"id\":\"330600\",\"name\":\"绍兴市\"},{\"id\":\"330700\",\"name\":\"金华市\"},{\"id\":\"330800\",\"name\":\"衢州市\"},{\"id\":\"330900\",\"name\":\"舟山市\"},{\"id\":\"331000\",\"name\":\"台州市\"},{\"id\":\"331100\",\"name\":\"丽水市\"}],\"510000\":[{\"id\":\"510000\",\"name\":\"省本级\"},{\"id\":\"510100\",\"name\":\"成都市\"},{\"id\":\"510300\",\"name\":\"自贡市\"},{\"id\":\"510400\",\"name\":\"攀枝花市\"},{\"id\":\"510500\",\"name\":\"泸州市\"},{\"id\":\"510600\",\"name\":\"德阳市\"},{\"id\":\"510700\",\"name\":\"绵阳市\"},{\"id\":\"510800\",\"name\":\"广元市\"},{\"id\":\"510900\",\"name\":\"遂宁市\"},{\"id\":\"511000\",\"name\":\"内江市\"},{\"id\":\"511100\",\"name\":\"乐山市\"},{\"id\":\"511300\",\"name\":\"南充市\"},{\"id\":\"511400\",\"name\":\"眉山市\"},{\"id\":\"511500\",\"name\":\"宜宾市\"},{\"id\":\"511600\",\"name\":\"广安市\"},{\"id\":\"511700\",\"name\":\"达州市\"},{\"id\":\"511800\",\"name\":\"雅安市\"},{\"id\":\"511900\",\"name\":\"巴中市\"},{\"id\":\"512000\",\"name\":\"资阳市\"},{\"id\":\"513200\",\"name\":\"阿坝藏族羌族自治州\"},{\"id\":\"513300\",\"name\":\"甘孜藏族自治州\"},{\"id\":\"513400\",\"name\":\"凉山彝族自治州\"}],\"210000\":[{\"id\":\"210000\",\"name\":\"省本级\"},{\"id\":\"210100\",\"name\":\"沈阳市\"},{\"id\":\"210200\",\"name\":\"大连市\"},{\"id\":\"210300\",\"name\":\"鞍山市\"},{\"id\":\"210400\",\"name\":\"抚顺市\"},{\"id\":\"210500\",\"name\":\"本溪市\"},{\"id\":\"210600\",\"name\":\"丹东市\"},{\"id\":\"210700\",\"name\":\"锦州市\"},{\"id\":\"210800\",\"name\":\"营口市\"},{\"id\":\"210900\",\"name\":\"阜新市\"},{\"id\":\"211000\",\"name\":\"辽阳市\"},{\"id\":\"211100\",\"name\":\"盘锦市\"},{\"id\":\"211200\",\"name\":\"铁岭市\"},{\"id\":\"211300\",\"name\":\"朝阳市\"},{\"id\":\"211400\",\"name\":\"葫芦岛市\"}],\"530000\":[{\"id\":\"530000\",\"name\":\"省本级\"},{\"id\":\"530100\",\"name\":\"昆明市\"},{\"id\":\"530300\",\"name\":\"曲靖市\"},{\"id\":\"530400\",\"name\":\"玉溪市\"},{\"id\":\"530500\",\"name\":\"保山市\"},{\"id\":\"530600\",\"name\":\"昭通市\"},{\"id\":\"530700\",\"name\":\"丽江市\"},{\"id\":\"530800\",\"name\":\"普洱市\"},{\"id\":\"530900\",\"name\":\"临沧市\"},{\"id\":\"532300\",\"name\":\"楚雄彝族自治州\"},{\"id\":\"532500\",\"name\":\"红河哈尼族彝族自治州\"},{\"id\":\"532600\",\"name\":\"文山壮族苗族自治州\"},{\"id\":\"532800\",\"name\":\"西双版纳傣族自治州\"},{\"id\":\"532900\",\"name\":\"大理白族自治州\"},{\"id\":\"533100\",\"name\":\"德宏傣族景颇族自治州\"},{\"id\":\"533300\",\"name\":\"怒江傈僳族自治州\"},{\"id\":\"533400\",\"name\":\"迪庆藏族自治州\"}],\"130000\":[{\"id\":\"130000\",\"name\":\"省本级\"},{\"id\":\"130100\",\"name\":\"石家庄市\"},{\"id\":\"130200\",\"name\":\"唐山市\"},{\"id\":\"130300\",\"name\":\"秦皇岛市\"},{\"id\":\"130400\",\"name\":\"邯郸市\"},{\"id\":\"130500\",\"name\":\"邢台市\"},{\"id\":\"130600\",\"name\":\"保定市\"},{\"id\":\"130700\",\"name\":\"张家口市\"},{\"id\":\"130800\",\"name\":\"承德市\"},{\"id\":\"130900\",\"name\":\"沧州市\"},{\"id\":\"131000\",\"name\":\"廊坊市\"},{\"id\":\"131100\",\"name\":\"衡水市\"}],\"340000\":[{\"id\":\"340000\",\"name\":\"省本级\"},{\"id\":\"340100\",\"name\":\"合肥市\"},{\"id\":\"340200\",\"name\":\"芜湖市\"},{\"id\":\"340300\",\"name\":\"蚌埠市\"},{\"id\":\"340400\",\"name\":\"淮南市\"},{\"id\":\"340500\",\"name\":\"马鞍山市\"},{\"id\":\"340600\",\"name\":\"淮北市\"},{\"id\":\"340700\",\"name\":\"铜陵市\"},{\"id\":\"340800\",\"name\":\"安庆市\"},{\"id\":\"341000\",\"name\":\"黄山市\"},{\"id\":\"341100\",\"name\":\"滁州市\"},{\"id\":\"341200\",\"name\":\"阜阳市\"},{\"id\":\"341300\",\"name\":\"宿州市\"},{\"id\":\"341500\",\"name\":\"六安市\"},{\"id\":\"341600\",\"name\":\"亳州市\"},{\"id\":\"341700\",\"name\":\"池州市\"},{\"id\":\"341800\",\"name\":\"宣城市\"}],\"500000\":[{\"id\":\"500000\",\"name\":\"省本级\"},{\"id\":\"500101\",\"name\":\"万州区\"},{\"id\":\"500102\",\"name\":\"涪陵区\"},{\"id\":\"500103\",\"name\":\"渝中区\"},{\"id\":\"500104\",\"name\":\"大渡口区\"},{\"id\":\"500105\",\"name\":\"江北区\"},{\"id\":\"500106\",\"name\":\"沙坪坝区\"},{\"id\":\"500107\",\"name\":\"九龙坡区\"},{\"id\":\"500108\",\"name\":\"南岸区\"},{\"id\":\"500109\",\"name\":\"北碚区\"},{\"id\":\"500110\",\"name\":\"綦江区\"},{\"id\":\"500111\",\"name\":\"大足区\"},{\"id\":\"500112\",\"name\":\"渝北区\"},{\"id\":\"500113\",\"name\":\"巴南区\"},{\"id\":\"500114\",\"name\":\"黔江区\"},{\"id\":\"500115\",\"name\":\"长寿区\"},{\"id\":\"500116\",\"name\":\"江津区\"},{\"id\":\"500117\",\"name\":\"合川区\"},{\"id\":\"500118\",\"name\":\"永川区\"},{\"id\":\"500119\",\"name\":\"南川区\"},{\"id\":\"500120\",\"name\":\"璧山区\"},{\"id\":\"500151\",\"name\":\"铜梁区\"},{\"id\":\"500152\",\"name\":\"潼南区\"},{\"id\":\"500153\",\"name\":\"荣昌区\"},{\"id\":\"500154\",\"name\":\"开州区\"},{\"id\":\"500155\",\"name\":\"梁平区\"},{\"id\":\"500156\",\"name\":\"武隆区\"},{\"id\":\"500229\",\"name\":\"城口县\"},{\"id\":\"500230\",\"name\":\"丰都县\"},{\"id\":\"500231\",\"name\":\"垫江县\"},{\"id\":\"500233\",\"name\":\"忠县\"},{\"id\":\"500235\",\"name\":\"云阳县\"},{\"id\":\"500236\",\"name\":\"奉节县\"},{\"id\":\"500237\",\"name\":\"巫山县\"},{\"id\":\"500238\",\"name\":\"巫溪县\"},{\"id\":\"500240\",\"name\":\"石柱土家族自治县\"},{\"id\":\"500241\",\"name\":\"秀山土家族苗族自治县\"},{\"id\":\"500242\",\"name\":\"酉阳土家族苗族自治县\"},{\"id\":\"500243\",\"name\":\"彭水苗族土家族自治县\"}],\"350000\":[{\"id\":\"350000\",\"name\":\"省本级\"},{\"id\":\"350100\",\"name\":\"福州市\"},{\"id\":\"350200\",\"name\":\"厦门市\"},{\"id\":\"350300\",\"name\":\"莆田市\"},{\"id\":\"350400\",\"name\":\"三明市\"},{\"id\":\"350500\",\"name\":\"泉州市\"},{\"id\":\"350600\",\"name\":\"漳州市\"},{\"id\":\"350700\",\"name\":\"南平市\"},{\"id\":\"350800\",\"name\":\"龙岩市\"},{\"id\":\"350900\",\"name\":\"宁德市\"}],\"320000\":[{\"id\":\"320000\",\"name\":\"省本级\"},{\"id\":\"320100\",\"name\":\"南京市\"},{\"id\":\"320200\",\"name\":\"无锡市\"},{\"id\":\"320300\",\"name\":\"徐州市\"},{\"id\":\"320400\",\"name\":\"常州市\"},{\"id\":\"320500\",\"name\":\"苏州市\"},{\"id\":\"320600\",\"name\":\"南通市\"},{\"id\":\"320700\",\"name\":\"连云港市\"},{\"id\":\"320800\",\"name\":\"淮安市\"},{\"id\":\"320900\",\"name\":\"盐城市\"},{\"id\":\"321000\",\"name\":\"扬州市\"},{\"id\":\"321100\",\"name\":\"镇江市\"},{\"id\":\"321200\",\"name\":\"泰州市\"},{\"id\":\"321300\",\"name\":\"宿迁市\"}],\"220000\":[{\"id\":\"220000\",\"name\":\"省本级\"},{\"id\":\"220100\",\"name\":\"长春市\"},{\"id\":\"220200\",\"name\":\"吉林市\"},{\"id\":\"220300\",\"name\":\"四平市\"},{\"id\":\"220400\",\"name\":\"辽源市\"},{\"id\":\"220500\",\"name\":\"通化市\"},{\"id\":\"220600\",\"name\":\"白山市\"},{\"id\":\"220700\",\"name\":\"松原市\"},{\"id\":\"220800\",\"name\":\"白城市\"},{\"id\":\"222400\",\"name\":\"延边朝鲜族自治州\"}],\"310000\":[{\"id\":\"310000\",\"name\":\"省本级\"},{\"id\":\"310101\",\"name\":\"黄浦区\"},{\"id\":\"310104\",\"name\":\"徐汇区\"},{\"id\":\"310105\",\"name\":\"长宁区\"},{\"id\":\"310106\",\"name\":\"静安区\"},{\"id\":\"310107\",\"name\":\"普陀区\"},{\"id\":\"310109\",\"name\":\"虹口区\"},{\"id\":\"310110\",\"name\":\"杨浦区\"},{\"id\":\"310112\",\"name\":\"闵行区\"},{\"id\":\"310113\",\"name\":\"宝山区\"},{\"id\":\"310114\",\"name\":\"嘉定区\"},{\"id\":\"310115\",\"name\":\"浦东新区\"},{\"id\":\"310116\",\"name\":\"金山区\"},{\"id\":\"310117\",\"name\":\"松江区\"},{\"id\":\"310118\",\"name\":\"青浦区\"},{\"id\":\"310120\",\"name\":\"奉贤区\"},{\"id\":\"310151\",\"name\":\"崇明区\"}],\"650000\":[{\"id\":\"650000\",\"name\":\"省本级\"},{\"id\":\"650100\",\"name\":\"乌鲁木齐市\"},{\"id\":\"650200\",\"name\":\"克拉玛依市\"},{\"id\":\"652100\",\"name\":\"吐鲁番市\"},{\"id\":\"652200\",\"name\":\"哈密市\"},{\"id\":\"652300\",\"name\":\"昌吉回族自治州\"},{\"id\":\"652700\",\"name\":\"博尔塔拉蒙古自治州\"},{\"id\":\"652800\",\"name\":\"巴音郭楞蒙古自治州\"},{\"id\":\"652900\",\"name\":\"阿克苏地区\"},{\"id\":\"653000\",\"name\":\"克孜勒苏柯尔克孜自治州\"},{\"id\":\"653100\",\"name\":\"喀什地区\"},{\"id\":\"653200\",\"name\":\"和田地区\"},{\"id\":\"654000\",\"name\":\"伊犁哈萨克自治州\"},{\"id\":\"654200\",\"name\":\"塔城地区\"},{\"id\":\"654300\",\"name\":\"阿勒泰地区\"},{\"id\":\"659001\",\"name\":\"石河子市\"},{\"id\":\"659002\",\"name\":\"阿拉尔市\"},{\"id\":\"659003\",\"name\":\"图木舒克市\"},{\"id\":\"659004\",\"name\":\"五家渠市\"}],\"150000\":[{\"id\":\"150000\",\"name\":\"省本级\"},{\"id\":\"150100\",\"name\":\"呼和浩特市\"},{\"id\":\"150200\",\"name\":\"包头市\"},{\"id\":\"150300\",\"name\":\"乌海市\"},{\"id\":\"150400\",\"name\":\"赤峰市\"},{\"id\":\"150500\",\"name\":\"通辽市\"},{\"id\":\"150600\",\"name\":\"鄂尔多斯市\"},{\"id\":\"150700\",\"name\":\"呼伦贝尔市\"},{\"id\":\"150800\",\"name\":\"巴彦淖尔市\"},{\"id\":\"150900\",\"name\":\"乌兰察布市\"},{\"id\":\"152200\",\"name\":\"兴安盟\"},{\"id\":\"152500\",\"name\":\"锡林郭勒盟\"},{\"id\":\"152900\",\"name\":\"阿拉善盟\"}],\"610000\":[{\"id\":\"610000\",\"name\":\"省本级\"},{\"id\":\"610100\",\"name\":\"西安市\"},{\"id\":\"610200\",\"name\":\"铜川市\"},{\"id\":\"610300\",\"name\":\"宝鸡市\"},{\"id\":\"610400\",\"name\":\"咸阳市\"},{\"id\":\"610500\",\"name\":\"渭南市\"},{\"id\":\"610600\",\"name\":\"延安市\"},{\"id\":\"610700\",\"name\":\"汉中市\"},{\"id\":\"610800\",\"name\":\"榆林市\"},{\"id\":\"610900\",\"name\":\"安康市\"},{\"id\":\"611000\",\"name\":\"商洛市\"}],\"540000\":[{\"id\":\"540000\",\"name\":\"省本级\"},{\"id\":\"540100\",\"name\":\"拉萨市\"},{\"id\":\"542100\",\"name\":\"昌都市\"},{\"id\":\"542200\",\"name\":\"山南市\"},{\"id\":\"542300\",\"name\":\"日喀则市\"},{\"id\":\"542400\",\"name\":\"那曲市\"},{\"id\":\"542500\",\"name\":\"阿里地区\"},{\"id\":\"542600\",\"name\":\"林芝市\"}],\"360000\":[{\"id\":\"360000\",\"name\":\"省本级\"},{\"id\":\"360100\",\"name\":\"南昌市\"},{\"id\":\"360200\",\"name\":\"景德镇市\"},{\"id\":\"360300\",\"name\":\"萍乡市\"},{\"id\":\"360400\",\"name\":\"九江市\"},{\"id\":\"360500\",\"name\":\"新余市\"},{\"id\":\"360600\",\"name\":\"鹰潭市\"},{\"id\":\"360700\",\"name\":\"赣州市\"},{\"id\":\"360800\",\"name\":\"吉安市\"},{\"id\":\"360900\",\"name\":\"宜春市\"},{\"id\":\"361000\",\"name\":\"抚州市\"},{\"id\":\"361100\",\"name\":\"上饶市\"}],\"420000\":[{\"id\":\"420000\",\"name\":\"省本级\"},{\"id\":\"420100\",\"name\":\"武汉市\"},{\"id\":\"420200\",\"name\":\"黄石市\"},{\"id\":\"420300\",\"name\":\"十堰市\"},{\"id\":\"420500\",\"name\":\"宜昌市\"},{\"id\":\"420600\",\"name\":\"襄阳市\"},{\"id\":\"420700\",\"name\":\"鄂州市\"},{\"id\":\"420800\",\"name\":\"荆门市\"},{\"id\":\"420900\",\"name\":\"孝感市\"},{\"id\":\"421000\",\"name\":\"荆州市\"},{\"id\":\"421100\",\"name\":\"黄冈市\"},{\"id\":\"421200\",\"name\":\"咸宁市\"},{\"id\":\"421300\",\"name\":\"随州市\"},{\"id\":\"422800\",\"name\":\"恩施土家族苗族自治州\"},{\"id\":\"429004\",\"name\":\"仙桃市\"},{\"id\":\"429005\",\"name\":\"潜江市\"},{\"id\":\"429006\",\"name\":\"天门市\"},{\"id\":\"429021\",\"name\":\"神农架林区\"}],\"520000\":[{\"id\":\"520000\",\"name\":\"省本级\"},{\"id\":\"520100\",\"name\":\"贵阳市\"},{\"id\":\"520200\",\"name\":\"六盘水市\"},{\"id\":\"520300\",\"name\":\"遵义市\"},{\"id\":\"520400\",\"name\":\"安顺市\"},{\"id\":\"520500\",\"name\":\"毕节市\"},{\"id\":\"520600\",\"name\":\"铜仁市\"},{\"id\":\"522300\",\"name\":\"黔西南布依族苗族自治州\"},{\"id\":\"522600\",\"name\":\"黔东南苗族侗族自治州\"},{\"id\":\"522700\",\"name\":\"黔南布依族苗族自治州\"}],\"370000\":[{\"id\":\"370000\",\"name\":\"省本级\"},{\"id\":\"370100\",\"name\":\"济南市\"},{\"id\":\"370200\",\"name\":\"青岛市\"},{\"id\":\"370300\",\"name\":\"淄博市\"},{\"id\":\"370400\",\"name\":\"枣庄市\"},{\"id\":\"370500\",\"name\":\"东营市\"},{\"id\":\"370600\",\"name\":\"烟台市\"},{\"id\":\"370700\",\"name\":\"潍坊市\"},{\"id\":\"370800\",\"name\":\"济宁市\"},{\"id\":\"370900\",\"name\":\"泰安市\"},{\"id\":\"371000\",\"name\":\"威海市\"},{\"id\":\"371100\",\"name\":\"日照市\"},{\"id\":\"371300\",\"name\":\"临沂市\"},{\"id\":\"371400\",\"name\":\"德州市\"},{\"id\":\"371500\",\"name\":\"聊城市\"},{\"id\":\"371600\",\"name\":\"滨州市\"},{\"id\":\"371700\",\"name\":\"菏泽市\"}],\"110000\":[{\"id\":\"110000\",\"name\":\"省本级\"},{\"id\":\"110101\",\"name\":\"东城区\"},{\"id\":\"110102\",\"name\":\"西城区\"},{\"id\":\"110105\",\"name\":\"朝阳区\"},{\"id\":\"110106\",\"name\":\"丰台区\"},{\"id\":\"110107\",\"name\":\"石景山区\"},{\"id\":\"110108\",\"name\":\"海淀区\"},{\"id\":\"110109\",\"name\":\"门头沟区\"},{\"id\":\"110111\",\"name\":\"房山区\"},{\"id\":\"110112\",\"name\":\"通州区\"},{\"id\":\"110113\",\"name\":\"顺义区\"},{\"id\":\"110114\",\"name\":\"昌平区\"},{\"id\":\"110115\",\"name\":\"大兴区\"},{\"id\":\"110116\",\"name\":\"怀柔区\"},{\"id\":\"110117\",\"name\":\"平谷区\"},{\"id\":\"110118\",\"name\":\"密云区\"},{\"id\":\"110119\",\"name\":\"延庆区\"}],\"460000\":[{\"id\":\"460000\",\"name\":\"省本级\"},{\"id\":\"460100\",\"name\":\"海口市\"},{\"id\":\"460200\",\"name\":\"三亚市\"},{\"id\":\"460300\",\"name\":\"三沙市\"},{\"id\":\"469001\",\"name\":\"五指山市\"},{\"id\":\"469002\",\"name\":\"琼海市\"},{\"id\":\"469003\",\"name\":\"儋州市\"},{\"id\":\"469005\",\"name\":\"文昌市\"},{\"id\":\"469006\",\"name\":\"万宁市\"},{\"id\":\"469007\",\"name\":\"东方市\"},{\"id\":\"469021\",\"name\":\"定安县\"},{\"id\":\"469022\",\"name\":\"屯昌县\"},{\"id\":\"469023\",\"name\":\"澄迈县\"},{\"id\":\"469024\",\"name\":\"临高县\"},{\"id\":\"469025\",\"name\":\"白沙黎族自治县\"},{\"id\":\"469026\",\"name\":\"昌江黎族自治县\"},{\"id\":\"469027\",\"name\":\"乐东黎族自治县\"},{\"id\":\"469028\",\"name\":\"陵水黎族自治县\"},{\"id\":\"469029\",\"name\":\"保亭黎族苗族自治县\"},{\"id\":\"469030\",\"name\":\"琼中黎族苗族自治县\"}]}";

    public static void main(String[] args) throws UnsupportedEncodingException, InterruptedException {


        //初始化城市信息  这里我们获取去山东的所有城市
        List<AddressInfo> addressInfos = JSONObject.parseObject(addressInfo).getJSONArray("370000")        .toJavaList(AddressInfo.class);


        for(AddressInfo addressInfo:addressInfos){

            //开始爬取第一个地区
            System.out.println(addressInfo.getName()+"开始");

            //获取地区招投标信息
            String tenderAdderssInfo = sendPost(BIAO_URL,addressInfo.getId(),"1");
            DeaListResponse deaListResponse = JSONObject.parseObject(tenderAdderssInfo, DeaListResponse.class);
            //无数据
            if(deaListResponse==null || deaListResponse.getTtlrow()==0){
                System.out.println(addressInfo.getName()+"无数据");
                continue;
            }
            List<ToubiaoList> data = deaListResponse.getData();
            //这里我处理一下数据  准备保存至我们的数据库
            handlerZhaobiao(data,addressInfo.getName());

            //不止一页
            if(deaListResponse.getTtlpage()>1){
                for(int i=2;i<=deaListResponse.getCurrentpage();i++){
                    System.out.println(addressInfo.getName()+"第"+i+"页");
                    tenderAdderssInfo = sendPost(BIAO_URL,addressInfo.getId(),i+"");
                    deaListResponse = JSONObject.parseObject(tenderAdderssInfo, DeaListResponse.class);
                    handlerZhaobiao(data,addressInfo.getName());
                }
            }

            System.out.println(addressInfo.getName()+"结束");
        }

        System.out.println("所有招投标列表数据:"+JSONObject.toJSONString(tenderInfos));


        // 利用webmagic爬取界面信息  初始化webmagic
        TenderInfoWebmagic tenderInfoWebmagic = new TenderInfoWebmagic();
        Spider spider = Spider.create(tenderInfoWebmagic)
                .addPipeline(new ConsolePipeline())
                .setDownloader(new SeleniumDownloader("C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\Application\\75.0.3770.100\\chromedriver_win32\\chromedriver.exe"));

        tenderInfoWebmagic.addTargetUrl(spider);
        //spider.addUrl("https://www.baidu.com/");
        spider.run();
    }

    private void addTargetUrl(Spider spider) {


        System.setProperty("webdriver.chrome.driver",
                "C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\Application\\75.0.3770.100\\chromedriver_win32\\chromedriver.exe");


        //webmagic默认会打开浏览器 关闭浏览器
        ChromeOptions chromeOptions=new ChromeOptions();
        chromeOptions.addArguments("-headless");
        WebDriver driver = new ChromeDriver(chromeOptions);

        driver.manage().window().maximize();

        //tenderinfos  爬取到的所有url
        for(TenderInfo tenderInfo: tenderInfos){
            try {
                String url = tenderInfo.getUrl();
                driver.get(tenderInfo.getUrl());
                //获取招标/资审公告按钮
                WebElement element = driver.findElement(By.xpath("//li[@id='t_0101']"));
                //点击
                element.click();
                //获取招标/资审公告 url
                WebElement element1 = driver.findElement(By.xpath("//div[@id='show0101']"));
                String substring = url.substring( url.lastIndexOf("/")+1,url.length()-6);
                String targetUrl = element1.findElement(By.xpath("//iframe[@id='iframe0101']")).getAttribute("src");

                //把外面列表的id追加到招标/资审公告上,来确定列表和url之间的关系
                targetUrl+="?url="+substring;
                spider.addUrl(targetUrl);
                //睡3秒 防止被封
                Thread.sleep(3000);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }

    static String sendPost(String url,String area,String page) throws UnsupportedEncodingException {
        try {
            //睡眠,防止调用过快被封
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpPost httpPost = new HttpPost(url);
        httpPost.addHeader("Accept","application/json");
        httpPost.addHeader("Content-Type","application/x-www-form-urlencoded; charset=UTF-8");
        httpPost.addHeader("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36");
        List<NameValuePair> nvps = new ArrayList<NameValuePair>();
        SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        Date currentDate = new Date();
        Calendar c = Calendar.getInstance();
        c.setTime(currentDate);
        c.add(Calendar.DATE, - 9);
        Date d = c.getTime();
        String day = format.format(d);

        //分析了部分请求参数
        nvps.add(new BasicNameValuePair("TIMEBEGIN_SHOW",day));
        nvps.add(new BasicNameValuePair("TIMEEND_SHOW",new SimpleDateFormat("yyyy-MM-dd").format(new Date())));

        //通过在前端分析接口发现 timebegin和timeend相差10天
        nvps.add(new BasicNameValuePair("TIMEBEGIN",day));
        nvps.add(new BasicNameValuePair("TIMEEND",new SimpleDateFormat("yyyy-MM-dd").format(currentDate)));

        nvps.add(new BasicNameValuePair("SOURCE_TYPE","1"));
        nvps.add(new BasicNameValuePair("DEAL_TIME","01"));
        nvps.add(new BasicNameValuePair("DEAL_CLASSIFY","01"));
        nvps.add(new BasicNameValuePair("DEAL_STAGE","0101"));

        //山东的省代码
        nvps.add(new BasicNameValuePair("DEAL_PROVINCE","370000"));

        //市代码
        nvps.add(new BasicNameValuePair("DEAL_CITY",area));

        nvps.add(new BasicNameValuePair("DEAL_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("BID_PLATFORM","0"));
        nvps.add(new BasicNameValuePair("DEAL_TRADE","0"));
        nvps.add(new BasicNameValuePair("isShowAll","1"));
        nvps.add(new BasicNameValuePair("PAGENUMBER",page));
        nvps.add(new BasicNameValuePair("FINDTXT",""));
        httpPost.setEntity(new UrlEncodedFormEntity(nvps, "utf-8"));
        String res = "";
        HttpResponse response = null;
        try {
            response = httpClient.execute(httpPost);
            res = EntityUtils.toString(response.getEntity(), "utf-8");
        } catch (Exception e) {
            e.printStackTrace();
        }
       return res;
    }


    public static void  handlerZhaobiao( List<ToubiaoList> data,String cityName){

        System.out.println(cityName+"数据:"+JSONObject.toJSONString(data));
       Date currentDate = new Date();
        for(ToubiaoList toubiaoList : data){
            //详情url
            String url = toubiaoList.getUrl();
            TenderInfo tenderInfo = new TenderInfo();
            tenderInfo.setUrl(url);
            tenderInfo.setAddress(toubiaoList.getDistrictShow()+"省"+cityName);
            tenderInfo.setId(url.substring( url.lastIndexOf("/")+1,url.length()-6));
            //内容
            tenderInfo.setContent(toubiaoList.getTitle());
            //tenderInfo.setContract_type("工程招标");
            //tenderInfo.setAnnouncement("招标公告");
            //发布时间
            tenderInfo.setRelease_time(new SimpleDateFormat("yyyy-MM-dd").format(currentDate));
            //tenderInfo.setCreator("f8133aa5ac26efe3da55e6a6882688d7");
            String format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(currentDate);
            //tenderInfo.setCreate_time(format);
            tenderInfos.add(tenderInfo);
        }
        System.out.println(JSONObject.toJSONString(tenderInfos));
    }

    @Override
    public void process(Page page) {
        String pageUrl = page.getUrl().toString();
        Html pageHtml = page.getHtml();
        try {
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        String substring = pageUrl.substring( pageUrl.lastIndexOf("=")+1,pageUrl.length());
        String xpath = pageHtml.xpath("//div[@id=\"mycontent\"]").toString();
        String html = "<!DOCTYPE html>\n" +
                "<html lang=\"en\">\n" +
                "<head>\n" +
                "    <meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,maximum-scale=1.0,user-scalable=0\">\n"+"<script src=\"jquery-1.6.4.min.js\"></script>"+
                "    <meta\n" +
                "      http-equiv=\"X-UA-Compatible\"\n" +
                "      content=\"IE=edge,chrome=1\"\n" +
                "      charset=\"utf-8\"\n" +
                "    />\n" +
                "</head>\n" +
                "<body>";
        html+=xpath;

        html+="</body>\n" +
                "</html>";


        File fp=new File("F:\\zfpackage\\"+substring+".html");
        PrintWriter pfp= null;
        try {
            pfp = new PrintWriter(fp);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        pfp.print(html);
        pfp.close();
    }

    @Override
    public Site getSite() {
        return site;
    }
}







public class ToubiaoList {

    private String classify;

    private String title;

    private String timeShow;

    private String stageName;

    private String  platformName;

    private String classifyShow;

    private String tradeShow;

    private String districtShow;

    private String url;

    private String stageShow;

    private String titleShow;

    public String getClassify() {
        return classify;
    }

    public void setClassify(String classify) {
        this.classify = classify;
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTimeShow() {
        return timeShow;
    }

    public void setTimeShow(String timeShow) {
        this.timeShow = timeShow;
    }

    public String getStageName() {
        return stageName;
    }

    public void setStageName(String stageName) {
        this.stageName = stageName;
    }

    public String getPlatformName() {
        return platformName;
    }

    public void setPlatformName(String platformName) {
        this.platformName = platformName;
    }

    public String getClassifyShow() {
        return classifyShow;
    }

    public void setClassifyShow(String classifyShow) {
        this.classifyShow = classifyShow;
    }

    public String getTradeShow() {
        return tradeShow;
    }

    public void setTradeShow(String tradeShow) {
        this.tradeShow = tradeShow;
    }

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }

    public String getStageShow() {
        return stageShow;
    }

    public void setStageShow(String stageShow) {
        this.stageShow = stageShow;
    }

    public String getTitleShow() {
        return titleShow;
    }

    public void setTitleShow(String titleShow) {
        this.titleShow = titleShow;
    }

    public String getDistrictShow() {
        return districtShow;
    }

    public void setDistrictShow(String districtShow) {
        this.districtShow = districtShow;
    }
}



public class TenderInfo {


    private static final long serialVersionUID = 1L;
    private String id;
    private String content;
    private String contract_type;
    private String announcement;
    private String release_time;
    private String address;
    private String creator;
    private String create_time;
    private String modified;
    private String modify_time;
    private String version;
    private String url;

    public static long getSerialVersionUID() {
        return serialVersionUID;
    }

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getContent() {
        return content;
    }

    public void setContent(String content) {
        this.content = content;
    }

    public String getContract_type() {
        return contract_type;
    }

    public void setContract_type(String contract_type) {
        this.contract_type = contract_type;
    }

    public String getAnnouncement() {
        return announcement;
    }

    public void setAnnouncement(String announcement) {
        this.announcement = announcement;
    }

    public String getRelease_time() {
        return release_time;
    }

    public void setRelease_time(String release_time) {
        this.release_time = release_time;
    }

    public String getAddress() {
        return address;
    }

    public void setAddress(String address) {
        this.address = address;
    }

    public String getCreator() {
        return creator;
    }

    public void setCreator(String creator) {
        this.creator = creator;
    }

    public String getCreate_time() {
        return create_time;
    }

    public void setCreate_time(String create_time) {
        this.create_time = create_time;
    }

    public String getModified() {
        return modified;
    }

    public void setModified(String modified) {
        this.modified = modified;
    }

    public String getModify_time() {
        return modify_time;
    }

    public void setModify_time(String modify_time) {
        this.modify_time = modify_time;
    }

    public String getVersion() {
        return version;
    }

    public void setVersion(String version) {
        this.version = version;
    }

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }
}


 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章