前後端分離的webgis項目(一)
簡單說明:後端django+python開發,apache+wsgi+python部署,前端vue+leaflet。項目主要展示全國5A景區的分佈信息,數據從去哪兒網站爬取存MySQL數據庫,爬取的數據不太全,由於是個人練手項目沒有做進一步處理。後端開發工具pycharm, 前端開發工具webstorm
一. 後端django+python開發,apache+wsgi+python部署
開發環境的搭建這裏就不細細說明了,python和pycharm的安裝網上教程很多,這裏就不作描述了。
- django+python開發
版本說明:python版本3.7.0 django版本2.0
首先打開pycharm,點擊File>New Project,然後選擇Pure Python,不要選Django新建工程,這樣會自動安裝最新版的django,後面就不好選擇django版本了
然後在pycharm工具左下的Terminal裏新建project和app以及安裝django等(以下的命令操作都在Terminal)
使用下面的命令新建工程
django-admin.py startproject myproject
然後打開剛纔的工程如下所示
使用下面的命令新建應用
python manage.py startapp prjApp
安裝指定版本django
pip install django==2.0
初始化sqlite數據庫
python manage.py makemigrations
python manage.py migrate
本項目使用mysql數據庫,需要安裝依賴包配置mysql數據庫連接,首先在工程名下面的_init_.py下面添加
import pymysql
pymysql.install_as_MySQLdb()
然後使用下面命令安裝mysql依賴包
pip install PyMySQL
在工程名下面的settings.py配置mysql數據庫,位置在DATABASES配置如下
在應用名下面的model.py裏創建類,用以接受爬取的數據
使用一下命令將model裏創建的類作用到數據庫生成表
python manage.py makemigrations
python manage.py migrate
在應用名下面新建xxx.py文件用於爬取數據,爬蟲參考的這位老哥,我自己修改了一下,只爬取5A景區並加入了景區圖片,爬取速度不快,不同的景區類型只取了部分數據,有的景區包含多個類別。代碼如下
# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup
import random
from time import sleep
import urllib.request
import pymysql
a = 0
# 設置保存圖片的路徑,否則會保存到程序當前路徑
path = r'D:\WatchFileTest\images' # 路徑前的r是保持字符串原始值的意思,就是說不對其中的符號進行轉義
User_Agent = [
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36",
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1",
"Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"]
HEADERS = {
'User-Agent': User_Agent[random.randint(0, 4)],
# 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/201002201 Firefox/55.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3',
'Accept-Encoding': 'gzip, deflate, br',
'Cookie': '',
'Connection': 'keep-alive',
'Pragma': 'no-cache',
'Cache-Control': 'no-cache'
}
def download_page(url): # 下載頁面
try:
data = requests.get(url, headers=HEADERS, allow_redirects=True).content # 請求頁面,獲取要爬取的頁面內容
return data
except:
pass
# 下載頁面 如果沒法下載就 等待1秒 再下載
def download_soup_waitting(url):
try:
response = requests.get(url, headers=HEADERS, allow_redirects=False, timeout=5)
if response.status_code == 200:
html = response.content
html = html.decode("utf-8")
soup = BeautifulSoup(html, "html.parser")
return soup
else:
sleep(1)
print("等待ing")
return download_soup_waitting(url)
except:
return ""
def getTypes():
types = ["故居", "宗教", "文化古蹟", "自然風光", "公園", "古建築", "寺廟", "遺址", "古鎮", "陵墓陵園"] # 實際不止這些分組 需要自己補充
for type in types:
url = "http://piao.qunar.com/ticket/list.htm?keyword=%E7%83%AD%E9%97%A8%E6%99%AF%E7%82%B9®ion=&from=mpl_search_suggest&subject=" + type + "&page=1"
getType(type, url)
def getType(type, url):
db = pymysql.connect(host='localhost', user='root', passwd='root', db='test') # 連接數據庫(地址,用戶名,密碼,數據庫名)
cur = db.cursor() # 取遊標
global a
soup = download_soup_waitting(url)
search_list = soup.find('div', attrs={'id': 'search-list'})
sight_items = search_list.findAll('div', attrs={'class': 'sight_item'})
for sight_item in sight_items:
a5spot = '5A'
level = sight_item.find('span', attrs={'class': 'level'})
if level:
if a5spot in level.text: # 判斷是否5A景區
name = sight_item['data-sight-name']
districts = sight_item['data-districts']
point = sight_item['data-point']
address = sight_item['data-address']
data_id = sight_item['data-id']
level = sight_item.find('span', attrs={'class': 'level'})
if level:
level = level.text
else:
level = ""
product_star_level = sight_item.find('span', attrs={'class': 'product_star_level'})
if product_star_level:
product_star_level = product_star_level.text
else:
product_star_level = ""
intro = sight_item.find('div', attrs={'class': 'intro'})
if intro:
intro = intro['title']
else:
intro = ""
link = sight_item['data-sight-img-u-r-l']
# 保存圖片,以data_id命名防止衝突
urllib.request.urlretrieve(link, path + '\%s.jpg' % data_id) # 使用request.urlretrieve直接將所有遠程鏈接數據下載到本地
print(name, districts, point, address, data_id, level, product_star_level, intro, type, link)
# 插入數據到mysql
sql = "insert into webmap_spot(name, districts, point, address, data_id, level, product_star_level, " \
"intro, type) VALUES ('%s','%s','%s','%s','%s','%s','%s','%s','%s') "
cur.execute(sql % (name, districts, point, address, data_id, level, product_star_level, intro, type))
db.commit() # 執行commit操作,插入語句才能生效
print('成功插入', cur.rowcount, '條數據')
a = a + 1
print(a)
cur.close()
db.close()
next = soup.find('a', attrs={'class': 'next'})
if next:
next_url = "http://piao.qunar.com" + next['href']
getType(type, next_url)
if __name__ == '__main__':
getTypes()
部分數據結果
在應用名下的views.py編寫獲取數據的方法
from django.shortcuts import render
from django.http import HttpResponse
from django.core import serializers
from .models import Spot
import json
def get_spot_data(request):
if request.method == 'GET':
allData = Spot.objects.all()
allData = allData.values('name', 'districts', 'point', 'address', 'data_id', 'level', 'product_star_level', 'intro', 'type').distinct() # 去除data_id重複的
print(json.dumps(list(allData))) # 使用values進行調用返回的是valueQuerySet字段,而濁QuerySet,所以先轉成list然後再使用json.dumps轉成json
return HttpResponse(json.dumps(list(allData)), content_type="application/json")
在應用名下的urls.py編寫跳轉到views,urls相當於路由
from django.contrib import admin
from django.urls import path
#從應用名引入views
from proApp import views
urlpatterns = [
path('', admin.site.urls),
path('spot/virusdata', views.get_spot_data) # views調用剛剛寫的方法
]
啓動項目,在瀏覽器輸入http://127.0.0.1:8000/spot/virusdata,spot/virusdata是剛剛定義的路徑,獲得json對象結果
- apache+wsgi+python部署
版本說明:Apache2.4.41
參考的這位老哥,很詳細我就不寫了,注意的是原來的項目依賴包在虛擬環境裏,所以部署環境裏也要安裝。不過我採用的多端口部署,此外,還映射了爬取圖片的文件夾位置,爲了後面前端讀取圖片方便