安裝流行腳本編輯器(jupyter notebook)流程

jupyter notebook是一個流行的輕量的在線代碼編輯器,可支持幾十種程序語言.
jupyter notebook 功能也很豐富,做文檔,數據科學分析,計算都非常方便.
jupyter notebook在window|linux上都有發行.window安裝非常簡單,linux安裝比較複雜,本人爲了安裝jupyter notebook花了不少時間,現把教程分享一下.

安裝python2.7安裝包

(python3.x會有不兼容的地方,所以選2.7.)。

從官網下載python2.7.6的安裝包。

https://www.python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz

解壓:

xz -d Python-2.7.6.tar.xz

tar -xvf Python-2.7.6.tar

解壓後做以下幾步:

sudo ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
sudo make && make altinstall

如果沒報錯就代表 python2.7已經安裝到了你的服務器上。

2,修改服務器原有python命令默認的python版本(一般是2.6或者更低)

在終端輸入python命令,會發現系統原有版本爲2.6,並沒有使用我們的2.7版本。

這是你可以用which python命令查看該python命令調用的是那個位置的python,一般情況下在/usr/bin/python 這裏。

但是這裏的python指向的系統自帶的2.6版本。而我們安裝的python2.7的命令在/usr/local/python2.7/bin/python(前邊的路徑要根據你的安裝路徑確定)這裏

我們只需把/usr/bin/python 刪除掉:rm /usr/bin/python。然後做個軟連接

sudo mv /usr/bin/python /usr/bin/python2.6
sudo ln -s /usr/local/bin/python2.7 /usr/bin/python

這個時候 我們再一次在終端輸入python命令

bingo!已經成了2.7版本。

3,yum工具已經不可以使用了

這時候你輸入 yum install xxxx 會提示你yum模塊找不到。

其實 yum 是依賴python 的。當我們修改了原有的python版本之後這個yum會調用我們的2.7版本的python,而我們2.7版本沒有yum就會報錯。

我們只需要 用 which yum 找到yum的地址,然後 編輯yum文件,然後把文件首行的

whereis yum

sudo vi /usr/bin/yum



#!/usr/bin/python 改成#!/usr/bin/python2.6

(其實在/usr/bin下邊依然是有python2.6這個文件的)。這樣子yum就又可以使用了。

4,安裝setuptools和pip

大家知道pip是使用python很方便的工具,其依賴setuptool。所以首先我們要安裝setuptool。(我直接從官網下載setuptool和pip的安裝包)

(1)安裝setuptool

安裝時候報錯 python的zlib模塊找不到:

下載zlib&&zlib-dev

解壓進入目錄,

sudo ./configure --prefix=/usr/local/zlib-1.2.11
sudo make
sudo make && make install




sudo yum -y install setuptool

(2)安裝pip依賴包openssl和openssl-devel

安裝pip時候又報了錯誤,錯誤是無法加載HTTPSHandler模塊。

在網上找了下,是系統的openssl和openssl-devel沒裝。我的系統只是openssl-devel沒裝。然後就下載了這個模塊安裝。

sudo yum -y install openssl
sudo yum -y install openssl-devel

然後重新編譯安裝python2.7,命令還是

sudo ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
sudo make && make altinstall

下載pip

wget https://bootstrap.pypa.io/get-pip.py --no-check-certificate

安裝pip

sudo python get-pip.py

查看pip版本

pip -V  

安裝完pip最好安裝一下py4j,因爲pyspark環境需要這個module.

pip install py4j –upgrade

安裝完重新編譯,make一下python

安裝jupyter

sudo pip install jupyter

查看jupyter版本:

jupyter --version

啓動jupyter

jupyter notebook

發現報錯:

Traceback (most recent call last):
  File "/usr/local/bin/jupyter-notebook", line 7, in <module>
    from notebook.notebookapp import main
  File "/usr/local/lib/python2.7/site-packages/notebook/notebookapp.py", line 79, in <module>
    from .services.sessions.sessionmanager import SessionManager
  File "/usr/local/lib/python2.7/site-packages/notebook/services/sessions/sessionmanager.py", line 13, in <module>
    from pysqlite2 import dbapi2 as sqlite3
ImportError: No module named 'pysqlite2'

提示缺了sqlite3-dev,下載sqlite3-dev:

sudo wget http://www.sqlite.org/2014/sqlite-autoconf-3080500.tar.gz

或者:

sudo wget http://sqlite.org/2013/sqlite-autoconf-3080100.tar.gz

安裝sqlite3-dev:

tar xvfz sqlite-autoconf-3080100.tar.gz

cd sqlite-autoconf-3080100

sudo ./configure

sudo make

sudo make install

注意:可能還需要libsqlite3-0-32bit-3.8.10.2-10.1.x86_64.rpm ,百度下載安裝

然後重新編譯python:

sudo ./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
sudo make && make altinstall

配置jupyter

修改/etc/profile

# jupyter------------------------

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"

source /etc/profile

再:

jupyter notebook

如果發現問題仍然存在,可能是由於權限問題,編譯不完全。使用:

make clean

再編譯,反覆幾次看看。

改jupyter默認端口號:

sudo vi ~/.jupyter/jupyter_notebook_config.py

如果沒有jupyter_notebook_config.py文件,創建一個:

jupyter notebook --generate-config

生成密碼:輸入shell命令:創建一個密鑰

ipython

會出現:

In [1]: from notebook.auth import passwd

In [2]: passwd()

Enter password: 

Verify password: 

Out[2]: 'sha1:ce23d945972f:34769685a7ccd3d08c84a18c63968a41f1140274'

把生成的密文‘sha:ce…’複製下來

修改默認配置:jupyter_notebook_config.py文件

c.NotebookApp.ip=’*’

c.NotebookApp.password = u’sha:ce…剛纔複製的那個密文’

c.NotebookApp.open_browser = False

c.NotebookApp.port =8888 #隨便指定一個端口

將默認端口號8888改成8990.

c.NotebookApp.port = 8990

再次啓動jupyter notebook

如果登陸失敗,則有可能是服務器防火牆設置的問題,此時最簡單的方法是在本地建立一個ssh通道:

在本地終端中輸入ssh username@address_of_remote -L127.0.0.1:1234:127.0.0.1:8888

便可以在localhost:1234直接訪問遠程的jupyter了。

最終可在瀏覽器中訪問jupyter:http://10.0.0.120:8990

創建一個文件夾,用於存放jupyter編輯器寫的腳本:

mkdir ~/jupyter_script

chmod -R 777 ~/jupyter_script

點擊頁面右上角,new python,出現報錯:

Permission denied: Untitled.ipynb

執行如下代碼修改Jupyter的一部分文件的權限(執行完之後重新啓動即可):

sudo chmod 777 ~/.local/share/jupyter/
cd ~/.local/share/jupyter/
ls
sudo chmod 777 runtime/
cd runtime/
ls

參考:http://www.cnblogs.com/uestc-mm/p/7168550.html

spark 編輯器安裝

下載安裝toree(spark2.1.0以上版本+scala2.11以上版本)

toree2.0下載網址:https://dist.apache.org/repos/dist/dev/incubator/toree/

最好使用toree-0.2.0.tar.gz版本

pip install -i https://pypi.anaconda.org/hyoon/simple toree

或離線下載:toree-0.2.0.dev1.tar.gz,安裝:

pip install toree-0.2.0.dev1.tar.gz

或者:

pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz

出現錯誤:

Processing ./toree-0.2.0.dev1.tar.gz
Requirement already satisfied: jupyter_core<5.0,>=4.0 in ./lib/python2.7/site-packages (from toree==0.2.0.dev1)
Collecting jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1)
  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x27af450>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/
  Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x27af250>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/
  Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d6f410>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/
  Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d6f110>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/
  Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d6fd10>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/jupyter-client/
  Could not find a version that satisfies the requirement jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1) (from versions: )
No matching distribution found for jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1)

應該是gfw的原因,解決辦法:添加幾個google的dns(親測可用)

參考索引:https://github.com/moby/moby/issues/30757

https://stackoverflow.com/questions/28668180/cant-install-pip-packages-inside-a-docker-container-with-ubuntu

(1):

nameserver 8.8.8.8
nameserver 8.8.4.4

If you want to add other DNS servers, have a look here.

However this change won't be permanent (see this thread). To make it permanent : $ sudo nano /etc/dhcp/dhclient.conf
Uncomment and edit the line with prepend domain-name-server : prepend domain-name-servers 8.8.8.8, 8.8.4.4;

Restart dhclient : $ sudo dhclient.

(2)安裝:

pip install toree-0.2.0.dev1.tar.gz

結果:

Processing ./toree-0.2.0.dev1.tar.gz
Requirement already satisfied: jupyter_core<5.0,>=4.0 in ./lib/python2.7/site-packages (from toree==0.2.0.dev1)
Collecting jupyter_client<5.0,>=4.0 (from toree==0.2.0.dev1)
/usr/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/usr/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading jupyter_client-4.4.0-py2.py3-none-any.whl (76kB)

```
100% |████████████████████████████████| 81kB 194kB/s 
```

Requirement already satisfied: traitlets<5.0,>=4.0 in ./lib/python2.7/site-packages (from toree==0.2.0.dev1)
Requirement already satisfied: pyzmq>=13 in ./lib/python2.7/site-packages (from jupyter_client<5.0,>=4.0->toree==0.2.0.dev1)
Requirement already satisfied: decorator in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)
Requirement already satisfied: ipython-genutils in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)
Requirement already satisfied: enum34; python_version == "2.7" in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)
Requirement already satisfied: six in ./lib/python2.7/site-packages (from traitlets<5.0,>=4.0->toree==0.2.0.dev1)
Building wheels for collected packages: toree
  Running setup.py bdist_wheel for toree ... done
  Stored in directory: /home/infosouth/.cache/pip/wheels/05/8c/59/313ad78c88005d86c240c7891a8fde548f29f0d64203a9bc07
Successfully built toree
Installing collected packages: jupyter-client, toree
  Found existing installation: jupyter-client 5.1.0

```
Uninstalling jupyter-client-5.1.0:
  Successfully uninstalled jupyter-client-5.1.0
```

Successfully installed jupyter-client-4.4.0 toree-0.2.0.dev1

參考索引:https://github.com/apache/incubator-toree

spark1.5.0以上版本+scala2.10版本的需要下載安裝toree0.1.0版本:

pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.1.0/snapshots/toree-0.1.0.dev8.tar.gz
jupyter toree install

或者去apache.toree網站下載0.1.0版本的toree

安裝spark-kernel 和scala(腳本):

# !/bin/bash
jupyter toree install --spark_home=$SPARK_HOME --user #will install scala + spark kernel
jupyter toree install --spark_home=$SPARK_HOME --interpreters=PySpark --user
jupyter kernelspec list
jupyter notebook #launch jupyter notebook

更改jupyter默認工作空間:

sudo vi ~/.jupyter/jupyter_notebook_config.py

找到c.NotebookApp.notebook_dir = ‘自己的位置’

配置跨域訪問jupyter notebook出現的錯誤:

Refused to display

Content Security Policy directive: “frame-ancestors ‘self’”

sudo vi ~/.jupyter/jupyter_notebook_config.py

c.NotebookApp.allow_origin = '*'
c.NotebookApp.trust_xheaders = True

c.NotebookApp.disable_check_xsrf = True
c.NotebookApp.tornado_settings = {
    'headers': {
            'Content-Security-Policy': ""
    }
}

參考索引:http://www.ruanyifeng.com/blog/2016/09/csp.html

並且添加:

cd ~/.jupyter

mkdir custom

chmod -R 755 custom

cd custom

sudo vi custom.js

添加內容(表示所有頁面只在一個頁面跳轉切換):

define(['base/js/namespace'], function(Jupyter){
Jupyter._target = '_self';
});

運行小程序

運行python代碼(例子):

pip install matplotlib

如果報錯,則是gfw原因。用離線下載以下4個包:

numpy-1.13.3-cp27-cp27mu-manylinux1x8664.whl

matplotlib-2.1.0-cp27-cp27mu-manylinux1x8664.whl

six-1.11.0-py2.py3-none-any.whl

python_dateutil-2.6.0-py2.py3-none-any.whl

逐個安裝

測試小程序:

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(20)
y = x**2
plt.plot(x, y)

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

# 通過rcParams設置全局橫縱軸字體大小

mpl.rcParams['xtick.labelsize'] = 24
mpl.rcParams['ytick.labelsize'] = 24

np.random.seed(42)

# x軸的採樣點

x = np.linspace(0, 5, 100)

# 通過下面曲線加上噪聲生成數據,所以擬合模型就用y了……

y = 2*np.sin(x) + 0.3*x**2
y_data = y + np.random.normal(scale=0.3, size=100)

# figure()指定圖表名稱

plt.figure('data')

# '.'標明畫散點圖,每個散點的形狀是個圓

plt.plot(x, y_data, '.')

# 畫模型的圖,plot函數默認畫連線圖

plt.figure('model')
plt.plot(x, y)

# 兩個圖畫一起

plt.figure('data & model')

# 通過'k'指定線的顏色,lw指定線的寬度

# 第三個參數除了顏色也可以指定線形,比如'r--'表示紅色虛線

# 更多屬性可以參考官網:http://matplotlib.org/api/pyplot_api.html

plt.plot(x, y, 'k', lw=3)

# scatter可以更容易地生成散點圖

plt.scatter(x, y_data)

# 將當前figure的圖保存到文件result.png

plt.savefig('result.png')

# 一定要加上這句才能讓畫好的圖顯示在屏幕上

plt.show()

運行scala代碼(例子):

var name="jupyter"
println(f"hello $name")

sc.parallelize(1 to 100).reduce(_+_)

sc.parallelize(1 to 100).mean()

val sc = new SparkContext(conf)

val datas: Array[String] = Array(
"{'id':1,'name':'xl1','pwd':'xl123','sex':2}",
"{'id':2,'name':'xl2','pwd':'xl123','sex':1}",
"{'id':3,'name':'xl3','pwd':'xl123','sex':2}")

sc.parallelize(datas)
.map(v => {
    new Gson().fromJson(v, classOf[User])
}).foreach(user => {println("id: " + user.id+" name: " + user.name+" pwd: " + user.pwd+" sex:" + user.sex)
})
}

運行通過,則說明已經安裝成功了.

發佈了61 篇原創文章 · 獲贊 42 · 訪問量 13萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章