數據挖掘工具numpy（一）Numpy基本認識

原創

TFATS

2020-05-21 04:36

一，Numpy的優勢

ndarray 對象由計算機內存中的一維連續區域組成，帶有將每個元素映射到內存塊中某個位置的索引方案。內存塊以按行（C風格）或按列（FORTRAN 或 MatLab 風格）的方式保存元素。

1，Numpy的優勢

numpy的優勢在運算速度快，是幫助處理數值型數據的，多用於大型、多維數組上的執行數值運算。

numpy是以連續的內存形式進行存儲的。內存有兩種排列方式“c-type”(行排列)、“Fortran”(列排列)。
numpy可以實行並行化運算，不僅使用c來實現，還使用了BLAS。也就是說numpy底層使用BLAS做向量。可以做到真正的多線程。

2，代碼案例

通過以下實例可以看出，np在計算速度方面的優勢非常明顯，是普通計算速度的10倍。

import timeit
import random
import numpy as np

a = []
for i in range(100000):
    a.append(random.random())
b=np.array(a)

def nornal_add():
    sum1 = sum(a)

def numpy_add():
    sum3 = np.sum(b)

timer = timeit.Timer(nornal_add,"from __main__ import nornal_add")
print("%s: %f seconds" % (nornal_add,timer.timeit(number=1000)))
timer = timeit.Timer(numpy_add,"from __main__ import numpy_add")
print("%s: %f seconds" % (numpy_add,timer.timeit(number=1000)))

# -----------output-----------------
<function nornal_add at 0x000001CADA40C1E0>: 0.504544 seconds
<function numpy_add at 0x000001CAE3276510>: 0.047483 seconds

二，Numpy的屬性：

Numpy最重要的一個特點就是其N緯數組對象(即ndarray),該對象是一個快速而靈活的大數據集容器。你可以利用這種數組對整塊的數據執行一些數學運算。
ndarray是一個通用的同構數據多維容器，其中的所有元素必須是相同類型的。每個數組都有一個shape(表示各維度大小的元組)和一個dtype(表示數組數據類型的對象):

1，數組的類名

temp1 = np.arange(12)
print(temp1,type(temp1))

# -------------output---------------------
[ 0  1  2  3  4  5  6  7  8  9 10 11] <class 'numpy.ndarray'>

2，數據的類型（當前數組裏面所存放的數據的類型）

temp3 = np.array(range(1,6))
print(temp3.dtype)

# -------------output---------------------
int32     # 默認爲多少位的電腦，數據類型就位多少位

3，調整數據類型

temp2 = np.array([[1,2,3,4],[3,4,5,6],[7,8,9,0]],dtype='i4')
print(temp2.dtype)
temp2 = temp2.astype('i8')
print(temp2.dtype)

# -------------output---------------------
int32
int64

4，限制數組float的浮點位數

temp4 = np.array([random.random() for i in range(10)])
print(temp4,temp4.dtype)
temp4 = np.round(temp4,2)
print(temp4,temp4.dtype)

# -------------output---------------------
[0.03505807 0.30070143 0.81331086 0.80476998 0.88999505 0.59220155
 0.6514705  0.11714838 0.53510445 0.09625571] float64
[0.04 0.3  0.81 0.8  0.89 0.59 0.65 0.12 0.54 0.1 ] float64

# python 代碼中限制float的浮點位數
import random
a = random.random()
print(a)
a = round(a,2)
print(a)

# -------------output---------------------
0.2379073047892517
0.24

5，數組的維度

temp2 = np.array([[1,2,3,4],[3,4,5,6],[7,8,9,0]])
print(temp2.ndim)

# -------------output---------------------
2     # 返回數組的維度

6，數組佔用的元素數目

import numpy as np
temp = np.array([[1,2,3],[4,5,6],[7,8,9],[5,6,7]])
# 獲取array對象的維度形狀
print(temp.shape)
# 獲得arrary對象元素的數量
print(temp.size)
# 獲得arrary對象每一個元素佔用的內存位數
print(temp.itemsize)
# 獲得arrary對象所有元素佔用的內存位數
print(temp.nbytes)
# arrary對象內存的分佈排列方式
print(temp.flags)

# -------------output---------------------
(4, 3)
12
4
48
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

7，數組的形狀(shape、reshape)

import numpy as np

# np定義數組的三種方法（推薦使用第一種）
temp1 = np.arange(12)
temp2 = np.array([[1,2,3,4],[3,4,5,6],[7,8,9,0]],dtype='i4')

# 查看數組形狀（大小）
rst = temp2.shape
rst2 = temp1.reshape((2,2,3))
print(rst,rst2.shape)

# -------------output---------------------
(3, 4) (2, 2, 3)

8，將數組轉化爲一維數據的兩種方法

temp2 = np.array([[1,2,3,4],[3,4,5,6],[7,8,9,0]],dtype='i4')
print(temp2.shape,temp2.ndim)

# 1，flatten是將多維數組輸出爲一維數組
temp2 = temp2.flatten()
print(temp2.shape,temp2.ndim)
# 2，也可以使用reshape轉化爲一維數組
temp2 = temp2.reshape(12,)
print(temp2.shape,temp2.ndim)

# -------------output---------------------
(3, 4) 2
(12,) 1
(12,) 1

9，擴展 - 自定義數據結構

通常對於numpy數組來說，儲存的都是同一類型的數據。
但其實也可以通過np.dtype實現 數據類型對象表示數據結構。

import numpy as np

mytype = np.dtype([("name" ,np.string_,10),('height',np.float64)])
print(mytype)

arr = np.array([("Sarsh",(8.3)),("John",(6.345))],dtype=mytype)
print(arr)
print(arr[0]["name"])
# 對於儲存關係複雜的數據，我們會選擇pandas更加方便的工具

# -------------output---------------------
[('name', 'S10'), ('height', '<f8')]
[(b'Sarsh', 8.3  ) (b'John', 6.345)]
b'Sarsh'

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

數據挖掘工具numpy（一）Numpy基本認識

一，Numpy的優勢

1，Numpy的優勢

2，代碼案例

二，Numpy的屬性：

1，數組的類名

2，數據的類型（當前數組裏面所存放的數據的類型）

3，調整數據類型

4，限制數組float的浮點位數

5，數組的維度

6，數組佔用的元素數目

7，數組的形狀(shape、reshape)

8，將數組轉化爲一維數據的兩種方法

9，擴展 - 自定義數據結構

如何使用 JS 判斷用戶是否處於活躍狀態

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

數據挖掘工具numpy（一）Numpy基本認識

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結