class pyspark.ml.linalg.Vector
方法
toArray(): 把vector轉換爲numpy.ndarray
class pyspark.ml.linalg.DenseVector(ar)
v = Vectors.dense([1.0, 2.0])
u = Vectors.dense([3.0, 4.0])
#可以進行加減乘除
v + u #DenseVector([4.0, 6.0])
v * u #DenseVector([3.0, 8.0])
方法
dot(other): 計算兩向量的點積,支持Numpy array, list, SparseVector, Scipy sparse
norm(p):計算SparseVector的正則化
numNonzeros(): 非零元素個數
size: 向量大小
squared_distance(other):到SparseVector或Numpy.array的平方距離
toArray(): 返回一個SparseVector副本
class pyspark.ml.linalg.Vectors
工廠方法
方法
dense(*elements): 創建一個dense vector
Vectors.dense([1, 2, 3]) #DenseVector([1.0, 2.0, 3.0])
Vectors.dense(1.0, 2.0) #DenseVector([1.0, 2.0])
norm(vector,p):找到給定vecor的norm
sparse(size, *args): 創建係數矩陣,可以使用字典,(index,value)對,切片數組和值
Vectors.sparse(4, {1: 1.0, 3: 5.5}) # SparseVector(4, {1: 1.0, 3: 5.5})
Vectors.sparse(4, [(1, 1.0), (3, 5.5)]) #SparseVector(4, {1: 1.0, 3: 5.5})
saquare_distance(v1,v2):v1和v2向量的平方距離,類型爲SparseVector, DenseVector, np.array, array.array
zeros(size):
class pyspark.ml.linalg.DenseMatrix(numRows, numCols, values, isTransposed=False)
方法
toArray()
toSparse()
m = DenseMatrix(2, 2, range(4))
m.toArray()
#array([[ 0., 2.],
[ 1., 3.]])
class pyspark.ml.linalg.SparseMatrix(numRows, numCols, colPtrs, rowIndices, values, isTransposed=False)
toArray()
toDense()