PySpark error採坑記
最近在跑pyspark任務有報錯:PySpark error: AttributeError: 'NoneType' object has no attribute '_jvm'
if prefer != ['-911']:
for prefer_i in prefer:
prefer_l = prefer_i.split(',')[0][1:]
prefer_r = prefer_i.split(',')[1][:-1]
if prefer_r == 'infinite':
prefer_r = '99999999999'
prefer_lst.append(prefer_l)
prefer_lst.append(prefer_r)
prefer_lst = map(int, prefer_lst)
min_prefer = min(prefer_lst) #############
max_prefer = max(prefer_lst)
prefer_real = range(min_prefer, max_prefer)
prefer_real = map(str, prefer_real)
x = prefer_real
min(lst)行報錯,檢查了一遍,發現是Python內置min,max函數被spark function中函數給覆蓋了,使用spark函數時應儘量避免使用import *
from pyspark.sql.functions import *
應按需導入
from pyspark.sql.functions import udf