pyspark sql數據類型

原創

2020-06-29 22:59

1. pyspark數據類型

“DataType”, “NullType”, “StringType”, “BinaryType”, “BooleanType”, “DateType”,
“TimestampType”, “DecimalType”, “DoubleType”, “FloatType”, “ByteType”, “IntegerType”,
“LongType”, “ShortType”, “ArrayType”, “MapType”, “StructField”, “StructType”

2. 示例 StructField

class StructField(DataType):
    """A field in :class:`StructType`.

    :param name: string, name of the field.
    :param dataType: :class:`DataType` of the field.
    :param nullable: boolean, whether the field can be null (None) or not.
    :param metadata: a dict from string to simple type that can be toInternald to JSON automatically
    """

    def __init__(self, name, dataType, nullable=True, metadata=None):
        """
        >>> (StructField("f1", StringType(), True)
        ...      == StructField("f1", StringType(), True))
        True
        >>> (StructField("f1", StringType(), True)
        ...      == StructField("f2", StringType(), True))
        False
        """
        assert isinstance(dataType, DataType), "dataType should be DataType"
        assert isinstance(name, basestring), "field name should be string"
        if not isinstance(name, str):
            name = name.encode('utf-8')
        self.name = name
        self.dataType = dataType
        self.nullable = nullable
        self.metadata = metadata or {}

3. DataFrame指定類型

指定說明每個DataFrame的數據類型。

val schema = StructType(
      List(
        StructField("id", IntegerType, true),
        StructField("name", StringType, true),
        StructField("age", IntegerType, true)
      )
    )
    //將RDD映射到rowRDD
    val rowRDD = personRDD.map(p => Row(p(0).toInt, p(1).trim, p(2).toInt))
    //將schema信息應用到rowRDD上
    val personDataFrame = sqlContext.createDataFrame(rowRDD, schema)

參考：

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

pyspark sql數據類型

1. pyspark數據類型

2. 示例 StructField

3. DataFrame指定類型

【筆記】動手學深度學習-前言

公司新來一個幹練小夥，把 MyBatis 替換成 MyBatis-Plus，上線後哭暈在廁所。。。

支持非IE瀏覽器真的那麼難嗎？

爲啥就那麼痛恨IE？

Brian Sun：回覆“爲啥就那麼痛恨IE？”

體驗下，大廠在使用功能的API網關！

見鬼了！我家的 WiFi 只有下雨天才能正常使用...

短視頻文案提取原來如此簡單

oa系統集成及案例樣式

iTOP-3588S開發板瑞芯微RK3588S處理器主頻2.4GHz算力6T

LeetCode44. 通配符匹配(python,動態規劃) 通用解法

LeetCode718. 最長重複子數組(python)

pySpark DataFrame簡介

Spark實現xgboost多分類(python)

LeetCode123. 買賣股票的最佳時機 III(python,動態規劃)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結