How to create DataFrame from Scala’s List of Iterables ?

How to create DataFrame from Scala’s List of Iterables ?

Asked on December 24, 2018 in Apache-spark.
Add Comment


  • 3 Answer(s)

    Here the best approach is by first converting List[Iterable[Any]] to List[Row] and after that add rows in RDD and ready the schema for the spark data frame.

    For converting List[Iterable[Any]] to List[Row], the below code can be used:

    val rows = values.map{x => Row(x:_*)}
    
    

    Here when we have schema, and then RDD can be made.

    val rdd = sparkContext.makeRDD[RDD](rows)
    
    

    Atlast spark data frame is created.

    val df = sqlContext.createDataFrame(rdd, schema)
    
    
    Answered on December 25, 2018.
    Add Comment

    Here the alternative easiest approach can be done by using the below code:

    val newList = yourList.map(Tuple1(_))
    val df = spark.createDataFrame(newList).toDF("stuff")
    

     

    Answered on December 25, 2018.
    Add Comment

    Here the DataSet is used for converting list to DS by toDS API in spark version 2.

    val ds = list.flatMap(_.split(",")).toDS() // Records split by comma
    

    Another method

    val ds = list.toDS()
    

    It is more simpler than rdd or df

    Answered on December 25, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.