How to parse nested JSON objects in spark sql ?

How to parse nested JSON objects in spark sql ?

Asked on January 11, 2019 in Apache-spark.
Add Comment


  • 3 Answer(s)

    Lets consider reading a json file and print the schema. And this follows as:

    DataFrame df = sqlContext.read().json("/path/to/file").toDF();
        df.registerTempTable("df");
        df.printSchema();
    

    Here nested objects is selected inside a struct type as:

    DataFrame app = df.select("app");
            app.registerTempTable("app");
            app.printSchema();
            app.show();
    DataFrame appName = app.select("element.appName");
            appName.registerTempTable("appName");
            appName.printSchema();
            appName.show();
    
    Answered on January 11, 2019.
    Add Comment

    Here alternatively try by using below query:

    val nameAndAddress = sqlContext.sql("""
        SELECT name, address.city, address.state
        FROM people
    """)
    nameAndAddress.collect.foreach(println)
    
    Answered on January 11, 2019.
    Add Comment

    Another method is by doing it straight from the SQL query as follows:

    Select apps.element.Ratings from yourTableName
    
    

    Here array will be returned and the elements can be accessed easily inside. Alternatively, online Json viewer is used when working with large JSON structures.

    Answered on January 11, 2019.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.