How to parse nested JSON objects in spark sql ?
How to parse nested JSON objects in spark sql ?
Lets consider reading a json file and print the schema. And this follows as:
DataFrame df = sqlContext.read().json("/path/to/file").toDF(); df.registerTempTable("df"); df.printSchema();
Here nested objects is selected inside a struct type as:
DataFrame app = df.select("app"); app.registerTempTable("app"); app.printSchema(); app.show(); DataFrame appName = app.select("element.appName"); appName.registerTempTable("appName"); appName.printSchema(); appName.show();
Here alternatively try by using below query:
val nameAndAddress = sqlContext.sql(""" SELECT name, address.city, address.state FROM people """) nameAndAddress.collect.foreach(println)
Another method is by doing it straight from the SQL query as follows:
Select apps.element.Ratings from yourTableName
Here array will be returned and the elements can be accessed easily inside. Alternatively, online Json viewer is used when working with large JSON structures.