Why does Apache Spark read unnecessary Parquet columns within nested structures ?

Why does Apache Spark read unnecessary Parquet columns within nested structures ?

Asked on January 12, 2019 in Apache-spark.
Add Comment


  • 2 Answer(s)

    The Spark query engine at the moment is limited, the given below is relevant JIRA ticket, spark only handles predicate pushdown of simple types in Parquet, not nested StructTypes.

    Refer this link: https://issues.apache.org/jira/browse/SPARK-17636

    Answered on January 12, 2019.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.