How to check if spark dataframe is empty

How to check if spark dataframe is empty

Asked on November 19, 2018 in Apache-spark.
Add Comment


  • 3 Answer(s)

    In Spark 2.1.0, Here best recommendation is to use head(n: Int) or take(n: Int) with isEmpty, whichever one has the clearest intent to you.

    df.head(1).isEmpty
    df.take(1).isEmpty
    

    with Python equivalent:

    len(df.head(1)) == 0 # or bool(df.head(1))
    len(df.take(1)) == 0 # or bool(df.take(1))
    

    Here df.first() and df.head() are used for returning the java.util.NoSuchElementException if the DataFrame is empty. first() calls head() directly, which calls head(1).head.

    def first(): T = head()
    def head(): T = head(1).head
    

    In this head(1) returns an Array, so taking head on that Array causes the java.util.NoSuchElementException when the DataFrame is empty.

    def head(n: Int): Array[T] = withAction("head", limit(n).queryExecution)(collectFromPlan)
    

    So rather of calling head(), use head(1) directly to get the array and then you can use isEmpty.

    take(n) is also equivalent to head(n)

    And limit(1).collect() is identical to head(1) (notice limit(n).queryExecution in the head(n: Int) method), so the following are all equivalent, at least from what I can tell, and you won’t have to catch a java.util.NoSuchElementException exception when the DataFrame is empty.

    df.head(1).isEmpty
    df.take(1).isEmpty
    df.limit(1).collect().isEmpty
    

    This could be helpful for the user using newer version of spark.

    Answered on November 19, 2018.
    Add Comment

    Here the best suggestion is to just grab the underlying RDD. In Scala:

    df.rdd.isEmpty
    
    

    in Python:

    df.rdd.isEmpty()
    
    

    Here they said that all this does is call take(1).length. 

    Answered on November 19, 2018.
    Add Comment

    Here the best benefit is to use by the head() (or first()) functions to see if the DataFrame has a single row. If so, it is not empty.

     

    Answered on November 19, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.