Pyspark: multiple conditions in when clause

Pyspark: multiple conditions in when clause

Asked on January 12, 2019 in Apache-spark.
Add Comment

  • 2 Answer(s)

    Here SyntaxError error exception is seen because Python has no && operator. It has and and &, For creating boolean expressions on Column (| for a logical disjunction and ~ for logical negation) the latter one is the best choice.

    Condition In this created condition is invalid because the operator precedence is not considered. & in Python has a higher precedence than == so expression has to be parenthesized.

    (col("Age") == "") & (col("Survived") == "0")
    ## Column<b'((Age = ) AND (Survived = 0))'>

    Make sure that when function is equivalent to case expression not WHEN clause. And the same condition is applied:

    df.where((col("foo") > 0) & (col("bar") < 0))

    The Disjunction will be like:

    df.where((col("foo") > 0) | (col("bar") < 0))

    For avoiding brackets the course is defined conditions separately:

    cond1 = col("Age") == ""
    cond2 = col("Survived") == "0"
    cond1 & cond2
    Answered on January 12, 2019.
    Add Comment

    I would like to modify the cell values of a dataframe column (Age) where currently it is blank and I would only do it if another column (Survived) has the value 0 for the corresponding row where it is blank for Age. If it is 1 in the Survived column but blank in Age column then I will keep it as null.

    I tried to use¬†&&¬†operator but it didn’t work. Here is my code:

    tdata.withColumn("Age",  when((tdata.Age == "" && tdata.Survived == "0"), mean_age_0).otherwise(tdata.Age)).show()

    Any suggestions how to handle that? Thanks.

    Error Message:

    SyntaxError: invalid syntax
      File "<ipython-input-33-3e691784411c>", line 1
        tdata.withColumn("Age",  when((tdata.Age == "" && tdata.Survived == "0"), mean_age_0).otherwise(tdata.Age)).show()


    Answered on January 13, 2019.
    Add Comment

  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.