Ambiguity in Pandas Dataframe / Numpy Array “axis” definition

Ambiguity in Pandas Dataframe / Numpy Array “axis” definition

Asked on December 21, 2018 in Pandas.
Add Comment


  • 3 Answer(s)

    Remember the two points:

    • Use axis=0 to apply a method down each column, or to the row labels (the index).
    • Use axis=1 to apply a method across each row, or to the column labels.

    In simple words 0=down and 1=across

    Also remember Pandas also follow the word axis and so do NumPy’s. Refer more in the document.

    Axes are used to define arrays for more than one dimension.A 2d array has two axes.Axis 0 is  downwards across rows and axis 1 is horizontally across columns.

    Therefore, the code df.mean(axis=1) refers to horizontal across columns and df.mean(axis=0) would mean vertical downwards columns.

     

    Answered on December 21, 2018.
    Add Comment

    Refer to the code below:

    // Not realistic but ideal for understanding the axis parameter
    df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]],
    columns=["idx1", "idx2", "idx3", "idx4"],
    index=["idx1", "idx2", "idx3"]
    )
     
    ---------------------------------------1
    |        idx1 idx2 idx3 idx4
    | idx1    1    1    1    1
    | idx2    2    2    2    2
    | idx3    3    3    3    3
    0
    

    Use df.drop:

    A: I wanna remove idx3.
    B: **Which one**? // typing while waiting response: df.drop("idx3",
    A: The one which is on axis 1
    B: OK then it is >> df.drop("idx3", axis=1)
     
    // Result
    ---------------------------------------1
    |        idx1 idx2 idx4
    | idx1     1    1    1
    | idx2     2    2    2
    | idx3     3    3    3
    0
    

    And df.apply:

    A: I wanna apply sum.
    B: Which direction? // typing while waiting response: df.apply(lambda x: x.sum(),
    A: The one which is on *parallel to axis 0*
    B: OK then it is >> df.apply(lambda x: x.sum(), axis=0)
     
    // Result
    idx1  6
    idx2  6
    idx3  6
    idx4  6
    
    Answered on December 21, 2018.
    Add Comment

    For greater than 2 dimensions the parameter axis has to be changed:

    Consider a dataframe with three dimensions a x b x c:

    • df.mean(axis=1) returns a dataframe with dimenstion a x 1 x c.
    • df.drop(“col4”, axis=1) returns a dataframe with dimension a x (b-1) x c.
    Answered on December 21, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.