Ambiguity in Pandas Dataframe / Numpy Array “axis” definition
Ambiguity in Pandas Dataframe / Numpy Array “axis” definition
Remember the two points:
- Use axis=0 to apply a method down each column, or to the row labels (the index).
- Use axis=1 to apply a method across each row, or to the column labels.
In simple words 0=down and 1=across
Also remember Pandas also follow the word axis and so do NumPy’s. Refer more in the document.
Axes are used to define arrays for more than one dimension.A 2d array has two axes.Axis 0 is downwards across rows and axis 1 is horizontally across columns.
Therefore, the code df.mean(axis=1) refers to horizontal across columns and df.mean(axis=0) would mean vertical downwards columns.
For greater than 2 dimensions the parameter axis has to be changed:
Consider a dataframe with three dimensions a x b x c:
- df.mean(axis=1) returns a dataframe with dimenstion a x 1 x c.
- df.drop(“col4”, axis=1) returns a dataframe with dimension a x (b-1) x c.