Why should I make a copy of a data frame in pandas

Why should I make a copy of a data frame in pandas

Asked on December 20, 2018 in Pandas.
Add Comment


  • 2 Answer(s)

    When you index a DataFrame it returns a reference to the initial DataFrame. Thus if you change the subset the initial dataframe is also changed.Therefore a copy should be used if the DataFrame should not be changed.

    df = DataFrame({'x': [1,2]})
    df_sub = df[0:1]
    df_sub.x = -1
    print(df)
    

    that results in,

    x
    0 -1
    1  2
    

    Execute the following code to create copy of the dataframe thus the df in not modifed:

    df_sub_copy = df[0:1].copy()
    df_sub_copy.x = -1
    
    Answered on December 20, 2018.
    Add Comment

    Indexing places a major role to mention whether a view  or copy is returned.

    There is a difference between returning a view and a copy:

    NumPy determines when a view of data is returned. If a label array or boolean vectors are involved in the indexing, a copy is resulted.For example  df.ix[3:6] or df.ix[:, ‘A’] is executed a view is returned since it involves single label/scalar indexing and slicing.

    Answered on December 20, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.