Saving Data#

When using any of the previous pandas examples none of the changes have been saved to disk as pandas is running in memory. If one needs a temporary saving then use pickle, all data types are retained together with cleaning actions, by contrast csv files will lose the data types:

df3.to_pickle("my_data.pkl")

Then in a subsequent session:

import pandas as pd

df3= pd.read_pickle("my_data.pkl")
df3
      Product  Price
0      Tablet    250
1     Printer    105
2      Laptop   1200
3  UHD Screen    400

Note

The old index is reused

The most used format for loading and saving is probably csv:

df3.to_csv('file1.csv', index=False)

Results in the following:

Product,    Price
Tablet,     250
Printer,    105
Laptop,     1200
UHD Screen, 400

If required the header and index need not be saved:

df3.to_csv('file2.csv', header=False, index=False)

More importantly special separators may be required:

df3.to_csv('file3.csv', sep=";")

Results in the csv:

;Product;Price
0;Tablet;250
1;Printer;105
2;Laptop;1200
3;UHD Screen;400

When reloaded:

df3 = pd.read_csv("file3.csv")
df3
 ;Product;Price
0;Tablet;250
1;Printer;105
2;Laptop;1200
3;UHD Screen;400

The output is squashed together, so it is not a well behaving dataframe, import again this time stating the separator:

df3 = pd.read_csv("file3.csv", sep=";")
df3
   Unnamed: 0     Product  Price
0           0      Tablet    250
1           1     Printer    105
2           2      Laptop   1200
3           3  UHD Screen    400

Oops now there are 2 indexes. Remove the old index:

del df3['Unnamed: 0']
df
    Product  Price
0      Tablet    250
1     Printer    105
2      Laptop   1200
3  UHD Screen    400

Alternatively we could have dropped a column by position:

df3.drop(df.columns[[0]], axis=1)

When saving to csv remember to disable the index:

df3.to_csv("file3.csv", sep=";", index=False)

A new index is produced automatically when loading/reloading csv files.

Saving Data#

This Page