2021-12-03

Reducing dtypes to save memory

In order the reduce the amount of memory a dataframe takes, I have written the following function which is converting to the lowest possible int/float.

from pandas.api.types import is_numeric_dtype
def chng_dtypes(df):
    has_decimal = 0
    for col in df.columns:
        if (is_numeric_dtype(df[col])):
            col_min = df[col].min()
            col_max = df[col].max()
            bytes = 64
            if ((col_min > -2147483648) & (col_max < 2147483648)):
                bytes = 32
            if ((col_min > -32768) & (col_max < 32768)):
                bytes = 16        
            if ((col_min > -128) & (col_max < 128)):
                bytes = 8
            if ( any(df[col]%1!=0) ):
                has_decimal == 1
                if (bytes == 8):
                    bytes = 16
                type = 'float'+str(bytes)
            else:
                type = 'int'+str(bytes)
            df[col] = df[col].astype(type)

Is there a more efficient way to do this?



from Recent Questions - Stack Overflow https://ift.tt/3ddLKj9
https://ift.tt/eA8V8J

No comments:

Post a Comment