Theprogrammersfirst: Handling duplicates in a Pandas dataframe

2022-01-01

Handling duplicates in a Pandas dataframe

I'm pulling cryptocurrency data from binance API and storing it in an database, its working, but how do I work around duplicates in the database? If I run the for loop a second time, it raises

ValueError: Table "'BTCUSDT' already exists."

How do stop it from creating a new table and just inserting a value if it's not already there using a timestamp as the key?

engine = create_engine('sqlite:///Cryptoprices3.db')

def getminutedata(symbol, lookback):
    frame = pd.DataFrame(client.get_historical_klines(symbol,'1m',lookback + ' days ago UTC' ))

    frame = frame.iloc[:,:6]
    frame.columns = ['Time','Open','High','Low','Close','Volume']
    frame[['Open','High','Low','Close','Volume']] = frame[['Open','High','Low','Close','Volume']].astype(float)
    frame.Time = pd.to_datetime(frame.Time, unit='ms')
    return frame


for coin in coins:
    getminutedata(coin,'2').to_sql(coin, engine, index=False)

from Recent Questions - Stack Overflow https://ift.tt/3qA1vqV
https://ift.tt/eA8V8J

Theprogrammersfirst

2022-01-01

Handling duplicates in a Pandas dataframe

No comments:

Post a Comment