Generating one NumPy array for each DataFrame row

By Ritesh Sahu - December 02, 2021

I'm attempting to plot stock market trades against a plot of the particular stock using mplfinance.plot(). I keep record of all my trades using jstock which uses as CSV file:

"Code","Symbol","Date","Units","Purchase Price","Current Price","Purchase Value","Current Value","Gain/Loss Price","Gain/Loss Value","Gain/Loss %","Broker","Clearing Fee","Stamp Duty","Net Purchase Value","Net Gain/Loss Value","Net Gain/Loss %","Comment"
"ASO","Academy Sports and Outdoors, Inc.","Sep 13, 2021","25.0","45.85","46.62","1146.25","1165.5","0.769999999999996","19.25","1.6793893129770994","0.0","0.0","0.0","1146.25","19.25","1.6793893129770994",""
"ASO","Academy Sports and Outdoors, Inc.","Aug 26, 2021","15.0","41.3","46.62","619.5","699.3","5.32","79.79999999999995","12.881355932203384","0.0","0.0","0.0","619.5","79.79999999999995","12.881355932203384",""
"ASO","Academy Sports and Outdoors, Inc.","Jun 3, 2021","10.0","37.48","46.62","374.79999999999995","466.2","9.14","91.40000000000003","24.386339381003214","0.0","0.0","0.0","374.79999999999995","91.40000000000003","24.386339381003214",""
"RMBS","Rambus Inc.","Nov 24, 2021","2.0","26.99","26.99","53.98","53.98","0.0","0.0","0.0","0.0","0.0","0.0","53.98","0.0","0.0",""

I can get this data easily enough using

myportfolio = pd.read_csv(PORTFOLIO_LOCATION, parse_dates=[2])

But I need to create individual lists for each trade that match the day-by-day stock price:

Date,High,Low,Open,Close,Volume,Adj Close
2020-12-01,17.020000457763672,16.5,16.799999237060547,16.8799991607666,990900,16.8799991607666
2020-12-02,17.31999969482422,16.290000915527344,16.65999984741211,16.40999984741211,1200500,16.40999984741211

and I have a normal DataFrame containing this. So far this is what I have:

for i in myportfolio.groupby("Code"):
    (code, j) = i
    if code == "ASO": # just testing it against one stock
        simp = pd.DataFrame(columns=["Date", "Units", "Price"],
            data=j[["Date", "Units", "Purchase Price"]].values, index=j[["Date"]])
        df = pd.read_csv("ASO-2020-12-01-2021-12-01.csv", index_col=0, parse_dates=True)
        # df.lookup(simp["Date"])
        df.insert(0, 'row_num', range(0,len(df)))
        k = df.loc[simp["Date"]]['row_num']
        trades = []
        for index, m in k.iteritems():
            t = np.zeros((df.shape[0], 1))
            t.fill(np.nan)
            t[m] = simp[index]["Price"]
            trades.append(t.to_list())

But I receive a KeyError: Timestamp('2021-09-17 00:00:00')

Any ideas of how to fix this?

Addendum 1:

import pandas as pd
trade_data = [['ASO', '5/5/21', 10], ['ASO', '5/6/21', 12], ['RBLX', '5/7/21', 15]]
trade_df = pd.DataFrame(trade_data, columns = ['Code', 'Date', 'Price'])
trade_df['Date'] = pd.to_datetime(trade_df['Date'])
trade_df

    Code    Date    Price
0   ASO 2021-05-05  10
1   ASO 2021-05-07  12
2   RBLX    2021-05-07  15

aso_data = [['5/5/21', 12, 5, 10, 7], ['5/6/21', 15, 7, 13, 8], ['5/9/21', 17, 10, 15, 11]]
aso_df = pd.DataFrame(aso_data, columns = ['Date', 'High', 'Low', 'Open', 'Close'])
aso_df['Date'] = pd.to_datetime(aso_df['Date'])
aso_df

    Date    High    Low Open    Close
0   2021-05-05  12  5   10  7
1   2021-05-06  15  7   13  8
2   2021-05-09  17  10  15  11

So I want to create two NumPy arrays for ASO {one for each trade) and one for the RBLX trade. For the example, consider only ASO. I should have a NumPy array that looks like:

0 10
1 NaN
2 12

the values on the left aren't there really it just looks like [10, Nan, 12].

from Recent Questions - Stack Overflow https://ift.tt/31cPEGT
https://ift.tt/eA8V8J

Search This Blog

Theprogrammersfirst | A technical portal.

Generating one NumPy array for each DataFrame row

Comments

Post a Comment

Popular posts from this blog

Today Walkin 14th-Sept

Spring Elasticsearch Operations

Hibernate Search - Elasticsearch with JSON manipulation