Python Pandas Dataframe - Nothing being returned from my function

By Ritesh Sahu - March 31, 2021

I have two dataframes:

energy_calculated (the time_stamp columns were just formatted using 3 decimal values to make sure there weren't any hidden values disrupting the simple math):

    fl_key min_time_stamp   max_time_stamp      energy
0    10051 1614556800019.000 1614556807979.000   0.352
1    10051 1614556808019.000 1614556815979.000   0.275
2    10051 1614556816019.000 1614556823979.000   0.429
3    10051 1614556824019.000 1614556831979.000   0.406
4    10051 1614556832019.000 1614556839979.000   0.444
5    10051 1614556840019.000 1614556847979.000   0.348
6    10051 1614556848019.000 1614556855979.000   0.381
7    10051 1614556856019.000 1614556863979.000   0.456
8    10051 1614556864019.000 1614556871979.000   0.362
9    10051 1614556872019.000 1614556879979.000   0.465
10   10051 1614556880019.000 1614556887979.000   0.577
11   10051 1614556888019.000 1614556895979.000   0.305
12   10051 1614556896019.000 1614556903979.000   0.347
13   10051 1614556904019.000 1614556911979.000   0.246
14   10051 1614556912019.000 1614556919939.000   0.340

df_test:

      fl_Key  time_stamp        energy       install_prediction
1007   10051  1614556840299      -1                  -1
491    10051  1614556819659      -1                  -1
1944   10051  1614556877779      -1                  -1
2227   10051  1614556889099      -1                  -1
677    10051  1614556827099      -1                  -1
2944   10051  1614556917779      -1                  -1
799    10051  1614556831979      -1                  -1
2378   10051  1614556895139      -1                  -1
1877   10051  1614556875099      -1                  -1
487    10051  1614556819499      -1                  -1

I am trying to do a lookup on the fl_Key and time_stamp from the df_test dataframe using them to find the "energy" value from the energy_calculated dataframe. The fl_Key to fl_key column should be exact match. The time_stamp column should be in between the min and max time_stamp columns.

The fl_Key and fl_key names are different so I can track which column is coming from where.

I have a simple method (I put in the raise exceptions just to make sure it was always finding a match):

def integrateEnergyCalculationData(row, energy_calculations):
  energy_calculations = energy_calculations[(energy_calculations['fl_key'] == row.fl_Key) & (energy_calculations['min_time_stamp'] <= row.time_stamp) & (energy_calculations['max_time_stamp'] >= row.time_stamp)]

  if (len(energy_calculations) == 0):
    raise Exception("No energy data for: " + str(row.fl_Key) + ", " + str(row.time_stamp))
  elif (len(energy_calculations) >= 2):
    raise Exception("Too much energy data for: " + str(row.fl_Key) + ", " + str(row.time_stamp))

  return energy_calculations['energy']

I tie it all together using apply():

df_test['energy'] = df_test[['time_stamp','fl_Key']].apply(integrateEnergyCalculationData, 1, args=(energy_calculated, ))

What ends up happening is that the mapping is made for some of the rows, but not all of them:

My resulting df_test dataframe looks like (I have a much bigger version of df_test, but I have shortened it to 10 rows to demonstrate the issue). I randomly selected 10 rows from the bigger version - that is why the index numbers are out of whack:

       fl_Key    time_stamp            energy     install_prediction
1007    10051    1614556840299                          -1
491     10051    1614556819659    0.4291915384067029    -1
1944    10051    1614556877779                          -1
2227    10051    1614556889099                          -1
677     10051    1614556827099                          -1
2944    10051    1614556917779                          -1
799     10051    1614556831979                          -1
2378    10051    1614556895139                          -1
1877    10051    1614556875099                          -1
487     10051    1614556819499    0.4291915384067029    -1

What am I missing? Thanks.

from Recent Questions - Stack Overflow https://ift.tt/3frBZAa
https://ift.tt/eA8V8J

Search This Blog

Theprogrammersfirst | A technical portal.

Python Pandas Dataframe - Nothing being returned from my function

Comments

Post a Comment

Popular posts from this blog

Today Walkin 14th-Sept

Hibernate Search - Elasticsearch with JSON manipulation

Spring Elasticsearch Operations