2023-03-26

Stack and explode columns in pandas

I have a dataframe to which I want to apply explode and stack at the same time. Explode the 'Attendees' column and assign the correct values to courses. For example, for Course 1 'intro to' the number of attendees was 24 but for Course 2 'advanced' the number of attendees was 46. In addition to that, I want all the course names in one column.

   import pandas as pd
import numpy as np
df = pd.DataFrame({'Session':['session1', 'session2','session3'],
                    'Course 1':['intro to','advanced','Cv'],
                    'Course 2':['Computer skill',np.nan,'Write cover letter'],
                    'Attendees':['24 & 46','23','30']})

If I apply the explode function to 'Attendees' I get the result

Course_df = Course_df.assign(Attendees=Course_df['Attendees'].str.split(' & ')).explode('Attendees')

    Session        Course 1 Course 2           Attendees
0   session1       intro to     Computer skill     24
0   session1       intro to     Computer skill     46
1   session2       advanced.    NaN                23

and when I apply the stack function

Course_df = (Course_df.set_index(['Session','Attendees']).stack().reset_index().rename({0:'Courses'}, axis = 1))

This is the result I get

  Session     level_1             Courses      Attendees
0  session1  Course 1            intro to        24
1  session1  Course 2      Computer skill        46
2  session2  Course 1            advanced        23
3  session3  Course 1                  Cv        30

Whereas the result I want is

   Session     level_1             Courses      Attendees
0  session1  Course 1            intro to        24
1  session1  Course 2      Computer skill        46
2  session2  Course 1            advanced        23
3  session3  Course 1                  Cv        30
4  session3  Course 2   Write cover letter        30


No comments:

Post a Comment