2022-06-29

How to create nested training and testing sets?

I'm working with the ChickWeight data set in R. I'm looking to create multiple models, each trained for an individual chick. As such, I am nesting the data so that a dataframe is created for each individual chick and stored within the list column.

Here is the start:

library(tidyverse)
library(datasets)
data("ChickWeight")

ChickWeightNest <- ChickWeight %>% 
  group_by(Chick) %>% 
  nest()

From here, training a linear regression model on all dataframes simultaneously is very easy by simply building the model as a function then mutating a new column and mapping. However, building a more sophisticated model (e.g. xgboost) requires first splitting the data into testing and training sets. How can I split my all nested data frames at once to create training and testing sets so that I can train multiple models simultaneously?

As a side note, info on training/tuning multiple models seems to be relatively sparse in my research, any related resources or past stack questions would be very appreciated.



No comments:

Post a Comment