2022-04-18

look for values within a date interval on a second data.table

I am a beginner to R and I am looking for help with a function/loop.

I Have this data table "by_newborn":

id_mom date id_newborn week weight conception_date pregnancy_interval one_year_before_pregnany one_year_before_interval first_trimester
1.21e+12 01/05/2020 1234 18 2 2019-12-27 2019-12-27 UTC--2020-05-01 UTC 2018-12-26 2018-12-26 UTC--2019-12-27 UTC 2020-04-02
1.21e+12 01/05/2020 5489 18 2 2019-12-27 2019-12-27 UTC--2020-05-01 UTC 2018-12-26 2018-12-26 UTC--2019-12-27 UTC 2020-04-02

by_newborn structure:

structure(list(ן..ID = c(2602035392, 2602035392, 4104232942), 
    date_of_birth = structure(c(1L, 1L, 2L), .Label = c("01/05/2020", 
    "02/05/2018", "03/05/2020", "04/05/2020", "05/05/2020", "06/05/2020", 
    "07/05/2020", "08/05/2020"), class = "factor"), week = c(38L, 
    38L, 36L), conception_date = structure(c(18117, 18117, 17401
    ), class = "Date"), pregnancy_interval = new("Interval", 
        .Data = c(22982400, 22982400, 21772800), start = structure(c(1565308800, 
        1565308800, 1503446400, 1503446400, 1563062400, 1563062400, 
        1564358400, 1564358400, 1563840000, 1563840000, 1563926400, 
        1564617600, 1567728000), tzone = "UTC", class = c("POSIXct", 
        "POSIXt")), tzone = "UTC")), row.names = c(NA, 
-3L), class = c("data.table", "data.frame"))

I have created the intervals using Data table and Lubridate

conception_date = lubridate :: dmy(by_newborn$date) - lubridate:: weeks(by_newborn$week)
by_newborn[, conception_date:= conception_date]
by_newborn[, pregnancy_interval := interval(ymd(by_newborn$conception_date), dmy(by_newborn$date))]

I have a second table I made TSH_results with tests results history for each id_mom:

id_mom  |date      |tsh_level|Units
1.21e+12|01/02/2020|0.5      |ng/dl
1.21e+12|05/02/2020|0.5      |ng/dl
1.21e+12|03/05/2015|1.8      |ng/dl
1.21e+12|09/05/2015|1.8      |ng/dl

TSH_results structure:

structure(list(ן..id_mom = c(1.21e+12, 1.21e+12, 1.21e+12, 1.21e+12, 
1.21e+12), date = c("01/02/2020", "01/02/2020", "01/02/2020", 
"01/02/2020", "01/02/2020"), TSH_level = c("0.5", "0.5", "0.5", 
"0.5", "0.5"), measur = c("ng/dl", "ng/dl", "ng/dl", "ng/dl", 
"ng/dl")), row.names = c(NA, -5L), class = c("data.table", "data.frame"
),

I would like for some help with writing a code that will look for each ID for a result in TSH results that is within an interval (or 2 dates) and will return the TSH level to a new column in by_newborn

I have tried this, but it seems I might need a loop or another way:

by_newborn[id_mom == TSH_results$id_mom & (dmy(TSH_results$date) %within%
pregnancy_interval), preg_results := TSH_results$result]

Many thanks in advance!



No comments:

Post a Comment