look for values within a date interval on a second data.table
I am a beginner to R and I am looking for help with a function/loop.
I Have this data table "by_newborn":
id_mom | date | id_newborn | week | weight | conception_date | pregnancy_interval | one_year_before_pregnany | one_year_before_interval | first_trimester |
---|---|---|---|---|---|---|---|---|---|
1.21e+12 | 01/05/2020 | 1234 | 18 | 2 | 2019-12-27 | 2019-12-27 UTC--2020-05-01 UTC | 2018-12-26 | 2018-12-26 UTC--2019-12-27 UTC | 2020-04-02 |
1.21e+12 | 01/05/2020 | 5489 | 18 | 2 | 2019-12-27 | 2019-12-27 UTC--2020-05-01 UTC | 2018-12-26 | 2018-12-26 UTC--2019-12-27 UTC | 2020-04-02 |
by_newborn
structure:
structure(list(ן..ID = c(2602035392, 2602035392, 4104232942),
date_of_birth = structure(c(1L, 1L, 2L), .Label = c("01/05/2020",
"02/05/2018", "03/05/2020", "04/05/2020", "05/05/2020", "06/05/2020",
"07/05/2020", "08/05/2020"), class = "factor"), week = c(38L,
38L, 36L), conception_date = structure(c(18117, 18117, 17401
), class = "Date"), pregnancy_interval = new("Interval",
.Data = c(22982400, 22982400, 21772800), start = structure(c(1565308800,
1565308800, 1503446400, 1503446400, 1563062400, 1563062400,
1564358400, 1564358400, 1563840000, 1563840000, 1563926400,
1564617600, 1567728000), tzone = "UTC", class = c("POSIXct",
"POSIXt")), tzone = "UTC")), row.names = c(NA,
-3L), class = c("data.table", "data.frame"))
I have created the intervals using Data table
and Lubridate
conception_date = lubridate :: dmy(by_newborn$date) - lubridate:: weeks(by_newborn$week)
by_newborn[, conception_date:= conception_date]
by_newborn[, pregnancy_interval := interval(ymd(by_newborn$conception_date), dmy(by_newborn$date))]
I have a second table I made TSH_results
with tests results history for each id_mom:
id_mom |date |tsh_level|Units
1.21e+12|01/02/2020|0.5 |ng/dl
1.21e+12|05/02/2020|0.5 |ng/dl
1.21e+12|03/05/2015|1.8 |ng/dl
1.21e+12|09/05/2015|1.8 |ng/dl
TSH_results
structure:
structure(list(ן..id_mom = c(1.21e+12, 1.21e+12, 1.21e+12, 1.21e+12,
1.21e+12), date = c("01/02/2020", "01/02/2020", "01/02/2020",
"01/02/2020", "01/02/2020"), TSH_level = c("0.5", "0.5", "0.5",
"0.5", "0.5"), measur = c("ng/dl", "ng/dl", "ng/dl", "ng/dl",
"ng/dl")), row.names = c(NA, -5L), class = c("data.table", "data.frame"
),
I would like for some help with writing a code that will look for each ID for a result in TSH results
that is within an interval (or 2 dates) and will return the TSH level to a new column in by_newborn
I have tried this, but it seems I might need a loop or another way:
by_newborn[id_mom == TSH_results$id_mom & (dmy(TSH_results$date) %within%
pregnancy_interval), preg_results := TSH_results$result]
Many thanks in advance!
Comments
Post a Comment