Writing single SQL query satisfying two cases
After the comments received, I am rephrasing this question with required data.
Reference: SQL query to exclude some records from the output
Vertica analytical functions: https://www.vertica.com/blog/analytic-queries-vertica/
Table 1:
create table etl_group_membership
(
group_item_id int not null,
member_item_id int not null
);
INSERT INTO etl_group_membership (group_item_id, member_item_id) VALUES (335640, 117722);
INSERT INTO etl_group_membership (group_item_id, member_item_id) VALUES (335640, 104151);
INSERT INTO etl_group_membership (group_item_id, member_item_id) VALUES (335640, 5316);
Table 2:
create table v_poll_item
(
device_item_id int not null,
item_id int not null
);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 273215);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 117936);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 117873);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 123305);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 240006);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 240005);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 239415);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 239414);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 118310);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 130627);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 298564);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 118311);
Table 3: Note that im_utilization can be NULL as well
create table nrm_cpustats_rate
(
item_id int not null,
tstamp datetime not null,
im_utilization float,
);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-28 19:55:00.000000', 2);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-27 23:35:00.000000', 24);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-26 14:05:00.000000', 27);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-25 09:05:00.000000', 29);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-28 19:30:00.000000', 17);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-28 19:15:00.000000', 35);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-28 19:05:00.000000', 50);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-27 05:45:00.000000', 89);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-25 09:20:00.000000', 37);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-25 09:10:00.000000', 51);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-25 08:50:00.000000', 90);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118310, '2021-06-23 04:10:00.000000', 51);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118310, '2021-06-23 03:15:00.000000', 48);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118310, '2021-06-22 22:20:00.000000', 19);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-22 17:10:00.000000', 11);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-22 16:30:00.000000', 37);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-22 16:35:00.000000', 38);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-28 18:45:00.000000', 74);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-28 18:48:00.000000', 76);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-28 18:50:00.000000', 77);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118311, '2021-06-28 00:40:00.000000', 29);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118311, '2021-06-23 22:30:00.000000', 37);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118311, '2021-06-23 22:25:00.000000', 92);
To get the device items ids in a group:
SELECT member_item_id FROM etl_group_membership WHERE group_item_id = 335640;
From the list of device item ids retrieved, to get the list of item_ids:
SELECT item_id FROM v_poll_item WHERE device_item_id IN (<devices retrieved from previous query>);
Inputs:
- Two time ranges: yesterday until 7 days back (
AND tstamp > '2021-06-22 00:00:00.000000'AND tstamp <= '2021-06-28 23:59:59.000000'
) - Group id: 335640
- breach_threshold: 25
- Minimum number of breaches each day: 2
Expected output:
breached means im_utilization is >= 25
Pick only those records where count_of_breached in a given day >= 2 That is, records with item_id 273215 are excluded because even though the number of breaches (>= 25) are 2, there's only one each day
device_item_id | item_id | count_of_breach | date_when_breached | max_utilization | max_utilization_tstamp
=====================================================================================================================
117722 | 117936 | 2 | 2021-06-28 | 90 | 2021-06-25 08:50:00.000000
117722 | 117936 | 3 | 2021-06-25 | 90 | 2021-06-25 08:50:00.000000
5316 | 118310 | 2 | 2021-06-23 | 51 | 2021-06-23 04:10:00.000000
5316 | 118311 | 2 | 2021-06-23 | 92 | 2021-06-23 22:25:00.000000
104151 | 239414 | 2 | 2021-06-22 | 77 | 2021-06-28 18:50:00.000000
104151 | 239414 | 3 | 2021-06-28 | 77 | 2021-06-28 18:50:00.000000
Even if a single SQL query cannot be written to produce this output, can two optimized queries be suggested. @marcothesane pointed that even the query to get daily breaches can be written in a better way.
from Recent Questions - Stack Overflow https://ift.tt/3jmg9QK
https://ift.tt/eA8V8J
Comments
Post a Comment