2021-06-30

Writing single SQL query satisfying two cases

After the comments received, I am rephrasing this question with required data.

Reference: SQL query to exclude some records from the output

Vertica analytical functions: https://www.vertica.com/blog/analytic-queries-vertica/

Table 1:

create table etl_group_membership
(
    group_item_id int not null,
    member_item_id int not null
);

INSERT INTO etl_group_membership (group_item_id, member_item_id) VALUES (335640, 117722);
INSERT INTO etl_group_membership (group_item_id, member_item_id) VALUES (335640, 104151);
INSERT INTO etl_group_membership (group_item_id, member_item_id) VALUES (335640, 5316);

Table 2:

create table v_poll_item
(
    device_item_id int not null,
    item_id int not null
    
);

INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 273215);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 117936);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 117873);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (117722, 123305);

INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 240006);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 240005);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 239415);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (104151, 239414);

INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 118310);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 130627);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 298564);
INSERT INTO v_poll_item (device_item_id, item_id) VALUES (5316, 118311);

Table 3: Note that im_utilization can be NULL as well

create table nrm_cpustats_rate
(
    item_id int not null,
    tstamp datetime not null,
    im_utilization float,
);

INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-28 19:55:00.000000', 2);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-27 23:35:00.000000', 24);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-26 14:05:00.000000', 27);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (273215, '2021-06-25 09:05:00.000000', 29);

INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-28 19:30:00.000000', 17);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-28 19:15:00.000000', 35);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-28 19:05:00.000000', 50);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-27 05:45:00.000000', 89);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-25 09:20:00.000000', 37);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-25 09:10:00.000000', 51);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (117936, '2021-06-25 08:50:00.000000', 90);

INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118310, '2021-06-23 04:10:00.000000', 51);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118310, '2021-06-23 03:15:00.000000', 48);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118310, '2021-06-22 22:20:00.000000', 19);

INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-22 17:10:00.000000', 11);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-22 16:30:00.000000', 37);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-22 16:35:00.000000', 38);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-28 18:45:00.000000', 74);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-28 18:48:00.000000', 76);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (239414, '2021-06-28 18:50:00.000000', 77);

INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118311, '2021-06-28 00:40:00.000000', 29);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118311, '2021-06-23 22:30:00.000000', 37);
INSERT INTO nrm_cpustats_rate (item_id, tstamp, im_utilization) VALUES (118311, '2021-06-23 22:25:00.000000', 92);

To get the device items ids in a group:

SELECT member_item_id FROM etl_group_membership WHERE group_item_id = 335640;

From the list of device item ids retrieved, to get the list of item_ids:

SELECT item_id FROM v_poll_item WHERE device_item_id IN (<devices retrieved from previous query>);

Inputs:

  1. Two time ranges: yesterday until 7 days back (AND tstamp > '2021-06-22 00:00:00.000000'AND tstamp <= '2021-06-28 23:59:59.000000')
  2. Group id: 335640
  3. breach_threshold: 25
  4. Minimum number of breaches each day: 2

Expected output:

breached means im_utilization is >= 25

Pick only those records where count_of_breached in a given day >= 2 That is, records with item_id 273215 are excluded because even though the number of breaches (>= 25) are 2, there's only one each day

device_item_id  |  item_id  |  count_of_breach  |  date_when_breached  |  max_utilization  |  max_utilization_tstamp
=====================================================================================================================
117722          |  117936   |  2                |  2021-06-28          | 90                |  2021-06-25 08:50:00.000000
117722          |  117936   |  3                |  2021-06-25          | 90                |  2021-06-25 08:50:00.000000
5316            |  118310   |  2                |  2021-06-23          | 51                |  2021-06-23 04:10:00.000000
5316            |  118311   |  2                |  2021-06-23          | 92                |  2021-06-23 22:25:00.000000
104151          |  239414   |  2                |  2021-06-22          | 77                |  2021-06-28 18:50:00.000000
104151          |  239414   |  3                |  2021-06-28          | 77                |  2021-06-28 18:50:00.000000

Even if a single SQL query cannot be written to produce this output, can two optimized queries be suggested. @marcothesane pointed that even the query to get daily breaches can be written in a better way.



from Recent Questions - Stack Overflow https://ift.tt/3jmg9QK
https://ift.tt/eA8V8J

No comments:

Post a Comment