Bigquery SQL: Calculate the average recency and latency of customer transactions by month

I'm trying to calculate monthly averages of - customer recency (days since last order) and customer latency (days between last order and previous order).

Calculating recency on the last day of each reporting month.

Including customers in the monthly calculation where:

  • max(order_date) is BETWEEN reporting mth_end_date (i.e. not in future) and 12 months prior.
  • latency for the max(order_date) IS NOT NULL (i.e. it has a prior order)

It feels like this is reasonably simple (when I says it at least!), but I can't crack it, and feel I am missing something obvious.

I'd appreciate any support you can offer.

Below is my progress to date using sample data.

With input_data AS (

SELECT '#1238' as order_id, DATE('2021-12-15') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1201' as order_id, DATE('2021-10-10') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1198' as order_id, DATE('2021-07-05') as order_date, 'c12345' as cust_id, 20 as order_value
UNION ALL SELECT '#1134' as order_id, DATE('2020-10-15') as order_date, 'c12345' as cust_id, 10 as order_value
UNION ALL SELECT '#1112' as order_id, DATE('2019-08-10') as order_date, 'c12345' as cust_id, 5 as order_value
UNION ALL SELECT '#1234' as order_id, DATE('2021-07-05') as order_date, 'c11111' as cust_id, 118 as order_value
UNION ALL SELECT '#1294' as order_id, DATE('2021-01-05') as order_date, 'c11111' as cust_id, 68 as order_value
UNION ALL SELECT '#1290' as order_id, DATE('2021-01-01') as order_date, 'c11111' as cust_id, 82 as order_value
UNION ALL SELECT '#1284' as order_id, DATE('2020-01-15') as order_date, 'c22222' as cust_id, 98 as order_value),

enhanced_data AS (
SELECT 
*
,DATE_DIFF(order_date, LAG(order_date) OVER (PARTITION BY cust_id ORDER BY order_date), day) AS order_latency_days,
from input_data
),

dates AS (
  SELECT
   LAST_DAY(array_date, month) AS mth_end_date
  FROM 
    UNNEST(GENERATE_DATE_ARRAY( 
      (SELECT MIN(order_date) FROM enhanced_data), 
      (SELECT MAX(order_date) FROM enhanced_data), INTERVAL 1 MONTH)) as array_date)

Select 
 d.mth_end_date
  , avg(e.order_latency_days) as avg_latency
 from dates d
left join enhanced_data e 
ON Date_trunc(d.mth_end_date, Month) >  Date_trunc(e.order_date, Month)
WHERE e.order_latency_days IS NOT NULL
group by 1
order by 1

I would expect the values for Dec-2021 to be (showing workings):

enter image description here



from Recent Questions - Stack Overflow https://ift.tt/nSCMQB2
https://ift.tt/tEicGFR

Comments

Popular posts from this blog

Spring Elasticsearch Operations

Network Error and Timeout on Authorize.net JS

Object oriented programming concepts (OOPs)