2022-01-15

How to parse this array of dicts and extract key columns from a big query external table

I have this Gaint Array of (dicts) loaded from a Json in a date partitioned big query external table with table structure as below as

Field name Type. Mode
meta Record Nullable
Messages String Repeated
date Integer Nullable

Every "Messages" Field is in its own row/record in my Bigquery table (New_line_delimited_Json)

I am trying to parse the "messages" field/column to extract some fields Key1 and Key2 which happens to be inside an Array (of dicts). For sake of simplicity ,below is the snippet of json of which "messages" is a field that I am trying to unnest/explode.

 [
  {
    "meta": {
      "table": "FEED",
      "source": "CP1"
    },
    "Messages": [
      "{
      "Key1":"2022-01-10",
      "Key2":"H21257061"
      }"
       ],
    "date": "20220110"
  },
  {
    "meta": {
      "table": "FEED",
      "source": "CP1"
    },
    "Messages": [
      "{
      "Key1":"2022-01-11",
      "Key2":"H21257062"
      }"
       ],
    "date": "20220111"
  }
]

schema representation:

enter image description here

so far I have tried this but I am getting sql output of key1 and Key2 as Nulls

    WITH table  AS (SELECT Messages as array_column FROM `project.dataset.table`  )
SELECT 
    json_extract_scalar(flattened_array, '$.Messages.key1') as key1,
    json_extract_scalar(flattened_array, '$.Messages.key2') as key2
FROM table t 
CROSS JOIN UNNEST(t.array_column) AS flattened_array


from Recent Questions - Stack Overflow https://ift.tt/3nsizys
https://ift.tt/3A3WwTY

No comments:

Post a Comment