当前位置: 首页 > 工具软件 > BigQuery > 使用案例 >

BigQuery查询

钦海荣
2023-12-01

1.导入BigQuery后数据表的结构

1.1 数据集

针对每个启用了 BigQuery 集成的 Firebase 应用,系统都会添加一个以软件包名称或软件包ID为基础来命名的数据集。

1.2 表

在每个数据集中,系统会为每天导出的数据导入一个表。此类表的名称格式为“app_events_YYYYMMDD”。此外,系统还会为当天收到的事件导入一个表。此表的名称格式为“app_events_intraday_YYYYMMDD”,是在收集事件期间实时填充的。

1.3 行

表中的每一行都与 Firebase SDK 上传的一批事件相对应。

1.4 列

1.4.1 user_dim

字段名称数据类型
user_dimRECORD
user_dim.user_idSTRING
user_dim.first_open_timestamp_microsINTEGER
user_dim.user_propertiesRECORD
user_dim.user_properties.keySTRING
user_dim.user_properties.valueRECORD
user_dim.user_properties.value.valueRECORD
user_dim.user_properties.value.value.string_valueSTRING
user_dim.user_properties.value.value.int_valueINTEGER
user_dim.user_properties.value.value.double_valueFLOAT
user_dim.user_properties.value.set_timestamp_usecINTEGER
user_dim.user_properties.value.indexINTEGER
user_dim.device_infoRECORD
user_dim.device_info.device_categorySTRING
user_dim.device_info.mobile_brand_nameSTRING
user_dim.device_info.mobile_model_nameSTRING
user_dim.device_info.mobile_marketing_nameSTRING
user_dim.device_info.device_modelSTRING
user_dim.device_info.platform_versionSTRING
user_dim.device_info.device_idSTRING
user_dim.device_info.resettable_device_idSTRING
user_dim.device_info.user_default_languageSTRING
user_dim.device_info.device_time_zone_offset_secondsINTEGER
user_dim.device_info.limited_ad_trackingBOOLEAN
user_dim.geo_infoRECORD
user_dim.geo_info.continentSTRING
user_dim.geo_info.countrySTRING
user_dim.geo_info.regionSTRING
user_dim.geo_info.citySTRING
user_dim.app_infoRECORD
user_dim.app_info.app_versionSTRING
user_dim.app_info.app_instance_idSTRING
user_dim.app_info.app_storeSTRING
user_dim.app_info.app_platformSTRING
user_dim.traffic_sourceRECORD
user_dim.traffic_source.user_acquired_campaignSTRING
user_dim.traffic_source.user_acquired_sourceSTRING
user_dim.traffic_source.user_acquired_campaignSTRING
user_dim.bundle_infoRECORD
user_dim.bundle_info.bundle_sequence_idINTEGER
user_dim.ltv_infoRECORD
user_dim.ltv_info.revenueFLOAT
user_dim.ltv_info.currencySTRING

1.4.2 event_dim

字段名称数据类型
event_dimRECORD
event_dim.dateSTRING
event_dim.nameSTRING
event_dim.paramsRECORD
event_dim.params.keySTRING
event_dim.params.valueRECORD
event_dim.params.value.string_valueSTRING
event_dim.params.value.int_valueINTEGER
event_dim.params.value.double_valueFLOAT
event_dim.timestamp_microsINTEGER
event_dim.previous_timestamp_microsINTEGER

1.5 数据格式

BigQuery用户事件字段

{
    "user_dim": {
      "user_id": null,
      "first_open_timestamp_micros": "1501051290221000",
      "user_properties": [
        {
          "key": "first_open_time",
          "value": {
            "value": {
              "string_value": null,
              "int_value": "1501052400000",
              "float_value": null,
              "double_value": null
            },
            "set_timestamp_usec": "1501051290221000",
            "index": null
          }
        }
      ],
      "device_info": {
        "device_category": "mobile",
        "mobile_brand_name": "Tecno",
        "mobile_model_name": "C9",
        "mobile_marketing_name": "C9",
        "device_model": "TECNO-C9",
        "platform_version": "6.0",
        "device_id": null,
        "resettable_device_id": "b095b68b-d977-43b0-92ad-c1ec93e06f37",
        "user_default_language": "en-us",
        "device_time_zone_offset_seconds": "3600",
        "limited_ad_tracking": "false"
      },
      "geo_info": {
        "continent": "Africa",
        "country": "Cameroon",
        "region": "Centre",
        "city": null
      },
      "app_info": {
        "app_version": "V1.0.22",
        "app_instance_id": "3f5b47995065c6a295a10bfc6add5790",
        "app_store": "com.android.vending",
        "app_platform": "ANDROID",
        "app_id": "com.hatsune.eagleee"
      },
      "traffic_source": null,
      "bundle_info": {
        "bundle_sequence_id": "4",
        "server_timestamp_offset_micros": "2026647"
      },
      "ltv_info": null
    },
    "event_dim": [
      {
        "date": "20170824",
        "name": "user_engagement",
        "params": [
          {
            "key": "firebase_screen_class",
            "value": {
              "string_value": "WelcomeActivity",
              "int_value": null,
              "float_value": null,
              "double_value": null
            }
          },
          {
            "key": "firebase_event_origin",
            "value": {
              "string_value": "auto",
              "int_value": null,
              "float_value": null,
              "double_value": null
            }
          },
          {
            "key": "firebase_screen_id",
            "value": {
              "string_value": null,
              "int_value": "-5619035179041357381",
              "float_value": null,
              "double_value": null
            }
          },
          {
            "key": "engagement_time_msec",
            "value": {
              "string_value": null,
              "int_value": "1017",
              "float_value": null,
              "double_value": null
            }
          }
        ],
        "timestamp_micros": "1503562955447000",
        "previous_timestamp_micros": "1501195415642000",
        "value_in_usd": null
      }
    ]
  }
···
## 2.BigQuery查询指令
查询自定义事件参数:https://cloud.google.com/solutions/mobile/mobile-firebase-analytics-big-query

### 2.1 示例1
手机游戏开发商想知道哪个级别的游戏触发游戏中最多的购买流程。为此,她将以下自定义事件调用添加到玩家触发游戏内购买流程时运行的游戏。
在这个例子中,自定义事件的名称trigger_purchase有一个键level,它指定一个整数值,表示玩家在触发购买时所处的水平。


BigQuery中针对此自定义事件的结果条目如下:
领域	值
`event_dim.name`	trigger_purchase
`event_dim.params.key`	水平
`event_dim.params.value.int_value`	3
为了确定每个级别的购买次数,她写了一个查询,报告“trigger_in_app_purchase”事件计数的分布,按用户当前的级别分组。此查询在1/1/2015至12/31/2016的日期范围内运行。
```sql
SELECT COUNT(event_dim.name) as NumberOfPurchases,
event_dim.params.value.int_value as level
FROM
TABLE_DATE_RANGE(com_game_example_ANDROID.app_events_, TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE event_dim.name = 'trigger_purchase'
GROUP BY level

2.2 示例2

在同一个游戏中工作的第二个开发者想要确定哪些角色类打败了最多的对手。为此,当玩家击败对手时,他将以下自定义事件记录添加到游戏中。
在这个例子中,自定义事件的名称是defeat_opponent,它有一个键character,它指定一个字符串值,指定对手的名字。
BigTC中针对此自定义事件的结果条目如下:
领域 值
event_dim.name defeat_opponent
event_dim.params.key 字符
event_dim.params.value.string_value 大力神
为了找到最成功的角色类,他写了一个查询,计算一个角色类参与失败对手的自定义事件的次数。

SELECT COUNT(event_dim.name) as DefeatEvents,
event_dim.params.value.string_value as CharacterClass
FROM
TABLE_DATE_RANGE( com_game_example_ANDROID.app_events_, TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE event_dim.name = 'defeat_opponent'
AND event_dim.params.key = 'character'
GROUP BY CharacterClass
ORDER BY DefeatEvents desc

2.3 示例3

应用程式开发人员希望使用 Firebase Cloud Messaging 向最近未使用该应用程式的使用者传送重新参与式推送通知。要构建通知的联系人列表,开发人员使用以下查询来查找过去一年中没有在过去两周内访问过该应用程序的用户。此查询返回iOS和Android用户以及用户最初打开应用的日期。
在下面的查询中,该user_dim.user_id字段包含开发人员通过调用setUserId API 在应用程序中设置的用户标识 。设置此值后,Firebase Analytics将保留该值并将其包含在与该用户关联的所有将来的行中。该值不会追溯到过去的行。
如果开发人员未设置user_dim.user_id 标识符,则可以使用app_info.app_instance_id包含Firebase生成的默认标识的字段来代替 user_dim.user_id。请注意,Firebase app_info.app_instance_id每次卸载Firebase应用程序并重新安装在同一台设备上时都会生成新的 。
导出到iOS应用程序的BigQuery的数据表是 com_retail_example_IOS.app_events_<date>和Android的相应的数据表com_retail_example_ANDROID.app_events_<date>。

SELECT
  userId,
  DATE(MIN( firstOpenTime )) firstOpenTime
FROM (
  SELECT
    user_dim.user_id AS userId,
    user_dim.first_open_timestamp_micros AS firstOpenTime
  FROM (TABLE_DATE_RANGE([com_retail_example_IOS.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'), CURRENT_TIMESTAMP())),
    (TABLE_DATE_RANGE([com_retail_example_ANDROID.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'), CURRENT_TIMESTAMP())) )
WHERE
  userId NOT IN (
  SELECT
    user_dim.user_id AS userId
  FROM (TABLE_DATE_RANGE([com_retail_example_IOS.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -14, 'DAY'), CURRENT_TIMESTAMP())),
    (TABLE_DATE_RANGE([com_retail_example_ANDROID.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -14, 'DAY'), CURRENT_TIMESTAMP())) )
GROUP BY
  userId

3. Firebase相关常用查询

3.1 查询从不同平台上下载应用的次数

SELECT user_dim.app_info.app_store, count(distinct(user_dim.app_info.app_instance_id)) as count  FROM [news-246a4:com_hatsune_eagleee_ANDROID.app_events_20171227] group by user_dim.app_info.app_store LIMIT 1000

3.2 新增人数(按appid来计算)

SELECT DATE(TIMESTAMP_MICROS(user_dim.first_open_timestamp_micros)) AS FirstDate, 
COUNT(DISTINCT(user_dim.app_info.app_instance_id)) AS UserCount 
FROM `com_hatsune_eagleee_ANDROID.app_events_*` 
WHERE _TABLE_SUFFIX BETWEEN '20170820' AND '20170826' 
GROUP BY FirstDate 
ORDER BY FirstDate DESC 
LIMIT 200;

3.3 用户互动

SELECT
  event_dim.name,
  event_dim.params.key,
  cast (event_dim.params.value.int_value/1000 AS integer) AS seconds
FROM
  [news-246a4:com_hatsune_eagleee_ANDROID.app_events_20170827]
WHERE
  event_dim.name='user_engagement'
  AND event_dim.params.key='engagement_time_msec'
GROUP BY
  event_dim.name,
  event_dim.params.key,
  seconds
ORDER BY
  seconds DESC
LIMIT
  1000;

3.4 会话时长分布

SELECT
  dt,
  EXACT_COUNT_DISTINCT(id) AS ids,
  SUM(dtm)/1000 AS tdtm
FROM (
  SELECT
    event_dim.date AS dt,
    user_dim.app_info.app_instance_id AS id,
    CONCAT(event_dim.date,'-',user_dim.app_info.app_instance_id) AS ky,
    SUM( event_dim.params.value.int_value) AS dtm
  FROM
    FLATTEN( (
      SELECT
        *
      FROM
        TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_2018010*")')), event_dim.name)
  WHERE
    event_dim.name='user_engagement'
    AND event_dim.params.key='engagement_time_msec'
  GROUP BY
    dt,
    id,
    ky )
WHERE
  ky IN (
  SELECT
    CONCAT(event_dim.date,'-', user_dim.app_info.app_instance_id) AS sessionky
  FROM
    FLATTEN( (
      SELECT
        *
      FROM
        TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_2018010*")')), event_dim.name)
  WHERE
    event_dim.name='session_start' )
GROUP BY
  dt;

3.5 每个国家、每个版本的人均互动时长

user_engagement事件中的计时参数相加得出总时长,与去重后的人数作比值

SELECT
  event_dim.date AS dt,
  user_dim.app_info.app_version AS ver,
  user_dim.geo_info.country AS coun,
  EXACT_COUNT_DISTINCT(user_dim.app_info.app_instance_id) AS ids,
  SUM(event_dim.params.value.int_value)/1000 AS ts
FROM
  FLATTEN( (
    SELECT
      *
    FROM
      TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_201((712*)|(8*))")')), event_dim.params)
WHERE
  event_dim.name='user_engagement'
  AND event_dim.params.key = 'engagement_time_msec'
  AND event_dim.date >= '20171201'
GROUP BY dt, ver, coun
ORDER BY dt ASC;

3.6 各个版本的卸载时间

app_remove的时间与first_open的时间差

SELECT
  version,
  duration,
  EXACT_COUNT_DISTINCT(app_id) AS usercnt
FROM (
  SELECT
    user_dim.app_info.app_instance_id AS app_id,
    user_dim.app_info.app_version AS version,
    USEC_TO_TIMESTAMP(event_dim.timestamp_micros) AS remove_time,
    USEC_TO_TIMESTAMP(user_dim.first_open_timestamp_micros) AS new_time,
    DATEDIFF(event_dim.timestamp_micros, user_dim.first_open_timestamp_micros) AS duration
  FROM
    TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_201((712*)|(801*))")')
  WHERE
    event_dim.name='app_remove'
    AND REGEXP_MATCH(user_dim.app_info.app_version, '^V3.*$' ) ) AS a
GROUP BY
  version,
  duration;

3.7 所有埋点事件及其参数

取一天内所有的事件及参数,去重即可

SELECT
  event_dim.name,
  event_dim.params.key
FROM
  [news-246a4:com_hatsune_eagleee_ANDROID.app_events_20170827]
GROUP BY
  event_dim.name,
  event_dim.params.key;

3.8 各版本某事件的触发人数

取一天内所有的事件及参数,去重即可

SELECT
  user_dim.device_info.platform_version AS osver,
  event_dim.name AS eve,
  EXACT_COUNT_DISTINCT(user_dim.app_info.app_instance_id) AS ids
FROM
  FLATTEN( (
    SELECT
      *
    FROM
      TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_201*")')), event_dim.params)
WHERE
  event_dim.name='first_open'
  OR event_dim.name='app_remove'
GROUP BY
  osver,
  eve;

3.9 会话时长

取开始会话的用户,算互动时长

SELECT
  dt,
  EXACT_COUNT_DISTINCT(id) AS ids,
  SUM(dtm)/1000 AS tdtm
FROM (
  SELECT
    event_dim.date AS dt,
    user_dim.app_info.app_instance_id AS id,
    CONCAT(event_dim.date,'-',user_dim.app_info.app_instance_id) AS ky,
    SUM( event_dim.params.value.int_value) AS dtm
  FROM
    FLATTEN( (
      SELECT
        *
      FROM
        TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_2018010*")')), event_dim.name)
  WHERE
    event_dim.name='user_engagement'
    AND event_dim.params.key='engagement_time_msec'
  GROUP BY
    dt,
    id,
    ky )
WHERE
  ky IN (
  SELECT
    CONCAT(event_dim.date,'-', user_dim.app_info.app_instance_id) AS sessionky
  FROM
    FLATTEN( (
      SELECT
        *
      FROM
        TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_2018010*")')), event_dim.name)
  WHERE
    event_dim.name='session_start' )
GROUP BY
  dt;

3.10 存量用户维度

取所有用户的渠道、版本、机型、品牌、语言、国家,除去己卸载的用户;剩余每个gadid取最晚的一次记录中的维度;最后,按各维度对gadid进行计数

SELECT
  country,
  version,
  language,
  brand,
  model,
  channel,
  EXACT_COUNT_DISTINCT(gadid) AS usercnt
FROM (
  SELECT
    user_dim.device_info.resettable_device_id AS gadid,
    user_dim.geo_info.country AS country,
    user_dim.app_info.app_version AS version,
    user_dim.device_info.user_default_language AS language,
    user_dim.device_info.mobile_brand_name AS brand,
    user_dim.device_info.mobile_model_name AS model,
    user_dim.user_properties.value.value.string_value AS channel
  FROM (
    SELECT
      *
    FROM
      TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_*")'))
  WHERE
    user_dim.user_properties.key='Channel'
    AND user_dim.app_info.app_instance_id NOT IN (
    SELECT
      user_dim.app_info.app_instance_id
    FROM (
      SELECT
        *
      FROM
        TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_*")'))
    WHERE
      event_dim.name='app_remove' )
  GROUP BY
    gadid,
    country,
    version,
    language,
    brand,
    model,
    channel ) a
GROUP BY
  country,
  version,
  language,
  brand,
  model,
  channel;

3.11 会话开始次数24小时分布

取session_start事件,计算各日期各小时的人数次数

SELECT
  STRFTIME_UTC_USEC(event_dim.timestamp_micros, "%Y-%m-%d") AS startdate,
  STRFTIME_UTC_USEC(event_dim.timestamp_micros, "%H") AS starthour,
  COUNT(user_dim.app_info.app_instance_id) AS timescnt,
  EXACT_COUNT_DISTINCT(user_dim.app_info.app_instance_id) AS usercnt
FROM
  FLATTEN( (
    SELECT
      *
    FROM
      TABLE_QUERY([news-246a4:com_hatsune_eagleee_ANDROID], 'REGEXP_MATCH(table_id, r"^app_events_201804(1[5-9],2)*")')), event_dim.params)
WHERE
  event_dim.name='session_start'
GROUP BY
  startdate,
  starthour
ORDER BY
  startdate ASC,
  starthour ASC;
 类似资料: