大資料實戰(三十九):電商數倉(三十二)之使用者行為資料倉庫(十八)每個使用者累計訪問次數
0 每個使用者累計訪問次數
結果如下
使用者 日期 小計 總計
mid1 2019-12-14 10 10
mid1 2019-02-11 12 22
mid2 2019-12-14 15 15
mid2 2019-02-11 12 27
1 DWS層
1.1 建表語句
hive (gmall)> drop table if exists dws_user_total_count_day; create external table dws_user_total_count_day( `mid_id` string COMMENT '裝置id', `subtotal` bigint COMMENT '每日登入小計' ) partitioned by(`dt` string) row format delimited fields terminated by '\t' location '/warehouse/gmall/dws/dws_user_total_count_day';View Code
1.2 匯入資料
-----------------------------需求9.每個使用者累計訪問次數-----------------------
向dws_user_total_count_day插入資料
-----------------------------相關表---------------------
dwd_start_log(啟動日誌表)
-----------------------------思路-----------------------
使用者每開啟一次應用,就會產生一條啟動日誌。
從啟動日誌表查詢,根據使用者(mid_id)分組,求每個使用者產生的
-----------------------------SQL------------------------
insert overwrite table dws_user_total_count_day PARTITION(dt='2020-02-18')
SELECT
mid_id,
count(*) subtotal
FROM dwd_start_log
where dt='2020-02-18'
GROUP by mid_id;
1.3資料匯入指令碼
dws_user_total_count_day.sh
#!/bin/bash if [ -n "$1" ] then do_date=$1 else do_date=$(date -d yesterday +%F) fi echo ===日誌日期為$do_date=== sql=" insert overwrite table dws_user_total_count_day PARTITION(dt='$do_date') SELECT mid_id, count(*) subtotal FROM dwd_start_log where dt='$do_date' GROUP by mid_id; " hive -e "$sql"
2 ADS層
2.1 建表語句
drop table if exists ads_user_total_count;
create external table ads_user_total_count(
`mid_id` string COMMENT '裝置id',
`subtotal` bigint COMMENT '每日登入小計',
`total` bigint COMMENT '登入次數總計'
)
partitioned by(`dt` string)
row format delimited fields terminated by '\t'
location '/warehouse/gmall/ads/ads_user_total_count';
View Code
2.2 匯入資料
-----------------------------需求 ads層統計使用者的累計訪問次數-----------------------
-----------------------------相關表---------------------
dws_user_total_count_day
-----------------------------思路-----------------------
從dws_user_total_count_day中取出每個使用者每天登入的次數,
再取出每個使用者之前每天登入的次數的總和
-----------------------------SQL------------------------
insert overwrite table ads_user_total_count PARTITION(dt='2020-02-18')
SELECT
t1.mid_id,
t1.subtotal,
t2.total
from
(select mid_id,subtotal
from dws_user_total_count_day
where dt='2020-02-18') t1
JOIN
(select mid_id,sum(subtotal) total
FROM dws_user_total_count_day
where dt<='2020-02-18'
GROUP by mid_id) t2
on t1.mid_id=t2.mid_id
2.3 資料匯入指令碼
ads_user_total_count.sh
#!/bin/bash if [ -n "$1" ] then do_date=$1 else do_date=$(date -d yesterday +%F) fi echo ===日誌日期為$do_date=== sql=" use gmall; insert overwrite table ads_user_total_count PARTITION(dt='$do_date') SELECT t1.mid_id, t1.subtotal, t2.total from (select mid_id,subtotal from dws_user_total_count_day where dt='$do_date') t1 JOIN (select mid_id,sum(subtotal) total FROM dws_user_total_count_day where dt<='$do_date' GROUP by mid_id) t2 on t1.mid_id=t2.mid_id " hive -e "$sql"