1. 程式人生 > 實用技巧 >大資料實戰(三十九):電商數倉(三十二)之使用者行為資料倉庫(十八)每個使用者累計訪問次數

大資料實戰(三十九):電商數倉(三十二)之使用者行為資料倉庫(十八)每個使用者累計訪問次數

0 每個使用者累計訪問次數

結果如下

使用者 日期 小計 總計

mid1 2019-12-14 10 10

mid1 2019-02-11 12 22

mid2 2019-12-14 15 15

mid2 2019-02-11 12 27

1 DWS

1.1 建表語句

hive (gmall)>
drop table if exists dws_user_total_count_day;
create external table dws_user_total_count_day( 
    `mid_id` string COMMENT '裝置id',
`subtotal` bigint COMMENT '每日登入小計'
)
partitioned by(`dt` string)
row format delimited fields terminated by '\t'
location '/warehouse/gmall/dws/dws_user_total_count_day';
View Code

1.2 匯入資料

-----------------------------需求9.每個使用者累計訪問次數-----------------------
向dws_user_total_count_day插入資料
-----------------------------相關表---------------------
dwd_start_log(啟動日誌表)
-----------------------------思路-----------------------
使用者每開啟一次應用,就會產生一條啟動日誌。
從啟動日誌表查詢,根據使用者(mid_id)分組,求每個使用者產生的

啟動日誌的總的數量(count)
-----------------------------SQL------------------------
insert overwrite table dws_user_total_count_day PARTITION(dt='2020-02-18')
SELECT
mid_id,
count(*) subtotal
FROM dwd_start_log
where dt='2020-02-18'
GROUP by mid_id;

1.3資料匯入指令碼

dws_user_total_count_day.sh

#!/bin/bash
if [ -n "$1" ]
then
     do_date=$1
else
    do_date=$(date -d yesterday +%F)
fi

echo ===日誌日期為$do_date===


sql="
insert overwrite table dws_user_total_count_day PARTITION(dt='$do_date')
SELECT 
    mid_id,
    count(*) subtotal
FROM dwd_start_log
where dt='$do_date'
GROUP by mid_id;
"
hive  -e "$sql"

2 ADS

2.1 建表語句

drop table if exists ads_user_total_count;
create external table ads_user_total_count( 
    `mid_id` string COMMENT '裝置id',
    `subtotal` bigint COMMENT '每日登入小計',
    `total` bigint COMMENT '登入次數總計'
)
partitioned by(`dt` string)
row format delimited fields terminated by '\t'
location '/warehouse/gmall/ads/ads_user_total_count';
View Code

2.2 匯入資料

-----------------------------需求 ads層統計使用者的累計訪問次數-----------------------
-----------------------------相關表---------------------
dws_user_total_count_day
-----------------------------思路-----------------------
從dws_user_total_count_day中取出每個使用者每天登入的次數,
再取出每個使用者之前每天登入的次數的總和
-----------------------------SQL------------------------
insert overwrite table ads_user_total_count PARTITION(dt='2020-02-18')
SELECT
t1.mid_id,
t1.subtotal,
t2.total
from
(select mid_id,subtotal
from dws_user_total_count_day
where dt='2020-02-18') t1
JOIN
(select mid_id,sum(subtotal) total
FROM dws_user_total_count_day
where dt<='2020-02-18'
GROUP by mid_id) t2
on t1.mid_id=t2.mid_id

2.3 資料匯入指令碼

ads_user_total_count.sh

#!/bin/bash
if [ -n "$1" ]
then
     do_date=$1
else
    do_date=$(date -d yesterday +%F)
fi

echo ===日誌日期為$do_date===


sql="

use gmall;
insert overwrite table ads_user_total_count PARTITION(dt='$do_date')
SELECT
    t1.mid_id,
    t1.subtotal,
    t2.total
from 
(select mid_id,subtotal
from dws_user_total_count_day
where dt='$do_date') t1
JOIN
(select mid_id,sum(subtotal) total
FROM dws_user_total_count_day
where dt<='$do_date'
GROUP by mid_id) t2
on t1.mid_id=t2.mid_id

"
hive  -e "$sql"