還沒等老板開口,我已經(jīng)把 ADS 層建好了
這次老板沒催我,我提前就把ADS層給建設(shè)好了。ADS(Application Data Store)層是數(shù)據(jù)倉庫的最頂層,直接面向業(yè)務(wù)應(yīng)用,為數(shù)據(jù)分析、報表展示和業(yè)務(wù)決策提供數(shù)據(jù)支持。它是數(shù)據(jù)倉庫與業(yè)務(wù)系統(tǒng)之間的橋梁,將經(jīng)過清洗、轉(zhuǎn)換和匯總的數(shù)據(jù)以業(yè)務(wù)友好的方式呈現(xiàn)給最終用戶。
今天以我們項目中的實際案例為例進行ADS層建設(shè)思路介紹。數(shù)倉代碼可訪問:這次老板沒催我,我提前就把ADS層給建設(shè)好了。ADS(Application Data Store)層是數(shù)據(jù)倉庫的最頂層,直接面向業(yè)務(wù)應(yīng)用,為數(shù)據(jù)分析、報表展示和業(yè)務(wù)決策提供數(shù)據(jù)支持。它是數(shù)據(jù)倉庫與業(yè)務(wù)系統(tǒng)之間的橋梁,將經(jīng)過清洗、轉(zhuǎn)換和匯總的數(shù)據(jù)以業(yè)務(wù)友好的方式呈現(xiàn)給最終用戶。今天以我們項目中的實際案例為例進行ADS層建設(shè)思路介紹。
數(shù)倉代碼可訪問:
- github:https://github.com/Mrkuhuo/data-warehouse-learning
- gitee:https://gitee.com/wzylzjtn/data-warehouse-learning
一、ADS層建設(shè)思路
1. 設(shè)計原則
- 業(yè)務(wù)導(dǎo)向:ADS層設(shè)計應(yīng)以業(yè)務(wù)需求為核心,確保數(shù)據(jù)能夠直接支持業(yè)務(wù)決策和運營分析。
- 性能優(yōu)化:針對高頻查詢場景進行優(yōu)化,包括合理的分區(qū)策略、索引設(shè)計和物化視圖。
- 數(shù)據(jù)一致性:確保數(shù)據(jù)口徑統(tǒng)一,避免不同報表之間的數(shù)據(jù)不一致。
- 可擴展性:設(shè)計時考慮未來業(yè)務(wù)擴展需求,預(yù)留足夠的擴展空間。
- 易用性:提供簡單直觀的數(shù)據(jù)結(jié)構(gòu),降低業(yè)務(wù)人員使用門檻。
2. 數(shù)據(jù)模型設(shè)計
ADS層通常采用星型模型或雪花模型,主要包含以下幾類表:
- 匯總事實表:按不同維度(時間、地區(qū)、產(chǎn)品等)匯總的事實數(shù)據(jù)。
- 維度表:包含業(yè)務(wù)實體的屬性信息,如客戶、產(chǎn)品、地區(qū)等。
- 指標(biāo)表:存儲預(yù)計算的業(yè)務(wù)指標(biāo),如轉(zhuǎn)化率、留存率等。
- 報表表:直接面向報表展示的寬表,包含多個維度的指標(biāo)。
3. 數(shù)據(jù)更新策略
- 增量更新:對于大表,采用增量更新策略,只處理新增或變更的數(shù)據(jù)。
- 全量刷新:對于小表或需要保證數(shù)據(jù)一致性的場景,采用全量刷新策略。
- 定時調(diào)度:根據(jù)業(yè)務(wù)需求設(shè)置合理的調(diào)度周期,如每日、每周或每月。
二、ADS層應(yīng)用場景
- 業(yè)務(wù)報表:為管理層提供決策支持的各類報表。
- 運營分析:支持運營人員進行用戶行為分析和營銷效果評估。
- 風(fēng)險控制:提供風(fēng)險監(jiān)控和預(yù)警數(shù)據(jù)。
- 客戶服務(wù):支持客服人員進行客戶畫像分析和精準(zhǔn)服務(wù)。
- 產(chǎn)品優(yōu)化:為產(chǎn)品團隊提供用戶反饋和使用數(shù)據(jù),指導(dǎo)產(chǎn)品迭代。
三、實戰(zhàn)案例:用戶價值分析報表
1. 業(yè)務(wù)背景
電商平臺需要對用戶進行價值分層,以便進行精準(zhǔn)營銷和個性化服務(wù)?;赗FM模型(Recency、Frequency、Monetary)對用戶進行價值評估,并計算用戶的生命周期價值,為運營決策提供數(shù)據(jù)支持。 02
2. 數(shù)據(jù)來源
- DWS層:用戶交易寬表(dws_trade_user_order_td)和用戶登錄寬表(dws_user_user_login_td)
- ADS層:歷史用戶價值分析數(shù)據(jù)(用于計算價值發(fā)展趨勢)
3. 實現(xiàn)方案
(1) 建表(ads.ads_user_value_analysis)
-- 用戶價值分析表
CREATE TABLE IF NOT EXISTS ads.ads_user_value_analysis
(
dt DATE COMMENT '統(tǒng)計日期',
user_id BIGINT COMMENT '用戶ID',
order_count_td BIGINT COMMENT '累計下單次數(shù)',
order_amount_td DECIMAL(20,2) COMMENT '累計下單金額',
order_last_date DATE COMMENT '最近下單日期',
order_first_date DATE COMMENT '首次下單日期',
login_count_td BIGINT COMMENT '累計登錄次數(shù)',
login_last_date DATE COMMENT '最近登錄日期',
average_order_amount DECIMAL(20,2) COMMENT '平均客單價',
purchase_cycle_days INT COMMENT '平均購買周期(天)',
account_days INT COMMENT '賬號存續(xù)天數(shù)',
life_time_value DECIMAL(20,2) COMMENT '生命周期價值(LTV)',
recency_score INT COMMENT '最近活躍度評分(R)',
frequency_score INT COMMENT '活動頻次評分(F)',
monetary_score INT COMMENT '消費金額評分(M)',
rfm_score INT COMMENT 'RFM總分',
user_value_level STRING COMMENT '用戶價值分層',
active_status STRING COMMENT '活躍狀態(tài)',
life_cycle_status STRING COMMENT '生命周期狀態(tài)',
shopping_preference STRING COMMENT '購物偏好',
growth_trend STRING COMMENT '價值發(fā)展趨勢'
)
COMMENT '用戶價值分層分析報表,基于RFM模型計算用戶價值'
PARTITIONED BY (dt STRING)
STORED AS ORC
TBLPROPERTIES ('orc.compress'='SNAPPY');
(2) 核心指標(biāo)計算
RFM模型評分:
- Recency(最近購買時間):根據(jù)最近一次購買距今的天數(shù)評分(1-5分)
- Frequency(購買頻率):根據(jù)累計購買次數(shù)評分(1-5分)
- Monetary(消費金額):根據(jù)累計消費金額評分(1-5分)
用戶價值分層:
- 高價值:RFM總分≥13分
- 中高價值:RFM總分10-12分
- 中價值:RFM總分7-9分
- 低價值:RFM總分4-6分 流失風(fēng)險:RFM總分≤3分
生命周期價值(LTV) :
- 計算公式:平均客單價 × 年購買頻率 × 預(yù)期客戶生命周期(年)
- 年購買頻率:總購買次數(shù)×365/賬號存續(xù)天數(shù)
用戶狀態(tài)分類:
- 活躍狀態(tài):基于最近交易和登錄時間
- 生命周期狀態(tài):基于交易行為和訂單歷史
- 購物偏好:基于購買頻率和金額
- 價值發(fā)展趨勢:比較當(dāng)前RFM評分與30天前的評分
(3) 具體邏輯實現(xiàn)
INSERT INTO ads.ads_user_value_analysis
(dt, user_id, order_count_td, order_amount_td, order_last_date, order_first_date,
login_count_td, login_last_date, average_order_amount, purchase_cycle_days, account_days,
life_time_value, recency_score, frequency_score, monetary_score, rfm_score,
user_value_level, active_status, life_cycle_status, shopping_preference, growth_trend)
SELECT
-- 基礎(chǔ)日期統(tǒng)計
date('${pdate}') AS dt, -- 統(tǒng)計日期,使用調(diào)度日期參數(shù)
t1.user_id AS user_id, -- 用戶ID
t1.order_count_td, -- 累計下單次數(shù),來自交易寬表
t1.total_amount_td, -- 累計下單金額,來自交易寬表
-- 格式化日期為yyyy-MM-dd格式
date_format(t1.order_last_date, '%Y-%m-%d') AS order_last_date, -- 最近下單日期
date_format(t1.order_first_date, '%Y-%m-%d') AS order_first_date, -- 首次下單日期
t2.login_count_td, -- 累計登錄次數(shù),來自用戶登錄寬表
date_format(t2.login_last_date, '%Y-%m-%d') AS login_last_date, -- 最近登錄日期
-- 計算衍生指標(biāo)
-- 計算平均客單價 = 總金額/訂單數(shù)
CASE WHEN t1.order_count_td > 0 THEN t1.total_amount_td/t1.order_count_td ELSE 0 END AS average_order_amount,
-- 計算平均購買周期(天) = (最后訂單日期-首次訂單日期)/(訂單數(shù)-1)
CASE WHEN t1.order_count_td > 1
THEN datediff(t1.order_last_date, t1.order_first_date)/(t1.order_count_td-1)
ELSE NULL END AS purchase_cycle_days,
-- 計算賬號存續(xù)天數(shù) = 當(dāng)前日期-注冊日期
datediff(current_date(), t2.register_date) AS account_days,
-- 計算生命周期價值(LTV) = 平均客單價 * 年購買頻率 * 預(yù)期客戶生命周期(年)
-- 年購買頻率計算方式: 總購買次數(shù)*365/賬號存續(xù)天數(shù),即年化購買頻率
-- 預(yù)期客戶生命周期取3年作為默認(rèn)預(yù)估
CASE WHEN t1.order_count_td > 0 AND datediff(current_date(), t2.register_date) > 0
THEN (t1.total_amount_td/t1.order_count_td) * (t1.order_count_td*365/datediff(current_date(), t2.register_date)) * 3
ELSE 0 END AS life_time_value,
-- RFM模型計算 - 為每個維度打分(1-5分)
-- Recency(最近購買時間)評分: 越近分?jǐn)?shù)越高
CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5 -- 30天內(nèi)
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4 -- 31-60天
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3 -- 61-90天
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2 -- 91-180天
ELSE 1 -- 180天以上
END AS recency_score,
-- Frequency(購買頻率)評分: 購買次數(shù)越多分?jǐn)?shù)越高
CASE
WHEN t1.order_count_td >= 20 THEN 5 -- 20次及以上
WHEN t1.order_count_td >= 10 THEN 4 -- 10-19次
WHEN t1.order_count_td >= 5 THEN 3 -- 5-9次
WHEN t1.order_count_td >= 2 THEN 2 -- 2-4次
ELSE 1 -- 1次
END AS frequency_score,
-- Monetary(消費金額)評分: 總消費金額越高分?jǐn)?shù)越高
CASE
WHEN t1.total_amount_td >= 10000 THEN 5 -- 1萬元及以上
WHEN t1.total_amount_td >= 5000 THEN 4 -- 5千-1萬元
WHEN t1.total_amount_td >= 2000 THEN 3 -- 2千-5千元
WHEN t1.total_amount_td >= 500 THEN 2 -- 500-2千元
ELSE 1 -- 500元以下
END AS monetary_score,
-- 計算RFM總分 = R分 + F分 + M分
(CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END) +
(CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END) +
(CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) AS rfm_score,
-- 用戶價值分層: 根據(jù)RFM總分(3-15分)進行分層
CASE
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END +
CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END +
CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) >= 13 THEN '高價值' -- 13-15分
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END +
CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END +
CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) >= 10 THEN '中高價值' -- 10-12分
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END +
CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END +
CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) >= 7 THEN '中價值' -- 7-9分
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END +
CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END +
CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) >= 4 THEN '低價值' -- 4-6分
ELSE '流失風(fēng)險' -- 3分
END AS user_value_level,
-- 活躍狀態(tài): 基于最近交易和登錄時間
CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 OR datediff(current_date(), t2.login_last_date) <= 7 THEN '活躍' -- 30天內(nèi)有交易或7天內(nèi)有登錄
WHEN datediff(current_date(), t1.order_last_date) <= 90 OR datediff(current_date(), t2.login_last_date) <= 30 THEN '沉默' -- 90天內(nèi)有交易或30天內(nèi)有登錄
ELSE '流失' -- 超過90天未交易且超過30天未登錄
END AS active_status,
-- 生命周期狀態(tài): 基于交易行為和訂單歷史
CASE
WHEN datediff(current_date(), t1.order_first_date) <= 30 AND t1.order_count_td <= 2 THEN '新用戶' -- 30天內(nèi)首次購買且購買次數(shù)<=2次
WHEN t1.order_count_td >= 3 AND (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END) >= 4 THEN '成長期' -- 購買>=3次且近期活躍(60天內(nèi))
WHEN t1.order_count_td >= 5 AND (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END) >= 3 THEN '成熟期' -- 購買>=5次且90天內(nèi)有購買
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END) <= 2 AND (CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END) >= 3 THEN '衰退期' -- 購買次數(shù)>=5但超過90天未購買
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END) >= 3 AND datediff(t1.order_last_date, t0.lag_order_date) > 90 THEN '回流' -- 最近90天內(nèi)有購買但之前超過90天未購買
ELSE '新用戶' -- 默認(rèn)為新用戶
END AS life_cycle_status,
-- 購物偏好: 基于購買頻率和金額
CASE
WHEN (CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END) >= 4 AND (CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) <= 3 THEN '高頻低額' -- 高頻率低金額: 購買頻繁但單價較低
WHEN (CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END) <= 3 AND (CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) >= 4 THEN '低頻高額' -- 低頻率高金額: 購買較少但大額消費
WHEN (CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END) >= 4 AND (CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) >= 4 THEN '高頻高額' -- 高頻率高金額: 高價值客戶,頻繁且大額
ELSE '低頻低額' -- 低頻率低金額: 低價值客戶
END AS shopping_preference,
-- 價值發(fā)展趨勢: 比較當(dāng)前RFM評分與30天前的評分
CASE
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END +
CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END +
CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) > COALESCE(t3.previous_rfm_score, 0) THEN '上升' -- 當(dāng)前分?jǐn)?shù)高于30天前,趨勢上升
WHEN (CASE
WHEN datediff(current_date(), t1.order_last_date) <= 30 THEN 5
WHEN datediff(current_date(), t1.order_last_date) <= 60 THEN 4
WHEN datediff(current_date(), t1.order_last_date) <= 90 THEN 3
WHEN datediff(current_date(), t1.order_last_date) <= 180 THEN 2
ELSE 1
END +
CASE
WHEN t1.order_count_td >= 20 THEN 5
WHEN t1.order_count_td >= 10 THEN 4
WHEN t1.order_count_td >= 5 THEN 3
WHEN t1.order_count_td >= 2 THEN 2
ELSE 1
END +
CASE
WHEN t1.total_amount_td >= 10000 THEN 5
WHEN t1.total_amount_td >= 5000 THEN 4
WHEN t1.total_amount_td >= 2000 THEN 3
WHEN t1.total_amount_td >= 500 THEN 2
ELSE 1
END) < COALESCE(t3.previous_rfm_score, 0) THEN '下降' -- 當(dāng)前分?jǐn)?shù)低于30天前,趨勢下降
ELSE '穩(wěn)定' -- 分?jǐn)?shù)相等,趨勢穩(wěn)定
END AS growth_trend
FROM
(
-- 訂單數(shù)據(jù): 獲取用戶交易相關(guān)信息
SELECT
user_id,
k1,
order_date_last,
LAG(order_date_last, 1, NULL) OVER(PARTITION BY user_id ORDER BY k1) AS lag_order_date -- 獲取上一次最近下單日期,用于計算回流狀態(tài)
FROM dws.dws_trade_user_order_td
WHERE k1 = date('${pdate}') -- 取當(dāng)天分區(qū)數(shù)據(jù)
) t0
JOIN
(
-- 訂單數(shù)據(jù): 獲取用戶交易相關(guān)信息
SELECT
user_id,
SUM(order_count_td) AS order_count_td, -- 累計下單次數(shù)
SUM(total_amount_td) AS total_amount_td, -- 累計下單金額
MAX(order_date_last) AS order_last_date, -- 最近下單日期
MIN(order_date_first) AS order_first_date -- 首次下單日期
FROM dws.dws_trade_user_order_td
WHERE k1 = date('${pdate}') -- 取當(dāng)天分區(qū)數(shù)據(jù)
GROUP BY user_id
) t1
ON t0.user_id = t1.user_id
JOIN
(
-- 登錄數(shù)據(jù): 獲取用戶登錄相關(guān)信息
SELECT
user_id,
SUM(login_count_td) AS login_count_td, -- 累計登錄次數(shù)
MAX(login_date_last) AS login_last_date, -- 最近登錄日期
date('2020-01-01') AS register_date -- 注冊日期,使用默認(rèn)值
FROM dws.dws_user_user_login_td
WHERE k1 = date('${pdate}') -- 取當(dāng)天分區(qū)數(shù)據(jù)
GROUP BY user_id
) t2
ON t1.user_id = t2.user_id
LEFT JOIN
(
-- 上月RFM評分?jǐn)?shù)據(jù): 用于計算價值發(fā)展趨勢
SELECT
user_id,
recency_score + frequency_score + monetary_score AS previous_rfm_score -- 30天前的RFM總分
FROM ads.ads_user_value_analysis
WHERE dt = date_sub(date('${pdate}'), 30) -- 取30天前的數(shù)據(jù)
) t3
ON t1.user_id = t3.user_id;
ADS(Application Data Store)層是數(shù)據(jù)倉庫的最頂層,直接面向業(yè)務(wù)應(yīng)用。其建設(shè)核心是以業(yè)務(wù)需求為導(dǎo)向,將經(jīng)過清洗、轉(zhuǎn)換和匯總的數(shù)據(jù)以業(yè)務(wù)友好的方式呈現(xiàn)。
ADS層設(shè)計應(yīng)遵循業(yè)務(wù)導(dǎo)向、性能優(yōu)化、數(shù)據(jù)一致性、可擴展性和易用性五大原則。主要包含匯總事實表、維度表、指標(biāo)表和報表表等數(shù)據(jù)模型。數(shù)據(jù)更新策略包括增量更新和全量刷新,需根據(jù)業(yè)務(wù)場景選擇。ADS層實現(xiàn)需要高性能存儲引擎(如Doris)和計算框架,并建立完善的數(shù)據(jù)質(zhì)量控制機制。通過ADS層,企業(yè)可以實現(xiàn)精準(zhǔn)營銷、會員運營、風(fēng)險控制、客戶服務(wù)和產(chǎn)品優(yōu)化等業(yè)務(wù)目標(biāo),為決策提供數(shù)據(jù)支持。