數(shù)據(jù)可視化發(fā)現(xiàn)[吃雞]秘密
大吉大利,今晚吃雞~ 今天跟朋友玩了幾把吃雞,經(jīng)歷了各種死法,還被嘲笑說論女生吃雞的100種死法,比如被拳頭掄死、跳傘落到房頂邊緣摔死 、把吃雞玩成飛車被車技秀死、被隊(duì)友用燃燒瓶燒死的。這種游戲?qū)ξ襾碚f就是一個(gè)讓我明白原來還有這種死法的游戲。但是玩歸玩,還是得假裝一下我沉迷學(xué)習(xí),所以今天就用吃雞比賽的真實(shí)數(shù)據(jù)來看看如何提高你吃雞的概率。
那么我們就用python和R做數(shù)據(jù)分析來回答以下的靈魂發(fā)問?
首先來看下數(shù)據(jù):
1、跳哪兒危險(xiǎn)?
對(duì)于我這樣一直喜歡茍著的良心玩家,在經(jīng)歷了無數(shù)次落地成河的慘痛經(jīng)歷后,我是堅(jiān)決不會(huì)選擇跳P城這樣樓房密集的城市,窮歸窮但保命要緊。所以我們決定統(tǒng)計(jì)一下到底哪些地方更容易落地成河?我們篩選出在前100秒死亡的玩家地點(diǎn)進(jìn)行可視化分析。激情沙漠地圖的電站、皮卡多、別墅區(qū)、依波城最為危險(xiǎn),火車站、火電廠相對(duì)安全。絕地海島中P城、軍事基地、學(xué)校、醫(yī)院、核電站、防空洞都是絕對(duì)的危險(xiǎn)地帶。物質(zhì)豐富的G港居然相對(duì)安全。
- import numpy as np
- import matplotlib.pyplot as plt
- import pandas as pd
- import seaborn as sns
- from scipy.misc.pilutil import imread
- import matplotlib.cm as cm
- #導(dǎo)入部分?jǐn)?shù)據(jù)
- deaths1 = pd.read_csv("deaths/kill_match_stats_final_0.csv")
- deaths2 = pd.read_csv("deaths/kill_match_stats_final_1.csv")
- deaths = pd.concat([deaths1, deaths2])
- #打印前5列,理解變量
- print (deaths.head(),'\n',len(deaths))
- #兩種地圖
- miramar = deaths[deaths["map"] == "MIRAMAR"]
- erangel = deaths[deaths["map"] == "ERANGEL"]
- #開局前100秒死亡熱力圖
- position_data = ["killer_position_x","killer_position_y","victim_position_x","victim_position_y"]
- for position in position_data:
- miramar[position] = miramar[position].apply(lambda x: x*1000/800000)
- miramar = miramar[miramar[position] != 0]
- erangel[position] = erangel[position].apply(lambda x: x*4096/800000)
- erangel = erangel[erangel[position] != 0]
- n = 50000
- mira_sample = miramar[miramar["time"] < 100].sample(n)
- eran_sample = erangel[erangel["time"] < 100].sample(n)
- # miramar熱力圖
- bg = imread("miramar.jpg")
- fig, ax = plt.subplots(1,1,figsize=(15,15))
- ax.imshow(bg)
- sns.kdeplot(mira_sample["victim_position_x"], mira_sample["victim_position_y"],n_levels=100, cmap=cm.Reds, alpha=0.9)
- # erangel熱力圖
- bg = imread("erangel.jpg")
- fig, ax = plt.subplots(1,1,figsize=(15,15))
- ax.imshow(bg)
- sns.kdeplot(eran_sample["victim_position_x"], eran_sample["victim_position_y"], n_levels=100,cmap=cm.Reds, alpha=0.9)
2、茍著還是出去干?
我到底是茍?jiān)诜块g里面還是出去和敵人硬拼?這里因?yàn)楸荣惖囊?guī)模不一樣,這里選取參賽人數(shù)大于90的比賽數(shù)據(jù),然后篩選出團(tuán)隊(duì)team_placement即最后成功吃雞的團(tuán)隊(duì)數(shù)據(jù),1、先計(jì)算了吃雞團(tuán)隊(duì)平均擊殺敵人的數(shù)量,這里剔除了四人模式的比賽數(shù)據(jù),因?yàn)槿藬?shù)太多的團(tuán)隊(duì)會(huì)因?yàn)閿?shù)量懸殊平均而變得沒意義;2、所以我們考慮通過分組統(tǒng)計(jì)每一組吃雞中存活到最后的成員擊殺敵人的數(shù)量,但是這里發(fā)現(xiàn)數(shù)據(jù)統(tǒng)計(jì)存活時(shí)間變量是按照?qǐng)F(tuán)隊(duì)最終存活時(shí)間記錄的,所以該想法失敗;3、最后統(tǒng)計(jì)每個(gè)吃雞團(tuán)隊(duì)中擊殺人數(shù)最多的數(shù)量統(tǒng)計(jì),這里剔除了單人模式的數(shù)據(jù),因?yàn)閱稳四J降臄?shù)量就是每組擊殺最多的數(shù)量。最后居然發(fā)現(xiàn)還有擊殺數(shù)量達(dá)到60的,懷疑是否有開掛。想要吃雞還是得出去練槍法,光是茍著是不行的。
- library(dplyr)
- library(tidyverse)
- library(data.table)
- library(ggplot2)
- pubg_full <- fread("../agg_match_stats.csv")
- # 吃雞團(tuán)隊(duì)平均擊殺敵人的數(shù)量
- attach(pubg_full)
- pubg_winner <- pubg_full %>% filter(team_placement==1&party_size<4&game_size>90)
- detach(pubg_full)
- team_killed <- aggregate(pubg_winner$player_kills, by=list(pubg_winner$match_id,pubg_winner$team_id), FUN="mean")
- team_killed$death_num <- ceiling(team_killed$x)
- ggplot(data = team_killed) + geom_bar(mapping = aes(x = death_num, y = ..count..), color="steelblue") +
- xlim(0,70) + labs(title = "Number of Death that PUBG Winner team Killed", x="Number of death")
- # 吃雞團(tuán)隊(duì)最后存活的玩家擊殺數(shù)量
- pubg_winner <- pubg_full %>% filter(pubg_full$team_placement==1) %>% group_by(match_id,team_id)
- attach(pubg_winner)
- team_leader <- aggregate(player_survive_time~player_kills, data = pubg_winner, FUN="max")
- detach(pubg_winner)
- # 吃雞團(tuán)隊(duì)中擊殺敵人最多的數(shù)量
- pubg_winner <- pubg_full %>% filter(pubg_full$team_placement==1&pubg_full$party_size>1)
- attach(pubg_winner)
- team_leader <- aggregate(player_kills, by=list(match_id,team_id), FUN="max")
- detach(pubg_winner)
- ggplot(data = team_leader) + geom_bar(mapping = aes(x = x, y = ..count..), color="steelblue") +
- xlim(0,70) + labs(title = "Number of Death that PUBG Winner Killed", x="Number of death")
3、哪一種武器干掉的玩家多?
運(yùn)氣好挑到好武器的時(shí)候,你是否猶豫選擇哪一件?從圖上來看,M416和SCAR是不錯(cuò)的武器,也是相對(duì)容易能撿到的武器,大家公認(rèn)Kar98k是能一槍斃命的好槍,它排名比較靠后的原因也是因?yàn)檫@把槍在比賽比較難得,而且一下?lián)糁袛橙艘彩切枰獙?shí)力的,像我這種撿到98k還裝上8倍鏡但沒捂熱乎1分鐘的玩家是不配得到它的。(捂臉)
- #殺人武器排名
- death_causes = deaths['killed_by'].value_counts()
- ns.set_context('talk')
- fig = plt.figure(figsize=(30, 10))
- ax = sns.barplot(x=death_causes.index, y=[v / sum(death_causes) for v in death_causes.values])
- ax.set_title('Rate of Death Causes')
- ax.set_xticklabels(death_causes.index, rotation=90)
- #排名前20的武器
- rank = 20
- fig = plt.figure(figsize=(20, 10))
- ax = sns.barplot(x=death_causes[:rank].index, y=[v / sum(death_causes) for v in death_causes[:rank].values])
- ax.set_title('Rate of Death Causes')
- ax.set_xticklabels(death_causes.index, rotation=90)
- #兩個(gè)地圖分開取
- f, axes = plt.subplots(1, 2, figsize=(30, 10))
- axes[0].set_title('Death Causes Rate: Erangel (Top {})'.format(rank))
- axes[1].set_title('Death Causes Rate: Miramar (Top {})'.format(rank))
- counts_er = erangel['killed_by'].value_counts()
- counts_mr = miramar['killed_by'].value_counts()
- sns.barplot(x=counts_er[:rank].index, y=[v / sum(counts_er) for v in counts_er.values][:rank], ax=axes[0] )
- sns.barplot(x=counts_mr[:rank].index, y=[v / sum(counts_mr) for v in counts_mr.values][:rank], ax=axes[1] )
- axes[0].set_ylim((0, 0.20))
- axes[0].set_xticklabels(counts_er.index, rotation=90)
- axes[1].set_ylim((0, 0.20))
- axes[1].set_xticklabels(counts_mr.index, rotation=90)
- #吃雞和武器的關(guān)系
- win = deaths[deaths["killer_placement"] == 1.0]
- win_causes = win['killed_by'].value_counts()
- sns.set_context('talk')
- fig = plt.figure(figsize=(20, 10))
- ax = sns.barplot(x=win_causes[:20].index, y=[v / sum(win_causes) for v in win_causes[:20].values])
- ax.set_title('Rate of Death Causes of Win')
- ax.set_xticklabels(win_causes.index, rotation=90)
4、隊(duì)友的助攻是否助我吃雞?
有時(shí)候一不留神就被擊倒了,還好我爬得快讓隊(duì)友救我。這里選擇成功吃雞的隊(duì)伍,最終接受1次幫助的成員所在的團(tuán)隊(duì)吃雞的概率為29%,所以說隊(duì)友助攻還是很重要的(再不要罵我豬隊(duì)友了,我也可以選擇不救你。)竟然還有讓隊(duì)友救9次的,你也是個(gè)人才。(手動(dòng)滑稽)
- library(dplyr)
- 2library(tidyverse)
- 3library(data.table)
- 4library(ggplot2)
- 5pubg_full <- fread("E:/aggregate/agg_match_stats_0.csv")
- 6attach(pubg_full)
- 7pubg_winner <- pubg_full %>% filter(team_placement==1)
- 8detach(pubg_full)
- 9ggplot(data = pubg_winner) + geom_bar(mapping = aes(x = player_assists, y = ..count..), fill="#E69F00") +
- 10 xlim(0,10) + labs(title = "Number of Player assisted", x="Number of death")
- 11ggplot(data = pubg_winner) + geom_bar(mapping = aes(x = player_assists, y = ..prop..), fill="#56B4E9") +
- 12 xlim(0,10) + labs(title = "Number of Player assisted", x="Number of death")
5、敵人離我越近越危險(xiǎn)?
對(duì)數(shù)據(jù)中的killer_position和victim_position變量進(jìn)行歐式距離計(jì)算,查看兩者的直線距離跟被擊倒的分布情況,呈現(xiàn)一個(gè)明顯的右偏分布,看來還是需要隨時(shí)觀察到附近的敵情,以免到淘汰都不知道敵人在哪兒。
- # python代碼:殺人和距離的關(guān)系
- import math
- def get_dist(df): #距離函數(shù)
- dist = []
- for row in df.itertuples():
- subset = (row.killer_position_x - row.victim_position_x)**2 + (row.killer_position_y - row.victim_position_y)**2
- if subset > 0:
- dist.append(math.sqrt(subset) / 100)
- else:
- dist.append(0)
- return dist
- df_dist = pd.DataFrame.from_dict({'dist(m)': get_dist(erangel)})
- df_dist.index = erangel.index
- erangel_dist = pd.concat([erangel,df_dist], axis=1)
- df_dist = pd.DataFrame.from_dict({'dist(m)': get_dist(miramar)})
- df_dist.index = miramar.index
- miramar_dist = pd.concat([miramar,df_dist], axis=1)
- f, axes = plt.subplots(1, 2, figsize=(30, 10))
- plot_dist = 150
- axes[0].set_title('Engagement Dist. : Erangel')
- axes[1].set_title('Engagement Dist.: Miramar')
- plot_dist_er = erangel_dist[erangel_dist['dist(m)'] <= plot_dist]
- plot_dist_mr = miramar_dist[miramar_dist['dist(m)'] <= plot_dist]
- sns.distplot(plot_dist_er['dist(m)'], ax=axes[0])
- sns.distplot(plot_dist_mr['dist(m)'], ax=axes[1])
6、團(tuán)隊(duì)人越多我活得越久?
對(duì)數(shù)據(jù)中的party_size變量進(jìn)行生存分析,可以看到在同一生存率下,四人團(tuán)隊(duì)的生存時(shí)間高于兩人團(tuán)隊(duì),再是單人模式,所以人多力量大這句話不是沒有道理的。
7、乘車是否活得更久?
對(duì)死因分析中發(fā)現(xiàn),也有不少玩家死于Bluezone,大家天真的以為撿繃帶就能跑毒。對(duì)數(shù)據(jù)中的player_dist_ride變量進(jìn)行生存分析,可以看到在同一生存率下,有開車經(jīng)歷的玩家生存時(shí)間高于只走路的玩家,光靠腿你是跑不過毒的。
8、小島上人越多我活得更久?
對(duì)game_size變量進(jìn)行生存分析發(fā)現(xiàn)還是小規(guī)模的比賽比較容易存活。
- # R語言代碼如下:
- library(magrittr)
- library(dplyr)
- library(survival)
- library(tidyverse)
- library(data.table)
- library(ggplot2)
- library(survminer)
- pubg_full <- fread("../agg_match_stats.csv")
- # 數(shù)據(jù)預(yù)處理,將連續(xù)變量劃為分類變量
- pubg_sub <- pubg_full %>%
- filter(player_survive_time<2100) %>%
- mutate(drive = ifelse(player_dist_ride>0, 1, 0)) %>%
- mutate(size = ifelse(game_size<33, 1,ifelse(game_size>=33 &game_size<66,2,3)))
- # 創(chuàng)建生存對(duì)象
- surv_object <- Surv(time = pubg_sub$player_survive_time)
- fit1 <- survfit(surv_object~party_size,data = pubg_sub)
- # 可視化生存率
- ggsurvplot(fit1, data = pubg_sub, pval = TRUE, xlab="Playing time [s]", surv.median.line="hv",
- legend.labs=c("SOLO","DUO","SQUAD"), ggtheme = theme_light(),risk.table="percentage")
- fit2 <- survfit(surv_object~drive,data=pubg_sub)
- ggsurvplot(fit2, data = pubg_sub, pval = TRUE, xlab="Playing time [s]", surv.median.line="hv",
- legend.labs=c("walk","walk&drive"), ggtheme = theme_light(),risk.table="percentage")
- fit3 <- survfit(surv_object~size,data=pubg_sub)
- ggsurvplot(fit3, data = pubg_sub, pval = TRUE, xlab="Playing time [s]", surv.median.line="hv",
- legend.labs=c("small","medium","big"), ggtheme = theme_light(),risk.table="percentage")
9、最后毒圈有可能出現(xiàn)的地點(diǎn)?
面對(duì)有本事能茍到最后的我,怎么樣預(yù)測(cè)最后的毒圈出現(xiàn)在什么位置。從表agg_match_stats數(shù)據(jù)找出排名第一的隊(duì)伍,然后按照match_id分組,找出分組數(shù)據(jù)里面player_survive_time最大的值,然后據(jù)此匹配表格kill_match_stats_final里面的數(shù)據(jù),這些數(shù)據(jù)里面取第二名死亡的位置,作圖發(fā)現(xiàn)激情沙漠的毒圈明顯更集中一些,大概率出現(xiàn)在皮卡多、圣馬丁和別墅區(qū)。絕地海島的就比較隨機(jī)了,但是還是能看出軍事基地和山脈的地方更有可能是最后的毒圈。
- #最后毒圈位置
- import matplotlib.pyplot as plt
- import pandas as pd
- import seaborn as sns
- from scipy.misc.pilutil import imread
- import matplotlib.cm as cm
- #導(dǎo)入部分?jǐn)?shù)據(jù)
- deaths = pd.read_csv("deaths/kill_match_stats_final_0.csv")
- #導(dǎo)入aggregate數(shù)據(jù)
- aggregate = pd.read_csv("aggregate/agg_match_stats_0.csv")
- print(aggregate.head())
- #找出最后三人死亡的位置
- team_win = aggregate[aggregate["team_placement"]==1] #排名第一的隊(duì)伍
- #找出每次比賽第一名隊(duì)伍活的最久的那個(gè)player
- grouped = team_win.groupby('match_id').apply(lambda t: t[t.player_survive_time==t.player_survive_time.max()])
- deaths_solo = deaths[deaths['match_id'].isin(grouped['match_id'].values)]
- deaths_solo_er = deaths_solo[deaths_solo['map'] == 'ERANGEL']
- deaths_solo_mr = deaths_solo[deaths_solo['map'] == 'MIRAMAR']
- df_second_er = deaths_solo_er[(deaths_solo_er['victim_placement'] == 2)].dropna()
- df_second_mr = deaths_solo_mr[(deaths_solo_mr['victim_placement'] == 2)].dropna()
- print (df_second_er)
- position_data = ["killer_position_x","killer_position_y","victim_position_x","victim_position_y"]
- for position in position_data:
- df_second_mr[position] = df_second_mr[position].apply(lambda x: x*1000/800000)
- df_second_mr = df_second_mr[df_second_mr[position] != 0]
- df_second_er[position] = df_second_er[position].apply(lambda x: x*4096/800000)
- df_second_er = df_second_er[df_second_er[position] != 0]
- df_second_er=df_second_er
- # erangel熱力圖
- sns.set_context('talk')
- bg = imread("erangel.jpg")
- fig, ax = plt.subplots(1,1,figsize=(15,15))
- ax.imshow(bg)
- sns.kdeplot(df_second_er["victim_position_x"], df_second_er["victim_position_y"], cmap=cm.Blues, alpha=0.7,shade=True)
- # miramar熱力圖
- bg = imread("miramar.jpg")
- fig, ax = plt.subplots(1,1,figsize=(15,15))
- ax.imshow(bg)
- sns.kdeplot(df_second_mr["victim_position_x"], df_second_mr["victim_position_y"], cmap=cm.Blues,alpha=0.8,shade=True)