又一數(shù)據(jù)處理神器，通過GPU加速Pandas性能！

作者：郭小喵玩AI 2023-12-13 13:23:21

本文以 Python 3.10 和 Nvidia T4 GPU 運行整個代碼為例說明 cuDF 的使用。

NVIDIA的RAPIDS cuDF是一個Python GPU DataFrame庫，可用于加載、連接、聚合、過濾以及其他數(shù)據(jù)處理操作。cuDF基于libcudf這一非常高效的C++/CUDA dataframe庫，以Apache Arrow的列式存儲，并且提供了一個GPU加速的Pandas API，依賴于NVIDIA CUDA進行低級計算優(yōu)化，從而可充分利用GPU并行性和高帶寬內(nèi)存速度。如下圖所示。

同時，cuDF包含一個“零代碼修改”的Pandas加速器(cudf.pandas)，可在GPU上執(zhí)行Pandas代碼，支持類似于Pandas的API，并且可以在需要時自動切換到CPU上的pandas執(zhí)行其它操作。

總而言之，cuDF比較好的一個使用場景就是「代替并行」，在Pandas處理比較慢的時候，切換到cuDF，就不用寫繁瑣的并行了。

如下所示是一段使用cuDF加速Pandas API進行數(shù)據(jù)處理操作的示例代碼。

%load_ext cudf.pandas  # 啟用Pandas API的GPU加速功能

import pandas as pd

"""
在GPU上對列進行數(shù)學運算、分組計算以及滾動求和操作，利用GPU加速
"""
df = pd.read_csv("/path/to/file")
df["col_a"] = df["col_b"] * 100
df.groupby("col_a").mean()
df.rolling(window=3).sum()
"""
這是一個cuDF不支持的操作，會自動切換到CPU執(zhí)行
"""
df.apply(set, axis=1)

接下來，以Python 3.10和Nvidia T4 GPU運行整個代碼為例說明cuDF的使用。

環(huán)境準備

Cuda 11.2+
Nvidia驅(qū)動程序 450.80.02+
Pascal架構(gòu)或更高版本（算力 >=6.0）

驗證設(shè)置

首先，需要驗證是否正在使用NVIDIA GPU。

!nvidia-smi

安裝cuDF庫

!pip install cudf-cu11 --extra-index-url=https://pypi.nvidia.com

導入庫

import cudf
cudf.__version__

下載數(shù)據(jù)集

!wget https://data.rapids.ai/datasets/nyc_parking/nyc_parking_violations_2022.parquet

使用標準的Pandas庫進行數(shù)據(jù)分析

import pandas as pd

# 讀取指定的5列數(shù)據(jù)
df = pd.read_parquet(
"nyc_parking_violations_2022.parquet",
columns=["Registration State", "Violation Description", "Vehicle Body Type", "Issue Date", "Summons Number"]
)

# 查看隨機抽樣的10行數(shù)據(jù)，并將結(jié)果顯示出來
df.sample(10)

在代碼塊中添加執(zhí)行時間計算。

%%time   #用于計算下面的代碼塊的執(zhí)行時間

# 讀取指定的5列數(shù)據(jù)
df = pd.read_parquet(
    "nyc_parking_violations_2022.parquet",
    columns=["Registration State", "Violation Description", "Vehicle Body Type", "Issue Date", "Summons Number"]
)

"""
對"Registration State"和"Violation Description"這兩列進行計數(shù)，
并按照"Registration State"分組，
選擇每個分組中出現(xiàn)次數(shù)最多的"Violation Description"，
最后，對結(jié)果進行排序并重置索引
"""

(df[["Registration State", "Violation Description"]]
 .value_counts()
 .groupby("Registration State")
 .head(1)
 .sort_index()
 .reset_index()
)

%%time             #計算下面的代碼塊的執(zhí)行時間

"""
按照"Vehicle Body Type"進行分組，
并使用agg方法對"Summons Number"進行計數(shù)，
然后將計數(shù)結(jié)果重命名為"Count"，
最后，按照計數(shù)結(jié)果降序排序。
"""
(df
 .groupby(["Vehicle Body Type"])
 .agg({"Summons Number": "count"})
 .rename(columns={"Summons Number": "Count"})
 .sort_values(["Count"], ascending=False)
)

使用cudf.pandas庫進行數(shù)據(jù)分析

接下來，使用cudf.pandas擴展重新運行之前的Pandas代碼。通常情況下，在Notebook中加載cudf.pandas擴展應該在導入模塊之前進行。因此，為了模擬類似的操作，建議重新啟動內(nèi)核。

get_ipython().kernel.do_shutdown(restart=True)

%load_ext cudf.pandas

%%time

import pandas as pd

df = pd.read_parquet(
    "nyc_parking_violations_2022.parquet",
    columns=["Registration State", "Violation Description", "Vehicle Body Type", "Issue Date", "Summons Number"]
)

(df[["Registration State", "Violation Description"]]
 .value_counts()
 .groupby("Registration State")
 .head(1)
 .sort_index()
 .reset_index()
)

由代碼塊的執(zhí)行時間可以看出，同樣的操作，cudf.pandas的計算速度明顯加快！

%%time

(df
 .groupby(["Vehicle Body Type"])
 .agg({"Summons Number": "count"})
 .rename(columns={"Summons Number": "Count"})
 .sort_values(["Count"], ascending=False)
)

性能分析

性能分析是一種用于評估程序執(zhí)行效率的方法，通過分析代碼的執(zhí)行時間、資源利用情況和性能瓶頸等方面，幫助開發(fā)人員理解和優(yōu)化程序的性能表現(xiàn)。cudf.pandas也提供了性能分析工具，可以幫助我們確定哪些部分的代碼在GPU上執(zhí)行，哪些部分在CPU上執(zhí)行，從而更好地利用GPU加速計算的優(yōu)勢。

「注意」：如果在Google Colab上運行，第一次運行性能分析工具可能需要10s以上，這是因為Colab的Debugger需要和用于性能分析的內(nèi)置Python函數(shù)sys.settrace進行交互，再次運行單元格即可解決這個問題。

如下代碼使用%%cudf.pandas.profile命令，將代碼提交給cudf.pandas的性能分析工具，以便分析代碼在GPU上的執(zhí)行情況，并識別性能瓶頸和優(yōu)化空間。

%%cudf.pandas.profile

#創(chuàng)建DataFrame small_df
small_df = pd.DataFrame({'a': [0, 1, 2], 'b': ["x", "y", "z"]})
#重復拼接
small_df = pd.concat([small_df, small_df])

axis = 0
#對small_df進行最小值計算，并在循環(huán)中改變計算的軸向
for i in range(0, 2):
    small_df.min(axis=axis)
    axis = 1
#對small_df按照"a"列進行分組，并統(tǒng)計"b"列的計數(shù)
counts = small_df.groupby("a").b.count()

責任編輯：趙寧寧來源：郭小喵玩AI

GPU Pandas

自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

又一數(shù)據(jù)處理神器，通過GPU加速Pandas性能！