再見(jiàn)Excel，你好Pandas！

作者：數(shù)據(jù)分析入門詳解 2021-01-13 11:13:46

寫(xiě)⼊數(shù)據(jù)時(shí)，要注意不同的⽂件格式選⽤不同的⽅法，如寫(xiě)⼊csv⽂件使⽤to_csv，寫(xiě)⼊ excel時(shí)使⽤to_excel，并且要注意添加編碼⽅式。

再⻅Excel，你好Pandas！

數(shù)據(jù)的寫(xiě)⼊:

寫(xiě)⼊數(shù)據(jù)時(shí)，要注意不同的⽂件格式選⽤不同的⽅法，如寫(xiě)⼊csv⽂件使⽤to_csv，寫(xiě)⼊ excel時(shí)使⽤to_excel，并且要注意添加編碼⽅式，下面創(chuàng)建⼀個(gè)表：

from pandas import Series,DataFrame   
# 使用字典創(chuàng)建 
 
index_list['001','002','003','004','005','006','007','008','009','010']  
name_list = ['李白','王昭君','諸葛亮','狄仁杰','孫尚香','妲己','周瑜','張飛','王昭君','大  
喬'] 
 
age_list=[25,28,27,25,30,29,25,32,28,26]  
salary_list=['10k','12.5k','20k','14k','12k','17k','18k','21k','22k','21.5k']  
marital_list = ['NO','NO','YES','YES','NO','NO','NO','YES','NO','YES']  
dic={ 
 
'姓名': Series(data=name_list,index=index_list),  
'年齡': Series(data=age_list,index=index_list), 
 '薪資': Series(data=salary_list,index=index_list),  
'婚姻狀況': Series(data=marital_list,index=index_list)  
} 
 
df=DataFrame(dic)  
# 寫(xiě)入csv，path_or_buf為寫(xiě)入文本文件  
df.to_csv(path_or_buf='./People_Information.csv',  
encoding='utf_8_sig',index=False)  
print('end')

這⾥調(diào)⽤to_csv⽅法寫(xiě)⼊數(shù)據(jù),可以指定路徑,參數(shù)encoding是指定編碼⽅式,這樣遇到中⽂不易出現(xiàn)亂碼,參數(shù)index=False是為了去除掉⾏索引,不然⾏索引1,2,3,4等也會(huì)放到表⾥。

數(shù)據(jù)的讀取:

讀取數(shù)據(jù)時(shí),不同的⽂件格式使⽤的⽅法也不⼀樣, 讀取csv使⽤read_csv,excel使⽤ read_excel,并且可以指定⽂件進(jìn)⾏讀,另外⼀個(gè)Excel⽂件可以創(chuàng)建多個(gè)表，然后在不同的表中存儲(chǔ)不同數(shù)據(jù)，這種形式的⽂件很常⻅。但是要注意csv⽂件不存在多個(gè)sheet的問(wèn)題。

如: import pandas as pd  
#sheet_name指定讀取⼯作鋪中的那個(gè)sheet(sheet名稱)  
sheet1 = pd.read_excel('./data/sheet.xlsx',sheet_name='sheet1')  
print(sheet1.head())  
sheet2 = pd.read_excel('./data/sheet.xlsx',sheet_name='sheet2')  
print(sheet2.head())  
當(dāng)csv或者excel中數(shù)據(jù)的第⼀⾏是⼀條臟數(shù)據(jù),可以利⽤read_excel()中的header參數(shù)進(jìn)  
⾏選擇哪⼀⾏作為我們的列索引。如:  
import pandas as pd  
#這里將header設(shè)置為1(第一行是0),代表數(shù)據(jù)將從第2行開(kāi)始讀取,第一行的數(shù)據(jù)會(huì)被  
忽略  
people = pd.read_csv('./data/People1.csv',header = 1)  
print(people.head())

如果都不滿⾜的你的要求，可以將header設(shè)置為None，列索引值會(huì)使⽤默認(rèn)的1、2、 3、4，之后在⾃⾏設(shè)置。

當(dāng)指定了header的值，讀出來(lái)的數(shù)據(jù)就是從該⾏開(kāi)始向下切⽚，該⾏以上的數(shù)據(jù)會(huì)被忽略。

責(zé)任編輯：未麗燕來(lái)源：今日頭條

自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

再見(jiàn)Excel，你好Pandas！

再見(jiàn)Excel，你好Pandas！