一種基于學習的電池壽命預測(Python)
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pandas as pd
import glob
import time
from qore_sdk.client import WebQoreClient
from tqdm import tqdm_notebook as tqdm
from ipywidgets import IntProgress
df_35 = pd.read_csv('CS2_35.csv',index_col=0).dropna()
df_36 = pd.read_csv('CS2_36.csv',index_col=0).dropna()
df_37 = pd.read_csv('CS2_37.csv',index_col=0).dropna()
df_38 = pd.read_csv('CS2_38.csv',index_col=0).dropna()
Plot Parameters
Only discharge capacity is used for prediction
fig = plt.figure(figsize=(9,8),dpi=150)
names = ['capacity','resistance','CCCT','CVCT']
titles = ['Discharge Capacity (mAh)', 'Internal Resistance (Ohm)',
'Constant Current Charging Time (s)','Constant Voltage Charging Time (s)']
plt.subplots_adjust(hspace=0.25)
for i in range(4):
plt.subplot(2,2,i+1)
plt.plot(df_35[names[i]],'o',ms=2,label='#35')
plt.plot(df_36[names[i]],'o',ms=2,label='#36')
plt.plot(df_37[names[i]],'o',ms=2,label='#37')
plt.plot(df_38[names[i]],'o',ms=2,label='#38')
plt.title(titles[i],fontsize=14)
plt.legend(loc='upper right')
if(i == 3):
plt.ylim(1000,5000)
plt.xlim(-20,1100)
X_train,y_train = ConvertData([df_35,df_37,df_38],50)
X_test,y_test = ConvertData([df_36],50)
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
(2588, 50) (873, 50) (2588,) (873,)
idx = np.arange(0,X_train.shape[0],1)
idx = np.random.permutation(idx)
idx_lim = idx[:500]
X_train = X_train[idx_lim]
y_train = y_train[idx_lim]
X_train = X_train.reshape([X_train.shape[0], X_train.shape[1], 1])
X_test = X_test.reshape([X_test.shape[0], X_test.shape[1], 1])
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
(500, 50, 1) (873, 50, 1) (500,) (873,)
%%time
client = WebQoreClient(username, password, endpoint=endpoint)
time_ = client.regression_train(X_train, y_train)
print('Time:', time_['train_time'], 's')
Time: 1.8583974838256836 s
Wall time: 2.7 s
res = client.regression_predict(X_train)
fig=plt.figure(figsize=(12, 4),dpi=150)
plt.plot(res['Y'],alpha=0.7,label='Prediction')
plt.plot(y_train,alpha=0.7,label='True')
plt.legend(loc='upper right',fnotallow=12)
res = client.regression_predict(X_test[300:800])
x_range = np.linspace(301,800,500)
fig=plt.figure(figsize=(12, 4),dpi=150)
plt.plot(x_range,y_test[300:800],label='true',c='gray')
plt.plot(x_range,res['Y'],label='Qore',c='red',alpha=0.8)
plt.xlabel('Number of Cycle',fnotallow=13)
plt.ylabel('DIscharge Capacity (Ah)',fnotallow=13)
plt.title('Prediction of Discharge Capacity of Test Data (CS2-36)',fnotallow=14)
plt.legend(loc='upper right',fnotallow=12)
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test[300:800], res['Y'])
initial = X_test[500]
results = []
for i in tqdm(range(50)):
if(i == 0):
res = client.regression_predict([initial])['Y']
results.append(res[0])
time.sleep(1)
else:
initial = np.vstack((initial[1:],np.array(res)))
res = client.regression_predict([initial])['Y']
results.append(res[0])
time.sleep(1)
HBox(children=(IntProgress(value=0, max=50), HTML(value='')))
fig=plt.figure(figsize=(12,4),dpi=150)
plt.plot(np.linspace(501,550,50),results,'o-',ms=4,lw=1,label='predict')
plt.plot(np.linspace(401,550,150),y_test[400:550],'o-',lw=1,ms=4,label='true')
plt.legend(loc='upper right',fnotallow=12)
plt.xlabel('Number of Cycle',fnotallow=13)
plt.ylabel('Discharge Capacity (Ah)',fnotallow=13)
知乎學術咨詢:
https://www.zhihu.com/consult/people/792359672131756032?isMe=1
擔任《Mechanical System and Signal Processing》《中國電機工程學報》等期刊審稿專家,擅長領域:信號濾波/降噪,機器學習/深度學習,時間序列預分析/預測,設備故障診斷/缺陷檢測/異常檢測。
分割線分割線分割線
渦扇發(fā)動機全稱為渦輪風扇發(fā)動機,是一種先進的空中引擎,由渦輪噴氣發(fā)動機發(fā)展而來。渦扇發(fā)動機主要特點是首級壓縮機的面積比渦輪噴氣發(fā)動機大。同時,空氣螺旋槳(扇)將部分吸入的空氣從噴射引擎噴射出來,并將其向后推動,以達到推進的目的。
渦扇發(fā)動機結構是一種特殊的引擎結構,它在渦輪噴氣發(fā)動機的基礎上再增加了 1-2 級低壓(低速)渦輪,這些渦輪可以驅動一定數(shù)量的風扇,消耗掉一部分渦噴發(fā)動機(核心機)的燃氣排氣動能,從而進一步降低燃氣排出速度,提高引擎的性能。從前至后,渦扇發(fā)動機由風扇,壓氣機,燃燒室,導向葉片,高壓低壓渦輪,加力燃燒室和尾噴管組成。
目前大多數(shù)論文的實驗中使用的數(shù)據(jù)集是美國國家航空航天局的 C-MAPSS 數(shù)據(jù)集。C-MAPSS 數(shù)據(jù)集是由模擬航空發(fā)動機的模擬軟件生成。模擬發(fā)動機的結構圖如下:
主要部件有低壓渦輪(low pressure turbine,LPT)、高壓渦輪(high pressure turbine,HPT)、高壓壓氣機(high pressure compressor,HPC)、低壓壓氣機(low pressure compressor,LPC)、風扇(fan)、噴嘴(nozzle)、燃燒室(combustor)、低壓轉子轉速(N1)和高壓轉子轉速(N2)等。風扇在正常飛行條件下工作,將空氣分流至內、外涵道。LPC 和 HPC 通過增溫增壓技術將高溫、壓力混合氣體輸送到 combustor。LPT 可以有效減少空氣的流速,大大提高飛機煤油的化學能轉換效率。而 HPT 則是利用高溫、高壓氣體對渦輪葉片施加做功,從而產(chǎn)生機械能。N1、N2 和 nozzle 的應用大大提升了發(fā)動機的燃燒效率。
監(jiān)控渦扇發(fā)動機狀況為21個傳感器。由于傳感器的單位不同,傳感器記錄的數(shù)值的量級也有所差異,位于10的-2次方到10的3次方之間。例如,燃燒室油氣比數(shù)值的量級是10的-2次方;低壓渦輪冷氣流量數(shù)值的量級是10的1次方。表2-2描述了NASA的C-MAPSS數(shù)據(jù)集。由于不同的操作條件和故障模式,數(shù)據(jù)集可以分為四個子數(shù)據(jù)集,依次是FD001、FD002、FD003和FD004。每個子數(shù)據(jù)分為訓練集和測試集,記錄了發(fā)動機的3種操作設置和21個傳感器數(shù)據(jù)。每個子數(shù)據(jù)集通過.txt文件單獨保存。在.txt文件中,每一行記錄了一個引擎某個時間刻的3種操作設置和21個傳感器數(shù)據(jù)。關于故障模式和操作條件方面,F(xiàn)D001和FD002子數(shù)據(jù)集包含一種故障模式(高壓壓氣機退化),F(xiàn)D003和FD004包含兩種故障模式(高壓壓氣機退化和風扇退化);FD001和FD003只有一種操作條件,F(xiàn)D002和FD004有六種操作條件。由于FD002和FD004子數(shù)據(jù)集引擎的操作環(huán)境復雜多變,F(xiàn)D002和FD004子數(shù)據(jù)集中RUL的預測更加困難。
可見,渦扇發(fā)動機數(shù)據(jù)集包含大量的按時間順序采集的傳感器數(shù)據(jù),這些數(shù)據(jù)包括發(fā)動機溫度、壓力、振動等多個方面的指標。并且渦扇發(fā)動機可能會出現(xiàn)多種故障模式。這適合用時間序列模型去對渦扇發(fā)動機的剩余使用壽命做預測。
鑒于此,采用機器學習和深度學習對C-MAPSS渦扇發(fā)動機進行剩余使用壽命RUL預測,Python代碼,Jupyter Notebook環(huán)境,方法如下:
(1)VAR with LSTM
(2)VAR with Logistic Regression
(3)LSTM (Lookback=20,10,5,1)
LSTM Lookback=10的結果如下:
final_pred = []
count = 0
for i in range(100):
j = max_cycles[i]
temp = pred[count+j-1]
count=count+j
final_pred.append(int(temp))
print(final_pred)
fig = plt.figure(figsize=(18,10))
plt.plot(final_pred,color='red', label='prediction')
plt.plot(y_test,color='blue', label='y_test')
plt.legend(loc='upper left')
plt.grid()
plt.show()
print("mean_squared_error >> ", mean_squared_error(y_test,final_pred))
print("root_mean_squared_error >> ", math.sqrt(mean_squared_error(y_test,final_pred)))
print("mean_absolute_error >>",mean_absolute_error(y_test,final_pred))
VAR with Logistic Regression結果:
fig = plt.figure(figsize=(18,10))
plt.plot(logistic_pred,color='red', label='prediction')
plt.plot(y_test,color='blue', label='y_test')
plt.legend(loc='upper left')
plt.grid()
plt.show()
VAR with LSTM結果:
fig = plt.figure(figsize=(18,10))
plt.plot(lstm_pred,color='red', label='prediction')
plt.plot(y_test,color='blue', label='y_test')
plt.legend(loc='upper left')
plt.grid()
plt.show()
完整代碼可通過知乎學術咨詢獲得:
https://www.zhihu.com/consult/people/792359672131756032?isMe=1
基于機器學習和深度學習的NASA渦扇發(fā)動機剩余使用壽命預測(C-MAPSS數(shù)據(jù)集,Python代碼,ipynb 文件)
以美國航空航天局提供的航空渦扇發(fā)動機退化數(shù)據(jù)集為研究對象,該數(shù)據(jù)集包含多臺發(fā)動機從啟動到失效期間多個運行周期的多源傳感器時序狀態(tài)監(jiān)測數(shù)據(jù),它們共同表征了發(fā)動機的性能退化情況。為減小計算成本,需要對原始多源傳感器監(jiān)測數(shù)據(jù)進行數(shù)據(jù)篩選,剔除與發(fā)動機性能退化情況無關的傳感器數(shù)據(jù)項,保留有用數(shù)據(jù),為對多源傳感器數(shù)據(jù)進行有效甄別,考慮綜合多種數(shù)據(jù)篩選方式,以保證篩選結果的準確性。
主要內容如下:
Data Visualization:
- Maximum life chart and engine life distribution chart for each unit.
- Correlation coefficient chart between sensors and RUL.
- Line chart showing the relationship between sensors and RUL for each engine.
- Value distribution chart for each sensor.
Feature Engineering:
- Based on the line chart showing the relationship between sensors and engine RUL, sensors 1, 5, 10, 16, 18, and 19 are found to be constant. Hence, these features are removed. Finally, the data is normalized.
Machine Learning Model:
- "Rolling mean feature" is added to the data, representing the average value of features over 10 time periods.
- Seven models are built: Linear regression, Light GBM, Random Forest, KNN, XGBoost, SVR, and Extra Tree.
- MAE, RMSE, and R2 are used as evaluation metrics. SVR performs the best with an R2 of 0.61 and RMSE = 25.7.
Deep Learning Model:
- The time window length is set to 30, and the shift length is set to 1. The training and test data are processed to be in a three-dimensional format for input to the models.
- Six deep learning models are built: CNN, LSTM, Stacked LSTM, Bi-LSTM, GRU, and a hybrid model combining CNN and LSTM.
- Convergence charts and evaluation of test data predictions are plotted. Each model has an R2 higher than 0.85, with Bi-LSTM achieving an R2 of 0.89 and RMSE of 13.5.?
機器學習模型所用模塊:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random
import warnings
warnings.filterwarnings('ignore')
from sklearn.metrics import mean_squared_error, r2_score,mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,MinMaxScaler
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.ensemble import RandomForestRegressor,ExtraTreesRegressor
from sklearn.neighbors import KNeighborsRegressor
from xgboost import XGBRegressor
from lightgbm import LGBMRegressor
結果如下:
深度學習所用模塊:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random
import time
import warnings
warnings.filterwarnings('ignore')
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler,MinMaxScaler
#from google.colab import drive
#drive.mount('/content/drive')
# model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Conv1D
from tensorflow.keras.layers import BatchNormalization, Dropout
from tensorflow.keras.layers import TimeDistributed, Flatten
from tensorflow.keras.layers.experimental import preprocessing
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ReduceLROnPlateau,EarlyStopping
完整代碼可通過知乎學術咨詢獲得:
https://www.zhihu.com/consult/people/792359672131756032?isMe=1
Python環(huán)境下基于機器學習的NASA渦輪風扇發(fā)動機剩余使用壽命RUL預測
本例所用的數(shù)據(jù)集為C-MAPSS數(shù)據(jù)集,C-MAPSS數(shù)據(jù)集是美國NASA發(fā)布的渦輪風扇發(fā)動機數(shù)據(jù)集,其中包含不同工作條件和故障模式下渦輪風扇發(fā)動機多源性能的退化數(shù)據(jù),共有 4 個子數(shù)據(jù)集,每個子集又可分為訓練集、 測試集和RUL標簽。其中,訓練集包含航空發(fā)動機從開始運行到發(fā)生故障的所有狀態(tài)參數(shù);測試集包含一定數(shù)量發(fā)動機從開始運行到發(fā)生故障前某一時間點的全部狀態(tài)參數(shù);RUL標簽記錄測試集中發(fā)動機的 RUL 值,可用于評估模 型的RUL預測能力。C-MAPSS數(shù)據(jù)集包含的基本信息如下:
本例只采用FD001子數(shù)據(jù)集:
關于python的集成環(huán)境,我一般Anaconda 和 winpython 都用,windows下主要用Winpython,IDE為spyder(類MATLAB界面)。
正如peng wang老師所說
winpython, anaconda 哪個更好?- peng wang的回答 - 知乎 https://www.zhihu.com/question/27615938/answer/71207511
winpython脫胎于pythonxy,面向科學計算,兼顧數(shù)據(jù)分析與挖掘;Anaconda主要面向數(shù)據(jù)分析與挖掘方面,在大數(shù)據(jù)處理方面有自己特色的一些包;winpython強調便攜性,被做成綠色軟件,不寫入注冊表,安裝其實就是解壓到某個文件夾,移動文件夾甚至放到U盤里在其他電腦上也能用;Anaconda則算是傳統(tǒng)的軟件模式。winpython是由個人維護;Anaconda由數(shù)據(jù)分析服務公司維護,意味著Winpython在很多方面都從簡,而Anaconda會提供一些人性化設置。Winpython 只能在windows上用,Anaconda則有l(wèi)inux的版本。
拋開軟件包的差異,我個人也推薦初學者用winpython,正因為其簡單,問題也少點,由于便攜性的特點系統(tǒng)壞了,重裝后也能直接用。
請直接安裝、使用winPython:WinPython download因為很多模塊以及集成的模塊
可以選擇版本,不一定要用最新版本,否則可能出現(xiàn)不兼容問題。
下載、解壓后如下
打開spyder就可以用了。
采用8種機器學習方法對NASA渦輪風扇發(fā)動機進行剩余使用壽命RUL預測,8種方法分別為:Linear Regression,SVM regression,Decision Tree regression,KNN model,Random Forest,Gradient Boosting Regressor,Voting Regressor,ANN Model。
首先導入相關模塊
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
import tensorflow as tf
from tensorflow.keras.layers import Dense
版本如下:
tensorflow=2.8.0
keras=2.8.0
sklearn=1.0.2
導入數(shù)據(jù)
path = ''
# define column names
col_names=["unit_nb","time_cycle"]+["set_1","set_2","set_3"] + [f's_{i}' for i in range(1,22)]
# read data
df_train = train_data = pd.read_csv(path+"train_FD001.txt", index_col=False,
sep= "\s+", header = None,nam
es=col_names )
df_test和y_test同理導入,看一下訓練樣本
df_train.head()
進行探索性數(shù)據(jù)分析
df_train[col_names[1:]].describe().T
數(shù)據(jù)可視化分析:
sns.set_style("darkgrid")
plt.figure(figsize=(16,10))
k = 1
for col in col_names[2:] :
plt.subplot(6,4,k)
sns.histplot(df_train[col],color='Green')
k+=1
plt.tight_layout()
plt.show()
def remaining_useful_life(df):
# Get the total number of cycles for each unit
grouped_by_unit = df.groupby(by="unit_nb")
max_cycle = grouped_by_unit["time_cycle"].max()
# Merge the max cycle back into the original frame
result_frame = df.merge(max_cycle.to_frame(name='max_cycle'), left_notallow='unit_nb', right_index=True)
# Calculate remaining useful life for each row
remaining_useful_life = result_frame["max_cycle"] - result_frame["time_cycle"]
result_frame["RUL"] = remaining_useful_life
# drop max_cycle as it's no longer needed
result_frame = result_frame.drop("max_cycle", axis=1)
return result_frame
df_train = remaining_useful_life(df_train)
df_train.head()
繪制最大RUL的直方圖分布
plt.figure(figsize=(10,5))
sns.histplot(max_ruls.RUL, color='r')
plt.xlabel('RUL')
plt.ylabel('Frequency')
plt.axvline(x=max_ruls.RUL.mean(), ls='--',color='k',label=f'mean={max_ruls.RUL.mean()}')
plt.axvline(x=max_ruls.RUL.median(),color='b',label=f'median={max_ruls.RUL.median()}')
plt.legend()
plt.show()
plt.figure(figsize=(20, 8))
cor_matrix = df_train.corr()
heatmap = sns.heatmap(cor_matrix, vmin=-1, vmax=1, annot=True)
heatmap.set_title('Correlation Heatmap', fnotallow={'fontsize':12}, pad=10);
添加圖片注釋,不超過 140 字(可選)
col = df_train.describe().columns
#we drop colummns with standard deviation is less than 0.0001
sensors_to_drop = list(col[df_train.describe().loc['std']<0.001]) + ['s_14']
print(sensors_to_drop)
#
df_train.drop(sensors_to_drop,axis=1,inplace=True)
df_test.drop(sensors_to_drop,axis=1,inplace=True)
sns.set_style("darkgrid")
fig, axs = plt.subplots(4,4, figsize=(25, 18), facecolor='w', edgecolor='k')
fig.subplots_adjust(hspace = .22, wspace=.2)
i=0
axs = axs.ravel()
index = list(df_train.unit_nb.unique())
for sensor in df_train.columns[1:-1]:
for idx in index[1:-1:15]:
axs[i].plot('RUL', sensor,data=df_train[df_train.unit_nb==idx])
axs[i].set_xlim(350,0)
axs[i].set(xticks=np.arange(0, 350, 25))
axs[i].set_ylabel(sensor)
axs[i].set_xlabel('Remaining Use Life')
i=i+1
X_train = df_train[df_train.columns[3:-1]]
y_train = df_train.RUL
X_test = df_test.groupby('unit_nb').last().reset_index()[df_train.columns[3:-1]]
y_train = y_train.clip(upper=155)
# create evalute function for train and test data
def evaluate(y_true, y_hat):
RMSE = np.sqrt(mean_squared_error(y_true, y_hat))
R2_score = r2_score(y_true, y_hat)
return [RMSE,R2_score];
#Make Dataframe which will contain results
Results = pd.DataFrame(columns=['RMSE-Train','R2-Train','RMSE-Test','R
2-Test','time-train (s)'])
訓練線性回歸模型
import time
Sc = StandardScaler()
X_train1 = Sc.fit_transform(X_train)
X_test1 = Sc.transform(X_test)
# create and fit model
start = time.time()
lm = LinearRegression()
lm.fit(X_train1, y_train)
end_fit = time.time()- start
# predict and evaluate
y_pred_train = lm.predict(X_train1)
y_pred_test = lm.predict(X_test1)
Results.loc['LR']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
def plot_prediction(y_test,y_pred_test,score):
plt.style.use("ggplot")
fig, ax = plt.subplots(1, 2, figsize=(17, 4), gridspec_kw={'width_ratios': [1.2, 3]})
fig.subplots_adjust(wspace=.12)
ax[0].plot([min(y_test),max(y_test)],
[min(y_test),max(y_test)],lw=3,c='r')
ax[0].scatter(y_test,y_pred_test,lw=3,c='g')
ax[0].annotate(text=('RMSE: ' + "{:.2f}".format(score[0]) +'\n' +
'R2: ' + "{:.2%}".format(score[1])), xy=(0,140), size='large');
ax[0].set_title('Actual vs predicted RUL')
ax[0].set_xlabel('Actual')
ax[0].set_ylabel('Predicted');
ax[1].plot(range(0,100),y_test,lw=2,c='r',label = 'actual')
ax[1].plot(range(0,100),y_pred_test,lw=1,ls='--', c='b',label = 'prediction')
ax[1].legend()
ax[1].set_title('Actual vs predicted RUL')
ax[1].set_xlabel('Engine num')
ax[1].set_ylabel('RUL');
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test))
訓練支持向量機模型
# create and fit model
start = time.time()
svr = SVR(kernel="rbf", gamma=0.25, epsilnotallow=0.05)
svr.fit(X_train1, y_train)
end_fit = time.time()-start
# predict and evaluate
y_pred_train = svr.predict(X_train1)
y_pred_test = svr.predict(X_test1)
Results.loc['SVM']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test))
訓練決策樹模型
start=time.time()
dtr = DecisionTreeRegressor(random_state=42, max_features='sqrt', max_depth=10, min_samples_split=10)
dtr.fit(X_train1, y_train)
end_fit =time.time()-start
# predict and evaluate
y_pred_train = dtr.predict(X_train1)
y_pred_test = dtr.predict(X_test1)
Results.loc['DTree']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test
))
訓練KNN模型
from sklearn.neighbors import KNeighborsRegressor
# Evaluating on Train Data Set
start = time.time()
Kneigh = KNeighborsRegressor(n_neighbors=7)
Kneigh.fit(X_train1, y_train)
end_fit =time.time()-start
# predict and evaluate
y_pred_train = Kneigh.predict(X_train1)
y_pred_test = Kneigh.predict(X_test1)
Results.loc['KNeigh']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test))
訓練隨機森林模型
start = time.time()
rf = RandomForestRegressor(n_jobs=-1, n_estimators=130,max_features='sqrt', min_samples_split= 2, max_depth=10, random_state=42)
rf.fit(X_train1, y_train)
y_hat_train1 = rf.predict(X_train1)
end_fit = time.time()-start
# predict and evaluate
y_pred_train = rf.predict(X_train1)
y_pred_test = rf.predict(X_test1)
Results.loc['RF']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test))
訓練Gradient Boosting Regressor模型
from sklearn.ensemble import GradientBoostingRegressor
# Evaluating on Train Data Set
start = time.time()
xgb_r = GradientBoostingRegressor(n_estimators=45, max_depth=10, min_samples_leaf=7, max_features='sqrt', random_state=42,learning_rate=0.11)
xgb_r.fit(X_train1, y_train)
end_fit =time.time()-start
# predict and evaluate
y_pred_train = xgb_r.predict(X_train1)
y_pred_test = xgb_r.predict(X_test1)
Results.loc['XGboost']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test))
訓練Voting Regressor模型
from sklearn.ensemble import VotingRegressor
start=time.time()
Vot_R = VotingRegressor([("rf", rf), ("xgb", xgb_r)],weights=[1.5,1],n_jobs=-1)
Vot_R.fit(X_train1, y_train)
end_fit =time.time()-start
# predict and evaluate
y_pred_train = Vot_R.predict(X_train1)
y_pred_test = Vot_R.predict(X_test1)
Results.loc['VotingR']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
plot_prediction(y_test.RUL,y_pred_test,evaluate(y_test, y_pred_test))
訓練ANN模型
star=time.time()
model = tf.keras.models.Sequential()
model.add(Dense(32, activatinotallow='relu'))
model.add(Dense(64, activatinotallow='relu'))
model.add(Dense(128, activatinotallow='relu'))
model.add(Dense(128, activatinotallow='relu'))
model.add(Dense(1, activatinotallow='linear'))
model.compile(loss= 'msle', optimizer='adam', metrics=['msle'])
history = model.fit(x=X_train1,y=y_train, epochs = 40, batch_size = 64)
end_fit = time.time()-star
# predict and evaluate
y_pred_train = model.predict(X_train1)
y_pred_test = model.predict(X_test1)
Results.loc['ANN']=evaluate(y_train, y_pred_train)+evaluate(y_test, y_pred_test)+[end_fit]
Results
本文轉載自??高斯的手稿??
