import pandas as pd
成都創(chuàng)新互聯(lián)專注為客戶提供全方位的互聯(lián)網(wǎng)綜合服務(wù),包含不限于成都網(wǎng)站制作、成都網(wǎng)站設(shè)計(jì)、延慶網(wǎng)絡(luò)推廣、微信小程序開(kāi)發(fā)、延慶網(wǎng)絡(luò)營(yíng)銷、延慶企業(yè)策劃、延慶品牌公關(guān)、搜索引擎seo、人物專訪、企業(yè)宣傳片、企業(yè)代運(yùn)營(yíng)等,從售前售中售后,我們都將竭誠(chéng)為您服務(wù),您的肯定,是我們最大的嘉獎(jiǎng);成都創(chuàng)新互聯(lián)為所有大學(xué)生創(chuàng)業(yè)者提供延慶建站搭建服務(wù),24小時(shí)服務(wù)熱線:13518219792,官方網(wǎng)址:jinyejixie.com
?
import numpy as np
?
from patsy import dmatrices
?
from statsmodels.stats.outliers_influence import variance_inflation_factor
?
import statsmodels.api as sm
?
import scipy.stats as stats
?
from sklearn.metrics import mean_squared_error
?
import seaborn as sns
?
import matplotlib.pyplot as plt
?
import matplotlib.mlab as mlab
import scipy.io
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# 數(shù)據(jù)讀取
?
# #ccpp = pd.read_excel( 'CCPP.xlsx')ccpp.describe()
# data = scipy.io.loadmat('ENCDATA-2hp.mat') # 讀取mat文件
# # path = scio.loadmat('FFT-2hp.mat')['FFT-2hp']
# print(data)
# train1=data['train3hp']
# test1=data['test3hp']
# train_y=data['train_y3hp']
# test_y=data['test_y3hp']
# data1=train1[:,1]
# print("train1",train1.shape)
# print("data",data1.shape)
# # sns.pairplot(data)
# # plt.show()
# #y, X = dmatrices( data1, data = train1, return_type= 'dataframe')
# fit2 = sm.formula.ols( data1,data = train1).fit()
# print("fit2",fit2)
# fit2.summary()
# pred2 = fit2.predict()
# print("pred2",pred2)
#
# np.sqrt(mean_squared_error(train1.PE, pred2))
# resid = fit2.resid
# plt.scatter(fit2.predict(), (fit2.resid-fit2.resid.mean())/fit2.resid.std())
# plt.xlabel( '預(yù)測(cè)值')
# plt.ylabel( '標(biāo)準(zhǔn)化殘差')
#
# # 添加水平參考線
#
# plt.axhline(y = 0, color = 'r', linewidth = 2)
# plt.show()
?
ccpp = pd.read_excel( 'CCPP.xlsx')
ccpp.describe()
sns.pairplot(ccpp)
plt.show()
?
# 發(fā)電量與自變量之間的相關(guān)系數(shù)
ccpp.corrwith(ccpp.PE)
y, X = dmatrices( 'PE~AT+V+AP', data = ccpp, return_type= 'dataframe')
?
# 構(gòu)造空的數(shù)據(jù)框
?
vif = pd.DataFrame()
vif[ "VIF Factor"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[ 1])]
# vif[ "features"] = X.columnsvif
# print(vif[ "features"])
# 構(gòu)造PE與AT、V和AP之間的線性模型
?
fit = sm.formula.ols( 'PE~AT+V+AP',data = ccpp).fit()
fit.summary()
print("fit",fit)
# 計(jì)算模型的RMSE值
?
pred = fit.predict()
np.sqrt(mean_squared_error(ccpp.PE, pred))
?
# 離群點(diǎn)檢驗(yàn)
?
outliers = fit.get_influence()
?
# 高杠桿值點(diǎn)(帽子矩陣)
?
leverage = outliers.hat_matrix_diag
?
# dffits值
?
dffits = outliers.dffits[ 0]
?
# 學(xué)生化殘差
?
resid_stu = outliers.resid_studentized_external
?
# cook距離
?
cook = outliers.cooks_distance[ 0]
?
# covratio值
?
covratio = outliers.cov_ratio
?
# 將上面的幾種異常值檢驗(yàn)統(tǒng)計(jì)量與原始數(shù)據(jù)集合并
?
contat1 = pd.concat([pd.Series(leverage, name = 'leverage'),
?????????????????????pd.Series(dffits, name = 'dffits'),
?????????????????????pd.Series(resid_stu,name = 'resid_stu'),
?
?
?
?????????????????????pd.Series(cook, name = 'cook'),
?????????????????????pd.Series(covratio, name = 'covratio'),],axis = 1)
ccpp_outliers = pd.concat([ccpp,contat1], axis = 1)
ccpp_outliers.head()
print("contat1",contat1)
?
# 重新建模
?
fit2 = sm.formula.ols( 'PE~AT+V+AP',data = ccpp_outliers).fit()
fit2.summary()
?
# 計(jì)算模型的RMSE值
?
pred2 = fit2.predict()
np.sqrt(mean_squared_error(ccpp_outliers.PE, pred2))
function(){ //K線圖 http://www.kaifx.cn/mt4/kaifx/1770.html
resid = fit2.resid
# 標(biāo)準(zhǔn)化殘差與預(yù)測(cè)值之間的散點(diǎn)圖
?
plt.scatter(fit2.predict(), (fit2.resid-fit2.resid.mean())/fit2.resid.std())
plt.xlabel( '預(yù)測(cè)值',fontdict={'family' : 'sans-serif', 'size' : 20})
plt.ylabel( '標(biāo)準(zhǔn)化殘差',fontdict={'family' : 'sans-serif', 'size' : 20})
?
# 添加水平參考線
?
plt.axhline(y = 0, color = 'r', linewidth = 2)
plt.show()
標(biāo)題名稱:異方差Python運(yùn)行
轉(zhuǎn)載源于:http://jinyejixie.com/article30/gpsgpo.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供網(wǎng)站維護(hù)、做網(wǎng)站、域名注冊(cè)、定制開(kāi)發(fā)、響應(yīng)式網(wǎng)站、網(wǎng)站設(shè)計(jì)
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來(lái)源: 創(chuàng)新互聯(lián)