基于三种机器学习方法的磁芯损耗预测模型

news/2024/12/31 3:09:25/

      大家好,我是带我去滑雪!

      磁性元件在电力转换和传输系统中扮演着关键角色,而磁芯损耗是影响其效率的重要指标。随着工业和能源需求的增长,尤其是可再生能源和智能电网的快速发展,对磁性元件的高效、紧凑、低功耗需求日益迫切。磁芯损耗与多种因素有关,包括工作频率、磁通密度、励磁波形、工作温度和材料特性,这些因素的非线性和相互关联性使得损耗预测变得复杂。通过对原始数据进行处理,包括剔除与磁通密度无关的变量,如温度、频率和磁芯损耗,以及检查数据中是否存在缺失值。在确保数据的完整性和相关性之后,采用时域和频域分析方法,结合统计分析、数值计算和信号处理技术,从数据中提取了关键特征,包括最大值、最小值、均值、标准差、偏度、峰度、总谐波失真、小波能级特征和波形因子等17个特征变量。

      汇总所有特征变量,并对磁芯材料类型、励磁波形数据等分类变量采用独立热编码将文本转化为数值。分别构建BP神经网络、随机森林回归、轻量梯度提升三种磁芯损耗预测模型,并对比各个模型在验证集上的MSE、MSE、MAE和R方,选择最优的模型对测试集进行预测。研究结果表明,轻量梯度提升模型的预测效果最好。此外为了探究模型的泛化能力,采用k-折交叉验证方法,对模型进行评估。最后,根据模型评估结果给出了相应建议。本文的建模流程如下:

    下面开始代码实战: 

目录

(一)特征提取

(二)模块导入与数据展示

(三)划分训练集和验证集

 (四)绘制变量热力图

(五)三种机器学习模型建立

(六)交叉验证


(一)特征提取

import pandas as pd
import numpy as np
import scipy.fftpack as fftpack
import pywt
file_path = r"E:\工作\硕士\问题四\附件三测试集特征提取.xlsx"
df = pd.read_excel(file_path, header=None)
flux_density = df.values.flatten()
segment_length = 1024
all_features = []
for i in range(0, len(flux_density), segment_length):segment = flux_density[i:i+segment_length]    if len(segment) < segment_length: breaknum_samples = len(segment)t = np.linspace(0, 1, num_samples)max_value = np.max(segment)min_value = np.min(segment)mean_value = np.mean(segment)std_deviation = np.std(segment)skewness = np.mean((segment - mean_value)**3) / (std_deviation**3)kurtosis = np.mean((segment - mean_value)**4) / (std_deviation**4)fft_values = fftpack.fft(segment)fft_freqs = fftpack.fftfreq(num_samples, d=t[1] - t[0])fft_magnitude = np.abs(fft_values)peak_to_peak = max_value - min_valuedominant_frequency = fft_freqs[np.argmax(fft_magnitude[:len(fft_magnitude)//2])]harmonics = np.sqrt(np.sum(fft_magnitude[2:]**2))fundamental = fft_magnitude[1] THD = harmonics / fundamental if fundamental != 0 else 0time_points = np.arange(len(segment))slope = np.polyfit(time_points, segment, 1)[0]zero_crossings = np.sum(np.diff(np.sign(segment)) != 0)rms_value = np.sqrt(np.mean(segment ** 2))mean_abs_value = np.mean(np.abs(segment))form_factor = rms_value / mean_abs_value if mean_abs_value != 0 else 0wavelet = 'db4'coeffs = pywt.wavedec(segment, wavelet, level=4)wavelet_coeffs_energy = [np.sum(c**2) for c in coeffs]features = {'最大值': max_value,'最小值': min_value,'峰峰值': peak_to_peak,'均值': mean_value,'标准差': std_deviation,'偏度': skewness,'峰度': kurtosis,'总谐波失真': THD,'小波能级1': wavelet_coeffs_energy[0],'小波能级2': wavelet_coeffs_energy[1],'小波能级3': wavelet_coeffs_energy[2],'小波能级4': wavelet_coeffs_energy[3],'小波能级5': wavelet_coeffs_energy[4],'斜率': slope,'过零率': zero_crossings,'波形因子 ': form_factor    }all_features.append(features)
features_df = pd.DataFrame(all_features)
output_file_path = r"E:\工作\硕士\问题四\附件三测试集特征提取结果.xlsx"
features_df.to_excel(output_file_path, index=False)
print(f"特征已提取并保存至 {output_file_path}")

(二)模块导入与数据展示

import os
import math
import time
import datetime
import random as rn
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams ['font.sans-serif'] ='SimHei' 
plt.rcParams ['axes.unicode_minus']=False
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler,StandardScaler
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error,r2_score
import tensorflow as tf
import keras
from keras.layers import Layer
import keras.backend as K
from keras.models import Model, Sequential
from keras.layers import GRU, Dense,Conv1D, MaxPooling1D,GlobalMaxPooling1D,Embedding,Dropout,Flatten,SimpleRNN,LSTM,Bidirectional
import tensorflow as tf
from keras.callbacks import EarlyStopping
from tensorflow.keras  import optimizersdata0 = pd.read_csv(r'E:\工作\硕士\科研竞赛\第四问模型训练数据.csv',encoding="ANSI",parse_dates=['序号']
).set_index('序号')[['温度', '频率', '最大值', '最小值', '峰峰值', '均值', '标准差', '偏度', '峰度', '总谐波失真', '小波能级1', '小波能级2', '小波能级3', '小波能级4', '小波能级5', '磁芯材料类型', '斜率', '过零率', '波形因子', '励磁波形', '磁芯损耗']].sort_index()
data0 = data0.astype(float)
data0.head()

      数据展示:

(三)划分训练集和验证集

from matplotlib.ticker import MaxNLocator
def build_sequences(text, window_size=24):x, y = [],[]for i in range(len(text) - window_size):sequence = text[i:i+window_size]target = text[i+window_size]x.append(sequence)y.append(target)return np.array(x), np.array(y)
def get_traintest(data,train_ratio=0.8,window_size=24):train_size=int(len(data)*train_ratio)train=data[:train_size]test=data[train_size-window_size:]X_train,y_train=build_sequences(train,window_size=window_size)X_test,y_test=build_sequences(test,window_size=window_size)return X_train,y_train[:,-1],X_test,y_test[:,-1]
data=data0.to_numpy()
scaler = MinMaxScaler() 
scaler = scaler.fit(data[:,:-1])
X=scaler.transform(data[:,:-1])   
y_scaler = MinMaxScaler() 
y_scaler = y_scaler.fit(data[:,-1].reshape(-1,1))
y=y_scaler.transform(data[:,-1].reshape(-1,1))train_ratio=0.8   
window_size=5  
X_train,y_train,X_test,y_test=get_traintest(np.c_[X,y],window_size=window_size,train_ratio=train_ratio)
print(X_train.shape,y_train.shape,X_test.shape,y_test.shape)
y_test1 = y_scaler.inverse_transform(y_test.reshape(-1,1))
test_size=int(len(data)*(1-train_ratio))
plt.figure(figsize=(10, 5), dpi=256)
plt.plot(data0.index[:-test_size], data0.iloc[:, -1].iloc[:-test_size], label='训练集', color='#FA9905')
plt.plot(data0.index[-test_size:], data0.iloc[:, -1].iloc[-test_size:], label='验证集', color='#FB8498', linestyle='dashed')
ax = plt.gca()
ax.xaxis.set_major_locator(MaxNLocator(10)) 
plt.legend()
plt.ylabel('磁芯损耗(每立方米瓦特)', fontsize=14)
plt.xlabel('序号', fontsize=14)
plt.savefig(r"E:\工作\硕士\科研竞赛\问题四\训练集与验证集比例.png",bbox_inches="tight",pad_inches=1,transparent=True,facecolor="w",edgecolor='w',dpi=300,orientation='landscape')

    可视化结果展示:

       经过统计,共有12400个样本,将其划分为训练集和测试集,其中训练集为数据前80%,共计9920个,测试集为数据的后20%,共计2485个。利用python将磁芯损耗数据可视化,其中黄色实线为训练集数据,粉色虚线为测试集数据。

 (四)绘制变量热力图

        结合提取的特征变量,本问模型构建选取的特征变量有温度、频率、最大值、最小值、峰  峰值、均值、标准差、偏度、峰度、总谐波失真、小波能级1、小波能级2、小波能级3、小波能级4、小波能级5、磁芯材料类型、斜率、过零率、波形因子、励磁波形共计20个,响应变量为磁芯损耗。使用皮尔逊相关系数计算每个变量之间的相关系数,并绘制热力图。为了探究特征变量对磁芯损耗的影响,绘制了随机森林特征重要性排序。

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns 
import datetime
%matplotlib inline
plt.rcParams['font.sans-serif'] = ['KaiTi']
plt.rcParams['axes.unicode_minus'] = Falsedata0 = pd.read_csv(r'E:\工作\硕士\科研竞赛\问题四\第四问模型训练数据.csv',encoding="ANSI",parse_dates=['序号']
).set_index('序号')[['温度', '频率', '最大值', '最小值', '峰峰值', '均值', '标准差', '偏度', '峰度', '总谐波失真', '小波能级1', '小波能级2', '小波能级3', '小波能级4', '小波能级5', '磁芯材料类型', '斜率', '过零率', '波形因子', '励磁波形', '磁芯损耗']].sort_index()
data = data0.drop(columns=['磁芯损耗'])
y= data0['磁芯损耗']
data.shape,y.shapecorr = plt.subplots(figsize = (18,16),dpi=300)
corr= sns.heatmap(data.assign(Y=y).corr(method='spearman'),annot=True,square=True)
plt.savefig(r"E:\工作\硕士\科研竞赛\问题四\热力图.png",bbox_inches="tight",pad_inches=1,transparent=True,facecolor="w",edgecolor='w',dpi=300,orientation='landscape')

    输出结果:

(五)三种机器学习模型建立

from sklearn.model_selection import train_test_split
X_train,X_val,y_train,y_val=train_test_split(data,y,test_size=0.2,random_state=0)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train_s = scaler.transform(X_train)
X_val_s = scaler.transform(X_val)
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import ElasticNet
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from xgboost.sklearn import XGBRegressor
from lightgbm import LGBMRegressor
from sklearn.svm import SVR
from sklearn.neural_network import MLPRegressor
model5= RandomForestRegressor(n_estimators=500,  max_features=int(X_train.shape[1]/3) , random_state=0)
model7 =  XGBRegressor(objective='reg:squarederror', n_estimators=1000, random_state=0)
model8 = LGBMRegressor(n_estimators=1000,objective='regression',random_state=0)
model10 = MLPRegressor(hidden_layer_sizes=(16,8), random_state=77, max_iter=10000)
model_list=[model1,model2,model3,model4,model5,model6,model7,model8,model9,model10]
model_name=['随机森林','轻量梯度提升','BP神经网络']
for i in range(3):model_C=model_list[i]name=model_name[i]model_C.fit(X_train_s, y_train)s=model_C.score(X_val_s, y_val)print(name+'方法在验证集的准确率为:'+str(s))

     输出结果:

(六)交叉验证

from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error,r2_score
from sklearn.model_selection import KFolddef evaluation(y_test, y_predict):mae = mean_absolute_error(y_test, y_predict)mse = mean_squared_error(y_test, y_predict)rmse = np.sqrt(mean_squared_error(y_test, y_predict))mape=(abs(y_predict -y_test)/ y_test).mean()r_2=r2_score(y_test, y_predict)return mae, rmse, mape
def evaluation2(lis):array=np.array(lis)return array.mean() , array.std()def cross_val(model=None,X=None,Y=None,K=5,repeated=1):df_mean=pd.DataFrame(columns=['R2','MAE','RMSE','MAPE']) df_std=pd.DataFrame(columns=['R2','MAE','RMSE','MAPE'])for n in range(repeated):print(f'正在进行第{n+1}次重复K折.....随机数种子为{n}\n')kf = KFold(n_splits=K, shuffle=True, random_state=n)R2=[]MAE=[]RMSE=[]MAPE=[]print(f"    开始本次在{K}折数据上的交叉验证.......\n")i=1for train_index, test_index in kf.split(X):print(f'        正在进行第{i}折的计算')X_train=X.values[train_index]y_train=y.values[train_index]X_test=X.values[test_index]y_test=y.values[test_index]model.fit(X_train,y_train)score=model.score(X_test,y_test)R2.append(score)pred=model.predict(X_test)mae, rmse, mape=evaluation(y_test, pred)MAE.append(mae)RMSE.append(rmse)MAPE.append(mape)print(f'        第{i}折的拟合优度为:{round(score,4)},MAE为{round(mae,4)},RMSE为{round(rmse,4)},MAPE为{round(mape,4)}')i+=1print(f'    ———————————————完成本次的{K}折交叉验证———————————————————\n')R2_mean,R2_std=evaluation2(R2)MAE_mean,MAE_std=evaluation2(MAE)RMSE_mean,RMSE_std=evaluation2(RMSE)MAPE_mean,MAPE_std=evaluation2(MAPE)print(f'第{n+1}次重复K折,本次{K}折交叉验证的总体拟合优度均值为{R2_mean},方差为{R2_std}')print(f'                               总体MAE均值为{MAE_mean},方差为{MAE_std}')print(f'                               总体RMSE均值为{RMSE_mean},方差为{RMSE_std}')print(f'                               总体MAPE均值为{MAPE_mean},方差为{MAPE_std}')print("\n====================================================================================================================\n")df1=pd.DataFrame(dict(zip(['R2','MAE','RMSE','MAPE'],[R2_mean,MAE_mean,RMSE_mean,MAPE_mean])),index=[n])df_mean=pd.concat([df_mean,df1])df2=pd.DataFrame(dict(zip(['R2','MAE','RMSE','MAPE'],[R2_std,MAE_std,RMSE_std,MAPE_std])),index=[n])df_std=pd.concat([df_std,df2])return df_mean,df_stdmodel = LGBMRegressor(n_estimators=1000,objective='regression',random_state=0)
lgb_crosseval,lgb_crosseval2=cross_val(model=model,X=data,Y=y,K=3,repeated=5)lgb_crosseval

     部分输出结果展示: 

        为了验证各个模型的泛化能力,本文采用5-折交叉验证方法,评估模型的泛化能力。展示了在5 折交叉验证下三种模型的总体拟合优度均值,而展示了在5折交叉验证下不同模型的总体拟合优度方差。


更多优质内容持续发布中,请移步主页查看。

 若有问题可邮箱联系:1736732074@qq.com 

博主的WeChat:TCB1736732074

   点赞+关注,下次不迷路!


http://www.ppmy.cn/news/1559079.html

相关文章

Chapter 03 复合数据类型-1

1.列表 Python内置的一种有序、可变的序列数据类型&#xff1b; 列表的定义&#xff1a; [ ]括起来的逗号分隔的多个元素组成的序列 列表对象的创建&#xff1a; &#xff08;1&#xff09;直接赋值 >>> list1 []#创建一个空列表赋值给list1 >>> list…

大数据-256 离线数仓 - Atlas 数据仓库元数据管理 正式安装 启动服务访问 Hive血缘关系导入

点一下关注吧&#xff01;&#xff01;&#xff01;非常感谢&#xff01;&#xff01;持续更新&#xff01;&#xff01;&#xff01; Java篇开始了&#xff01; 目前开始更新 MyBatis&#xff0c;一起深入浅出&#xff01; 目前已经更新到了&#xff1a; Hadoop&#xff0…

攻防世界web新手第四题easyphp

<?php highlight_file(__FILE__); $key1 0; $key2 0;$a $_GET[a]; $b $_GET[b];if(isset($a) && intval($a) > 6000000 && strlen($a) < 3){if(isset($b) && 8b184b substr(md5($b),-6,6)){$key1 1;}else{die("Emmm...再想想&quo…

理想的未来在AI——李想深度解析理想汽车的智能化之路

摘要&#xff1a; 在2024年的AI Talk中&#xff0c;理想汽车董事长兼CEO李想分享了他对人工智能的深刻见解和理想汽车在智能化道路上的最新进展。本文整理了访谈的精华内容&#xff0c;探讨了李想如何将理想汽车从传统车企转型为人工智能企业&#xff0c;并展望了人工智能技术如…

第一个C++程序|cin和cout|命名空间

第一个C程序 基础程序 使用DevC5.4.0 写一个C程序 在屏幕上打印hello world #include <iostream> using namespace std;int main() {cout << "hello world" << endl;return 0; } 运行这个C程序 F9->编译 F10->运行 F11->编译运行 mai…

docker-开源nocodb,使用已有数据库

使用已有数据库 创建本地数据库 数据库&#xff1a;nocodb 用户&#xff1a;nocodb 密码&#xff1a;xxxxxx修改docker-compose.yml 默认网关的 IP 地址是 172.17.0.1&#xff08;适用于 bridge 网络模式&#xff09;version: "2.1" services:nocodb:environment:…

TCP 传输可靠性保障

TCP 如何保证传输的可靠性&#xff1f; TCP传输可靠性的保障主要通过以下几个核心机制实现&#xff1a; ‌1. 三次握手与连接管理‌&#xff1a; TCP在建立连接时采用三次握手过程&#xff0c;确保通信双方实体真实存在&#xff0c;并同步连接参数。这一过程为后续的可靠数据传…

Java连接HANA数据库

package hs.hsapp.util; import java.sql.*; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; public class HanaJdbc { private static final String DRIVER “com.sap.db.jdbc.Driver”; //jdbc 4.0 private static…