04_机器学习赛事_一起挖掘幸福感

server/2024/10/19 13:34:27/

在这里插入图片描述

1. 函数库导入

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error
import lightgbm as lgb
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import KFold, RepeatedKFold
from scipy import sparse
#显示所有列
pd.set_option('display.max_columns', None)
#显示所有行
pd.set_option('display.max_rows', None)
from datetime import datetime

2. 导入数据

#导入数据
train_abbr=pd.read_csv("./data/happiness_train_abbr.csv",encoding='ISO-8859-1')
train=pd.read_csv("./data/happiness_train_complete.csv",encoding='ISO-8859-1')
test_abbr=pd.read_csv("./data/happiness_test_abbr.csv",encoding='ISO-8859-1')
test=pd.read_csv("./data/happiness_test_complete.csv",encoding='ISO-8859-1')
test_sub=pd.read_csv("./data/happiness_submit.csv",encoding='ISO-8859-1')

3. 查看数据

#观察数据大小
test.shape
(2968, 139)
test_sub.shape
(2968, 2)
train.shape
(8000, 140)
#简单查看数据
train.head()
idhappinesssurvey_typeprovincecitycountysurvey_timegenderbirthnationalityreligionreligion_freqeduedu_otheredu_statusedu_yrincomepoliticaljoin_partyfloor_areaproperty_0property_1property_2property_3property_4property_5property_6property_7property_8property_otherheight_cmweight_jinhealthhealth_problemdepressionhukouhukou_locmedia_1media_2media_3media_4media_5media_6leisure_1leisure_2leisure_3leisure_4leisure_5leisure_6leisure_7leisure_8leisure_9leisure_10leisure_11leisure_12socializerelaxlearnsocial_neighborsocial_friendsocia_outingequityclassclass_10_beforeclass_10_afterclass_14work_experwork_statuswork_yrwork_typework_manageinsur_1insur_2insur_3insur_4family_incomefamily_mfamily_statushousecarinvest_0invest_1invest_2invest_3invest_4invest_5invest_6invest_7invest_8invest_othersondaughterminor_childmaritalmarital_1sts_birthmarital_nows_edus_politicals_hukous_incomes_work_expers_work_statuss_work_typef_birthf_eduf_politicalf_work_14m_birthm_edum_politicalm_work_14status_peerstatus_3_beforeviewinc_abilityinc_exptrust_1trust_2trust_3trust_4trust_5trust_6trust_7trust_8trust_9trust_10trust_11trust_12trust_13neighbor_familiaritypublic_service_1public_service_2public_service_3public_service_4public_service_5public_service_6public_service_7public_service_8public_service_9
01411232592015/8/4 14:181195911111NaN4.0-2.0200001NaN45.0010000000NaN17615532552.04255431431234145412433.03.023333113.030.01.02.0111260000.02212010000000NaN100.031984.01958.01984.06.01.05.040000.05.0NaNNaN-2441-2411324350000.042-8-8532343-84145060505030.030505050
12421852852015/7/21 15:041199211112NaN4.02013.0200001NaN110.0000010000NaN17011054311.02213512343543234512436.02.013648513.02.01.03.0111140000.03412010000000NaN00NaN1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN19723121973312114250000.0544353332333239070708085.070906060
234229831262015/7/21 13:24219671034NaN4.0-2.020001NaN120.0011000000NaN16012244511.02225131443544235553422.05.02454632NaNNaNNaNNaN11228000.03312010000000NaN021.031990.01968.01990.03.01.01.06000.03.0NaNNaN-2112-2112214280000.03333433333-83149080757980.090909075
34521028512015/7/25 17:33219431113NaN4.01959.064201NaN78.0000100000NaN16317044412.02115111524545115552441.06.01455724NaNNaNNaNNaN222212000.03311010000000NaN140.071960.0NaNNaNNaNNaNNaNNaNNaNNaNNaN-21412-2112213210000.03343533543332310090708080.090908080
4541718362015/8/10 9:502199411112NaN1.02014.0-12NaN70.0000010000NaN16511055323.01342553332443525514347.05.03211146NaNNaNNaNNaN1222-2.04311010000000NaN00NaN1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1970611019724115323-8200000.0433355343333225050505050.050505050
#查看数据是否缺失
train.info(verbose=True,show_counts=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8000 entries, 0 to 7999
Data columns (total 140 columns):#    Column                Non-Null Count  Dtype  
---   ------                --------------  -----  0    id                    8000 non-null   int64  1    happiness             8000 non-null   int64  2    survey_type           8000 non-null   int64  3    province              8000 non-null   int64  4    city                  8000 non-null   int64  5    county                8000 non-null   int64  6    survey_time           8000 non-null   object 7    gender                8000 non-null   int64  8    birth                 8000 non-null   int64  9    nationality           8000 non-null   int64  10   religion              8000 non-null   int64  11   religion_freq         8000 non-null   int64  12   edu                   8000 non-null   int64  13   edu_other             3 non-null      object 14   edu_status            6880 non-null   float6415   edu_yr                6028 non-null   float6416   income                8000 non-null   int64  17   political             8000 non-null   int64  18   join_party            824 non-null    float6419   floor_area            8000 non-null   float6420   property_0            8000 non-null   int64  21   property_1            8000 non-null   int64  22   property_2            8000 non-null   int64  23   property_3            8000 non-null   int64  24   property_4            8000 non-null   int64  25   property_5            8000 non-null   int64  26   property_6            8000 non-null   int64  27   property_7            8000 non-null   int64  28   property_8            8000 non-null   int64  29   property_other        66 non-null     object 30   height_cm             8000 non-null   int64  31   weight_jin            8000 non-null   int64  32   health                8000 non-null   int64  33   health_problem        8000 non-null   int64  34   depression            8000 non-null   int64  35   hukou                 8000 non-null   int64  36   hukou_loc             7996 non-null   float6437   media_1               8000 non-null   int64  38   media_2               8000 non-null   int64  39   media_3               8000 non-null   int64  40   media_4               8000 non-null   int64  41   media_5               8000 non-null   int64  42   media_6               8000 non-null   int64  43   leisure_1             8000 non-null   int64  44   leisure_2             8000 non-null   int64  45   leisure_3             8000 non-null   int64  46   leisure_4             8000 non-null   int64  47   leisure_5             8000 non-null   int64  48   leisure_6             8000 non-null   int64  49   leisure_7             8000 non-null   int64  50   leisure_8             8000 non-null   int64  51   leisure_9             8000 non-null   int64  52   leisure_10            8000 non-null   int64  53   leisure_11            8000 non-null   int64  54   leisure_12            8000 non-null   int64  55   socialize             8000 non-null   int64  56   relax                 8000 non-null   int64  57   learn                 8000 non-null   int64  58   social_neighbor       7204 non-null   float6459   social_friend         7204 non-null   float6460   socia_outing          8000 non-null   int64  61   equity                8000 non-null   int64  62   class                 8000 non-null   int64  63   class_10_before       8000 non-null   int64  64   class_10_after        8000 non-null   int64  65   class_14              8000 non-null   int64  66   work_exper            8000 non-null   int64  67   work_status           2951 non-null   float6468   work_yr               2951 non-null   float6469   work_type             2951 non-null   float6470   work_manage           2951 non-null   float6471   insur_1               8000 non-null   int64  72   insur_2               8000 non-null   int64  73   insur_3               8000 non-null   int64  74   insur_4               8000 non-null   int64  75   family_income         7999 non-null   float6476   family_m              8000 non-null   int64  77   family_status         8000 non-null   int64  78   house                 8000 non-null   int64  79   car                   8000 non-null   int64  80   invest_0              8000 non-null   int64  81   invest_1              8000 non-null   int64  82   invest_2              8000 non-null   int64  83   invest_3              8000 non-null   int64  84   invest_4              8000 non-null   int64  85   invest_5              8000 non-null   int64  86   invest_6              8000 non-null   int64  87   invest_7              8000 non-null   int64  88   invest_8              8000 non-null   int64  89   invest_other          29 non-null     object 90   son                   8000 non-null   int64  91   daughter              8000 non-null   int64  92   minor_child           6934 non-null   float6493   marital               8000 non-null   int64  94   marital_1st           7172 non-null   float6495   s_birth               6282 non-null   float6496   marital_now           6230 non-null   float6497   s_edu                 6282 non-null   float6498   s_political           6282 non-null   float6499   s_hukou               6282 non-null   float64100  s_income              6282 non-null   float64101  s_work_exper          6282 non-null   float64102  s_work_status         2565 non-null   float64103  s_work_type           2565 non-null   float64104  f_birth               8000 non-null   int64  105  f_edu                 8000 non-null   int64  106  f_political           8000 non-null   int64  107  f_work_14             8000 non-null   int64  108  m_birth               8000 non-null   int64  109  m_edu                 8000 non-null   int64  110  m_political           8000 non-null   int64  111  m_work_14             8000 non-null   int64  112  status_peer           8000 non-null   int64  113  status_3_before       8000 non-null   int64  114  view                  8000 non-null   int64  115  inc_ability           8000 non-null   int64  116  inc_exp               8000 non-null   float64117  trust_1               8000 non-null   int64  118  trust_2               8000 non-null   int64  119  trust_3               8000 non-null   int64  120  trust_4               8000 non-null   int64  121  trust_5               8000 non-null   int64  122  trust_6               8000 non-null   int64  123  trust_7               8000 non-null   int64  124  trust_8               8000 non-null   int64  125  trust_9               8000 non-null   int64  126  trust_10              8000 non-null   int64  127  trust_11              8000 non-null   int64  128  trust_12              8000 non-null   int64  129  trust_13              8

http://www.ppmy.cn/server/28636.html

相关文章

设计模式- 中介者模式(Mediator Pattern)结构|原理|优缺点|场景|示例

设计模式&#xff08;分类&#xff09; 设计模式&#xff08;六大原则&#xff09; 创建型&#xff08;5种&#xff09; 工厂方法 抽象工厂模式 单例模式 建造者模式 原型模式 结构型&#xff08;7种&#xff09; 适配器…

正点原子[第二期]Linux之ARM(MX6U)裸机篇学习笔记-6.4--汇编LED驱动程序

前言&#xff1a; 本文是根据哔哩哔哩网站上“正点原子[第二期]Linux之ARM&#xff08;MX6U&#xff09;裸机篇”视频的学习笔记&#xff0c;在这里会记录下正点原子 I.MX6ULL 开发板的配套视频教程所作的实验和学习笔记内容。本文大量引用了正点原子教学视频和链接中的内容。…

基于MSOGI-FLL的交叉对消谐波信号提取网络MATLAB仿真

微❤关注“电气仔推送”获得资料&#xff08;专享优惠&#xff09; 模型简介&#xff1a; 此模型利用二阶广义积分器&#xff08;SOGI&#xff09;对基波电流和相应次的谐波电流进行取 &#xff0c;具体是通过多个基于二阶广义积分器的正交信号发生器 &#xff08; S&#xf…

Github 2024-05-01 开源项目月报Top20

根据Github Trendings的统计,本月(2024-05-01统计)共有20个项目上榜。根据开发语言中项目的数量,汇总情况如下: 开发语言项目数量Python项目13TypeScript项目5C项目2非开发语言项目1C++项目1JavaScript项目1Rust项目1Go项目1Shell项目1Svelte项目1编程面试大学:成为软件工程…

Springboot+vue+小程序+基于微信小程序的在线学习平台

一、项目介绍    基于Spring BootVue小程序的在线学习平台从实际情况出发&#xff0c;结合当前年轻人的学习环境喜好来开发。基于Spring BootVue小程序的在线学习平台在语言上使用Java语言进行开发&#xff0c;在数据库存储方面使用的MySQL数据库&#xff0c;开发工具是IDEA。…

ngrinder项目-本地调试遇到的坑

前提-maven mirrors配置 <mirrors><!--阿里公有仓库--><mirror><id>nexus-aliyun</id><mirrorOf>central</mirrorOf><name>Nexus aliyun</name><url>http://maven.aliyun.com/nexus/content/groups/public</ur…

织梦云端:网络信号原理的艺术解码

hello &#xff01;大家好呀&#xff01; 欢迎大家来到我的Linux高性能服务器编程系列之《织梦云端&#xff1a;网络信号原理的艺术解码》&#xff0c;在这篇文章中&#xff0c;你将会学习到网络信号原理以及应用&#xff0c;并且我会给出源码进行剖析&#xff0c;以及手绘UML图…

Redis基本數據結構 ― Set

Redis基本數據結構 ― Set 介紹常用命令範例1. 將元素添加到集合2. 移除指定元素3. 獲取集合包含的所有元素4. 交集5. 聯集6. 差集 介紹 元素不能重複無序集合底層透過hash table實現 常用命令 命令功能SADD將元素添加到集合SREM從集合中移除元素SMOVE將元素從一個集合移動到…