基于numpy的鸢尾花数据获取、处理等操作。

news/2024/11/28 2:52:45/

这是搬运的。

《基于numpy的鸢尾花数据操作》

一、 实验准备

1.1 实验概述

我们本次实验将会使用的编程语言是Python,在本次实验当中我们将会使用结合我们学习过的numpy 中的知识点进行实验,通过本实验当中我们将学会如何使用numpy来对实际数据进行处理,加深numpy的理解。

Numpy:

NumPy(Numerical Python的简称)是Python数值计算最重要的基础包。大多数提供科学计算的包都是用NumPy的数组作为构建基础。

NumPy的部分功能如下:

1、ndarray,一个具有矢量算术运算和复杂广播能力的快速且节省空间的多维数组。 2、用于对整组数据进行快速运算的标准数学函数(无需编写循环)。 3、用于读写磁盘数据的工具以及用于操作内存映射文件的工具。 4、线性代数、随机数生成以及傅里叶变换功能。 5、用于集成由C、C++、Fortran等语言编写的代码的A C API。

由于NumPy提供了一个简单易用的C API,因此很容易将数据传递给由低级语言编写的外部库,外部库也能以NumPy数组的形式将数据返回给Python。这个功能使Python成为一种包装C/C++/Fortran历史代码库的选择,并使被包装库拥有一个动态的、易用的接口。

NumPy本身并没有提供多么高级的数据分析功能,理解NumPy数组以及面向数组的计算将有助于你更加高效地使用诸如pandas之类的工具。因为NumPy是一个很大的题目,我会在附录A中介绍更多NumPy高级功能,比如广播。

对于大部分数据分析应用而言,我们最关注的功能主要集中在:

1、用于数据整理和清理、子集构造和过滤、转换等快速的矢量化数组运算。 2、常用的数组算法,如排序、唯一化、集合运算等。 3、高效的描述统计和数据聚合/摘要运算。 4、用于异构数据集的合并/连接运算的数据对齐和关系型数据运算。 5、将条件逻辑表述为数组表达式(而不是带有if-elif-else分支的循环)。 6、数据的分组运算(聚合、转换、函数应用等)。

1.2 实验目的

  • 了解各类数据文件
  • 掌握numpy中各种方法的灵活应用
  • 掌握numpy对实际数据的处理方法
  • 掌握numpy对真实数据处理的流程

1.3 实验环境

实验环境:python3.6以上、Numpy、Jupyter Notebook、Google Chrome\IE浏览器

二、 实验步骤

2.1 数据的读取

NumPy能够读写磁盘上的文本数据或二进制数据。本次实验我们将直接加载鸢尾花的数据,数据集当中主要包括了鸢尾花的花萼长宽、花瓣长宽以及鸢尾花的类别,我们将直接从网上导入数据(可能会有些慢,之后我们会直接使用下载好的数据),并从元组数据中提取一列。

import numpy as np url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris_1d = np.genfromtxt(url, delimiter=',', dtype=None) print(iris_1d) #提取一列 species = np.array([row[4] for row in iris_1d]) print(species[:5]) 

部分输出如下:

In [1]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_1d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>None</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_1d</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#提取一列</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#000000">row</span>[<span style="color:#008800">4</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris_1d</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">species</span>[:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[(5.1, 3.5, 1.4, 0.2, b'Iris-setosa') (4.9, 3. , 1.4, 0.2, b'Iris-setosa')(4.7, 3.2, 1.3, 0.2, b'Iris-setosa') (4.6, 3.1, 1.5, 0.2, b'Iris-setosa')(5. , 3.6, 1.4, 0.2, b'Iris-setosa') (5.4, 3.9, 1.7, 0.4, b'Iris-setosa')(4.6, 3.4, 1.4, 0.3, b'Iris-setosa') (5. , 3.4, 1.5, 0.2, b'Iris-setosa')(4.4, 2.9, 1.4, 0.2, b'Iris-setosa') (4.9, 3.1, 1.5, 0.1, b'Iris-setosa')(5.4, 3.7, 1.5, 0.2, b'Iris-setosa') (4.8, 3.4, 1.6, 0.2, b'Iris-setosa')(4.8, 3. , 1.4, 0.1, b'Iris-setosa') (4.3, 3. , 1.1, 0.1, b'Iris-setosa')(5.8, 4. , 1.2, 0.2, b'Iris-setosa') (5.7, 4.4, 1.5, 0.4, b'Iris-setosa')(5.4, 3.9, 1.3, 0.4, b'Iris-setosa') (5.1, 3.5, 1.4, 0.3, b'Iris-setosa')(5.7, 3.8, 1.7, 0.3, b'Iris-setosa') (5.1, 3.8, 1.5, 0.3, b'Iris-setosa')(5.4, 3.4, 1.7, 0.2, b'Iris-setosa') (5.1, 3.7, 1.5, 0.4, b'Iris-setosa')(4.6, 3.6, 1. , 0.2, b'Iris-setosa') (5.1, 3.3, 1.7, 0.5, b'Iris-setosa')(4.8, 3.4, 1.9, 0.2, b'Iris-setosa') (5. , 3. , 1.6, 0.2, b'Iris-setosa')(5. , 3.4, 1.6, 0.4, b'Iris-setosa') (5.2, 3.5, 1.5, 0.2, b'Iris-setosa')(5.2, 3.4, 1.4, 0.2, b'Iris-setosa') (4.7, 3.2, 1.6, 0.2, b'Iris-setosa')(4.8, 3.1, 1.6, 0.2, b'Iris-setosa') (5.4, 3.4, 1.5, 0.4, b'Iris-setosa')(5.2, 4.1, 1.5, 0.1, b'Iris-setosa') (5.5, 4.2, 1.4, 0.2, b'Iris-setosa')(4.9, 3.1, 1.5, 0.1, b'Iris-setosa') (5. , 3.2, 1.2, 0.2, b'Iris-setosa')(5.5, 3.5, 1.3, 0.2, b'Iris-setosa') (4.9, 3.1, 1.5, 0.1, b'Iris-setosa')(4.4, 3. , 1.3, 0.2, b'Iris-setosa') (5.1, 3.4, 1.5, 0.2, b'Iris-setosa')(5. , 3.5, 1.3, 0.3, b'Iris-setosa') (4.5, 2.3, 1.3, 0.3, b'Iris-setosa')(4.4, 3.2, 1.3, 0.2, b'Iris-setosa') (5. , 3.5, 1.6, 0.6, b'Iris-setosa')(5.1, 3.8, 1.9, 0.4, b'Iris-setosa') (4.8, 3. , 1.4, 0.3, b'Iris-setosa')(5.1, 3.8, 1.6, 0.2, b'Iris-setosa') (4.6, 3.2, 1.4, 0.2, b'Iris-setosa')(5.3, 3.7, 1.5, 0.2, b'Iris-setosa') (5. , 3.3, 1.4, 0.2, b'Iris-setosa')(7. , 3.2, 4.7, 1.4, b'Iris-versicolor')(6.4, 3.2, 4.5, 1.5, b'Iris-versicolor')(6.9, 3.1, 4.9, 1.5, b'Iris-versicolor')(5.5, 2.3, 4. , 1.3, b'Iris-versicolor')(6.5, 2.8, 4.6, 1.5, b'Iris-versicolor')(5.7, 2.8, 4.5, 1.3, b'Iris-versicolor')(6.3, 3.3, 4.7, 1.6, b'Iris-versicolor')(4.9, 2.4, 3.3, 1. , b'Iris-versicolor')(6.6, 2.9, 4.6, 1.3, b'Iris-versicolor')(5.2, 2.7, 3.9, 1.4, b'Iris-versicolor')(5. , 2. , 3.5, 1. , b'Iris-versicolor')(5.9, 3. , 4.2, 1.5, b'Iris-versicolor')(6. , 2.2, 4. , 1. , b'Iris-versicolor')(6.1, 2.9, 4.7, 1.4, b'Iris-versicolor')(5.6, 2.9, 3.6, 1.3, b'Iris-versicolor')(6.7, 3.1, 4.4, 1.4, b'Iris-versicolor')(5.6, 3. , 4.5, 1.5, b'Iris-versicolor')(5.8, 2.7, 4.1, 1. , b'Iris-versicolor')(6.2, 2.2, 4.5, 1.5, b'Iris-versicolor')(5.6, 2.5, 3.9, 1.1, b'Iris-versicolor')(5.9, 3.2, 4.8, 1.8, b'Iris-versicolor')(6.1, 2.8, 4. , 1.3, b'Iris-versicolor')(6.3, 2.5, 4.9, 1.5, b'Iris-versicolor')(6.1, 2.8, 4.7, 1.2, b'Iris-versicolor')(6.4, 2.9, 4.3, 1.3, b'Iris-versicolor')(6.6, 3. , 4.4, 1.4, b'Iris-versicolor')(6.8, 2.8, 4.8, 1.4, b'Iris-versicolor')(6.7, 3. , 5. , 1.7, b'Iris-versicolor')(6. , 2.9, 4.5, 1.5, b'Iris-versicolor')(5.7, 2.6, 3.5, 1. , b'Iris-versicolor')(5.5, 2.4, 3.8, 1.1, b'Iris-versicolor')(5.5, 2.4, 3.7, 1. , b'Iris-versicolor')(5.8, 2.7, 3.9, 1.2, b'Iris-versicolor')(6. , 2.7, 5.1, 1.6, b'Iris-versicolor')(5.4, 3. , 4.5, 1.5, b'Iris-versicolor')(6. , 3.4, 4.5, 1.6, b'Iris-versicolor')(6.7, 3.1, 4.7, 1.5, b'Iris-versicolor')(6.3, 2.3, 4.4, 1.3, b'Iris-versicolor')(5.6, 3. , 4.1, 1.3, b'Iris-versicolor')(5.5, 2.5, 4. , 1.3, b'Iris-versicolor')(5.5, 2.6, 4.4, 1.2, b'Iris-versicolor')(6.1, 3. , 4.6, 1.4, b'Iris-versicolor')(5.8, 2.6, 4. , 1.2, b'Iris-versicolor')(5. , 2.3, 3.3, 1. , b'Iris-versicolor')(5.6, 2.7, 4.2, 1.3, b'Iris-versicolor')(5.7, 3. , 4.2, 1.2, b'Iris-versicolor')(5.7, 2.9, 4.2, 1.3, b'Iris-versicolor')(6.2, 2.9, 4.3, 1.3, b'Iris-versicolor')(5.1, 2.5, 3. , 1.1, b'Iris-versicolor')(5.7, 2.8, 4.1, 1.3, b'Iris-versicolor')(6.3, 3.3, 6. , 2.5, b'Iris-virginica')(5.8, 2.7, 5.1, 1.9, b'Iris-virginica')(7.1, 3. , 5.9, 2.1, b'Iris-virginica')(6.3, 2.9, 5.6, 1.8, b'Iris-virginica')(6.5, 3. , 5.8, 2.2, b'Iris-virginica')(7.6, 3. , 6.6, 2.1, b'Iris-virginica')(4.9, 2.5, 4.5, 1.7, b'Iris-virginica')(7.3, 2.9, 6.3, 1.8, b'Iris-virginica')(6.7, 2.5, 5.8, 1.8, b'Iris-virginica')(7.2, 3.6, 6.1, 2.5, b'Iris-virginica')(6.5, 3.2, 5.1, 2. , b'Iris-virginica')(6.4, 2.7, 5.3, 1.9, b'Iris-virginica')(6.8, 3. , 5.5, 2.1, b'Iris-virginica')(5.7, 2.5, 5. , 2. , b'Iris-virginica')(5.8, 2.8, 5.1, 2.4, b'Iris-virginica')(6.4, 3.2, 5.3, 2.3, b'Iris-virginica')(6.5, 3. , 5.5, 1.8, b'Iris-virginica')(7.7, 3.8, 6.7, 2.2, b'Iris-virginica')(7.7, 2.6, 6.9, 2.3, b'Iris-virginica')(6. , 2.2, 5. , 1.5, b'Iris-virginica')(6.9, 3.2, 5.7, 2.3, b'Iris-virginica')(5.6, 2.8, 4.9, 2. , b'Iris-virginica')(7.7, 2.8, 6.7, 2. , b'Iris-virginica')(6.3, 2.7, 4.9, 1.8, b'Iris-virginica')(6.7, 3.3, 5.7, 2.1, b'Iris-virginica')(7.2, 3.2, 6. , 1.8, b'Iris-virginica')(6.2, 2.8, 4.8, 1.8, b'Iris-virginica')(6.1, 3. , 4.9, 1.8, b'Iris-virginica')(6.4, 2.8, 5.6, 2.1, b'Iris-virginica')(7.2, 3. , 5.8, 1.6, b'Iris-virginica')(7.4, 2.8, 6.1, 1.9, b'Iris-virginica')(7.9, 3.8, 6.4, 2. , b'Iris-virginica')(6.4, 2.8, 5.6, 2.2, b'Iris-virginica')(6.3, 2.8, 5.1, 1.5, b'Iris-virginica')(6.1, 2.6, 5.6, 1.4, b'Iris-virginica')(7.7, 3. , 6.1, 2.3, b'Iris-virginica')(6.3, 3.4, 5.6, 2.4, b'Iris-virginica')(6.4, 3.1, 5.5, 1.8, b'Iris-virginica')(6. , 3. , 4.8, 1.8, b'Iris-virginica')(6.9, 3.1, 5.4, 2.1, b'Iris-virginica')(6.7, 3.1, 5.6, 2.4, b'Iris-virginica')(6.9, 3.1, 5.1, 2.3, b'Iris-virginica')(5.8, 2.7, 5.1, 1.9, b'Iris-virginica')(6.8, 3.2, 5.9, 2.3, b'Iris-virginica')(6.7, 3.3, 5.7, 2.5, b'Iris-virginica')(6.7, 3. , 5.2, 2.3, b'Iris-virginica')(6.3, 2.5, 5. , 1.9, b'Iris-virginica')(6.5, 3. , 5.2, 2. , b'Iris-virginica')(6.2, 3.4, 5.4, 2.3, b'Iris-virginica')(5.9, 3. , 5.1, 1.8, b'Iris-virginica')]
[b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa']
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:4: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.after removing the cwd from sys.path.

将一维元组数组转化为二维numpy数组

import numpy as np iris_1d = np.genfromtxt('iris.data', delimiter=',', dtype=None) # 方法1,将每一行转换为一个列表并获取前4项 iris_2d = np.array([row.tolist()[:4] for row in iris_1d]) # 打印转化后的二维numpy数组的前5行 print(iris_2d[:5]) # 方法2,仅从源导入前4列 iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) # 打印转化后的二维numpy数组的前5行 print(iris_2d[:5]) 

输出如下:

[[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2]] [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2]]
In [2]:
 
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_1d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>None</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法1,将每一行转换为一个列表并获取前4项</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#000000">row</span>.tolist()[:<span style="color:#008800">4</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris_1d</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 打印转化后的二维numpy数组的前5行</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法2,仅从源导入前4列</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 打印转化后的二维numpy数组的前5行</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[[5.1 3.5 1.4 0.2][4.9 3.  1.4 0.2][4.7 3.2 1.3 0.2][4.6 3.1 1.5 0.2][5.  3.6 1.4 0.2]]
[[5.1 3.5 1.4 0.2][4.9 3.  1.4 0.2][4.7 3.2 1.3 0.2][4.6 3.1 1.5 0.2][5.  3.6 1.4 0.2]]
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:5: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default."""

求出鸢尾属植物萼片长度的平均值、中位数和标准差(第1列)

import numpy as np # 先提取要计算的一列 sepallength = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0]) mu, med, sd = np.mean(sepallength), np.median(sepallength), np.std(sepallength) print(mu, med, sd)

输出如下:

5.843333333333334 5.8 0.8253012917851409
In [3]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 先提取要计算的一列</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">mu</span>, <span style="color:#000000">med</span>, <span style="color:#000000">sd</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.mean(<span style="color:#000000">sepallength</span>), <span style="color:#000000">np</span>.median(<span style="color:#000000">sepallength</span>), <span style="color:#000000">np</span>.std(<span style="color:#000000">sepallength</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">mu</span>, <span style="color:#000000">med</span>, <span style="color:#000000">sd</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
5.843333333333334 5.8 0.8253012917851409

2.2 规范化数组

在numpy中我们是否规范化数组呢?使数组的值正好介于0和1之间?答案当然是“肯定的”

import numpy as np # 创建一种标准化形式的鸢尾属植物间隔长度,其值正好介于0和1之间,这样最小值为0,最大值为1 sepallength = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0]) Smax, Smin = sepallength.max(), sepallength.min() S = (sepallength - Smin)/(Smax - Smin) print(S) # or ptp()表示最大值-最小值 S = (sepallength - Smin)/sepallength.ptp() print(S)

输出如下:

[0.22 0.17 0.11 0.08 0.19 0.31 0.08 0.19 0.03 0.17 0.31 0.14 0.14 0. 0.42 0.39 0.31 0.22 0.39 0.22 0.31 0.22 0.08 0.22 0.14 0.19 0.19 0.25 0.25 0.11 0.14 0.31 0.25 0.33 0.17 0.19 0.33 0.17 0.03 0.22 0.19 0.06 0.03 0.19 0.22 0.14 0.22 0.08 0.28 0.19 0.75 0.58 0.72 0.33 0.61 0.39 0.56 0.17 0.64 0.25 0.19 0.44 0.47 0.5 0.36 0.67 0.36 0.42 0.53 0.36 0.44 0.5 0.56 0.5 0.58 0.64 0.69 0.67 0.47 0.39 0.33 0.33 0.42 0.47 0.31 0.47 0.67 0.56 0.36 0.33 0.33 0.5 0.42 0.19 0.36 0.39 0.39 0.53 0.22 0.39 0.56 0.42 0.78 0.56 0.61 0.92 0.17 0.83 0.67 0.81 0.61 0.58 0.69 0.39 0.42 0.58 0.61 0.94 0.94 0.47 0.72 0.36 0.94 0.56 0.67 0.81 0.53 0.5 0.58 0.81 0.86 1. 0.58 0.56 0.5 0.94 0.56 0.58 0.47 0.72 0.67 0.72 0.42 0.69 0.67 0.67 0.56 0.61 0.53 0.44] [0.22 0.17 0.11 0.08 0.19 0.31 0.08 0.19 0.03 0.17 0.31 0.14 0.14 0. 0.42 0.39 0.31 0.22 0.39 0.22 0.31 0.22 0.08 0.22 0.14 0.19 0.19 0.25 0.25 0.11 0.14 0.31 0.25 0.33 0.17 0.19 0.33 0.17 0.03 0.22 0.19 0.06 0.03 0.19 0.22 0.14 0.22 0.08 0.28 0.19 0.75 0.58 0.72 0.33 0.61 0.39 0.56 0.17 0.64 0.25 0.19 0.44 0.47 0.5 0.36 0.67 0.36 0.42 0.53 0.36 0.44 0.5 0.56 0.5 0.58 0.64 0.69 0.67 0.47 0.39 0.33 0.33 0.42 0.47 0.31 0.47 0.67 0.56 0.36 0.33 0.33 0.5 0.42 0.19 0.36 0.39 0.39 0.53 0.22 0.39 0.56 0.42 0.78 0.56 0.61 0.92 0.17 0.83 0.67 0.81 0.61 0.58 0.69 0.39 0.42 0.58 0.61 0.94 0.94 0.47 0.72 0.36 0.94 0.56 0.67 0.81 0.53 0.5 0.58 0.81 0.86 1. 0.58 0.56 0.5 0.94 0.56 0.58 0.47 0.72 0.67 0.72 0.42 0.69 0.67 0.67 0.56 0.61 0.53 0.44]
In [4]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 创建一种标准化形式的鸢尾属植物间隔长度,其值正好介于0和1之间,这样最小值为0,最大值为1</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">Smax</span>, <span style="color:#000000">Smin</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">sepallength</span>.max(), <span style="color:#000000">sepallength</span>.min()</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">S</span> <span style="color:#aa22ff"><strong>=</strong></span> (<span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>-</strong></span> <span style="color:#000000">Smin</span>)<span style="color:#aa22ff"><strong>/</strong></span>(<span style="color:#000000">Smax</span> <span style="color:#aa22ff"><strong>-</strong></span> <span style="color:#000000">Smin</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">S</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># or  ptp()表示最大值-最小值</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">S</span> <span style="color:#aa22ff"><strong>=</strong></span> (<span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>-</strong></span> <span style="color:#000000">Smin</span>)<span style="color:#aa22ff"><strong>/</strong></span><span style="color:#000000">sepallength</span>.ptp()</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">S</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[0.22222222 0.16666667 0.11111111 0.08333333 0.19444444 0.305555560.08333333 0.19444444 0.02777778 0.16666667 0.30555556 0.138888890.13888889 0.         0.41666667 0.38888889 0.30555556 0.222222220.38888889 0.22222222 0.30555556 0.22222222 0.08333333 0.222222220.13888889 0.19444444 0.19444444 0.25       0.25       0.111111110.13888889 0.30555556 0.25       0.33333333 0.16666667 0.194444440.33333333 0.16666667 0.02777778 0.22222222 0.19444444 0.055555560.02777778 0.19444444 0.22222222 0.13888889 0.22222222 0.083333330.27777778 0.19444444 0.75       0.58333333 0.72222222 0.333333330.61111111 0.38888889 0.55555556 0.16666667 0.63888889 0.250.19444444 0.44444444 0.47222222 0.5        0.36111111 0.666666670.36111111 0.41666667 0.52777778 0.36111111 0.44444444 0.50.55555556 0.5        0.58333333 0.63888889 0.69444444 0.666666670.47222222 0.38888889 0.33333333 0.33333333 0.41666667 0.472222220.30555556 0.47222222 0.66666667 0.55555556 0.36111111 0.333333330.33333333 0.5        0.41666667 0.19444444 0.36111111 0.388888890.38888889 0.52777778 0.22222222 0.38888889 0.55555556 0.416666670.77777778 0.55555556 0.61111111 0.91666667 0.16666667 0.833333330.66666667 0.80555556 0.61111111 0.58333333 0.69444444 0.388888890.41666667 0.58333333 0.61111111 0.94444444 0.94444444 0.472222220.72222222 0.36111111 0.94444444 0.55555556 0.66666667 0.805555560.52777778 0.5        0.58333333 0.80555556 0.86111111 1.0.58333333 0.55555556 0.5        0.94444444 0.55555556 0.583333330.47222222 0.72222222 0.66666667 0.72222222 0.41666667 0.694444440.66666667 0.66666667 0.55555556 0.61111111 0.52777778 0.44444444]
[0.22222222 0.16666667 0.11111111 0.08333333 0.19444444 0.305555560.08333333 0.19444444 0.02777778 0.16666667 0.30555556 0.138888890.13888889 0.         0.41666667 0.38888889 0.30555556 0.222222220.38888889 0.22222222 0.30555556 0.22222222 0.08333333 0.222222220.13888889 0.19444444 0.19444444 0.25       0.25       0.111111110.13888889 0.30555556 0.25       0.33333333 0.16666667 0.194444440.33333333 0.16666667 0.02777778 0.22222222 0.19444444 0.055555560.02777778 0.19444444 0.22222222 0.13888889 0.22222222 0.083333330.27777778 0.19444444 0.75       0.58333333 0.72222222 0.333333330.61111111 0.38888889 0.55555556 0.16666667 0.63888889 0.250.19444444 0.44444444 0.47222222 0.5        0.36111111 0.666666670.36111111 0.41666667 0.52777778 0.36111111 0.44444444 0.50.55555556 0.5        0.58333333 0.63888889 0.69444444 0.666666670.47222222 0.38888889 0.33333333 0.33333333 0.41666667 0.472222220.30555556 0.47222222 0.66666667 0.55555556 0.36111111 0.333333330.33333333 0.5        0.41666667 0.19444444 0.36111111 0.388888890.38888889 0.52777778 0.22222222 0.38888889 0.55555556 0.416666670.77777778 0.55555556 0.61111111 0.91666667 0.16666667 0.833333330.66666667 0.80555556 0.61111111 0.58333333 0.69444444 0.388888890.41666667 0.58333333 0.61111111 0.94444444 0.94444444 0.472222220.72222222 0.36111111 0.94444444 0.55555556 0.66666667 0.805555560.52777778 0.5        0.58333333 0.80555556 0.86111111 1.0.58333333 0.55555556 0.5        0.94444444 0.55555556 0.583333330.47222222 0.72222222 0.66666667 0.72222222 0.41666667 0.694444440.66666667 0.66666667 0.55555556 0.61111111 0.52777778 0.44444444]

找到numpy数组的百分位数

import numpy as np sepallength = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0]) print(np.percentile(sepallength, q=[5, 95])) # [4.6 7.255]

输出如下:

[4.6 7.25]
In [5]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">sepallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.percentile(<span style="color:#000000">sepallength</span>, <span style="color:#000000">q</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">5</span>, <span style="color:#008800">95</span>]))</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># </em></span><span style="color:#00bb00"><em>[</em></span><span style="color:#408080"><em>4.6   7.255</em></span><span style="color:#00bb00"><em>]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[4.6   7.255]

在数组中的随机位置插入值

import numpy as np # 在iris_2d数据集中的20个随机位置插入np.nan值 iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='object') # 方法1 np.random.seed(100) # i,j包含iris_2d所有元素的行号和列号 i, j = np.where(iris_2d) # print(i, j) iris_2d[np.random.choice((i), 20), np.random.choice((j), 20)] = np.nan print(iris_2d[:10]) # 方法2 np.random.seed(100) iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan print(iris_2d[:10]) 

输出如下:

In [6]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在iris_2d数据集中的20个随机位置插入np.nan值</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法1</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># i,j包含iris_2d所有元素的行号和列号</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">i</span>, <span style="color:#000000">j</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.where(<span style="color:#000000">iris_2d</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(i, j)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.choice((<span style="color:#000000">i</span>), <span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.choice((<span style="color:#000000">j</span>), <span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[:<span style="color:#008800">10</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法2 </em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.randint(<span style="color:#008800">150</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.randint(<span style="color:#008800">4</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">iris_2d</span>[:<span style="color:#008800">10</span>]<span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[[b'5.1' b'3.5' b'1.4' b'0.2' b'Iris-setosa'][b'4.9' b'3.0' b'1.4' b'0.2' b'Iris-setosa'][b'4.7' b'3.2' b'1.3' b'0.2' b'Iris-setosa'][b'4.6' b'3.1' b'1.5' b'0.2' b'Iris-setosa'][b'5.0' b'3.6' b'1.4' b'0.2' b'Iris-setosa'][b'5.4' b'3.9' b'1.7' b'0.4' b'Iris-setosa'][b'4.6' b'3.4' b'1.4' b'0.3' b'Iris-setosa'][b'5.0' b'3.4' b'1.5' b'0.2' b'Iris-setosa'][b'4.4' b'2.9' b'1.4' b'0.2' b'Iris-setosa'][b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']]
[[b'5.1' b'3.5' b'1.4' b'0.2' b'Iris-setosa'][b'4.9' b'3.0' b'1.4' b'0.2' b'Iris-setosa'][b'4.7' b'3.2' b'1.3' b'0.2' b'Iris-setosa'][b'4.6' b'3.1' b'1.5' b'0.2' b'Iris-setosa'][b'5.0' b'3.6' b'1.4' b'0.2' b'Iris-setosa'][b'5.4' b'3.9' b'1.7' b'0.4' b'Iris-setosa'][b'4.6' b'3.4' b'1.4' b'0.3' b'Iris-setosa'][b'5.0' b'3.4' b'1.5' b'0.2' b'Iris-setosa'][b'4.4' nan b'1.4' b'0.2' b'Iris-setosa'][b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']]

使用numpy还可以找到数组中缺失值的位置

import numpy as np # 在iris_2d的sepallength中查找缺失值的数量和位置(第1列) iris_2d = np.genfromtxt('iris.data', delimiter=',',usecols=(0, 1,2,3),dtype=float) iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan print("缺失值个数: \n", np.isnan(iris_2d[:, 0]).sum()) print("缺失值位置: \n", np.where(np.isnan(iris_2d[:, 0])))

输出如下:

缺失值个数: 2 缺失值位置: (array([36, 56]),)
In [7]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在iris_2d的sepallength中查找缺失值的数量和位置(第1列)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>,<span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>(<span style="color:#008800">0</span>, <span style="color:#008800">1</span>,<span style="color:#008800">2</span>,<span style="color:#008800">3</span>),<span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000">float</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.randint(<span style="color:#008800">150</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.randint(<span style="color:#008800">4</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#ba2121">"缺失值个数: \n"</span>, <span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>]).sum())</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#ba2121">"缺失值位置: \n"</span>, <span style="color:#000000">np</span>.where(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>]))<span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
缺失值个数: 5
缺失值位置: (array([ 38,  80, 106, 113, 121]),)

从numpy数组中删除包含缺失值的行

import numpy as np iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan # print(iris_2d) # Method 1: ~表示取反 any_nan_in_row = np.array([~np.any(np.isnan(row)) for row in iris_2d]) # print(any_nan_in_row) # 打印的是布尔型数组 # print(iris_2d[any_nan_in_row]) 打印的是剔除掉包含缺失值的行的矩阵 print(iris_2d[any_nan_in_row][:5]) # Method 2: # print(np.isnan(iris_2d)) # 返回的是布尔型数组;false+false+false+false == 0 print(iris_2d[np.sum(np.isnan(iris_2d), axis=1) == 0][:5]) 

输出如下:

[[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2] [5.4 3.9 1.7 0.4]] [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2] [5.4 3.9 1.7 0.4]]
In [8]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.random.randint(<span style="color:#008800">150</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>), <span style="color:#000000">np</span>.random.randint(<span style="color:#008800">4</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)] <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.nan</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(iris_2d)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Method 1: ~表示取反</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">any_nan_in_row</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#aa22ff"><strong>~</strong></span><span style="color:#000000">np</span>.any(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">row</span>)) <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris_2d</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(any_nan_in_row) # 打印的是布尔型数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(iris_2d[any_nan_in_row]) 打印的是剔除掉包含缺失值的行的矩阵</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">iris_2d</span>[<span style="color:#000000">any_nan_in_row</span>][:<span style="color:#008800">5</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Method 2:</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.isnan(iris_2d)) # 返回的是布尔型数组;false+false+false+false == 0</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">iris_2d</span>[<span style="color:#000000">np</span>.sum(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>), <span style="color:#000000">axis</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">1</span>) <span style="color:#aa22ff"><strong>==</strong></span> <span style="color:#008800">0</span>][:<span style="color:#008800">5</span>]<span style="color:#00bb00">)</span></span></span></span></span></span>
[[4.9 3.  1.4 0.2][4.7 3.2 1.3 0.2][4.6 3.1 1.5 0.2][5.  3.6 1.4 0.2][4.6 3.4 1.4 0.3]]
[[4.9 3.  1.4 0.2][4.7 3.2 1.3 0.2][4.6 3.1 1.5 0.2][5.  3.6 1.4 0.2][4.6 3.4 1.4 0.3]]

找到numpy数组的两列之间的相关性

import numpy as np # 在iris_2d中找出SepalLength(第1列)和PetalLength(第3列)之间的相关性 iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) # Solution 1 print(np.corrcoef(iris_2d[:, 0], iris_2d[:, 2])[0, 1]) # 0.8717541573048718 # Solution 2 from scipy.stats.stats import pearsonr corr, p_value = pearsonr(iris_2d[:, 0], iris_2d[:, 2]) print(corr) # 0.8717541573048713 

输出如下:

0.8717541573048718 0.8717541573048714
In [9]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在iris_2d中找出SepalLength(第1列)和PetalLength(第3列)之间的相关性</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#ba2121">'iris.data'</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 1</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.corrcoef(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>], <span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">2</span>])[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 0.8717541573048718</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 2</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>from</strong></span> <span style="color:#000000">scipy</span>.stats.stats <span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">pearsonr</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">corr</span>, <span style="color:#000000">p_value</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">pearsonr</span>(<span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">0</span>], <span style="color:#000000">iris_2d</span>[:, <span style="color:#008800">2</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">corr</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 0.8717541573048713</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
0.871754157304871
0.8717541573048713

查找给定数组是否具有任何空值

import numpy as np url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0, 1, 2, 3]) print(np.isnan(iris_2d).any()) # False 

输出如下:

False
In [10]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris_2d</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'float'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">0</span>, <span style="color:#008800">1</span>, <span style="color:#008800">2</span>, <span style="color:#008800">3</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.isnan(<span style="color:#000000">iris_2d</span>).any())</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># False</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
False

在numpy数组中查找唯一值的计数

import numpy as np # 找出鸢尾属植物物种中的独特值和独特值的数量 url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') species = np.array([row.tolist()[4] for row in iris]) print(species) u, counts = np.unique(species, return_counts=True) print(u, counts) 

输出如下:

In [18]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 找出鸢尾属植物物种中的独特值和独特值的数量</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.array([<span style="color:#000000">row</span>.tolist()[<span style="color:#008800">4</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">row</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">iris</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">species</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">u</span>, <span style="color:#000000">counts</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.unique(<span style="color:#000000">species</span>, <span style="color:#000000">return_counts</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>True</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">u</span>, <span style="color:#000000">counts</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-setosa' b'Iris-setosa'b'Iris-setosa' b'Iris-setosa' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-versicolor' b'Iris-versicolor' b'Iris-versicolor'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica' b'Iris-virginica' b'Iris-virginica'b'Iris-virginica' b'Iris-virginica']
[b'Iris-setosa' b'Iris-versicolor' b'Iris-virginica'] [50 50 50]

将数字转换为分类(文本)数组

import numpy as np # 将iris_2d的花瓣长度(第3列)加入以形成文本数组 # Less than 3 --> 'small' # 3-5 --> 'medium' # '>=5 --> 'large' url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') # [0, 3, 5, 10]表示划分成[0,3), [3,5), [5,10), [10,>10) 4个区间; # 返回的petallength数组是每个元素对应这4个区间的索引 petallength = np.digitize(iris[:, 2].astype('float'), [0, 3, 5, 10]) # print(petallength) label_map = {1: 'small', 2: 'medium', 3: 'large', 4: np.nan} petallength2 = [label_map[x] for x in petallength] print(petallength2) 

输出如下:

In [12]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 将iris_2d的花瓣长度(第3列)加入以形成文本数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Less than 3 --> 'small'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 3-5 --> 'medium'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># '>=5 --> 'large'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [0, 3, 5, 10]表示划分成[0,3), [3,5), [5,10), [10,>10) 4个区间;</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 返回的petallength数组是每个元素对应这4个区间的索引</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.digitize(<span style="color:#000000">iris</span>[:, <span style="color:#008800">2</span>].astype(<span style="color:#ba2121">'float'</span>), [<span style="color:#008800">0</span>, <span style="color:#008800">3</span>, <span style="color:#008800">5</span>, <span style="color:#008800">10</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(petallength)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">label_map</span> <span style="color:#aa22ff"><strong>=</strong></span> {<span style="color:#008800">1</span>: <span style="color:#ba2121">'small'</span>, <span style="color:#008800">2</span>: <span style="color:#ba2121">'medium'</span>, <span style="color:#008800">3</span>: <span style="color:#ba2121">'large'</span>, <span style="color:#008800">4</span>: <span style="color:#000000">np</span>.nan}</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength2</span> <span style="color:#aa22ff"><strong>=</strong></span> [<span style="color:#000000">label_map</span>[<span style="color:#000000">x</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">x</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">petallength</span>]</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">petallength2</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
['small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'medium', 'large', 'large', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large']

按列对2D数组进行排序

import numpy as np # 根据sepallength列对数据集进行排序 url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') print(iris[iris[:, 0].argsort()])

部分输出如下:

In [13]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 将iris_2d的花瓣长度(第3列)加入以形成文本数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Less than 3 --> 'small'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 3-5 --> 'medium'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># '>=5 --> 'large'</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [0, 3, 5, 10]表示划分成[0,3), [3,5), [5,10), [10,>10) 4个区间;</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 返回的petallength数组是每个元素对应这4个区间的索引</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.digitize(<span style="color:#000000">iris</span>[:, <span style="color:#008800">2</span>].astype(<span style="color:#ba2121">'float'</span>), [<span style="color:#008800">0</span>, <span style="color:#008800">3</span>, <span style="color:#008800">5</span>, <span style="color:#008800">10</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(petallength)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">label_map</span> <span style="color:#aa22ff"><strong>=</strong></span> {<span style="color:#008800">1</span>: <span style="color:#ba2121">'small'</span>, <span style="color:#008800">2</span>: <span style="color:#ba2121">'medium'</span>, <span style="color:#008800">3</span>: <span style="color:#ba2121">'large'</span>, <span style="color:#008800">4</span>: <span style="color:#000000">np</span>.nan}</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">petallength2</span> <span style="color:#aa22ff"><strong>=</strong></span> [<span style="color:#000000">label_map</span>[<span style="color:#000000">x</span>] <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">x</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">petallength</span>]</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">petallength2</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
['small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'small', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'medium', 'large', 'large', 'medium', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'medium', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large', 'large']

在numpy数组中找到最常见的值

import numpy as np # 在鸢尾属植物数据集中找到最常见的花瓣长度petallenth值(第3列) url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') vals, counts = np.unique(iris[:, 2], return_counts=True) # print(np.argmax(counts)) # 返回的是最大值所在的下标; print(vals[np.argmax(counts)]) 

输出如下:

b'1.5'
In [14]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在鸢尾属植物数据集中找到最常见的花瓣长度petallenth值(第3列)</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">vals</span>, <span style="color:#000000">counts</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.unique(<span style="color:#000000">iris</span>[:, <span style="color:#008800">2</span>], <span style="color:#000000">return_counts</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008000"><strong>True</strong></span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.argmax(counts))  # 返回的是最大值所在的下标;</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">vals</span>[<span style="color:#000000">np</span>.argmax(<span style="color:#000000">counts</span>)]<span style="color:#00bb00">)</span></span></span></span></span></span>
b'1.5'

找到第一次出现的值大于给定值的位置

import numpy as np # 在数据集的第4列petalwidth中查找第一次出现的值大于1.0的位置。 url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') print(np.argwhere(iris[:, 3].astype(float) > 1.0)[0]) # print(np.argwhere(iris[:, 3].astype(float) > 1.0)) # 返回值是一个列向量 # print(np.where(iris[:, 3].astype(float) > 1.0)) # 返回值是数组 

输出如下:

[50]
In [15]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 在数据集的第4列petalwidth中查找第一次出现的值大于1.0的位置。</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">iris</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'object'</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.argwhere(<span style="color:#000000">iris</span>[:, <span style="color:#008800">3</span>].astype(<span style="color:#008000">float</span>) <span style="color:#aa22ff"><strong>></strong></span> <span style="color:#008800">1.0</span>)[<span style="color:#008800">0</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.argwhere(iris[:, 3].astype(float) > 1.0))  # 返回值是一个列向量</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># print(np.where(iris[:, 3].astype(float) > 1.0))  # 返回值是数组</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[50]

将大于给定值的所有值替换为给定的截止值

import numpy as np # 从数组a中,用30替换所有大于30的元素,用10替换所有小于10的元素。 np.set_printoptions(precision=2) np.random.seed(100) # 生成1-50内的随机数组,长度是20个元素 a = np.random.uniform(1,50, 20) print(a) # [27.63 14.64 21.8 42.39 1.23 6.96 33.87 41.47 7.7 29.18 44.67 11.25 # 10.08 6.31 11.77 48.95 40.77 9.43 41. 14.43] # Solution 1: Using np.clip print(np.clip(a, a_min=10, a_max=30)) # [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 # 10.08 10. 11.77 30. 30. 10. 30. 14.43] # Solution 2: Using np.where print(np.where(a < 10, 10, np.where(a > 30, 30, a))) # [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 # 10.08 10. 11.77 30. 30. 10. 30. 14.43]

输出分别如下:

[27.63 14.64 21.8 42.39 1.23 6.96 33.87 41.47 7.7 29.18 44.67 11.25 10.08 6.31 11.77 48.95 40.77 9.43 41. 14.43] [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 10.08 10. 11.77 30. 30. 10. 30. 14.43] [27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 10.08 10. 11.77 30. 30. 10. 30. 14.43]
In [16]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 从数组a中,用30替换所有大于30的元素,用10替换所有小于10的元素。</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.set_printoptions(<span style="color:#000000">precision</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">2</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 生成1-50内的随机数组,长度是20个元素</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">a</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.random.uniform(<span style="color:#008800">1</span>,<span style="color:#008800">50</span>, <span style="color:#008800">20</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">a</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [27.63 14.64 21.8  42.39  1.23  6.96 33.87 41.47  7.7  29.18 44.67 11.25</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#  10.08  6.31 11.77 48.95 40.77  9.43 41.   14.43]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 1: Using np.clip</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.clip(<span style="color:#000000">a</span>, <span style="color:#000000">a_min</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">10</span>, <span style="color:#000000">a_max</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">30</span>))</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># [27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.25</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#  10.08 10.   11.77 30.   30.   10.   30.   14.43]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># Solution 2: Using np.where</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span>(<span style="color:#000000">np</span>.where(<span style="color:#000000">a</span> <span style="color:#aa22ff"><strong><</strong></span> <span style="color:#008800">10</span>, <span style="color:#008800">10</span>, <span style="color:#000000">np</span>.where(<span style="color:#000000">a</span> <span style="color:#aa22ff"><strong>></strong></span> <span style="color:#008800">30</span>, <span style="color:#008800">30</span>, <span style="color:#000000">a</span>)))</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># </em></span><span style="color:#00bb00"><em>[</em></span><span style="color:#408080"><em>27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.25</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#  10.08 10.   11.77 30.   30.   10.   30.   14.43</em></span><span style="color:#00bb00"><em>]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[27.63 14.64 21.8  42.39  1.23  6.96 33.87 41.47  7.7  29.18 44.67 11.2510.08  6.31 11.77 48.95 40.77  9.43 41.   14.43]
[27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.2510.08 10.   11.77 30.   30.   10.   30.   14.43]
[27.63 14.64 21.8  30.   10.   10.   30.   30.   10.   29.18 30.   11.2510.08 10.   11.77 30.   30.   10.   30.   14.43]

根据给定的分类变量创建组ID

import numpy as np url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' species = np.genfromtxt(url, delimiter=',', dtype='str', usecols=[4]) np.random.seed(100) #随机数种子 species_small = np.sort(np.random.choice(species, size=20)) #排序 # 方法1: # output = [np.argwhere(np.unique(species_small) == s).tolist()[0][0] for val in np.unique(species_small) for s in species_small[species_small==val]] # 方法2: 使用循环遍历 output = [] uniqs = np.unique(species_small) for val in uniqs: # 在组中的唯一值 for s in species_small[species_small==val]: # 在组中的每一个元素 groupid = np.argwhere(uniqs == s).tolist()[0][0] # 组的ID output.append(groupid) print(output) 

输出如下:

[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]
In [17]:
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em>#请在此处写你的代码</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>import</strong></span> <span style="color:#000000">numpy</span> <span style="color:#008000"><strong>as</strong></span> <span style="color:#000000">np</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">url</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#ba2121">'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.genfromtxt(<span style="color:#000000">url</span>, <span style="color:#000000">delimiter</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">','</span>, <span style="color:#000000">dtype</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#ba2121">'str'</span>, <span style="color:#000000">usecols</span><span style="color:#aa22ff"><strong>=</strong></span>[<span style="color:#008800">4</span>])</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">np</span>.random.seed(<span style="color:#008800">100</span>) <span style="color:#408080"><em>#随机数种子</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">species_small</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.sort(<span style="color:#000000">np</span>.random.choice(<span style="color:#000000">species</span>, <span style="color:#000000">size</span><span style="color:#aa22ff"><strong>=</strong></span><span style="color:#008800">20</span>)) <span style="color:#408080"><em>#排序</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法1:</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># output = [np.argwhere(np.unique(species_small) == s).tolist()[0][0] for val in np.unique(species_small) for s in species_small[species_small==val]]</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#408080"><em># 方法2: 使用循环遍历</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">output</span> <span style="color:#aa22ff"><strong>=</strong></span> []</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#000000">uniqs</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.unique(<span style="color:#000000">species_small</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">val</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">uniqs</span>:  <span style="color:#408080"><em># 在组中的唯一值</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">    <span style="color:#008000"><strong>for</strong></span> <span style="color:#000000">s</span> <span style="color:#008000"><strong>in</strong></span> <span style="color:#000000">species_small</span>[<span style="color:#000000">species_small</span><span style="color:#aa22ff"><strong>==</strong></span><span style="color:#000000">val</span>]:  <span style="color:#408080"><em># 在组中的每一个元素</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">        <span style="color:#000000">groupid</span> <span style="color:#aa22ff"><strong>=</strong></span> <span style="color:#000000">np</span>.argwhere(<span style="color:#000000">uniqs</span> <span style="color:#aa22ff"><strong>==</strong></span> <span style="color:#000000">s</span>).tolist()[<span style="color:#008800">0</span>][<span style="color:#008800">0</span>]  <span style="color:#408080"><em># 组的ID</em></span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">        <span style="color:#000000">output</span>.append(<span style="color:#000000">groupid</span>)</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit"><span style="color:#008000">print</span><span style="color:#00bb00">(</span><span style="color:#000000">output</span><span style="color:#00bb00">)</span></span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
<span style="background-color:#ffffff"><span style="color:#000000"><span style="background-color:#f7f7f7"><span style="color:black"><span style="color:inherit">​</span></span></span></span></span>
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]

三、实验总结

通过本次实验的主要目的,是希望我们能够通过它一方面能够更加熟练的使用numpy;另一方面更重要的是,通过我们使用numpy对于鸢尾花的各种处理,让我们能够初步了解到数据处理是怎么一回事。实验当中用到的许多方法也是在我们之后进行数据分析的时候也会经常用到的。

四、思考与练习

在本次的实验当中,我们使用numpy进行了鸢尾花数据的处理,其中有相关性、规范化、缺失值处理等,想一想,你是否还起到其他的处理方法呢?不防验证验证。

总结:

1、了解了numpy库的功能

1、ndarray,一个具有矢量算术运算和复杂广播能力的快速且节省空间的多维数组。
2、用于对整组数据进行快速运算的标准数学函数(无需编写循环)。
3、用于读写磁盘数据的工具以及用于操作内存映射文件的工具。
4、线性代数、随机数生成以及傅里叶变换功能。
5、用于集成由C、C++、Fortran等语言编写的代码的A C API。

2、掌握了数据获取与处理方法

核心函数1:genfromtxt,直接从网站链接读取数据。

iris_1d = np.genfromtxt(url, delimiter=',', dtype=None)

核心函数2:np.array(),创建数组,代码中提取iris_ld列表第四列,并输出前4个数值。

species = np.array([row[4] for row in iris_1d])
print(species[:5])

核心函数3:row.tolist()[:4]获取数据每行前四列。

iris_2d = np.array([row.tolist()[:4] for row in iris_1d])

核心函数4:usecols=[0,1,2,3],从数据源获取每行前四列。

iris_2d = np.genfromtxt('iris.data', delimiter=',', dtype='float', usecols=[0, 1, 2, 3]

核心函数5:np.mean();np.median();np.std();获取指定

 

变量的平均值、中位数和标准差。

mu, med, sd = np.mean(sepallength), np.median(sepallength), np.std(sepallength)

核心函数6:.max;.min;求数据最小值、最大值,可以用来规范化数组。

Smax, Smin = sepallength.max(), sepallength.min()

核心函数7:percentile(sepallength,)找到变量的百分位数。

print(np.percentile(sepallength, q=[5, 95]))

核心函数8:random.choice()随机寻找变量的20个。

np.random.choice((i), 20)

核心函数9:random.randint()寻找变量的缺失值。

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

核心函数10:~np.any(np.isnan(row))寻找缺失值所在的行。

any_nan_in_row = np.array([~np.any(np.isnan(row)) for row in iris_2d])

核心函数11:np.corrcoef()寻找两列数据之间的相关性

print(np.corrcoef(iris_2d[:, 0], iris_2d[:, 2])[0, 1])

核心函数12:np.any()寻找指定数组中的空值。

print(np.isnan(iris_2d).any())

核心函数13:np.unique()寻找特殊值的计数

u, counts = np.unique(species, return_counts=True

核心函数14:np.digitize()函数将数字转化为文本数组。

petallength = np.digitize(iris[:, 2].astype('float'), [0, 3, 5, 10])

核心函数15:iris按第一列二维数组排序。

print(iris[iris[:, 0].argsort()])

核心函数16:argwhere()函数在某列查找第一个大于特定值的数。

print(np.argwhere(iris[:, 3].astype(float) > 1.0)[0])

核心函数17:np.clip()替换函数。

print(np.clip(a, a_min=10, a_max=30))

核心函数18:创建ID。

for s in species_small[species_small==val]]

    ​综合以上总结,通过反复训练与编写将会提升利用numpy库获取与处理数据的能力。


http://www.ppmy.cn/news/61843.html

相关文章

协众信息Web前端必备8个工具

1、Wappalyzer 这个工具可以让你了解到某个网站是用什么搭建的&#xff0c;即它的内容管理系统、电子商务平台或营销自动化工具。研究如何创建一个网站&#xff0c;这是一个很棒的工具。 它还创建了使用某些技术的网站列表&#xff0c;这些技术可以帮助你了解如何构建客户的…

创建一个react项目

文章目录 1&#xff1a;命令行输入2&#xff1a;在vs的终端中输入npm start 来启动项目3&#xff1a;删除src目录中的干扰项4&#xff1a;去掉index.js中的严格模式节点JSX介绍JSX中使用js表达式原生js调用三元运算符列表渲染条件渲染三元表达式逻辑&&运算 分支逻辑类名…

消息队列 (Message Queue)

消息队列 What 消息队列 是消息的队列&#xff1b;是消息的临时缓冲&#xff1b;是发布/订阅模式的兄弟&#xff1b;在多个进程/线程间实现异步通讯模式。 Why 消息队列在多个进程/线程中实现了异步通讯模式。 这里我们先介绍下同步消息处理。对于同步消息处理&#xff0…

第七章集合与字典作业

目录 1.字符串去重排序 2.列表去重 第3关 猜年龄 第4关 集合的属性、方法与运算 第5关 集合介绍 第6关 手机销售统计 第7关 集合添加元素 第8关 列表嵌套字典的排序 第9关 绩点计算 第10关 通讯录&#xff08;MOD&#xff09; 第11关 字典增加元素 第12关 字典的属性…

CMIP6:WRF模式动力降尺度、单点降尺度、统计方法区域降尺度

专题一 CMIP6中的模式比较计划 1.1 GCM介绍 ​ 1.2 相关比较计划介绍 ​ 专题二数据下载 2.1方法一&#xff1a;手动人工 ​ 利用官方网站 2.2方法二&#xff1a;自动 利用Python的命令行工具 ​ 2.3方法三&#xff1a;半自动购物车 利用官方网站 ​ 2.4 裁剪netCDF文件 …

(详解)js中什么是宏任务、微任务?宏任务、微任务有哪些?又是怎么执行的?

目录 参考资料 必看强烈建议十分钟看完视频 &#xff0c;即可学会 必看参考详解宏任务微任务 笔记 宏任务与微任务 定时器的任务编排 promise的微任务处理逻辑 DOM渲染任务 任务队列共享内存 进度条的实现 任务拆分成多个任务 promise复杂任务分割 img算同步还是异步…

Kubernetes 集群中某个节点出现 Error querying BIRD: unable to connect to BIRDv4 socket

1. 问题描述 Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refusedReadiness probe failed: 2023-05-04 22:13:23.706 [INFO]…

带你彻底理解Spark的分区

前言 我:什么是RDD? 面试者:RDD是被分区的,由一系列分区组成… … 我:你怎么理解分区? 面试者:… 我:Spark中有哪些可以实现分区的方法?分别使用的场景是什么? 面试者… 我:Spark默认分区数是多少?如何保证一个分区对应一个文件? 面试者… 我:…谢谢您的面试,回…