示例1:
# make_blobs示例
from sklearn.datasets.samples_generator import make_blobsX, y = make_blobs(n_samples=10, centers=3, n_features=2,random_state=0)
#看看数据集长什么样
plt.scatter(X[:, 0], X[:, 1], c=y, cmap="rainbow");
示例2:
X, y = make_blobs(n_samples=[3, 3, 4], centers=None, n_features=2,random_state=0)#看看数据集长什么样
plt.scatter(X[:, 0], X[:, 1], c=y, cmap="rainbow");
可见,n_samples=10与n_samples=[3, 3, 4]等价,centers=3与centers=None等价,验证了默认值的定义。
示例3:
# 创建数据集
class_1_ = 7
class_2_ = 4
centers_ = [[0.0, 0.0], [1,1]]
clusters_std = [0.5, 1]
X_, y_ = make_blobs(n_samples=[class_1_, class_2_],centers=centers_,cluster_std=clusters_std,random_state=0, shuffle=False)# or :
X_, y_ = make_blobs(n_samples=[7, 4],centers=[[0.0, 0.0], [1,1]],cluster_std=[0.5, 1],random_state=0, shuffle=False) # n_features 默认是2# 绘图:
plt.scatter(X_[:, 0], X_[:, 1], c=y_, cmap="rainbow",s=100); # s控制点的大小