Python聚类图实战:从入门到精通

Python 数据可视化之聚类图技术详解

聚类图(Cluster Plot)是一种用于展示数据分组结构的可视化方法,常用于无监督学习中的聚类分析。Python 提供了多种库(如 scikit-learnseabornmatplotlib)来实现聚类图的绘制。以下是具体实现方法和技术要点。

数据准备与预处理

聚类分析通常需要标准化或归一化数据。使用 scipyscikit-learn 进行数据预处理:

from sklearn.preprocessing import StandardScaler
import pandas as pd

data = pd.read_csv('your_data.csv')
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)

聚类算法选择

常见的聚类算法包括 K-Means、层次聚类(Hierarchical Clustering)和 DBSCAN。以 K-Means 为例:

from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=3, random_state=42)
clusters = kmeans.fit_predict(scaled_data)

绘制聚类图

使用 matplotlibseaborn 绘制聚类结果。对于二维数据,可直接散点图展示:

import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(10, 6))
sns.scatterplot(x=scaled_data[:, 0], y=scaled_data[:, 1], hue=clusters, palette='viridis')
plt.title('K-Means Clustering Results')
plt.show()

层次聚类图(树状图)

层次聚类可通过树状图(Dendrogram)展示。使用 scipydendrogram 函数:

from scipy.cluster.hierarchy import dendrogram, linkage
import numpy as np

linked = linkage(scaled_data, method='ward')
plt.figure(figsize=(10, 6))
dendrogram(linked, orientation='top')
plt.title('Hierarchical Clustering Dendrogram')
plt.show()

高维数据可视化

对于高维数据,可先降维(如 PCA 或 t-SNE)再绘图:

from sklearn.decomposition import PCA

pca = PCA(n_components=2)
pca_result = pca.fit_transform(scaled_data)

sns.scatterplot(x=pca_result[:, 0], y=pca_result[:, 1], hue=clusters, palette='Set2')
plt.title('PCA + K-Means Clustering')
plt.show()

聚类评估与调优

使用轮廓系数(Silhouette Score)或肘部法(Elbow Method)评估聚类效果:

from sklearn.metrics import silhouette_score

score = silhouette_score(scaled_data, clusters)
print(f'Silhouette Score: {score:.2f}')

实战案例:鸢尾花数据集

以鸢尾花数据集为例,展示完整流程:

from sklearn.datasets import load_iris

iris = load_iris()
data = iris.data
target = iris.target

# K-Means 聚类
kmeans = KMeans(n_clusters=3)
clusters = kmeans.fit_predict(data)

# 可视化
pca = PCA(n_components=2)
pca_result = pca.fit_transform(data)

sns.scatterplot(x=pca_result[:, 0], y=pca_result[:, 1], hue=clusters, palette='deep')
plt.title('Iris Dataset Clustering')
plt.show()

注意事项

  • 聚类结果可能因初始质心选择而不同,建议多次运行取最优解。
  • 高维数据需结合降维技术,避免“维度灾难”。
  • 树状图适合小规模数据(样本量 < 1000),否则可视化效果较差。

通过以上方法,可以灵活实现不同场景下的聚类分析与可视化需求。

BbS.okapop113.sbs/PoSt/1122_849321.HtM
BbS.okapop114.sbs/PoSt/1122_576817.HtM
BbS.okapop115.sbs/PoSt/1122_174012.HtM
BbS.okapop116.sbs/PoSt/1122_506183.HtM
BbS.okapop117.sbs/PoSt/1122_345998.HtM
BbS.okapop118.sbs/PoSt/1122_340952.HtM
BbS.okapop119.sbs/PoSt/1122_466756.HtM
BbS.okapop120.sbs/PoSt/1122_492480.HtM
BbS.okapop121.sbs/PoSt/1122_587322.HtM
BbS.okapop122.sbs/PoSt/1122_272086.HtM
BbS.okapop113.sbs/PoSt/1122_436935.HtM
BbS.okapop114.sbs/PoSt/1122_402609.HtM
BbS.okapop115.sbs/PoSt/1122_346832.HtM
BbS.okapop116.sbs/PoSt/1122_316965.HtM
BbS.okapop117.sbs/PoSt/1122_361477.HtM
BbS.okapop118.sbs/PoSt/1122_561263.HtM
BbS.okapop119.sbs/PoSt/1122_891432.HtM
BbS.okapop120.sbs/PoSt/1122_203978.HtM
BbS.okapop121.sbs/PoSt/1122_276166.HtM
BbS.okapop122.sbs/PoSt/1122_445024.HtM
BbS.okapop113.sbs/PoSt/1122_387334.HtM
BbS.okapop114.sbs/PoSt/1122_852273.HtM
BbS.okapop115.sbs/PoSt/1122_814355.HtM
BbS.okapop116.sbs/PoSt/1122_938096.HtM
BbS.okapop117.sbs/PoSt/1122_700198.HtM
BbS.okapop118.sbs/PoSt/1122_027626.HtM
BbS.okapop119.sbs/PoSt/1122_644587.HtM
BbS.okapop120.sbs/PoSt/1122_978610.HtM
BbS.okapop121.sbs/PoSt/1122_785004.HtM
BbS.okapop122.sbs/PoSt/1122_210039.HtM
BbS.okapop113.sbs/PoSt/1122_793430.HtM
BbS.okapop114.sbs/PoSt/1122_520282.HtM
BbS.okapop115.sbs/PoSt/1122_263826.HtM
BbS.okapop116.sbs/PoSt/1122_564132.HtM
BbS.okapop117.sbs/PoSt/1122_762613.HtM
BbS.okapop118.sbs/PoSt/1122_569883.HtM
BbS.okapop119.sbs/PoSt/1122_749528.HtM
BbS.okapop120.sbs/PoSt/1122_168836.HtM
BbS.okapop121.sbs/PoSt/1122_979589.HtM
BbS.okapop122.sbs/PoSt/1122_918363.HtM
BbS.okapop113.sbs/PoSt/1122_203653.HtM
BbS.okapop114.sbs/PoSt/1122_786542.HtM
BbS.okapop115.sbs/PoSt/1122_426384.HtM
BbS.okapop116.sbs/PoSt/1122_094605.HtM
BbS.okapop117.sbs/PoSt/1122_481446.HtM
BbS.okapop118.sbs/PoSt/1122_886348.HtM
BbS.okapop119.sbs/PoSt/1122_281326.HtM
BbS.okapop120.sbs/PoSt/1122_093141.HtM
BbS.okapop121.sbs/PoSt/1122_017423.HtM
BbS.okapop122.sbs/PoSt/1122_823485.HtM
BbS.okapop113.sbs/PoSt/1122_281371.HtM
BbS.okapop114.sbs/PoSt/1122_807772.HtM
BbS.okapop115.sbs/PoSt/1122_984033.HtM
BbS.okapop116.sbs/PoSt/1122_260876.HtM
BbS.okapop117.sbs/PoSt/1122_078266.HtM
BbS.okapop118.sbs/PoSt/1122_583220.HtM
BbS.okapop119.sbs/PoSt/1122_269500.HtM
BbS.okapop120.sbs/PoSt/1122_326415.HtM
BbS.okapop121.sbs/PoSt/1122_064661.HtM
BbS.okapop122.sbs/PoSt/1122_678401.HtM
BbS.okapop113.sbs/PoSt/1122_926708.HtM
BbS.okapop114.sbs/PoSt/1122_933465.HtM
BbS.okapop115.sbs/PoSt/1122_142055.HtM
BbS.okapop116.sbs/PoSt/1122_267467.HtM
BbS.okapop117.sbs/PoSt/1122_031661.HtM
BbS.okapop118.sbs/PoSt/1122_964707.HtM
BbS.okapop119.sbs/PoSt/1122_177564.HtM
BbS.okapop120.sbs/PoSt/1122_654664.HtM
BbS.okapop121.sbs/PoSt/1122_207841.HtM
BbS.okapop122.sbs/PoSt/1122_968297.HtM
BbS.okapop113.sbs/PoSt/1122_758193.HtM
BbS.okapop114.sbs/PoSt/1122_088793.HtM
BbS.okapop115.sbs/PoSt/1122_432900.HtM
BbS.okapop116.sbs/PoSt/1122_678049.HtM
BbS.okapop117.sbs/PoSt/1122_402701.HtM
BbS.okapop118.sbs/PoSt/1122_845981.HtM
BbS.okapop119.sbs/PoSt/1122_840548.HtM
BbS.okapop120.sbs/PoSt/1122_557917.HtM
BbS.okapop121.sbs/PoSt/1122_399152.HtM
BbS.okapop122.sbs/PoSt/1122_147713.HtM

#牛客AI配图神器#

全部评论

相关推荐

评论
点赞
收藏
分享

创作者周榜

更多
牛客网
牛客网在线编程
牛客网题解
牛客企业服务