当前位置：首页 > JavaScript

js实现kmeans

2026-04-06 05:43:51JavaScript

K-means 算法简介

K-means 是一种无监督聚类算法，通过迭代将数据点分配到最近的聚类中心（质心），并重新计算质心位置，直到收敛。适用于数据分组、图像分割等场景。

JavaScript 实现步骤

初始化质心

随机选择 k 个数据点作为初始质心，或采用更智能的初始化方法（如 K-means++）。

js实现kmeans

function initCentroids(data, k) {
  const centroids = [];
  const indices = new Set();
  while (indices.size < k) {
    const idx = Math.floor(Math.random() * data.length);
    if (!indices.has(idx)) {
      centroids.push([...data[idx]]);
      indices.add(idx);
    }
  }
  return centroids;
}

计算欧氏距离

使用欧氏距离衡量数据点与质心的相似度。

function euclideanDistance(a, b) {
  return Math.sqrt(
    a.reduce((sum, val, i) => sum + Math.pow(val - b[i], 2), 0)
  );
}

分配数据点到最近质心

遍历所有数据点，将其分配到距离最近的质心所属的簇。

js实现kmeans

function assignClusters(data, centroids) {
  const clusters = new Array(data.length).fill(-1);
  data.forEach((point, i) => {
    let minDist = Infinity;
    centroids.forEach((centroid, j) => {
      const dist = euclideanDistance(point, centroid);
      if (dist < minDist) {
        minDist = dist;
        clusters[i] = j;
      }
    });
  });
  return clusters;
}

更新质心位置

根据当前簇分配重新计算质心（取簇内点的均值）。

function updateCentroids(data, clusters, k) {
  const centroids = new Array(k).fill().map(() => new Array(data[0].length).fill(0));
  const counts = new Array(k).fill(0);

  data.forEach((point, i) => {
    const cluster = clusters[i];
    point.forEach((val, dim) => {
      centroids[cluster][dim] += val;
    });
    counts[cluster]++;
  });

  return centroids.map((centroid, i) =>
    centroid.map(val => (counts[i] > 0 ? val / counts[i] : val))
  );
}

迭代至收敛

重复分配和更新步骤，直到质心变化小于阈值或达到最大迭代次数。

function kmeans(data, k, maxIterations = 100, threshold = 0.001) {
  let centroids = initCentroids(data, k);
  let prevCentroids = null;
  let iterations = 0;

  while (iterations < maxIterations) {
    const clusters = assignClusters(data, centroids);
    prevCentroids = [...centroids];
    centroids = updateCentroids(data, clusters, k);

    const delta = centroids.reduce(
      (sum, centroid, i) => sum + euclideanDistance(centroid, prevCentroids[i]), 0
    );
    if (delta < threshold) break;
    iterations++;
  }

  return { centroids, clusters: assignClusters(data, centroids) };
}

使用示例

const data = [
  [1, 2], [1, 4], [1, 0],
  [10, 2], [10, 4], [10, 0]
];
const k = 2;
const result = kmeans(data, k);
console.log("质心:", result.centroids);
console.log("簇分配:", result.clusters);