数据集gen的问题

2024-10-17 21:43:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图创建一个生成2D数据集的函数。我希望代码执行的步骤如下:

  • 制作一个包含100个列表的列表,每个列表中都有坐标和类0(未初始化)
  • 生成n个随机坐标并设置其类别,更新列表
  • 计算类为0的点与指定类的点之间的距离。取到1类点的距离的1/和,到2类点的距离的1/和。。。这一点的类将是具有最大和的类

但是,下面的代码不起作用,当打印列表时,几乎所有点仍然具有0类

import random

def dataset_maker(n): # 2D plane with 0 through 9 for x and y axis
    points = []
    for i in range(0,10):
        for j in range(0,10):
            points.append([i, j, "0"]) # 0 = class not initialised, 1 = class 1, 2 = class 2 etc

    centriods = []
    for i in range(n):
        centriods.append((random.randint(0,9), random.randint(0,9)))
    for i in range(0,len(points)):
        for j in range(0,len(centriods)):
            if points[i][0] == centriods[j][0] and points[i][1] == centriods[j][1]:
                points[i][2] = str(j+1)

    # All neighbours NN to determine classes of other points
    distances = [0] * n
    for i in range(0,len(points)):
        if points[i][2] == "0":
            for j in range(0,len(points)):
                if points[j][2] != "0":
                    if((points[i][0] - points[j][0]) ** 2 + (points[i][1] - points[j][1]) ** 2) != 0: # prevent division of 0
                        tmp = 1 / ((points[i][0] - points[j][0]) ** 2 + (points[i][1] - points[j][1]) ** 2)
                        tmp2 = int(points[j][2]) - 1
                        distances[tmp2] += tmp

                        tmp3 = distances[0]
                        for i in range(0,len(distances)):
                            if distances[i] > tmp3:
                                tmp3 = distances[i] 
                        points[i][2] = str(distances.index(tmp3) + 1)
                        distances = [0] * n
    print(points)

dataset_maker(5)

Tags: in距离列表forlenifrangerandom