在matplotlib中使用hexbin获取bin坐标

2024-05-29 11:04:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用matplotlib的方法hexbin计算数据的二维直方图。 但我想得到六边形中心的坐标,以便进一步处理结果。

我用get_array()方法得到了结果的值,但是我不知道如何得到bins坐标。

我试着根据垃圾箱的数量和数据的范围来计算它们,但我不知道每个方向的垃圾箱的确切数量。gridsize=(10,2)应该可以做到,但似乎不起作用。

知道吗?


Tags: 数据方法get数量matplotlib直方图方向中心
3条回答

我想这行得通。

from __future__ import division
import numpy as np
import math
import matplotlib.pyplot as plt

def generate_data(n):
    """Make random, correlated x & y arrays"""
    points = np.random.multivariate_normal(mean=(0,0),
        cov=[[0.4,9],[9,10]],size=int(n))
    return points

if __name__ =='__main__':

    color_map = plt.cm.Spectral_r
    n = 1e4
    points = generate_data(n)

    xbnds = np.array([-20.0,20.0])
    ybnds = np.array([-20.0,20.0])
    extent = [xbnds[0],xbnds[1],ybnds[0],ybnds[1]]

    fig=plt.figure(figsize=(10,9))
    ax = fig.add_subplot(111)
    x, y = points.T
    # Set gridsize just to make them visually large
    image = plt.hexbin(x,y,cmap=color_map,gridsize=20,extent=extent,mincnt=1,bins='log')
    # Note that mincnt=1 adds 1 to each count
    counts = image.get_array()
    ncnts = np.count_nonzero(np.power(10,counts))
    verts = image.get_offsets()
    for offc in xrange(verts.shape[0]):
        binx,biny = verts[offc][0],verts[offc][1]
        if counts[offc]:
            plt.plot(binx,biny,'k.',zorder=100)
    ax.set_xlim(xbnds)
    ax.set_ylim(ybnds)
    plt.grid(True)
    cb = plt.colorbar(image,spacing='uniform',extend='max')
    plt.show()

enter image description here

我想确认使用get_offsets()钩住的代码是有效的,但是我尝试了上面提到的代码的多次迭代来检索中心位置,正如Dave提到的,get_offsets()仍然是空的。我找到的解决方法是使用非空的“image.get_paths()”选项。我的代码使用平均值来找到中心,但这意味着它只是一点点长,但它确实工作。

get_paths()选项返回一组嵌入的x,y坐标,这些坐标可以循环,然后取平均值以返回每个十六进制的中心位置。

我的代码如下:

counts=image.get_array() #counts in each hexagon, works great
verts=image.get_offsets() #empty, don't use this
b=image.get_paths()   #this does work, gives Path([[]][]) which can be plotted

for x in xrange(len(b)):
    xav=np.mean(b[x].vertices[0:6,0]) #center in x (RA)
    yav=np.mean(b[x].vertices[0:6,1]) #center in y (DEC)
    plt.plot(xav,yav,'k.',zorder=100)

我也有同样的问题。我认为需要开发的是一个框架,它有一个六边形的对象,然后可以应用到许多不同的数据集(这将是可怕的,为N维)。这是可能的,让我惊讶的是,Scipy和Numpy都没有任何东西适合它(此外,除了binify之外,似乎没有其他类似的东西)

也就是说,我假设您想使用hexbinning来比较多个binned数据集。这需要一些共同的基础。我用matplotlib的hexbin按以下方式运行:

import numpy as np
import matplotlib.pyplot as plt

def get_data (mean,cov,n=1e3):
    """
    Quick fake data builder
    """
    np.random.seed(101)
    points = np.random.multivariate_normal(mean=mean,cov=cov,size=int(n))
    x, y = points.T
    return x,y

def get_centers (hexbin_output):
    """
    about 40% faster than previous post only cause you're not calculating the 
    min/max every time 
    """
    paths = hexbin_output.get_paths()
    v = paths[0].vertices[:-1] # adds a value [0,0] to the end
    vx,vy = v.T

    idx = [3,0,5,2] # index for [xmin,xmax,ymin,ymax]    
    xmin,xmax,ymin,ymax = vx[idx[0]],vx[idx[1]],vy[idx[2]],vy[idx[3]]

    half_width_x = abs(xmax-xmin)/2.0
    half_width_y = abs(ymax-ymin)/2.0

    centers = []
    for i in xrange(len(paths)):
        cx = paths[i].vertices[idx[0],0]+half_width_x
        cy = paths[i].vertices[idx[2],1]+half_width_y
        centers.append((cx,cy))

    return np.asarray(centers)


# important parts ==>

class Hexagonal2DGrid (object):
    """
    Used to fix the gridsize, extent, and bins
    """
    def __init__ (self,gridsize,extent,bins=None):
        self.gridsize = gridsize
        self.extent = extent
        self.bins = bins

def hexbin (x,y,hexgrid):
    """
    To hexagonally bin the data in 2 dimensions
    """
    fig = plt.figure()
    ax = fig.add_subplot(111)

    # Note mincnt=0 so that it will return a value for every point in the 
    # hexgrid, not just those with count>mincnt

    # Basically you fix the gridsize, extent, and bins to keep them the same
    # then the resulting count array is the same
    hexbin = plt.hexbin(x,y, mincnt=0,
                        gridsize=hexgrid.gridsize, 
                        extent=hexgrid.extent,
                        bins=hexgrid.bins)
    # you could close the figure if you don't want it
    # plt.close(fig.number)

    counts = hexbin.get_array().copy() 
    return counts, hexbin

# Example ===>
if __name__ == "__main__":
    hexgrid = Hexagonal2DGrid((21,5),[-70,70,-20,20])
    x_data,y_data = get_data((0,0),[[-40,95],[90,10]])
    x_model,y_model = get_data((0,10),[[100,30],[3,30]])

    counts_data, hexbin_data = hexbin(x_data,y_data,hexgrid)
    counts_model, hexbin_model = hexbin(x_model,y_model,hexgrid)

    # if you want the centers, they will be the same for both 
    centers = get_centers(hexbin_data) 

    # if you want to ignore the cells with zeros then use the following mask. 
    # But if want zeros for some bins and not others I'm not sure an elegant way
    # to do this without using the centers
    nonzero = counts_data != 0

    # now you can compare the two data sets
    variance_data = counts_data[nonzero]
    square_diffs = (counts_data[nonzero]-counts_model[nonzero])**2
    chi2 = np.sum(square_diffs/variance_data)
    print(" chi2={}".format(chi2))

相关问题 更多 >

    热门问题