滑动窗口 - 如何获取图像上的窗口位置问题的回答

滑动窗口 - 如何获取图像上的窗口位置

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如果您尝试使用 <code>flatten=False</code>要在图像上创建窗口的“网格”： <pre><code>import numpy as np from scipy.misc import lena from matplotlib import pyplot as plt img = lena() print(img.shape) # (512, 512) # make a 64x64 pixel sliding window on img. win = sliding_window(img, (64, 64), shiftSize=None, flatten=False) print(win.shape) # (8, 8, 64, 64) # i.e. (img_height / win_height, img_width / win_width, win_height, win_width) plt.imshow(win[4, 4, ...]) plt.draw() # grid position [4, 4] contains Lena's eye and nose </code></pre> 要获得相应的像素坐标，可以执行以下操作： ^{pr2}$ 使用<code>flatten=True</code>，64x64像素窗口的8x8网格将被展平成64个64x64像素窗口的长矢量。那样的话你可以使用<code>np.unravel_index</code>之类的方法从一维向量索引进行转换在一个网格索引的元组中，然后使用这些来获得像素坐标上图： <pre><code>win = sliding_window(img, (64, 64), flatten=True) grid_pos = np.unravel_index(12, (8, 8)) t, b, l, r = get_win_pixel_coords(grid_pos, (64, 64)) print(np.all(img[t:b, l:r] == win[12])) # True </code></pre> <hr/> 好吧，我会尽力回答你在评论中提出的一些问题。在 <blockquote> I want the pixel location of the window relative to the actual pixel dimensions original image. </blockquote> 也许我还不够清楚-你已经可以使用类似于我的<code>get_win_pixel_coords()</code>函数来完成这项工作，它提供窗口相对于图像的上、下、左和右坐标。例如： <pre><code>win = sliding_window(img, (64, 64), shiftSize=None, flatten=False) fig, (ax1, ax2) = plt.subplots(1, 2) ax1.hold(True) ax1.imshow(win[4, 4]) ax1.plot(8, 9, 'oy') # position of Lena's eye, relative to this window t, b, l, r = get_win_pixel_coords((4, 4), (64, 64)) ax2.hold(True) ax2.imshow(img) ax2.plot(t + 8, l + 9, 'oy') # position of Lena's eye, relative to whole image plt.show() </code></pre> 还请注意，我已经更新了<code>get_win_pixel_coords()</code>，以处理<code>shiftSize</code>不是{<cd7>}（即窗口不能完全平铺没有重叠的图像）。在 <blockquote> So I'm guessing that in that case, I should just make the grid be equal to the original image's dimensions, is that right? (instead of using 8x8). </blockquote> 不，如果窗口不重叠地平铺图像（即<code>shiftSize=None</code>，我目前为止一直假设），那么如果你让网格尺寸等于图像的像素尺寸，那么每个窗口只包含一个像素！在 <blockquote> So in my case, for an image of width: 360 and height: 240, would that mean I use this line: <code>grid_pos = np.unravel_index(*12*, (240, 360))</code>. Also, what does 12 refer to in this line? </blockquote> 正如我所说，使“网格大小”等于图像尺寸是没有意义的，因为每个窗口只包含一个像素（至少，假设窗口是不重叠的）。12表示将索引放入扁平的窗口网格中，例如： <pre><code>x = np.arange(25).reshape(5, 5) # 5x5 grid containing numbers from 0 ... 24 x_flat = x.ravel() # flatten it into a 25-long vector print(x_flat[12]) # the 12th element in the flattened vector # 12 row, col = np.unravel_index(12, (5, 5)) # corresponding row/col index in x print(x[row, col]) # 12 </code></pre> <blockquote> I am shifting 10 pixels with each window, and the first sliding window starts from coordinates 0x0 on the image, and the second starts from 10x10, etc, then I want it the program to return not just the window contents but the coordinates corresponding to each window, i.e. 0,0, and then 10,10, etc </blockquote> 如前所述，您已经可以使用<code>get_win_pixel_coords()</code>返回的上、下、左、右坐标来获得窗口相对于图像的位置。如果您真的需要，可以将其打包成一个函数： <pre><code>def get_pixels_and_coords(win_grid, grid_pos): pix = win_grid[grid_pos] tblr = get_win_pixel_coords(grid_pos, pix.shape) return pix, tblr # e.g.: pix, tblr = get_pixels_and_coords(win, (3, 4)) </code></pre> 如果需要窗口中每个像素相对于图像的坐标，另一个技巧是构造包含图像中每个像素的行和列索引的数组，然后将滑动窗口应用于这些： <pre><code>ridx, cidx = np.indices(img.shape) r_win = sliding_window(ridx, (64, 64), shiftSize=None, flatten=False) c_win = sliding_window(cidx, (64, 64), shiftSize=None, flatten=False) pix = win[3, 4] # pixel values r = r_win[3, 4] # row index of every pixel in the window c = c_win[3, 4] # column index of every pixel in the window </code></pre>

滑动窗口 - 如何获取图像上的窗口位置

1 个回答

相关Python问题