如何从数据帧中选择随机行序列?

2024-10-01 02:27:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我的数据是:

         dOpen     dHigh      dLow    dClose   dVolume  day_of_week_0  day_of_week_1  ...  month_6  month_7  month_8  month_9  month_10  month_11  month_12
0     0.000000  0.000000  0.000000  0.000000  0.000000              0              0  ...        0        0        0        0         0         0         0
1     0.000000  0.006397  0.005000  0.007112  0.007111              1              0  ...        0        0        0        0         0         0         0
2     0.005686  0.002825  0.003554  0.002119  0.002119              0              1  ...        0        0        0        0         0         0         0
3     0.004240  0.010563  0.005666  0.010571  0.010571              0              0  ...        0        0        0        0         0         0         0
4     0.012667  0.005575  0.002113  0.004184  0.004184              0              0  ...        0        0        0        0         0         0         0
...        ...       ...       ...       ...       ...            ...            ...  ...      ...      ...      ...      ...       ...       ...       ...
6787 -0.002750  0.001527  0.002214  0.006877  0.006877              1              0  ...        0        0        0        0         0         0         0
6788  0.003309  0.002012  0.002823 -0.001525 -0.001525              0              1  ...        0        0        0        0         0         0         0
6789 -0.000366  0.001217  0.001285  0.002260  0.002260              0              0  ...        0        0        0        0         0         0         0
6790  0.007179  0.005775  0.006692  0.008318  0.008318              0              0  ...        0        0        0        0         0         0         0
6791  0.006066  0.003808  0.004249  0.003113  0.003113              0              0  ...        0        0        0        0         0         0         0

我想选择5个连续的行(随机)。我尝试过使用.sample,但这只是加载一个随机的n行,这些行不是连续的


Tags: of数据sampleweekdaymonthdopendhigh
3条回答

您还可以在df.index上使用^{},然后使用^{}获取位置,并使用df.iloc[]切片

s=np.random.choice(df.index[:-5],1)
df.iloc[df.index.get_loc(s[0]):df.index.get_loc(s[0])+5]

这里有一种使用random.randint的方法:

import random

nrows = range(df.shape[0])
ix = random.randint(nrows.start, nrows.stop-5)
df.iloc[ix:ix+5, :]

 dOpen     dHigh      dLow    dClose   dVolume  day_of_week_0  \
4      4  0.012667  0.005575  0.002113  0.004184       0.004184   
5   6787 -0.002750  0.001527  0.002214  0.006877       0.006877   
6   6788  0.003309  0.002012  0.002823 -0.001525      -0.001525   
7   6789 -0.000366  0.001217  0.001285  0.002260       0.002260   
8   6790  0.007179  0.005775  0.006692  0.008318       0.008318   
9   6791  0.006066  0.003808  0.004249  0.003113       0.003113   

   day_of_week_1  ...  month_6  month_7  month_8  month_9  month_10  month_11  \
4              0    0        0        0        0        0         0         0   
5              1    0        0        0        0        0         0         0   
6              0    1        0        0        0        0         0         0   
7              0    0        0        0        0        0         0         0   
8              0    0        0        0        0        0         0         0   
9              0    0        0        0        0        0         0         0   

   month_12  
4         0  
5         0  
6         0  
7         0  
8         0  
9         0  

随机选择一行n,然后取n到n+5行

n = random.randint(0, rows_in_dataframe-5)

five_random_consecutive_rows = dataframe[n:n+5]

相关问题 更多 >