在Pandas中，read_csv（）中的“nrows”等价于read_excel（）中的“nrows”？

网友

1楼 · 编辑于 2024-05-08 20:05:59

作为noted in the documentation，从pandas版本0.23开始，这是一个内置选项，函数几乎与OP所述完全相同。

守则

data = pd.read_excel(filepath, header=0, skiprows=4, nrows= 20, use_cols = "A:D")

现在将读取excel文件，从第一个工作表中获取数据（默认值），跳过4行数据，然后将第一行（即工作表的第五行）作为标题，将接下来的20行数据读入数据框（第6-25行），并且只使用列A:D。请注意，使用列现在是最后一个选项，因为parse列已被弃用。

网友

2楼 · 编辑于 2024-05-08 20:05:59

如果知道Excel工作表中的行数，可以使用skip_footer参数读取文件的第一行n-skip_footer，其中n是行总数。

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html

用法：

data = pd.read_excel(filepath, header=0, parse_cols = "A:D", skip_footer=80)

假设您的excel工作表有100行，那么这一行将解析前20行。

网友

3楼 · 编辑于 2024-05-08 20:05:59

我想使（扩展）@Erol's answer更灵活一些。

假设我们不知道excel工作表中的行总数：

xl = pd.ExcelFile(filepath)

# parsing first (index: 0) sheet
total_rows = xl.book.sheet_by_index(0).nrows

skiprows = 4
nrows = 20

# calc number of footer rows
# (-1) - for the header row
skipfooter = total_rows - nrows - skiprows - 1

df = xl.parse(0, skiprows=skiprows, skipfooter=skipfooter, parse_cols="A:D") \
       .dropna(axis=1, how='all')

.dropna(axis=1, how='all')将删除仅包含的所有列NaN

相关问题更多 >

编程相关推荐

热门问题

热门文章