如何查找数据集中的值错误？问题的回答

如何查找数据集中的值错误？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我正在尝试将.csv表中的数据缩放到0到1之间的范围。我已经多次收到输入数据包含NaN、无穷大或值太大的错误。在 <blockquote> "ValueError: Input contains NaN, infinity or a value too large for dtype('float64')." </blockquote> 到目前为止，我总是能够找出错误的来源，例如，一个空单元格，有时表中有空格，或者字符与UTF-8不兼容。直到现在，我总是能让它成功。在 这次我又收到了错误，但我找不到错误。有没有办法找出哪个数据点是“NaN、无穷大或值太大”？因为我有很多数据点，所以我不能手动浏览。如果你有一个建议，我将非常高兴-即使这只是一个技巧在Excel找到导致错误的值。下面你可以找到我的代码和错误。不幸的是，我不能提供数据集，因为它包含机密信息。在 代码： <pre><code>import pandas as pd from sklearn.preprocessing import MinMaxScaler # Load training data set from CSV file training_data_df = pd.read_csv("mtth_train.csv") # Load testing data set from CSV file test_data_df = pd.read_csv("mtth_test.csv") # Data needs to be scaled to a small range like 0 to 1 scaler = MinMaxScaler(feature_range= (0, 1)) # Scale both the training inputs and outputs scaled_training = scaler.fit_transform(training_data_df) scaled_testing = scaler.transform(test_data_df) # Print out the adjustment that the scaler applied to the total_earnings column of data print("Note: Parameters were scaled by multiplying by {:.10f} and adding {:.6f}".format(scaler.scale_[8], scaler.min_[8])) # Create new pandas DataFrame objects from the scaled data scaled_training_df = pd.DataFrame(scaled_training, columns=training_data_df.columns.values) scaled_testing_df = pd.DataFrame(scaled_testing, columns=test_data_df.columns.values) # Save scaled data dataframes to new CSV files scaled_training_df.to_csv("mtth_train_scaled", index=False) scaled_testing_df.to_csv("mtth_test_scaled.csv", index=False) </code></pre> 错误： ^{pr2}$

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如何查找数据集中的值错误？

1 个回答

相关Python问题