如何替换Pandas中的值？问题的回答

如何替换Pandas中的值？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

试图将<a href="https://www.unb.ca/cic/datasets/nsl.html" rel="nofollow noreferrer">"KDDTest+.csv"</a>的最后一列中的23个不同标签分为四组。请注意，在执行此操作之前，我已删除csv的最后一列 我已使用读取.csv文件 <pre><code>df = pd.read_csv('KDDTrain+.csv', header=None, names = col_names) </code></pre> 在哪里 <pre><code>col_names = ["duration","protocol_type","service","flag","src_bytes", "dst_bytes","land","wrong_fragment","urgent","hot","num_failed_logins", "logged_in","num_compromised","root_shell","su_attempted","num_root", "num_file_creations","num_shells","num_access_files","num_outbound_cmds", "is_host_login","is_guest_login","count","srv_count","serror_rate", "srv_serror_rate","rerror_rate","srv_rerror_rate","same_srv_rate", "diff_srv_rate","srv_diff_host_rate","dst_host_count","dst_host_srv_count", "dst_host_same_srv_rate","dst_host_diff_srv_rate","dst_host_same_src_port_rate", "dst_host_srv_diff_host_rate","dst_host_serror_rate","dst_host_srv_serror_rate", "dst_host_rerror_rate","dst_host_srv_rerror_rate","label"] </code></pre> 如果我打印出数据框的前5行，这就是输出（请注意“标签”列）： 使用<code>print(df.head(5))</code> <pre><code> duration protocol_type ... dst_host_srv_rerror_rate label 0 0 tcp ... 0.00 normal 1 0 udp ... 0.00 normal 2 0 tcp ... 0.00 neptune 3 0 tcp ... 0.01 normal 4 0 tcp ... 0.00 normal </code></pre> 我已经尝试了这两种方法，根据我在网上找到的内容进行分组： 方法1： <pre><code>df.replace(to_replace = ['ipsweep.', 'portsweep.', 'nmap.', 'satan.'], value = 'probe', inplace = True) df.replace(to_replace = ['ftp_write.', 'guess_passwd.', 'imap.', 'multihop.', 'phf.', 'spy.', 'warezclient.', 'warezmaster.'], value = 'r2l', inplace = True) df.replace(to_replace = ['buffer_overflow.', 'loadmodule.', 'perl.', 'rootkit.'], value = 'u2r', inplace = True) df.replace(to_replace = ['back.', 'land.' , 'neptune.', 'pod.', 'smurf.', 'teardrop.'], value = 'dos', inplace = True) </code></pre> 方法2： <pre><code>df['label'] = df['label'].replace(['ipsweep.', 'portsweep.', 'nmap.', 'satan.'], 'probe',regex=True) df['label'] = df['label'].replace(['ftp_write.', 'guess_passwd.', 'imap.', 'multihop.', 'phf.', 'spy.', 'warezclient.', 'warezmaster.'], 'r2l',regex=True) df['label'] = df['label'].replace(['buffer_overflow.', 'loadmodule.', 'perl.', 'rootkit.'], 'u2r',regex=True) df['label'] = df['label'].replace(['back.', 'land.' , 'neptune.', 'pod.', 'smurf.', 'teardrop.'], 'dos',regex=True) </code></pre> 但是，这仍然是打印数据帧前5行的输出： <pre><code>After replacing, first 5 rows of df: duration protocol_type ... dst_host_srv_rerror_rate label 0 0 tcp ... 0.00 normal 1 0 udp ... 0.00 normal 2 0 tcp ... 0.00 neptune 3 0 tcp ... 0.01 normal 4 0 tcp ... 0.00 normal </code></pre> 我希望第2行中的标签列显示的是“dos”而不是“neptune”，但事实并非如此 我做错了什么？感谢您的帮助

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如何替换Pandas中的值？

1 个回答

相关Python问题