将列表转换为数据帧不起作用

2024-09-27 21:32:25 发布

您现在位置:Python中文网/ 问答频道 /正文

在循环之后,我很难将列表转换为数据帧。我得到了前3行的结果,但是其余的输出是NaN值。 这是我的密码。非常感谢您的帮助。多谢各位

    for i in range(0,5000):
        data=data_phished["url"][i]
        if not urlparse(data).scheme:
            url = 'https://' + data
        print(i),print(url)

    urlRequest.append(fe.urlRequest(url,soup,hostname))
    urlAnchor.append(fe.urlAnchor(url,soup,hostname))
    linksTags.append(fe.linksTags(url))
    sfh.append(fe.sfh(url,soup,hostname))
    emailSubmit.append(fe.emailSubmit(url))
    urlAbnormal.append(fe.urlAbnormal(url,hostname))

    #Storing extracted features in a list
    label = []
    for i in range(0,5000):
        label.append(1)

    #Converting the list to dataframe

    feat_col = {'request_url':pd.Series(urlRequest), 'anchor_url':pd.Series(urlAnchor),'links_in_tags':pd.Series(linksTags),'server_from_handler':pd.Series(sfh),'submit_info_email':pd.Series(emailSubmit),'abnormal_url':pd.Series(urlAbnormal),'class':pd.Series(label)}

    abn = pd.DataFrame(feat_col)
    abn

这是我收到的输出(附件) [1] :https://i.stack.imgur.com/uMWwB.png


Tags: inurldatahostnameseriespdsoupappend
2条回答

首先,您实际上不需要在feat_coldict中使用pd.Series

feat_col = {'request_url':urlRequest, 'anchor_url':urlAnchor,'links_in_tags':linksTags,'server_from_handler':sfh,'submit_info_email':emailSubmit,'abnormal_url':urlAbnormal,'class':label}

应该足够了。我认为您应该在for循环之后缩进特征集合部分:

    for i in range(0,5000):
        data=data_phished["url"][i]
        if not urlparse(data).scheme:
            url = 'https://' + data
        print(i),print(url)

        # These lines should be indented to be run in the for loop.
        urlRequest.append(fe.urlRequest(url,soup,hostname))
        urlAnchor.append(fe.urlAnchor(url,soup,hostname))
        linksTags.append(fe.linksTags(url))
        sfh.append(fe.sfh(url,soup,hostname))
        emailSubmit.append(fe.emailSubmit(url))
        urlAbnormal.append(fe.urlAbnormal(url,hostname))

编辑:最终代码


    label = []
    for i in range(0,5000):
        data=data_phished["url"][i]
        if not urlparse(data).scheme:
            url = 'https://' + data
        print(i),print(url)

        try:
            # These lines should be indented to be run in the for loop.
            urlRequest.append(fe.urlRequest(url,soup,hostname))
            urlAnchor.append(fe.urlAnchor(url,soup,hostname))
            linksTags.append(fe.linksTags(url))
            sfh.append(fe.sfh(url,soup,hostname))
            emailSubmit.append(fe.emailSubmit(url))
            urlAbnormal.append(fe.urlAbnormal(url,hostname))
            label.append(1)
        except Exception as e:
            print("Some error")

    feat_col = {'request_url':urlRequest, 'anchor_url':urlAnchor,'links_in_tags':linksTags,'server_from_handler':sfh,'submit_info_email':emailSubmit,'abnormal_url':urlAbnormal,'class':label}
    abn = pd.DataFrame(feat_col)
    abn

我相信您正在尝试从dict转换到dataframe。为此,您必须使用

pd.DataFrame.from_dict(feat_col)

相关问题 更多 >

    热门问题