我正在处理英国2017年大选数据。我有csv文件格式和Elasticsearch索引。以下是来自Elasticsearch索引的Chichester选区样本:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 8.03183,
"hits" : [
{
"_index" : "ge",
"_type" : "_doc",
"_id" : "eCtGCG4BaIAfLxq_V2By",
"_score" : 8.03183,
"_source" : {
"code" : "E14000633",
"PANO" : "145",
"constituency" : "Chichester",
"last_name" : "EMERSON",
"first_name" : "Andrew",
"party" : "Patria",
"Party Identifer" : "Patria",
"votes" : "84"
}
},
{
"_index" : "ge",
"_type" : "_doc",
"_id" : "eStGCG4BaIAfLxq_V2By",
"_score" : 8.03183,
"_source" : {
"code" : "E14000633",
"PANO" : "145",
"constituency" : "Chichester",
"last_name" : "MONCREIFF",
"first_name" : "Andrew Malcolm",
"party" : "UK Independence Party (UKIP)",
"Party Identifer" : "UKIP",
"votes" : "1650"
}
},
{
"_index" : "ge",
"_type" : "_doc",
"_id" : "eitGCG4BaIAfLxq_V2By",
"_score" : 8.03183,
"_source" : {
"code" : "E14000633",
"PANO" : "145",
"constituency" : "Chichester",
"last_name" : "BARRIE",
"first_name" : "Heather Margaret",
"party" : "Green Party",
"Party Identifer" : "Green Party",
"votes" : "1992"
}
},
{
"_index" : "ge",
"_type" : "_doc",
"_id" : "eytGCG4BaIAfLxq_V2By",
"_score" : 8.03183,
"_source" : {
"code" : "E14000633",
"PANO" : "145",
"constituency" : "Chichester",
"last_name" : "BROWN",
"first_name" : "Jonathan",
"party" : "Liberal Democrats",
"Party Identifer" : "Liberal Democrats",
"votes" : "6749"
}
},
{
"_index" : "ge",
"_type" : "_doc",
"_id" : "fCtGCG4BaIAfLxq_V2By",
"_score" : 8.03183,
"_source" : {
"code" : "E14000633",
"PANO" : "145",
"constituency" : "Chichester",
"last_name" : "FARWELL",
"first_name" : "Mark Andrew",
"party" : "Labour Party",
"Party Identifer" : "Labour",
"votes" : "13411"
}
},
{
"_index" : "ge",
"_type" : "_doc",
"_id" : "fStGCG4BaIAfLxq_V2By",
"_score" : 8.03183,
"_source" : {
"code" : "E14000633",
"PANO" : "145",
"constituency" : "Chichester",
"last_name" : "KEEGAN",
"first_name" : "Gillian",
"party" : "The Conservative Party Candidate",
"Party Identifer" : "Conservative",
"votes" : "36032"
}
}
]
}
}
我想创建一个新的“列”,称为“排名”,然后选择每个不同的选区,并为相关候选人添加适当的数字。因此,在上面的例子中,保守党候选人的排名为1,工党候选人的排名为2,依此类推
每个选区的候选人人数并不相同
一些最终目标是: 1) 计算并分组每个政党的席位数 2) 要选择那些选区,多数是最小的,并对它们进行排序 3) 写一个算法,指出战术选民应该做出什么选择(当然取决于你想要的结果)
我不知道该怎么做(除了手动更新原始电子表格)
是否应该通过编程方式将cUrl命令直接放入集群中?或者使用Python脚本处理csv文件
请有人建议最好的方法,并提供一个代码示例
我的第一个想法是为每个不同的选区对返回的对象进行排序,使用总点击数循环遍历数据并在此基础上更新排名字段。我同意这一点:
curl -X POST "localhost:9200/ge/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query" : {
"term" : { "Constituency" : "Aldershot" }
},
"sort" : [
{"votes.keyword" : {"order" : "desc"}}
]
}'
返回一个空的数据集。所以我被卡住了。 感谢所有的帮助
目前没有回答
相关问题 更多 >
编程相关推荐