从地址geopandas获取经纬度

2024-09-30 18:17:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我有大约1亿个日志的csv。其中一列是address,我试图得到地址的经纬度。我想尝试解决方案中提到的东西,但是给定的solutionarcGIS,这是一个商业工具。我确实尝试了google API,它只限制2000个条目。在

在大型数据集中获取address的Lat&Long的下一个最佳选择是什么。在

输入:列Site是巴黎市的地址

start_time,stop_time,duration,input_octets,output_octets,os,browser,device,langue,site
2016-08-27T16:15:00+05:30,2016-08-27T16:28:00+05:30,721.0,69979.0,48638.0,iOS,CFNetwork,iOS-Device,zh_CN,NULL
2016-08-27T16:16:00+05:30,2016-08-27T16:30:00+05:30,835.0,2528858.0,247541.0,iOS,Mobile Safari UIWebView,iPhone,en_GB,Berges de Seine Rive Gauche - Gros Caillou
2016-08-27T16:16:00+05:30,2016-08-27T16:47:00+05:30,1805.0,133303549.0,4304680.0,Android,Android,Samsung GT-N7100,fr_FR,Centre d'Accueil Kellermann
2016-08-27T16:17:00+05:30,,2702.0,32499482.0,7396904.0,Other,Apache-HttpClient,Other,NULL,Bibliothèque Saint Fargeau
2016-08-27T16:17:00+05:30,2016-08-27T17:07:00+05:30,2966.0,39208187.0,1856761.0,iOS,Mobile Safari UIWebView,iPad,fr_FR,NULL
2016-08-27T16:18:00+05:30,,2400.0,1505716.0,342726.0,NULL,NULL,NULL,NULL,NULL
2016-08-27T16:18:00+05:30,,302.0,3424123.0,208827.0,Android,Chrome Mobile,Samsung SGH-I337M,fr_CA,Square Jean Xxiii
2016-08-27T16:19:00+05:30,,1500.0,35035181.0,1913667.0,iOS,Mobile Safari UIWebView,iPhone,fr_FR,Parc Monceau 1 (Entrée)
2016-08-27T16:19:00+05:30,,6301.0,9227174.0,5681273.0,Mac OS X,AppleMail,Other,fr_FR,Bibliothèque Parmentier

空地址可以忽略,也可以从输出中删除。在

输出应该有以下列

^{pr2}$

感谢所有的帮助,提前谢谢你!!在


Tags: timeaddress地址frmobileoctetsnullsafari
1条回答
网友
1楼 · 发布于 2024-09-30 18:17:47
import csv
from geopy.geocoders import Nominatim

#if your sites are located in France only you can use the country_bias parameters to restrict search
geolocator = Nominatim(country_bias="France")

with open('c:/temp/input.csv', 'rb') as csvinput:
    with open('c:/temp/output.csv', 'wb') as csvoutput:
       output_fieldnames = ['Site', 'Address_found', 'Latitude', 'Longitude']
       writer = csv.DictWriter(csvoutput, delimiter=';', fieldnames=output_fieldnames)
       writer.writeheader()
       reader = csv.DictReader(csvinput)
       for row in reader:
            site = row['site']
            if site != "NULL":
                try:
                    location = geolocator.geocode(site)
                    address = location.address
                    latitude = location.latitude
                    longitude = location.longitude
                except:
                    address = 'Not found'
                    latitude = 'N/A'
                    longitude = 'N/A'
            else:
                address = 'N/A'
                latitude = 'N/A'
                longitude = 'N/A'

            #here is the writing section
            output_row = {}
            output_row['Site'] = row['site']
            output_row['Address_found'] = address.encode("utf-8")
            output_row['Latitude'] = latitude
            output_row['Longitude'] = longitude
            writer.writerow(output_row)

相关问题 更多 >