<p>这里有很多问题。在</p>
<ol>
<li>该文件不是一个简单的csv,并且假定的<code>data = pd.read_csv('input.csv')</code>无法正确解析。在</li>
<li>“Coordinates”字段似乎是<code>json</code>字符串</li>
<li>在同一块地里也有楠</li>
</ol>
<p>这就是我目前所做的。您需要自己做一些工作来更恰当地解析这个文件</p>
<pre><code>import pandas as pd
df1 = pd.read_csv('./Turkey_28.csv')
coords = df1[['tweetID', 'Coordinates']].set_index('tweetID')['Coordinates']
coords = coords.dropna().apply(lambda x: eval(x))
coords = coords[coords.apply(type) == dict]
def get_coords(x):
return pd.Series(x['coordinates'], index=['Coordinate_one', 'Coordinate_two'])
coords = coords.apply(get_coords)
df2 = pd.concat([coords, df1.set_index('tweetID').reindex(coords.index)], axis=1)
print df2.head(2).T
tweetID 714602054988275712
Coordinate_one 23.2745
Coordinate_two 56.6165
tweetText I'm at MK Appartaments in Dobele https://t.co/...
tweetRetweetCt 0
tweetFavoriteCt 0
tweetSource Foursquare
tweetCreated 2016-03-28 23:56:21
userID 782541481
userScreen MartinsKnops
userName Martins Knops
userCreateDt 2012-08-26 14:24:29
userDesc I See Them Try But They Can't Do What I Do. Be...
userFollowerCt 137
userFriendsCt 164
userLocation DOB Till I Die
userTimezone Casablanca
Coordinates {u'type': u'Point', u'coordinates': [23.274462...
GeoEnabled True
Language en
</code></pre>