I recently had to geocode a huge number of addresses for a project I'm working on. Here is a simple python script that reads in a bunch of addresses from a text file and then uses the Google maps api to geocode them (get their latitude and longitude).
The format of the address file should be roughly like:
-
123 mystreet, Beverly Hills, CA, 90210
-
456 mystreet, Beverly Hills, CA, 90210
-
789 mystreet, Beverly Hills, CA, 90210
-
.
-
.
-
.
although the Google geocoder is quite forgiving about the format of addresses so you can probably get away with something like
-
CN Tower Toronto Ontario
-
Bloor Street Toronto Ontario
-
.
-
.
-
.
You will also need to get your own google maps api key which you can get over at Google.
-
import urllib,urllib2,time
-
-
addr_file = 'addresses.csv'
-
out_file = 'addresses_geocoded.csv'
-
out_file_failed = 'failed.csv'
-
-
sleep_time = 2
-
root_url = "http://maps.google.com/maps/geo?"
-
gkey = "YourGoogleKeyGoesHere"
-
-
return_codes = {'200':'SUCCESS',
-
'400':'BAD REQUEST',
-
'500':'SERVER ERROR',
-
'601':'MISSING QUERY',
-
'602':'UNKOWN ADDRESS',
-
'603':'UNAVAILABLE ADDRESS',
-
'604':'UNKOWN DIRECTIONS',
-
'610':'BAD KEY',
-
'620':'TOO MANY QUERIES'
-
}
-
-
def geocode(addr,out_fmt='csv'):
-
-
#encode our dictionary of url parameters
-
values = {'q' : addr, 'output':out_fmt, 'key':gkey}
-
data = urllib.urlencode(values)
-
-
#set up our request
-
url = root_url+data
-
req = urllib2.Request(url)
-
-
#make request and read response
-
response = urllib2.urlopen(req)
-
geodat = response.read().split(',')
-
response.close()
-
-
#handle the data returned from google
-
code = return_codes[geodat[0]]
-
if code == 'SUCCESS':
-
code,precision,lat,lng = geodat
-
return {'code':code,'precision':precision,'lat':lat,'lng':lng}
-
else:
-
return {'code':code}
-
-
def main():
-
-
#open our i/o files
-
outf = open(out_file,'w')
-
outf_failed = open(out_file_failed,'w')
-
inf = open(addr_file,'r')
-
-
for address in inf:
-
-
#get latitude and longitude of address
-
data = geocode(address)
-
-
#output results and log to file
-
if len(data)>1:
-
print "Latitude and Longitude of "+address+":"
-
print "\tLatitude:",data['lat']
-
print "\tLongitude:",data['lng']
-
outf.write(address.strip()+data['lat']+','+data['lng']+'\n')
-
outf.flush()
-
else:
-
print "Geocoding of '"+addr+"' failed with error code "+data['code']
-
outf_failed.write(address)
-
outf_failed.flush()
-
-
#play nice and don't just pound the server with requests
-
time.sleep(sleep_time)
-
-
#clean up
-
inf.close()
-
outf.close()
-
outf_failed.close()
-
-
if __name__ == "__main__":
-
main()
Nice and easy. You may also want to check out the geopy package which looks very nice and includes support for distance calculations.
Get 25 FREE iPodĀ® compatible downloads from eMusic! Choose from over 2.8 Million DRM free songs! Works on ANY MP3 playerTags: geocode, geopy, google maps, Python, urllib
This entry was posted on Friday, December 5th, 2008 at 4:21 pm and is filed under Programming, Python. You can follow any responses to this entry through the RSS 2.0 feed. RSS 2.0. Responses are currently closed, but you can trackback from your own site.

