I recently had to geocode a huge number of addresses for a project I'm working on. Here is a simple python script that reads in a bunch of addresses from a text file and then uses the Google maps api to geocode them (get their latitude and longitude).

The format of the address file should be roughly like:

CODE:
  1. 123 mystreet, Beverly Hills, CA, 90210
  2. 456 mystreet, Beverly Hills, CA, 90210
  3. 789 mystreet, Beverly Hills, CA, 90210
  4. .
  5. .
  6. .

although the Google geocoder is quite forgiving about the format of addresses so you can probably get away with something like

CODE:
  1. CN Tower Toronto Ontario
  2. Bloor Street Toronto Ontario
  3. .
  4. .
  5. .

You will also need to get your own google maps api key which you can get over at Google.

PYTHON:
  1. import urllib,urllib2,time
  2.  
  3. addr_file = 'addresses.csv'
  4. out_file  = 'addresses_geocoded.csv'
  5. out_file_failed = 'failed.csv'
  6.  
  7. sleep_time = 2
  8. root_url = "http://maps.google.com/maps/geo?"
  9. gkey = "YourGoogleKeyGoesHere"
  10.  
  11. return_codes = {'200':'SUCCESS',
  12. '400':'BAD REQUEST',
  13. '500':'SERVER ERROR',
  14. '601':'MISSING QUERY',
  15. '602':'UNKOWN ADDRESS',
  16. '603':'UNAVAILABLE ADDRESS',
  17. '604':'UNKOWN DIRECTIONS',
  18. '610':'BAD KEY',
  19. '620':'TOO MANY QUERIES'
  20. }
  21.  
  22. def geocode(addr,out_fmt='csv'):
  23.  
  24. #encode our dictionary of url parameters
  25. values = {'q' : addr, 'output':out_fmt, 'key':gkey}
  26. data = urllib.urlencode(values)
  27.  
  28. #set up our request
  29. url = root_url+data
  30. req = urllib2.Request(url)
  31.  
  32. #make request and read response
  33. response = urllib2.urlopen(req)
  34. geodat = response.read().split(',')
  35. response.close()
  36.  
  37. #handle the data returned from google
  38. code = return_codes[geodat[0]]
  39. if code == 'SUCCESS':
  40. code,precision,lat,lng = geodat
  41. return {'code':code,'precision':precision,'lat':lat,'lng':lng}
  42. else:
  43. return {'code':code}
  44.  
  45. def main():
  46.  
  47. #open our i/o files
  48. outf = open(out_file,'w')
  49. outf_failed = open(out_file_failed,'w')
  50. inf = open(addr_file,'r')
  51.  
  52. for address in inf:
  53.  
  54. #get latitude and longitude of address
  55. data = geocode(address)
  56.  
  57. #output results and log to file
  58. if len(data)>1:
  59. print "Latitude and Longitude of "+address+":"
  60. print "\tLatitude:",data['lat']
  61. print "\tLongitude:",data['lng']
  62. outf.write(address.strip()+data['lat']+','+data['lng']+'\n')
  63. outf.flush()
  64. else:
  65. print "Geocoding of '"+addr+"' failed with error code "+data['code']
  66. outf_failed.write(address)
  67. outf_failed.flush()
  68.  
  69. #play nice and don't just pound the server with requests
  70. time.sleep(sleep_time)
  71.  
  72. #clean up
  73. inf.close()
  74. outf.close()
  75. outf_failed.close()
  76.  
  77. if __name__ == "__main__":
  78. main()

Nice and easy. You may also want to check out the geopy package which looks very nice and includes support for distance calculations.

Get 25 FREE iPodĀ® compatible downloads from eMusic! Choose from over 2.8 Million DRM free songs! Works on ANY MP3 player