I've written some code that includes a nested loop where the inner loop is executed about 1.5 million times. I have a function in this loop that I'm trying to optimize. I've done some work, and got some results, but I need a little input to check if what I'm doing is sensible.

Some background:

I have two collections of geographic points (latitude, longitude), one relatively small collection and one relatively huge collection. For every point in the small collection, I need to find the closest point in the large collection.

The obvious way to do this would be to use the haversine formula. The benefit here is that the distances are definitely accurate.

```
from math import radians, sin, cos, asin, sqrt
def haversine(point1, point2):
"""Gives the distance between two points on earth.
"""
earth_radius_miles = 3956
lat1, lon1 = (radians(coord) for coord in point1)
lat2, lon2 = (radians(coord) for coord in point2)
dlat, dlon = (lat2 - lat1, lon2 - lon1)
a = sin(dlat/2.0)**2 + cos(lat1) * cos(lat2) * sin(dlon/2.0)**2
great_circle_distance = 2 * asin(min(1,sqrt(a)))
d = earth_radius_miles * great_circle_distance
return d
```

However, running this 1.5 million times takes about 9 seconds on my machine (according to timeit). Since having an accurate distance is unimportant, rather I only need to find the closest point, I decided to try some other functions.

A simple implementation of the pythagorean theorem gives me a speedup of about 30%. Thinking that I can do better, I wrote the following:

```
def dumb(point1, point2):
lat1, lon1 = point1
lat2, lon2 = point2
d = abs((lat2 - lat1) + (lon2 - lon1))
```

which gives me a factor of 10 improvement. However, now I'm worried that this will not preserve the triangle inequality.

So, my final question is two fold: I'd like to have a function that runs as fast as `dumb`

but still be correct. Will `dumb`

work? If not, any suggestions on how to improve my haversine function?