Quantcast
Channel: Planet MySQL
Viewing all articles
Browse latest Browse all 18783

MySQL GIS – Part 6

$
0
0

Is MySQL’s GIS really worth using?

Is GIS worth using in MySQL? In the past few post, I have explored what GIS is and how it is used. GIS encoded data is wonderful and can help with all kinds of cool queries.  I'm late getting this article written so lets get right to it.

The most common geographical  query is for all the point within some distance from a given point. I'll try to focus on ways to answer this type of query. Accuracy of the answer is always important. Think carefully about your query. Do you want every pizza place within a radius of a port or within a square mile? Or, do you really want it within a miles walking distance?

I'm using the common city_lookup table for these tests. Here is the schema.

CREATE TABLE `city_lookup` (
`city_id` INT(7) NOT NULL DEFAULT '0',
`feature` VARCHAR(20) NULL DEFAULT NULL,
`name` VARCHAR(50) NULL DEFAULT NULL,
`pop_2000` INT(10) NULL DEFAULT NULL,
`fips_55` VARCHAR(7) NULL DEFAULT NULL,
`county` VARCHAR(50) NULL DEFAULT NULL,
`fips` VARCHAR(7) NULL DEFAULT NULL,
`state` CHAR(3) NULL DEFAULT NULL,
`state_fips` CHAR(3) NULL DEFAULT NULL,
`display` TINYINT(3) NULL DEFAULT NULL,
`lat` DOUBLE NULL DEFAULT NULL,
`lon` DOUBLE NULL DEFAULT NULL,
 PRIMARY KEY (`city_id`),
  KEY `lat` (`lat`),
  KEY `lon` (`lon`)
)ENGINE=MyISAM

This table uses simple numeric columns to store latitude (lat) and longitude (lat).  There are 35,432 records in the city_lookup table. I’ll flush the cache before each query.

I added at geomentry point to his table by converting latitude and longitude with this command.

ALTER TABLE city_lookup ADD location GEOMETRY NOT NULL AFTER lon;
UPDATE city_lookup set location = point(lon,lat)

Simple Queries

Now to search for data. There are lots of formulas / calculations to find geo distences and navigation. Most of these are highly accurate. I have chosen a "POW()" formula because it requires the least work for MySQL. (Best guess.)

SELECT NAME,lat,lon,ASTEXT(location) FROM city_lookup
  WHERE (POW(lat - 35.5,2) + POW(lon - -97.6,2)) < .02
NAME lat lon ASTEXT(location)
Oklahoma City 35.4676 -97.5164 POINT(-97.5164 35.4676)
Woodlawn Park 35.5114 -97.65 POINT(-97.65 35.5114)
Bethany 35.5187 -97.6323 POINT(-97.6323 35.5187)
Warr Acres 35.5226 -97.6189 POINT(-97.6189 35.5226)
Nichols Hills 35.5509 -97.5489 POINT(-97.5489 35.5509)
The Village 35.5609 -97.5514 POINT(-97.5514 35.5609)
Timestamp Duration Message Line Position 10/27/2010 3:37:10 PM 0:00:00.891 Query OK

Now let's do almost the same query using GIS functions and a bounding box created with two points.  Now we are searching a square, not a circle.

SELECT name,lat,lon,AsText(location) FROM city_lookup
  WHERE MBRContains(GeomFromText('LineString(-98.7 35.6, -97.5 35.4)'),location) ;
name lat lon AsText(location)
Oklahoma City 35.4676 -97.5164 POINT(-97.5164 35.4676)
Woodlawn Park 35.5114 -97.65 POINT(-97.65 35.5114)
Bethany 35.5187 -97.6323 POINT(-97.6323 35.5187)
Warr Acres 35.5226 -97.6189 POINT(-97.6189 35.5226)
Nichols Hills 35.5509 -97.5489 POINT(-97.5489 35.5509)
The Village 35.5609 -97.5514 POINT(-97.5514 35.5609)

Timestamp Duration Message Line Position
10/27/2010 3:38:02 PM 0:00:02.420 Query OK

Well, that’s not any faster.  Be we did get the results we expected.  I ran a hundred of each query on a quit system.  The POW() query takes 0.157 of a seconds and the MBRContains() query takes 0.171 of a second on average.

Is it the Math?

Maybe the math used in the queries is having an effect.  I’ll use benchmark to test the basic functions.  This will not be completely fair. To make this work, I had to added a POINT() function to the MBRContains() functions so I can run the MBRContains “calculation” in benchmark.

select benchmark (10000000, (POW(35.6 - 35.5,2) + POW(-97.7 - -97.6,2)) < .02 ) ;

This runs in 3.354 seconds.

select benchmark (10000000, MBRContains(GeomFromText('LineString(-97.7 35.6, -97.5 35.4)'),POINT(-97.6,35.5)) ) ;

This runs in 5.460 seconds.  Now I’ll try to remove the time was taken by the POINT() function?

select benchmark (10000000, POINT(-97.6,35.5));

This ran in only 0.967. So the MBRContains() function runs in 4.493 after removing the time POINT() takes.  Still the POW() functions looks better.  It runs  in 3/4th the time of the MBRContains() function.

Indexing?

Explain shows neither query is using an index.  In a working application, both queries would contain variables that would replace the latitude and longitude numbers (35.5 and -97.6). Because the POW() query uses these as a part of the WHERE clause it is not able to use either the lat or lon index.

So far both POW() and GIS queries are searching through the entire database and taking the same time.  (I saw that coming.)

Next I created an index for the location column and tried the query again.

ALTER TABLE city_lookup ADD SPATIAL INDEX `location` (`location`) ;
SELECT name,lat,lon,AsText(location) FROM city_lookup
  WHERE MBRContains(GeomFromText('LineString(-98.7 35.6, -97.5 35.4)'),location) ;

Now the average time for the GIS query is .00162.  That’s almost ten times faster!

Conclustion

You should be using GIS functions, but be aware of the limitations.

  1. MySQL only uses bounding box points. Complex shapes will NOT exclude records within the bounding box but outside your polygon.
  2. MBRContains function is NOT a distance function. If you are starting with a point and distance you will need to calculate the the difference in Lat and Lon to create the bounding box points.  (1 deg of latitude ~= 69 miles and 1 deg of longitude ~= cos(latitude)*69)

For my next post:

  • How to collect your own GIS data.
  • Good and bad examples of searching GIS data.
  • Data sources shared by users.
  • Mark Grennan

    Reverences:

    This is a really great talk on GEO searches with MySQL by Alexander Rubin.  http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL

    Wiki descriptions on Latitude http://en.wikipedia.org/wiki/Latitude Longitude http://en.wikipedia.org/wiki/Longitude and geographical distance http://en.wikipedia.org/wiki/Geographical_distance

    Tweet


    PlanetMySQL Voting: Vote UP / Vote DOWN

    Viewing all articles
    Browse latest Browse all 18783

    Trending Articles



    <script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>