<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6346091698278358988</id><updated>2011-09-19T11:41:14.028-07:00</updated><category term='gis'/><category term='mysql'/><title type='text'>Information is Contagious</title><subtitle type='html'>MySQL and more</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>14</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-7759076096099818178</id><published>2010-04-05T16:13:00.000-07:00</published><updated>2010-04-07T14:39:43.918-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>IPv6 Functions for MySQL</title><content type='html'>MySQL has inet_aton and inet_ntoa functions for converting IPv4 addresses to an unsigned 32 bit integer value, and back into dotted decimal format.  Since IPv6 is becoming more popular, I decided to create a couple functions for handling IPv6 addresses as well.  In this case, we want to convert the friendly ::0:53D6:0001 style addresses to a 16 byte binary value that is easier to store.  But we also want to extract them in a human readable format as well.  To do that &lt;a href="https://launchpad.net/mysql-ipv6-udf"&gt;we now have inet_aton6() and inet_ntoa6()&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Examples:&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;mysql&gt; SELECT inet_aton6('0:0:0:0:0:0:af00:b001');&lt;br /&gt;+------------------------------------------+&lt;br /&gt;|      inet_aton6('0:0:0:0:0:0:af00:b001') |&lt;br /&gt;+------------------------------------------+&lt;br /&gt;| 000000000000000000000000AF00B001         |&lt;br /&gt;+------------------------------------------+&lt;br /&gt;&lt;br /&gt;mysql&gt; SELECT inet_ntoa6(unhex('000000000000000000000000AF00B001'));&lt;br /&gt;+-------------------------------------------------------+&lt;br /&gt;| inet_ntoa6(unhex('000000000000000000000000AF00B001')) |&lt;br /&gt;+-------------------------------------------------------+&lt;br /&gt;| 0:0:0:0:0:0:af00:b001                                 |&lt;br /&gt;+-------------------------------------------------------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;When storing in a table, it's recommended to use BINARY(16), and use UNHEX() to convert the hexidecimal string representation into a binary string.  inet_aton6 should do that itself, but returning binary strings from a udf causes errors.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;CREATE TABLE `address` (&lt;br /&gt;  `ipv6` binary(16) DEFAULT NULL&lt;br /&gt;) ENGINE=MyISAM DEFAULT CHARSET=latin1&lt;br /&gt;&lt;br /&gt;mysql&gt; INSERT INTO address (ipv6) VALUES (unhex(inet_aton6('0:0:0:0:0:0:af00:b001')));&lt;br /&gt;&lt;br /&gt;mysql&gt; SELECT inet_ntoa6(ipv6) FROM address;&lt;br /&gt;+-----------------------+&lt;br /&gt;| inet_ntoa6(ipv6)      |&lt;br /&gt;+-----------------------+&lt;br /&gt;| 0:0:0:0:0:0:af00:b001 |&lt;br /&gt;+-----------------------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Enjoy!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-7759076096099818178?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/7759076096099818178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=7759076096099818178' title='31 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/7759076096099818178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/7759076096099818178'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2010/04/ipv6-functions-for-mysql.html' title='IPv6 Functions for MySQL'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>31</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-8845960757125035403</id><published>2010-03-08T12:48:00.000-08:00</published><updated>2010-03-08T13:21:00.759-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Qsh.pl: distributed query tool</title><content type='html'>I've written quite a few tools over time to connect to many mysql servers and run queries.  Most of these have been pretty specific to a small set of tasks such as running an alter across many servers.  Any sysadmin that is in charge of many servers is probably familiar with dsh, and as I was using recently I realized how all those specific tools I've written for mysql could be generalized into a dsh like tool.  Thus, Qsh.pl was born! &lt;a href="https://launchpad.net/qsh"&gt;(download at launchpad)&lt;br /&gt;&lt;/a&gt;&lt;br /&gt;Usage should be familiar to anyone who has used dsh before, it even will read group files made for dsh in /etc/dsh/group/or /usr/local/etc/group/.&lt;br /&gt;&lt;br /&gt;Here's an example where this tool was quite useful.  I was getting a query error for SHOW GLOBAL STATUS.  This was a curious result since we're running mysql 5.0 everywhere.  So what better way to find out which machines are complaining than just run it everywhere:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;# qsh.pl -Mcg all_servers --user root --ask-pass --db=test -e 'SHOW GLOBAL STATUS'  2&gt;error.log&lt;br /&gt;{snip ... lots of output}&lt;br /&gt;Done. Total time 2.919&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;My group file for all_servers includes 120 mysql servers, executing that query on all of them took a total of 2.9 seconds, not bad. I also redirected stderr to a file, so any query errors are easy to find:&lt;br /&gt;&lt;pre&gt;cat error.log&lt;br /&gt;myserver1: Query Error (1064) You have an error in your SQL syntax ...&lt;br /&gt;myserver2: Query Error (1064) You have an error in your SQL syntax ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Ok, we found all the servers that return an error, why do they complain?&lt;br /&gt;&lt;pre&gt;# qsh.pl -Mcm myserver1,myserver2,myserver3 --user root --ask-pass --db=test -e 'SELECT VERSION()'&lt;br /&gt;Password:&lt;br /&gt;myserver1: +-----------------+&lt;br /&gt;myserver1: |       VERSION() |&lt;br /&gt;myserver1: +-----------------+&lt;br /&gt;myserver1: | 4.1.21-standard |&lt;br /&gt;myserver1: +-----------------+&lt;br /&gt;myserver3: +----------------------------+&lt;br /&gt;myserver3: |                  VERSION() |&lt;br /&gt;myserver3: +----------------------------+&lt;br /&gt;myserver3: | 5.0.66a-enterprise-gpl-log |&lt;br /&gt;myserver3: +----------------------------+&lt;br /&gt;myserver2: +-----------------+&lt;br /&gt;myserver2: |       VERSION() |&lt;br /&gt;myserver2: +-----------------+&lt;br /&gt;myserver2: | 4.1.21-standard |&lt;br /&gt;myserver2: +-----------------+&lt;br /&gt;Done. Total time 0.063&lt;br /&gt;&lt;/pre&gt;Ooops! That's right, still a few old versions for legacy reasons.&lt;br /&gt;&lt;br /&gt;That's just one example of how I used it.  There are probably lots of use cases out there, but since it's new I'm still learning to rely on it.  It certainly makes things faster when I can think about querying many servers at once, and is a more efficient way to work when dealing with many machines.  It might be useful for:&lt;br /&gt;&lt;br /&gt;+ comparing explain plan between many machines&lt;br /&gt;+ altering large tables across many slaves, before promoting one to master.&lt;br /&gt;+ grabbing status output from many machines to feed into awk or sed&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-8845960757125035403?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/8845960757125035403/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=8845960757125035403' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/8845960757125035403'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/8845960757125035403'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2010/03/qshpl-distributed-query-tool.html' title='Qsh.pl: distributed query tool'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-5273960196016670484</id><published>2010-02-01T12:00:00.000-08:00</published><updated>2010-02-01T11:27:50.489-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Three Types of Sets: Performance Comparrisson</title><content type='html'>In a previous post I talked about how to take a classic SQL anti-pattern (storing lists as CSV in a single field), normalize it, and how to write queries for it.&lt;br /&gt;&lt;br /&gt;There are two other ways to handle sets of values:  The built in SET column type, and using a INT/BIGINT column as a bitfield.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html"&gt;mysql set data type&lt;/a&gt; provides a convenient way of working with a list of values, where each field could represent multiple simultaneous values.  It's really just a wrapper around bitmask operations -- it lest you work with them using text as well as numerical values.&lt;br /&gt;&lt;br /&gt;But how do they perform?  My quick test showed that aside from a couple edge cases, they all preformed about the same speed.  In fact it was surprising to me to see that the bitmask type queries were really not that much faster than the more normalised table structure with multiple rows.  I would have guessed that the extra grouping operations would have added significant overhead, but they didn't.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;table border="1" cellpadding="5" cellspacing="0"&gt;&lt;br /&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th width="75%" align="left"&gt;Query&lt;/th&gt;&lt;th align="right"&gt;Avg. Time&lt;/th&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;SELECT * from bitmask WHERE val&amp;amp;3 LIMIT 1000&lt;/td&gt;&lt;td align="right"&gt;0.0023512&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;SELECT * from bitmask WHERE val&amp;amp;3=3 LIMIT 1000&lt;/td&gt;&lt;td align="right"&gt;0.0149182&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;select * from settable where s&amp;amp;3=3 LIMIT 1000&lt;/td&gt;&lt;td align="right"&gt;0.0155686&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;select id FROM normalset&lt;br /&gt; group by id &lt;br /&gt; HAVING sum(val='yellow') and sum(val='cyan')&lt;br /&gt; LIMIT 1000&lt;/td&gt;&lt;td align="right"&gt;0.016244&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;select id, &lt;br /&gt;  BIT_OR(CASE val WHEN 'cyan' THEN 1 WHEN 'yellow' THEN 2 ELSE 0 END) as bitvals,&lt;br /&gt;  count(*)&lt;br /&gt; FROM normalset&lt;br /&gt;   WHERE val in ('cyan','yellow')&lt;br /&gt; GROUP BY id&lt;br /&gt; HAVING bitvals=3&lt;br /&gt; LIMIT 1000&lt;/td&gt;&lt;td align="right"&gt;0.0163958&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;select * from settable &lt;br /&gt; where s like '%cyan%yellow%' LIMIT 1000&lt;/td&gt;&lt;td align="right"&gt;0.0333536&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Notice that that the top query in the list is an order of magnitude faster than the others, however it doesn't actually answer the same question as the other queries.  I included it anyway because it was a curious result -- apparently bit manipulation operations alone are very very fast in mysql, but just adding an equality test slows the query down considerably.  The cost of the equality test is much much higher than I would have anticipated.  It would be an interesting project to track down exactly why that is.&lt;br /&gt;&lt;br /&gt;You can see that working with the SET datatype using string values has the worst performance.  When you use it like an ordinary bitmask, then the performance is identical to the other, faster methods (no surprise there.)  Working with it using text values involves some internal conversions.  IMO this means ordinary bitmasks are the clear winner over the SET datatype -- the SET datatype requires an ALTER TABLE to add new values, and the the convenience of using text values should be avoided because they're slow.  If you need maximum flexibility, the more normalized structure seems to preform just as well.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To generate the test data, I created about 200k unique items and randomly assigned any of a set of 4 values to each item.  This ended up with about 400k associations, so each item had on average two values out of the set.  Statements follow:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;-- creating SET table, generating values&lt;br /&gt;create table settable (id serial, s set('cyan','yellow','magenta','black'), key (s) );&lt;br /&gt;insert into settable (s) values ( 'cyan,yellow' ), ( 'yellow,magenta'), ( 'cyan,black,yellow'), ( 'magenta' );&lt;br /&gt;&lt;br /&gt;-- repeat until desired # of rows reached:&lt;br /&gt;insert into settable (s) select truncate(rand()*15+1,0) from settable;&lt;br /&gt;&lt;br /&gt;-- creating normalized set, copy values from settable&lt;br /&gt;create table normalset (id int not null default 0, val char(8), key(id) );&lt;br /&gt;insert into normalset SELECT id, 'cyan' FROM settable WHERE s like '%cyan%';&lt;br /&gt;insert into normalset SELECT id, 'yellow' FROM settable WHERE s like '%yellow%';&lt;br /&gt;insert into normalset SELECT id, 'magenta' FROM settable WHERE s like '%magenta%';&lt;br /&gt;insert into normalset SELECT id, 'black' FROM settable WHERE s like '%black%';&lt;br /&gt;&lt;br /&gt;-- create bitmask, copy values from settable&lt;br /&gt;CREATE TABLE bitmask ( id serial, val int unsigned NOT NULL default '0', UNIQUE KEY (id), KEY (val) );&lt;br /&gt;insert into bitmask SELECT id, s+0 FROM settable;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-5273960196016670484?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/5273960196016670484/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=5273960196016670484' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/5273960196016670484'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/5273960196016670484'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/01/three-types-of-sets-performance.html' title='Three Types of Sets: Performance Comparrisson'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-8973046685250970369</id><published>2009-12-29T10:41:00.000-08:00</published><updated>2010-01-04T16:18:14.301-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>How to Fix the Comma Separated List of Doom</title><content type='html'>This is a solution that I've seen a few people have trouble with now, so I want to make more people aware of it:&lt;br /&gt;&lt;br /&gt;Say you have some items that can have any of a set of associated values.  These could be a list flags, it could be keywords or tags.  This is often seen as the anti-pattern known as "Comma Separated List of Doom" (CSLoD). Naive developers will often create something like the following:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;create table cslod ( id serial, vals varchar(255) );&lt;br /&gt;insert into cslod values (NULL, "cyan,yellow" );&lt;br /&gt;select * from cslod;&lt;br /&gt;+----+-------------+&lt;br /&gt;| id | vals        |&lt;br /&gt;+----+-------------+&lt;br /&gt;|  1 | cyan,yellow |&lt;br /&gt;+----+-------------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I know that most people reading this already understand this is bad but let me re-iterate: &lt;span style="font-weight: bold;"&gt;Please please please don't do this&lt;/span&gt;! It will be very very difficult to write meaningful queries later.&lt;br /&gt;&lt;br /&gt;The natural solution is to create separate rows for each value in the list:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;create table normalset (id int unsigned not null default 0, val varchar(12) not null default '', primary key (id, val) );&lt;br /&gt;insert into normalset VALUES (1, "cyan"), (1, "yellow" );&lt;br /&gt;select * from normalset;&lt;br /&gt;+----+--------+&lt;br /&gt;| id | val    |&lt;br /&gt;+----+--------+&lt;br /&gt;|  1 | cyan   |&lt;br /&gt;|  1 | yellow |&lt;br /&gt;+----+--------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Ok, to be truly normalised, val should be an id with a foreign key to another unique list of values, but it's easier to illustrate with this not-quite-perfect structure.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now, using the improved structure, how would we write a query that retrieves all ids that have *both* cyan and yellow as values?&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;insert into normalset VALUES (2, "cyan"), (2, "black" ), (3, "magenta"), (4,"cyan"), (4, "magenta"), (4, "yellow");&lt;br /&gt;&lt;br /&gt;select * from normalset;&lt;br /&gt;+----+---------+&lt;br /&gt;| id | val     |&lt;br /&gt;+----+---------+&lt;br /&gt;|  1 | cyan    |&lt;br /&gt;|  1 | yellow  |&lt;br /&gt;|  2 | cyan    |&lt;br /&gt;|  2 | black   |&lt;br /&gt;|  3 | magenta |&lt;br /&gt;|  4 | cyan    |&lt;br /&gt;|  4 | magenta |&lt;br /&gt;|  4 | yellow  |&lt;br /&gt;+----+---------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We want to return 1 and 4:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;SELECT id FROM normalset GROUP BY id HAVING SUM(val='cyan') AND SUM(val='yellow');&lt;br /&gt;+----+&lt;br /&gt;| id |&lt;br /&gt;+----+&lt;br /&gt;|  1 |&lt;br /&gt;|  4 |&lt;br /&gt;+----+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Simple!  The trick is to use GROUP BY, and then you can write conditions based on aggregates and filter them in the HAVING clause.  There are a few other ways to obtain the same result:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;SELECT id, GROUP_CONCAT(val order by val) as theset, count(*) FROM normalset GROUP BY id HAVING theset like '%cyan%yellow%'&lt;br /&gt;&lt;br /&gt;SELECT id, BIT_OR(CASE val WHEN 'cyan' THEN 1 WHEN 'yellow' THEN 2 ELSE 0 END) as bitvals, count(*) FROM normalset GROUP BY id HAVING bitvals=3&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The first form can be easily used to find other conditions, such as all rows that have cyan as a value but NOT yellow:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;SELECT id FROM normalset GROUP BY id HAVING SUM(val='cyan') AND NOT SUM(val='yellow');&lt;br /&gt;+----+&lt;br /&gt;| id |&lt;br /&gt;+----+&lt;br /&gt;|  2 |&lt;br /&gt;+----+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I have also been comparing these methods to using the SET datatype, and plain ol' bitmasks.  I'll add a performance comparison of various methods in another post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-8973046685250970369?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/8973046685250970369/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=8973046685250970369' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/8973046685250970369'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/8973046685250970369'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/12/how-to-fix-comma-separated-list-of-doom.html' title='How to Fix the Comma Separated List of Doom'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-2206985194279618156</id><published>2009-11-24T18:10:00.001-08:00</published><updated>2009-11-24T18:23:11.276-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Restore a Single Table From mysqldump</title><content type='html'>Ok, this is inspired by &lt;a href="http://everythingmysql.ning.com/profiles/blogs/restore-one-table-from-an-all"&gt;another post on the same subject&lt;/a&gt;.  I would have just commented, but there was a required signup. Ew.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;While the method there with juggling grants works fine, I'd like to highlight another way that I think is much simpler: use grep &amp;amp; sed to create a file which contains only the table you want to restore.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;mysql&gt; show tables;&lt;br /&gt;+------------------+&lt;br /&gt;| Tables_in_gtowey |&lt;br /&gt;+------------------+&lt;br /&gt;| t1               |&lt;br /&gt;| t2               |&lt;br /&gt;| t3               |&lt;br /&gt;+------------------+&lt;br /&gt;3 rows in set (0.00 sec)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;# mysqldump gtowey &gt; dump.sql&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;Given 3 tables in a mysqldump file, lets restore just t2&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;# grep -n 'Table structure' dump.sql&lt;br /&gt;19:-- Table structure for table `t1`&lt;br /&gt;40:-- Table structure for table `t2`&lt;br /&gt;61:-- Table structure for table `t3`&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;Now just use the line numbers to extract the table with sed:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;sed -n '40,61 p' dump.sql  &gt; t2.sql&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;That's it, you have everything you need with no cleanup of mysql grants to do afterward.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-2206985194279618156?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/2206985194279618156/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=2206985194279618156' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2206985194279618156'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2206985194279618156'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/11/restore-single-table-from-mysqldump.html' title='Restore a Single Table From mysqldump'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-3938383408282907319</id><published>2009-10-05T17:06:00.000-07:00</published><updated>2010-01-05T16:55:32.334-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Simple Test Case Techiques</title><content type='html'>Ok, I know that most of you reading this would think of this as painfully obvious, but it's also clear that a lot of beginning mysql users are timid about actually diving in and discovering answers.  In an effort to answer fewer simple questions, I'm hoping to get this out to more people who may be beginners with mysql.&lt;br /&gt;&lt;br /&gt;The most common barrier seems to be "I don't have a test environment!"  This is easily remedied:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;1. Even if your production enviroment is linux, you can still install mysql on your windows-based workstation to play around with.   &lt;/span&gt;The installation on any environment can be done in minutes, so lack of SQL is not a good excuse.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;2.  The perfect companion for testing mysql is &lt;/span&gt;&lt;a style="font-weight: bold;" href="https://launchpad.net/mysql-sandbox"&gt;MySQL Sandbox&lt;/a&gt;&lt;span style="font-weight: bold;"&gt;.  You can even test replication setups!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;3. Even if you just have a production environment, small test cases can be done inside a different database and are unlikely to affect production traffic.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Techniques for Test Cases&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;Say you have a question about valid dates, lets say you're unsure about how NULL works with DATE&lt;/span&gt;TIME types, and you don't want to run an update on your production data to test it.&lt;br /&gt;&lt;br /&gt;Remember that you can pare down a question like this to the most basic elements. You just need three things:&lt;br /&gt;&lt;br /&gt;1. A table with a date column&lt;br /&gt;2. A row with a valid date&lt;br /&gt;3. An update statement&lt;br /&gt;&lt;br /&gt;1. &lt;code&gt;CREATE TABLE d ( a DATETIME);&lt;/code&gt;&lt;br /&gt;2. &lt;code&gt;INSERT INTO d VALUES (NOW());&lt;/code&gt;&lt;br /&gt;3. &lt;code&gt;SELECT * FROM d; UPDATE d SET a=NULL; SELECT * FROM d;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;There!  The trick is to remember how simple the create table syntax is, and how easy it is to put data into it.  All kinds of extremely simple tables can be created, with no fuss at the command line.  I think many beginners get used to seeing more complex CREATE TABLE statements full of lots of columns and options, and don't realize just how much is optional.&lt;br /&gt;&lt;br /&gt;There are a few things I tend do do as a matter of habit:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;1. Keep all identifiers down to one or two letters &lt;/span&gt;&lt;span&gt;-- reduced typing makes it easier to write these cases.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;2. Columns are usually: a,b,c,d  or x,y,z  &lt;/span&gt;&lt;span&gt;-- again, don't worry about names, keep them simple!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;CREATE TABLE t1 (a INT, b VARCHAR(32), c DATE );&lt;/code&gt; -- very simple with 3 different data types to play with.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;3. SERIAL is a great shortcut  &lt;/span&gt;-- although I tell everyone that BIGINT is a terrible default primary key column type, the SERIAL column type is great for quickly creating a table with an auto incrementing primary key&lt;br /&gt;&lt;br /&gt;&lt;code&gt;CREATE TABLE t2 ( a SERIAL, b INT)&lt;/code&gt;# SERIAL maps to  &lt;code&gt;bigint(20) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;4. MySQL has functions which help populate data for tests,&lt;/span&gt;&lt;span&gt; use NOW() / CURDATE() for dates, MD5(now()) for strings, and RAND() for numbers (just examples)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt; INSERT INTO t1 VALUES (RAND()*&lt;/code&gt;&lt;code&gt;POW(2,32)&lt;/code&gt;&lt;code&gt;, MD5(NOW()), NOW());&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;5. Randomized  dates are easy.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt; SELECT NOW() - INTERVAL TRUNCATE(RAND()*48,0) HOUR &lt;/code&gt; # generates a random date in the last 48 hours&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;6. Use the power of INSERT ... SELECT to generate large sets of data quickly:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;INSERT INTO t1 SELECT ( RAND()*10000, MD5(RAND()*10000), CURDATE() - INTERVAL RAND(360) DAY );&lt;/code&gt; # every time you run this, you will double the amount of data in your table!  You can quickly "make" millions of rows this way.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;7. Test functions without even using tables&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;SELECT DATE_FORMAT(CURDATE()," %d, %m %Y");&lt;/code&gt;  # This format string isn't too useful, but now you can play with it until you get the output you want.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;8. Test updates as SELECT statements.&lt;/span&gt;  If you want to know if the statement &lt;code&gt;UPDATE client SET balance=IF(client_type='premium' &amp;amp;&amp;amp; balance, balance+1000, balance) AND join_date &gt;= CURDATE()-INTERVAL 90 DAY  &lt;/code&gt; will do the right thing before you run it against your database, re-write it as a select, and move the IF conditions  to be part of your result set (or other conditions)&lt;br /&gt;&lt;br /&gt;&lt;code&gt;SELECT client_type, balance, IF(client_type='premium' &amp;amp;&amp;amp; balance, balance+1000, balance) FROM client WHERE &lt;/code&gt;&lt;code&gt;join_date &gt;= CURDATE()-INTERVAL 90 DAY &lt;/code&gt;# now you can see exactly what the new value will be for each row, without making any changes&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;9. Remember the &lt;/span&gt;&lt;a style="font-weight: bold;" href="http://dev.mysql.com/doc/refman/5.1/en/index.html"&gt;MySQL manual  &lt;/a&gt;&lt;span style="font-weight: bold;"&gt;is a great resource!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Enjoy, the more you're able to answer your own questions, the more powerful you become!  Please feel free to post your own favorite techniques.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-3938383408282907319?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/3938383408282907319/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=3938383408282907319' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/3938383408282907319'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/3938383408282907319'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/10/simple-test-case-techiques.html' title='Simple Test Case Techiques'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-2429595810493732549</id><published>2009-08-05T12:30:00.000-07:00</published><updated>2009-08-05T16:04:27.032-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>How to Select 'this Wednesday' or Other Relative Dates</title><content type='html'>I have some bad news and good news.  The bad:  there is no built in MySQL function for finding a relative date.  The good: it's still pretty trivial to get MySQL to calculate it.&lt;br /&gt;&lt;br /&gt;The trick is that you need to start with a known date,  such as:&lt;br /&gt;&lt;pre&gt;mysql&gt; SELECT DATE_FORMAT(CURDATE() - INTERVAL DAYOFWEEK(CURDATE()) DAY, '%W, %M %D, %Y') AS d;&lt;br /&gt;+----------------------------+&lt;br /&gt;| d                          |&lt;br /&gt;+----------------------------+&lt;br /&gt;| Saturday, August 1st, 2009 |&lt;br /&gt;+----------------------------+&lt;br /&gt;&lt;/pre&gt;That gets you the date for the Saturday that ends the previous week.  Then "this Wednesday" or "last Thursday" or almost any other relative date is simple to calculate, provided one caveat. You must do one additional check to see if were already passed the target day of the week.&lt;br /&gt;&lt;br /&gt;If today is Tuesday, then "this Wednesday" is 4 days from our reference date above.  However if it's already Thursday of this week, then we have to add 7+4 days instead.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; set @dayofweek=4;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; SELECT DATE_FORMAT(CURDATE() - INTERVAL DAYOFWEEK(CURDATE()) DAY + INTERVAL (DAYOFWEEK(CURDATE())&gt;=@dayofweek)*7+@dayofweek DAY, '%W, %M %D, %Y') AS d;&lt;br /&gt;+-----------------------------+&lt;br /&gt;| d                           |&lt;br /&gt;+-----------------------------+&lt;br /&gt;| Wednesday, August 5th, 2009 |&lt;br /&gt;+-----------------------------+&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;For "next Wednesday," you only have to add another week.&lt;br /&gt;&lt;br /&gt;For dates such as "last Monday" the process is similar:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; set @dayofweek=2;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; SELECT DATE_FORMAT(CURDATE() - INTERVAL DAYOFWEEK(CURDATE()) DAY + INTERVAL (DAYOFWEEK(CURDATE())&lt;=@dayofweek)*-7 + @dayofweek DAY, '%W, %M %D, %Y') AS d;&lt;br /&gt;+--------------------------+&lt;br /&gt;| d                        |&lt;br /&gt;+--------------------------+&lt;br /&gt;| Monday, August 3rd, 2009 |&lt;br /&gt;+--------------------------+&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;Note that it's not necessary to do this using SET @dayofweek=N;  That is simply to make the query a little more readable in this case.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-2429595810493732549?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/2429595810493732549/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=2429595810493732549' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2429595810493732549'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2429595810493732549'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/04/how-to-select-this-wednesday-or-other.html' title='How to Select &apos;this Wednesday&apos; or Other Relative Dates'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-2080440379204254043</id><published>2009-07-31T11:08:00.001-07:00</published><updated>2009-08-03T11:49:28.762-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>MySQL Midpoint Stored Function</title><content type='html'>MySQL GIS functions don't seem to include a Midpoint function; it's trivial to write one, but I thought posting this might save someone a few minutes.  Enjoy!&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;DROP FUNCTION IF EXISTS midpoint;&lt;br /&gt;DELIMITER //&lt;br /&gt;CREATE FUNCTION midpoint(line GEOMETRY)&lt;br /&gt;RETURNS POINT DETERMINISTIC&lt;br /&gt;BEGIN&lt;br /&gt;&lt;br /&gt;IF NumPoints(line) != 2 THEN&lt;br /&gt; RETURN NULL;&lt;br /&gt;END IF;&lt;br /&gt;&lt;br /&gt;RETURN GeomFromText(CONCAT('POINT(',(X(StartPoint(line)) +X(EndPoint(line))) / 2,' ',(Y(StartPoint(line))+Y(EndPoint(line))) / 2,')'));&lt;br /&gt;END //&lt;br /&gt;DELIMITER ;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-2080440379204254043?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/2080440379204254043/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=2080440379204254043' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2080440379204254043'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2080440379204254043'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/07/mysql-midpoint-stored-function.html' title='MySQL Midpoint Stored Function'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-2423046943367093957</id><published>2009-07-30T13:35:00.000-07:00</published><updated>2009-08-03T11:50:07.312-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Decimal Math Precision</title><content type='html'>I ran into this little problem today;  when dividing two whole number the result MySQL gave me was nowhere near as precise as I needed it to be.  The values were truncated which caused me some errors in my application.&lt;br /&gt;&lt;br /&gt;Example:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; select 1*0.00001;&lt;br /&gt;+-----------+&lt;br /&gt;| 1*0.00001 |&lt;br /&gt;+-----------+&lt;br /&gt;|   0.00001 |&lt;br /&gt;+-----------+&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;Ok, that looks fine. Let's do the same thing using division:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;mysql&gt; select 1/100000;&lt;br /&gt;+----------+&lt;br /&gt;| 1/100000 |&lt;br /&gt;+----------+&lt;br /&gt;|   0.0000 |&lt;br /&gt;+----------+&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;Oops!  What happened?&lt;br /&gt;&lt;br /&gt;The issue is that the maximum precision of the result value depends on the number of decimal places in the arguments.  Since the second version uses two whole numbers, the result uses the default number of decimal places, which is 4.  So you really have to take care to make sure you're getting the precision you want out of your math operations!&lt;br /&gt;&lt;br /&gt;To MySQL's credit, &lt;/span&gt;&lt;a style="font-family: georgia;" href="http://dev.mysql.com/doc/refman/5.0/en/arithmetic-functions.html"&gt;the behavior is clearly documented &lt;/a&gt;&lt;span style="font-family:georgia;"&gt;(however, I maintain it's still a bit troublesome because how often would you expect the division operator to have a list of instructions and caveats?)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;There are at least a couple simple solutions:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; SELECT CAST(1/100000 AS DECIMAL(8,5) );&lt;br /&gt;+---------------------------------+&lt;br /&gt;| CAST(1/100000 AS DECIMAL(8,5) ) |&lt;br /&gt;+---------------------------------+&lt;br /&gt;|                         0.00001 |&lt;br /&gt;+---------------------------------+&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;Or&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;mysql&gt; SET div_precision_increment=5;&lt;br /&gt;mysql&gt; SELECT 1/100000;&lt;br /&gt;+----------+&lt;br /&gt;| 1/100000 |&lt;br /&gt;+----------+&lt;br /&gt;|  0.00001 |&lt;br /&gt;+----------+&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="font-family:georgia;"&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="font-family:georgia;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-2423046943367093957?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/2423046943367093957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=2423046943367093957' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2423046943367093957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2423046943367093957'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/07/decimal-math-precision.html' title='Decimal Math Precision'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-6313359718176855897</id><published>2009-05-18T10:42:00.000-07:00</published><updated>2009-12-16T16:57:47.456-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='gis'/><title type='text'>Reverse Geocoding using MySQL GIS</title><content type='html'>Normal geocoding is the process of taking an address and converting it into latitude and longitude.  Reverse geocoding goes the other direction; given a latitude and longitude we want to know the closest street address.&lt;br /&gt;&lt;br /&gt;Google Maps API, Yahoo Maps, Microsoft, &lt;a href="http://www.geonames.org/"&gt; GeoNames.Org&lt;/a&gt;  and others can provide both these services, but they either cost, or have restrictive terms of service that makes them unavailable for high volume applications. See &lt;a href="http://www.geonames.org/products/reverse-geocoding.html%20GeoNames%20Reverse%20Geocoding%5D%20for%20example.%20%20Forward%20geocoding%20can%20be%20accomplished%20easily%20through%20a%20PERL%20module%20%3CA%20HREF=" org="" sderle="" us="" pm=""&gt; Geo::Coder::US&lt;/a&gt;  (for US addresses only).  This uses a local data source, and requires you to download the data from the US Census manually.  It can import a format known as &lt;a href="http://www.census.gov/geo/www/tiger/"&gt; Tiger/Line&lt;/a&gt; into a local &lt;a href="http://search.cpan.org/%7Epmqs/BerkeleyDB-0.38/BerkeleyDB.pod"&gt; BerkeleyDB&lt;/a&gt; file, then uses that to calculate lat/lng values from street addresses.  However, it provides no method for reverse lookups.&lt;br /&gt;&lt;br /&gt;This is because the parsed data is simply a tree of values with the following form:&lt;br /&gt;&lt;pre&gt; /zip/street name/type//&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Each key contains a list of numbers which are actually the spatial representation of the street, broken down into segments for each block.  It is a series of line segments, which are called "shape files," because they describe the streets in spatial terms. For example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;/94702/Ashby/Ave// :  $VAR1 = [&lt;br /&gt;# record 1&lt;br /&gt;      37848907,   # start lat (from) $value/1_000_000 = the decimal form.&lt;br /&gt;      122297583,  # start lng&lt;br /&gt;      400,  # even num start (range)&lt;br /&gt;      598,  # even num end&lt;br /&gt;      0,  # zero street marker&lt;br /&gt;      401,  # odd num start&lt;br /&gt;      599,  # odd num end&lt;br /&gt;#record 2 .. N&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;#   ends with final point&lt;br /&gt;      37853426,&lt;br /&gt;      122279480&lt;br /&gt;    ];&lt;br /&gt;] )&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;For forward geocoding, you simply start with a known address, then use that as a key to find the appropriate spatial data for that street.  The lat/lon is an interpolated value.&lt;br /&gt;&lt;br /&gt;For reverse geocoding, it's impossible to start with a lat/lng and return the street without iterating for each and every street and doing comparisons to find the "closest."  To be able to do this efficiently, we need a spatial index on this data.  CPAN has a module for &lt;a href="http://search.cpan.org/%7Eajolma/Tree-R-0.05/lib/Tree/R.pm"&gt; R-Tree&lt;/a&gt; indexes, but mysql also provides a convenient way to store all this shape data, and do spatial queries.&lt;br /&gt;&lt;br /&gt;This example uses a modified version of &lt;a href="http://forge.mysql.com/wiki/GIS_Functions"&gt; MySQL, which includes improved GIS support&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Building A Reverse Geocoding DB&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Get the Data&lt;/span&gt;&lt;br /&gt;Here's a quick PERL script to spider the Census Bureau site and download all the data.  It's split by State and county so there are a lot of files:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;use HTML::Parser;&lt;br /&gt;my $base_url = 'http://www2.census.gov/geo/tiger/tigerua';&lt;br /&gt;&lt;br /&gt;package IdentityParse;&lt;br /&gt;use base "HTML::Parser";&lt;br /&gt;use Data::Dumper;&lt;br /&gt;use LWP::Simple;&lt;br /&gt;&lt;br /&gt;my @urls;&lt;br /&gt;my $burl;&lt;br /&gt;&lt;br /&gt;sub start {&lt;br /&gt;    my ($self, $tag ,$attr, $attrseq, $origtext) = @_;&lt;br /&gt;    unless ($tag =~ /^a$/) {&lt;br /&gt;            return;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    if (defined $attr-&gt;{'href'}) {&lt;br /&gt;            if ($attr-&gt;{'href'} =~ /[A-Z]\/$/) { # directories: add to list&lt;br /&gt;                    push @urls, "$burl/$attr-&gt;{'href'}";&lt;br /&gt;            }&lt;br /&gt;            if ($attr-&gt;{'href'} =~ /\.ZIP$/i ) { # zip file, download it&lt;br /&gt;                    if (! -e $attr-&gt;{'href'}) {&lt;br /&gt;                            # we don't have it yet!&lt;br /&gt;                            `wget '$burl/$attr-&gt;{href}'`;&lt;br /&gt;                    }&lt;br /&gt;            }&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;my $p = new IdentityParse;&lt;br /&gt;&lt;br /&gt;push @urls, $base_url;&lt;br /&gt;while (@urls) {&lt;br /&gt;my $url = shift @urls;&lt;br /&gt;my $content = get $url;&lt;br /&gt;$burl = $url;&lt;br /&gt;$p-&gt;parse($content);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Convert Using Geo::Coder::US&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;* Install Geo::Coder::US&lt;br /&gt;&lt;br /&gt;This comes with a series of support scripts, among these is import_tiger_zip.pl  which will import all the files into a BerkeleyDB file.&lt;br /&gt;&lt;pre&gt; find /path/to/tigerline/data/ -name "*ZIP"  \&lt;br /&gt;-exec ./import_tiger_zip.pl geocoder.db {} \;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The result is:&lt;br /&gt;&lt;pre&gt; -rw-rw-r-- 1 root root 724M Apr 24 12:15 geocoder.db&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Import into MySQL&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Once you have the geocoder.db output, this script can be used to generate sql statements for inserting this into a spatially aware mysql database.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;for i in `cat z2`; do perl ./bdb_mysql_export.pl \&lt;br /&gt;geocoder.db $i &gt;&gt; geocoder.sql; done&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;#!/usr/bin/perl -w&lt;br /&gt;&lt;br /&gt;use Geo::Coder::US;&lt;br /&gt;use DB_File;&lt;br /&gt;use Data::Dumper;&lt;br /&gt;use strict;&lt;br /&gt;use vars qw(%db *db);&lt;br /&gt;&lt;br /&gt;my $filename = shift @ARGV or die "Usage: $0 \n";&lt;br /&gt;&lt;br /&gt;Geo::Coder::US-&amp;gt;set_db($filename);&lt;br /&gt;my @streets;&lt;br /&gt;my %streets;&lt;br /&gt;for my $arg (@ARGV) {&lt;br /&gt; my $val;&lt;br /&gt; my $path = "/$arg/";&lt;br /&gt; my $zip  = $arg;&lt;br /&gt; my ( $key, $value );&lt;br /&gt; $Geo::Coder::US::DBO-&amp;gt;seq( $key = $path, $value, R_CURSOR );&lt;br /&gt; while ( $key and $value and $key =~ /^$path/i ) {&lt;br /&gt;     $streets{$key} =&lt;br /&gt;       [ split( ", ", join( ", ", unpack( "w*", $value ) ) ) ];&lt;br /&gt;     $Geo::Coder::US::DBO-&amp;gt;seq( $key, $value, R_NEXT );&lt;br /&gt; }&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;my $DEBUG = 0;&lt;br /&gt;&lt;br /&gt;#print Dumper \%streets;&lt;br /&gt;#my @data = values %streets;&lt;br /&gt;foreach my $k (keys %streets) {&lt;br /&gt; my @data = @{ $streets{$k} } ;&lt;br /&gt;# if (lc($k) =~ /fairoaks/) { $DEBUG = 1; } else { $DEBUG = 0; };&lt;br /&gt;# print "Doing street: $k\n" if ($DEBUG);&lt;br /&gt;# print Dumper(\@data) if ($DEBUG);&lt;br /&gt;# next if (!$DEBUG);&lt;br /&gt;my $first = 1;&lt;br /&gt; my (@from, @to, @range, @best, $matched, @range2);&lt;br /&gt;    shift @data;&lt;br /&gt;    while (@data) {&lt;br /&gt;            @from = splice( @data, 0, 2 ) if $data[0] &amp;gt; 1_000_000;&lt;br /&gt;            while (@data and $data[0] &amp;lt; 1_000_000) {&lt;br /&gt;                shift @data if not $data[0]; # skip street-side zero marker&lt;br /&gt;                @range = splice( @data, 0, 2 );&lt;br /&gt;                shift @data while @data and $data[0] &amp;lt; 1_000_000;&lt;br /&gt;            }&lt;br /&gt;            last unless @data;&lt;br /&gt;            @to = splice( @data, 0, 2 );&lt;br /&gt;&lt;br /&gt;           print "From  : " . Dumper(\@from)."\n"  if ($DEBUG);;&lt;br /&gt;           print "To    : ". Dumper(\@to)."\n"  if ($DEBUG);;&lt;br /&gt;           print "Range1: ". Dumper(\@range). "\n"  if ($DEBUG);;&lt;br /&gt;           print "Range2: ". Dumper(\@range2) ."\n"  if ($DEBUG);;&lt;br /&gt;           my %found;&lt;br /&gt;     @found{qw{ zip street type prefix suffix }}  = split "/", substr($k, 1), 5;&lt;br /&gt;           @found{qw{ frlat frlong tolat tolong }}      = map( $_ / 1_000_000, @from, @to );&lt;br /&gt;           @found{qw{ fradd toadd }}          = @range;&lt;br /&gt;           $found{$_} *= -1 for qw/frlong tolong/;&lt;br /&gt;        print "Found :". Dumper(\%found)."\n"  if ($DEBUG);&lt;br /&gt;        if ( defined($found{fradd})) {&lt;br /&gt;          $found{street} =~ s/'/''/g;&lt;br /&gt;        next if (!defined($found{tolong}));&lt;br /&gt;&lt;br /&gt;        my $q =  "INSERT INTO reverse_geocode (zip, street, type, prefix, from_addr, to_addr, line_segment) VALUES ('$found{zip}','$found{street}','$found{type}', '$found{prefix}',$found{fradd}, $found{toadd}, GeomFromText('LineString($found{frlat} $found{frlong}, $found{tolat} $found{tolong})') );";&lt;br /&gt;        print $q ."\n"; # if ($DEBUG);&lt;br /&gt;        }&lt;br /&gt;        if (defined($data[0]) &amp;&amp; $data[0] &amp;gt; 1_000_000) {&lt;br /&gt;                @to = splice(@data, 0,2 );&lt;br /&gt;        }&lt;br /&gt;        }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here's the basic table structure:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;CREATE TABLE `reverse_geocode` (&lt;br /&gt;`zip` varchar(5) NOT NULL DEFAULT '0',&lt;br /&gt;`street` varchar(64) NOT NULL DEFAULT '',&lt;br /&gt;`type` varchar(8) NOT NULL DEFAULT '',&lt;br /&gt;`from_addr` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;`to_addr` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;`line_segment` geometry NOT NULL,&lt;br /&gt;`prefix` varchar(4) NOT NULL DEFAULT '',&lt;br /&gt;SPATIAL KEY `line_segment` (`line_segment`)&lt;br /&gt;) ENGINE=MyISAM DEFAULT CHARSET=latin1;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;And the .sql file can be imported with something like:&lt;br /&gt;&lt;pre&gt; mysql -u&lt;user&gt; -p geocode &lt;&gt;&lt;/user&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The result is&lt;br /&gt;&lt;pre&gt;mysql&gt; show table status like 'reverse_geocode' \G&lt;br /&gt;*************************** 1. row ***************************&lt;br /&gt;       Name: reverse_geocode&lt;br /&gt;     Engine: MyISAM&lt;br /&gt;    Version: 10&lt;br /&gt; Row_format: Dynamic&lt;br /&gt;       Rows: 16005729&lt;br /&gt;Avg_row_length: 81&lt;br /&gt;Data_length: 1306599484&lt;br /&gt;Max_data_length: 281474976710655&lt;br /&gt;Index_length: 1021292544&lt;br /&gt;  Data_free: 0&lt;br /&gt;Auto_increment: NULL&lt;br /&gt;Create_time: 2009-04-27 13:08:31&lt;br /&gt;Update_time: 2009-04-27 14:34:36&lt;br /&gt; Check_time: NULL&lt;br /&gt;  Collation: latin1_swedish_ci&lt;br /&gt;   Checksum: NULL&lt;br /&gt;Create_options:&lt;br /&gt;    Comment:&lt;br /&gt;1 row in set (0.00 sec)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Query the Data&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Simply use the MBRContains fuction, and pass it a bounding box to cull results too far away.  Sort by distance and take the closest result.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;mysql&gt; SET @center = GeomFromText('POINT(37.372241 -122.021671)');&lt;br /&gt;Query OK, 0 rows affected (0.00 sec)&lt;br /&gt;&lt;br /&gt;mysql&gt; SET @radius = 0.005;&lt;br /&gt;Query OK, 0 rows affected (0.00 sec)&lt;br /&gt;&lt;br /&gt;mysql&gt; SET @bbox = GeomFromText(CONCAT('POLYGON((',&lt;br /&gt;-&gt; X(@center) - @radius, ' ', Y(@center) - @radius, ',',&lt;br /&gt;-&gt; X(@center) + @radius, ' ', Y(@center) - @radius, ',',&lt;br /&gt;-&gt; X(@center) + @radius, ' ', Y(@center) + @radius, ',',&lt;br /&gt;-&gt; X(@center) - @radius, ' ', Y(@center) + @radius, ',',&lt;br /&gt;-&gt; X(@center) - @radius, ' ', Y(@center) - @radius, '))')&lt;br /&gt;-&gt; );&lt;br /&gt;Query OK, 0 rows affected (0.00 sec)&lt;br /&gt;&lt;br /&gt;mysql&gt; select zip, prefix, street, type, from_addr, to_addr, astext(line_segment), Distance(@center,line_segment) as dist&lt;br /&gt;FROM reverse_geocode where MBRContains(@bbox, line_segment) order by dist limit 1;&lt;br /&gt;+-------+--------+----------+------+-----------+---------+---------------------------------------------------------+------+&lt;br /&gt;| zip   | prefix | street   | type | from_addr | to_addr | astext(line_segment)                                    | dist |&lt;br /&gt;+-------+--------+----------+------+-----------+---------+---------------------------------------------------------+------+&lt;br /&gt;| 94086 | S      | Fairoaks | Ave  |       323 |     331 | LINESTRING(37.375141 -122.020671,37.372241 -122.021671) |    0 |&lt;br /&gt;+-------+--------+----------+------+-----------+---------+---------------------------------------------------------+------+&lt;br /&gt;1 row in set (0.00 sec)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Now for my project, this was good enough; knowing the closest block of the street produced the results I wanted.&lt;br /&gt;&lt;br /&gt;For more accuracy, there are still a few issue that need to be solved:&lt;br /&gt;&lt;br /&gt;1. For the even/odd side of each block the same line segment is used.  In this case I only keep one side of the street, however the extraction script would have to be modified a little bit if you wanted both sides.  It's probably most effective to add two more from/to address columns in the table to store the even/odd numbers.&lt;br /&gt;&lt;br /&gt;2. You would have to use some basic geometry to figure out which side of the street the point is on, then use that to choose the even or odd numbers.&lt;br /&gt;&lt;br /&gt;3. Using some more geometry, you could also find an approximate address along the line.  Essentially you would interpolate a value, which may or may not be a valid address.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-6313359718176855897?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/6313359718176855897/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=6313359718176855897' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/6313359718176855897'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/6313359718176855897'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/05/reverse-geocoding-using-mysql-gis.html' title='Reverse Geocoding using MySQL GIS'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-8008743237880145059</id><published>2009-03-12T16:54:00.000-07:00</published><updated>2009-03-12T22:43:32.542-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>LOAD DATA INFILE Gotcha</title><content type='html'>I was recently asked about errors in a LOAD DATA INFILE statement.  Everything seemed straightforward enough except it was throwing warnings for some columns and not loading the correct values.&lt;br /&gt;&lt;br /&gt;Given a simple set of data like this in a text file:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;\N    \N&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;18000    \N&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;18900    18900&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;546680    546680&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;and a table schema like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;CREATE TABLE numbers (a int , b int );      &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Why would you get a result like this?&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; &lt;/span&gt;&lt;span style="font-family:courier new;"&gt;LOAD DATA INFILE 'c:\\test.txt' &lt;/span&gt;&lt;span style="font-family:courier new;"&gt;INTO TABLE&lt;/span&gt;&lt;span style="font-family:courier new;"&gt; numbers&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;Query OK, 4 rows affected, 2 warnings (0.00 sec)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;Records: 4  Deleted: 0  Skipped: 0  Warnings: 0&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; show warnings;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+---------+------+-------------------------------------------------------+&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;| Level   | Code | Message                                               |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+---------+------+-------------------------------------------------------+&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;' for column 'b' at row 1 |  integer value: 'N&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;' for column 'b' at row 2 |  integer value: 'N&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+---------+------+-------------------------------------------------------+&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;2 rows in set (0.00 sec)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;mysql&gt; SELECT * FROM numbers;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+--------+--------+&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;| a      | b      |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+--------+--------+&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;|   NULL |      0 |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;|  18000 |      0 |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;|  18900 |  18900 |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;| 546680 | 546680 |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;+--------+--------+&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;4 rows in set (0.00 sec)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Notice it only freaked out on the NULL values and the end of the line!&lt;br /&gt;&lt;br /&gt;The answer is one that's come up before, but for some reason I had to break out the hex editor before I remembered:  The file was created on a windows machine, and had \r\n at the end of each line instead of just \n.&lt;br /&gt;&lt;br /&gt;The solution is simple:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;LOAD DATA INFILE 'c:\\test.txt' &lt;/span&gt;&lt;span style="font-family:courier new;"&gt;INTO TABLE&lt;/span&gt;&lt;span style="font-family:courier new;"&gt; numbers &lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;LINES TERMINATED BY '\r\n'&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-8008743237880145059?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/8008743237880145059/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=8008743237880145059' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/8008743237880145059'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/8008743237880145059'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/03/load-data-infile-gotcha.html' title='LOAD DATA INFILE Gotcha'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-2129529406459995247</id><published>2009-03-10T18:14:00.001-07:00</published><updated>2009-03-10T18:25:45.569-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Why can't you use subqueries on temp tables?</title><content type='html'>So recently I advised someone to do a subquery like:&lt;br /&gt;&lt;br /&gt;SELECT distance FROM table WHERE distance &lt; ( SELECT MAX(distance)/2 FROM table)&lt;br /&gt;&lt;br /&gt;The key part was the subquery.  They came back and said it returned an error:&lt;br /&gt;&lt;br /&gt;"Can't reopen table"&lt;br /&gt;&lt;br /&gt;The problem is this was on a TEMPORARY table.  The manual clearly states this as a restriction:&lt;br /&gt;&lt;br /&gt;You cannot refer to a &lt;code class="literal"&gt;TEMPORARY&lt;/code&gt; table               more than once in the same query. (http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html)&lt;br /&gt;&lt;br /&gt;However I decided to do a bit of digging to find out why that's the case.  So I found a nicely commented answer, straight from the source's mouth:&lt;br /&gt;&lt;br /&gt;/*&lt;br /&gt;  We're trying to use the same temporary table twice in a query.&lt;br /&gt;  Right now we don't support this because a temporary table&lt;br /&gt;  is always represented by only one TABLE object in THD, and&lt;br /&gt;  it can not be cloned.  Emit an error for an unsupported behavior&lt;br /&gt;*/   from sql/sql_base.cc open_table&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-2129529406459995247?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/2129529406459995247/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=2129529406459995247' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2129529406459995247'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/2129529406459995247'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/03/why-cant-you-use-subqueries-on-temp.html' title='Why can&apos;t you use subqueries on temp tables?'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-9134601819767767396</id><published>2009-01-28T16:43:00.000-08:00</published><updated>2009-01-28T17:42:35.715-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>MySQL AES Encryption Compatability</title><content type='html'>So what happens if you want to use the mysql functions AES_ENCRYPT / AES_DECRYPT to handle data encryption, but you might also need to write a tool that takes the encrypted data and accesses or updates it outside of the database?  This operation can be hit or miss depending on your library's implementation of how it handles the encryption key passed to the actual encryption function.  There is no standard on how this is handled.&lt;br /&gt;&lt;br /&gt;AES only defines that it accepts an encryption key of AES_KEY_LENGTH/8 bytes.  So for 128 bit encyption you need to give it a 16 byte key.&lt;br /&gt;&lt;br /&gt;In mysql you can do:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;AES_ENCRYPT('mykey','here is a string to encrypt');&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:georgia;"&gt;But we just passed it a 5 character string, where does the 16 byte key come from?  This is the part that can differ between libraries.  MySQL takes your string and converts it like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;static int my_aes_create_key(KEYINSTANCE *aes_key,&lt;br /&gt;                          enum encrypt_dir direction, const char *key,&lt;br /&gt;                          int key_length)&lt;br /&gt;{&lt;br /&gt;uint8 rkey[AES_KEY_LENGTH/8];  /* The real key to be used for encryption */&lt;br /&gt;uint8 *rkey_end=rkey+AES_KEY_LENGTH/8; /* Real key boundary */&lt;br /&gt;uint8 *ptr;                   /* Start of the real key*/&lt;br /&gt;const char *sptr;                     /* Start of the working key */&lt;br /&gt;const char *key_end=key+key_length;   /* Working key boundary*/&lt;br /&gt;&lt;br /&gt;bzero((char*) rkey,AES_KEY_LENGTH/8);      /* Set initial key  */&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;  for (ptr= rkey, sptr= key; sptr &lt;&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;  {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;    if (ptr == rkey_end)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;      ptr= rkey;  /*  Just loop over tmp_key until we used all key */&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;    *ptr^= (uint8) *sptr;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 102, 0);"&gt;  }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;snip&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;if (direction == AES_DECRYPT)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;    aes_key-&gt;nr = rijndaelKeySetupDec(aes_key-&gt;rk, rkey, AES_KEY_LENGTH);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; else&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;    aes_key-&gt;nr = rijndaelKeySetupEnc(aes_key-&gt;rk, rkey, AES_KEY_LENGTH);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; return 0;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;The important bit is in green.  That's the code from the mysql source, in mysys/my_aes.c . The algorithm just creates a 16 byte buffer set to all zero, then loops through all the characters of the string you provide  and does an assignment with bitwise OR between the two values.  If we iterate until we hit the end of the 16 byte buffer, we just start over from the beginning doing ^=.  For strings shorter than 16 characters, we stop at the end of the string.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;So for our sample password above, the resulting key would be this:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt; select hex('mykey\0\0\0\0\0\0\0\0\0\0\0');&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;+------------------------------------+&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;| hex('mykey\0\0\0\0\0\0\0\0\0\0\0') |&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;+------------------------------------+&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;| 6D796B65790000000000000000000000   |&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;+------------------------------------+&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;There are two important aspects of this method:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;1. For any password you supply that is less than 16 bytes, the resulting AES key will simply be that value padded to 16 bytes with null zeros&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;2. For any password that is exactly 16 bytes, the resulting AES key is exactly the same as the supplied password.&lt;br /&gt;&lt;br /&gt;*WHEW* That was quite a bit, but now lets see how that applies to our attempt to descrypt mysql's values from outside the database or create encrypted strings that mysql can decrypt.  For this example lets look at java:&lt;br /&gt;&lt;br /&gt;From: &lt;a href="http://java.sun.com/developer/technicalArticles/Security/AES/AES_v1.html"&gt;http://java.sun.com/developer/technicalArticles/Security/AES/AES_v1.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/snip&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;pre style="color: rgb(0, 0, 0);"&gt;&lt;code&gt;       // Get the KeyGenerator&lt;strong&gt;&lt;br /&gt;   KeyGenerator kgen = KeyGenerator.getInstance("AES");&lt;br /&gt;   kgen.init(128); // 192 and 256 bits may not be available&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;   // Generate the secret key specs.&lt;br /&gt;   SecretKey skey = kgen.generateKey();&lt;br /&gt;   byte[] raw = skey.getEncoded();&lt;strong&gt;&lt;br /&gt;   SecretKeySpec skeySpec = new SecretKeySpec(raw, "AES");&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;   // Instantiate the cipher&lt;strong&gt;&lt;br /&gt;   Cipher cipher = Cipher.getInstance("AES");&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;   cipher.init(Cipher.ENCRYPT_MODE, skeySpec);&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;But wait, where is our password?  Well this example they give you has you use KeyGenerator to create a key randomly.  That's not very usefull if you ever hope to decrypt your data later.   More commonly for password based encryption we would&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);font-family:courier new;" &gt;&lt;span style="font-family:georgia;"&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="font-family:georgia;"&gt; do something like:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="font-family:georgia;"&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;&lt;br /&gt;  &lt;span style="color: rgb(255, 102, 0);"&gt;byte[] keyBytes = new byte[] { 0x6D, 0x79, 0x6B, 0x65, 0x79, 0x00, 0x00, 0x00, 0x00, 0x00,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 102, 0);"&gt;       0x00, 0x00, 0x00, 0x00, 0x00, 0x00  };&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 102, 0);"&gt;   SecretKeySpec key = new SecretKeySpec(keyBytes, "AES");&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 102, 0);"&gt;   // Instantiate the cipher&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 102, 0);"&gt;   Cipher cipher = Cipher.getInstance("AES");&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 102, 0);"&gt;  cipher.init(Cipher.ENCRYPT_MODE, key);&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;In this keyBytes is simply a byte array defining our 16 byte encryption key.  Notice the values, they're the same as our password above padded out with null zeros!  In this example the encrypted output would exactly match what mysql would produce.  Yay, compatable encryption&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);font-family:georgia;" &gt;  I'll leave it as an exercise to the reader to create a function in their language of choice to take a string as input and produce a 16 byte array that matches the result of the mysql source above.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-9134601819767767396?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/9134601819767767396/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=9134601819767767396' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/9134601819767767396'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/9134601819767767396'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2009/01/mysql-aes-encryption-compatability.html' title='MySQL AES Encryption Compatability'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6346091698278358988.post-7737223120505617054</id><published>2008-11-26T19:19:00.000-08:00</published><updated>2008-11-26T19:31:02.735-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Find out what process is connected to mysql</title><content type='html'>&lt;p&gt;This works when you have access to the connected machine -- I've done this on Redhat/CentOs, but don't know about other linux distros.&lt;br /&gt;&lt;/p&gt; &lt;ul&gt;&lt;li&gt; In the mysql processlist, you'll see the host/port connected: &lt;/li&gt;&lt;/ul&gt; &lt;pre&gt;192.168.10.100:60786&lt;br /&gt;&lt;/pre&gt; &lt;ul&gt;&lt;li&gt;Connect to the machine shown and run:&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt; &lt;pre&gt;&gt; socklist | grep 60786&lt;br /&gt;tcp   60786   51371094       0  17518   27  httpd&lt;br /&gt;&lt;br /&gt;&gt; find /proc/ -lname "*51371094*" -ls&lt;br /&gt;&lt;/pre&gt; &lt;ul&gt;&lt;li&gt; this will give you the pid of the process that has the socket open to mysql by searching the file descriptor symlinks for the inode that coresponds to the socket on that port. &lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6346091698278358988-7737223120505617054?l=gtowey.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gtowey.blogspot.com/feeds/7737223120505617054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6346091698278358988&amp;postID=7737223120505617054' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/7737223120505617054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6346091698278358988/posts/default/7737223120505617054'/><link rel='alternate' type='text/html' href='http://gtowey.blogspot.com/2008/11/find-out-what-process-is-connected-to.html' title='Find out what process is connected to mysql'/><author><name>gtowey</name><uri>http://www.blogger.com/profile/11697214385828484372</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
