worldgonemad.com

If the world was perfect, it wouldn't be. – Yogi Berra

My daughter and I (and everyone else in the subway car) got to enjoy the constant chatter of an older Brooklynite the full way from Grand Army Plaza to Hoyt Street this morning:

[parrot-like]

Happy Wednesday! Peace and love!

Happy Wednesday! Peace and love!

Adam had a party on the fourth day! What did he call it?

[singing]

Adam had a par-ty!

Adam had a par-ty!

[repeat from beginning]

Try it a couple times. It will stick in your head.

My 7-year-old turns to me after we get off and says “It’s not polite to make so much noise on the subway.” And then giggles.

Me: But he was pretty funny.

Sophia: Yeah.

Adam had a par-ty!

If you are fortunate enough to be in Beijing this week for Oracle Openworld, there are several sessions planned on MySQL to choose from. The English site is somewhat hard to navigate, so I humbly submit a tiny URL that will point you to the appropriate session search results.

DSC_0032Few act from pure motives, and I’m shamelessly plugging my own talks: Improving the Scalability of Web Applications with MySQL Replication, Performance and Scalability Enhancements in MySQL Release 5.5, and MySQL Strategy: What’s Next? (that last I’m merely technical backup for Richard Mason).

There are several other outstanding talks on MySQL migrations, tuning and Cluster as well, and of course an amazing amount of technical talent in one building. It doesn’t hurt to be walking distance from the Water Cube (now a public pool, hoping to get in a swim there) and the Bird’s Nest.

If you do find yourself at OOW Beijing, be sure to visit the MySQL pod as well in the exhibitor area. It’s not always a cakewalk manning a booth.

I was reading another post comparing the different forks of MySQL (disclaimer: my employer), and again it seemed to me the term “fork” is somewhat imprecise. I agree with Morgan Tocker that “delta” does not capture these other creatures either – after all, isn’t a delta what makes a fork not a copy?

Wikipedia cites Eric Raymond‘s definition that “The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community”, but also notes “However, this is not common present usage”. Kind of a shame – at least esr drew a hard line. The definition we’re left with could include any copy of MySQL with a patch or even a UDF attached to it.

“Fork” implies a complete departure. Drizzle is no doubt a fork of MySQL. They took the source, dramatically reworked it, and made something brand new. Can’t wait to see where it goes. Tracking MySQL development and maintaining a set of additions and enhancements seems to be a different concept.

“Distribution” is the OS-derived term that some people prefer, but I’d like to propose a different one: MySQL now has several spins. Quoth Fedora: “spins are alternate version (sic) of [the software], tailored for various types of users via hand-picked application set or customizations.

What’s the difference? Drizzle has a limited ability to incorporate what the MySQL development team comes up with next, and vice versa. There is much compatibility, but it is likely only going to become less over time, not more. If you are using a “drop-in replacement branch of the MySQL Database Server“, or a binary that “adds enhancements to the MySQL server code“, I should think you are counting on it not being a fork. So what seems like an issue of semantics is really an attempt to give end users a sense of current & future compatibility.

Terminology is important. Comments are welcome.

Shouldn’t really be that hard, should it? Ubuntu (current LTS is fine) on a mini-tower, with the hardware certified and supported.

Dell has a lot to say about linux, but pretty much on the server. You get to the laptops & workstations, and Linux is up there with FreeDOS. I’m scared to even ask for a quote (which you have to do by email) from Penguin Computing, because they are really geared toward the power workstation.

I don’t think I’m so unique: I’ve had a linux desktop for 10+ years, my employer pays for my main work machine, and I want a second box for home stuff. ‘Zat so hard?

If you have done a good job of building your rails models, you may find that they are helpful for your non-rails system maintenance and such. They may even be necessary to reuse if you follow the rails model of using activerecord validations (rather that database RI) to preserve the integrity of your data.

Or you may just find yourself rewriting the same code again and again, and want all that good railsiness to make it easier to write and maintain. Personally I find myself in some instance of ./script/console as often as irb just so I can get the activesupport helper methods ( 4.days.from_now and such) that many rails developers are surprised to find are not actually a standard part of ruby.

So, the good news is it is easy to reuse rails code outside of rails.

Let’s say you want to do some data manipulation (reporting, loading, scrubbing, etc) in your rails db, and want to use your models to do it. A few imports in your ruby script gets the necessary environment in place:

require ‘rubygems’
require ‘yaml’
require ‘active_record’
require ‘logger’

and a few more will load up your models (note: they’re probably not in the same location as mine, unless you are also working on an app called ‘seweb’ in your home dir):

PROJECT_HOME = "#{ENV['HOME']}/seweb/"
require "#{PROJECT_HOME}/app/models/sales_rep.rb"
require "#{PROJECT_HOME}/app/models/organization.rb"
require "#{PROJECT_HOME}/app/models/team.rb"

Then connect to the appropriate database (note I’m connecting to the development environment – can you guess how I’d connect to ‘test’ or ‘production’?), with rails logging enabled:

ActiveRecord::Base.logger = Logger.new( STDERR )
db_config = YAML::load( File.open("#{seweb_home}/config/database.yml"))
ActiveRecord::Base.establish_connection( db_config["development"])

And you are good! If you are using a transactional database (such as my personal favorite, MySQL with InnoDB), you can make nice transaction wrappers for your work thusly:

ActiveRecord::Base.transaction do

        rep = SalesRep.find_or_initialize_by_name( ‘Kyllin D. Quota’ )
        # create the component parts
        if( rep.changed? )
            rep.organization = Organization.find_or_create_by_name ‘APAC’
            rep.team = Team.find_or_create_by_name ‘Enterprise’
            rep.save!
        end

        rescue Exception
            raise ActiveRecord::Rollback, "Invalid record for #{rep.name}"
        end

end

Pow. You get your rails sugar, rails validations, rails logging. Are you happy? Why yes, yes you are.

I was warned by my brother a while ago that should I start tweeting, he would stage an intervention. I had already confessed to accounts with facebook, multiply, myspace, and several others (disclaimer/explanation: all of those sites are customers of my longtime employer). Twitter, to the uninitiated, looks like the crack cocaine of social networking that turns the weekend photo-poster into a hardcore jittering lifecaster. Nobody wants to see their family member come to that, right? But follow along, twitter has purpose. Or just skip to the bottom.

 I was never an active friendster user. The first site I used regularly was the more inward-facing multiply.com – and then mostly because it was an easy way to foist photos of my daughter on my extended family. Multiply is more of a community-based, relationship-savvy site than a place to find online friends. If I am cousin-of-John, it makes the assumption (with my consent) that I am interested in content by wife-of-John, brother-of-John, etc. Combined with who I have directly connected to, Multiply can quickly become a nice "walled garden" of family and friends content and connections. Easy media uploading, a slider I can set to "show me what the people close to me are up to", and that’s really about all I needed.

 But then, like everyone else with a facebook account, sometime in the last 18 months a bizarre game of "this is your life" began. Invitations to connect came in from grade school friends, distant family, former co-workers, babysitters and fellow inmates. Scratch that last one.

 I’m in (facebook-only, mostly) contact with literally dozens of people that I hadn’t talked to in 5-25 years, that I seriously doubt I would have ever heard from or about. Its really interesting, and I enjoy seeing the 2-3 things a week they note about their lives, the occasional photo or link to what they’re doing. Unlike Multiply, facebook defaults to "everyone is an acquaintance" and gives you one firehose of updates sorted by when they were posted. Its fun to gaze over when I have a few minutes online and see who is up to what.

 Tweets then started to bleed in to the status update page. I knew of, and had zero interest in, this "text the world" service called Twitter. Why would I even use an online outlet to send a message to a friend of mine? I have an email address, cell phone number and at least one IM handle for most of my friends and families… can’t I connect with them easily enough? So why on earth are there facebook updates like "hanging out with @tomjefferson in #philly #consitition #usa"? What’s with all the @ signs and hashes?

Well, it was low-effort enough (no "friending", much less relationship definition, required for most posts) to review the updates for an individual person on twitter. Good friends who lived far away, co-workers involved in a crucial event, things like that (the celebs like Lance came later). To streamline the process of viewing those, I got an account.

 Twitter is the most stalker-friendly social networking site. Twitter by default does not even ask if you know or like anyone, only if you want to "follow" them. Seems kinda creepy for those unfamiliar with the Information Age. And it really is like a big water cooler on the internet. It’s hard to resist joining in after a bit.

The photos of the kids still go to multiply, and I still watch facebook to keep up with a more extended group. But the most inane of updates and commentary are best put out to twitter, and here is why. My frustrated tweet when a desktop social networking client crashed on me:

pantoniades: #gwibber unstable on fedora 11. Need a new desktop Twitter client.

Inane, right? On multiply I’d confuse more people than inform. I promise you most of my family doesn’t understand the context for half of the words above. On facebook, that’s just clutter. But on twitter, here’s what happens next:

pauljakma: @pantoniades grab #gwibber 1.2.0 (e.g. from #Fedora rawhide) – works great

I don’t know Paul Jakma. I’m guessing he’s a gwibber developer or enthusiast. But more importantly, he’s right. Yes, I could likely have found that tip in an IRC channel, bugzilla note or through some google search, but I was really not invested in this client. I got the fix, he kept somebody on his project, and neither one of us invested much effort (presumably he has a tickler on "#gwibber").

I haven’t found any long-lost friends on twitter, and I’m not putting the photos of my kids goofing around in the bathtub on facebook. Perhaps there is one uber-site to rule them all, but I’m also quite happy with the three I’ve got. Provided I can dodge the van my brother sends to take me off to deprogramming.

Once again, I was unable to attend all of the sessions I wanted to at this year’s User Converence, but I was happy to make it to Bob Burgess‘ talk on bash scripting with mysql. The slides and examples aren’t up yet, but when they are (which may be as you read this, check the last link), they would probably also be a great tutorial.

So, I got bore^D^D^D^D inspired later that day to put some of the practices into use, and worked up a script to run mysqlslap in various ways against a server, and then added a couple funcitons to try it out on each storage engine. The script is below in its entirety – bash scripters, please be kind in your comments. No, I didn’t write all this just for the pun in the subject. But I’m not above that.

The result?

Why don’t I use more BLACKHOLE tables? They are blazing fast!

 My results (on my lenovo T61, Fedora 10):

SLAP Base values:
 50 simultaneous connections ||  10 runs through
 Writes : 1000, 500 unique (Commit every 500) || Queries: 1000, 200 unique
 Schema: 4 character columns, 8 numeric with auto-increment PK and 10 secondary indexes
For InnoDB: 0.389 Average, 0.299 Min, 0.651 Max
For MyISAM: 0.364 Average, 0.355 Min, 0.377 Max
For BLACKHOLE: 0.137 Average, 0.124 Min, 0.147 Max
For CSV: n/a Average, n/a Min, n/a Max
For MEMORY: 0.375 Average, 0.363 Min, 0.444 Max
For ARCHIVE: n/a Average, n/a Min, n/a Max
For MRG_MYISAM: n/a Average, n/a Min, n/a Max

The "n/a" ones are tables that, generally for obvious reasons, couldn’t do the slap. My error handling needs work.

There are some expected trends that are good to validate – InnoDB improves with more concurrency (in a relative sense), MEMORY has remarkably little fluxuation in response time, things like that. But the marketing guys really have to capitalize on those BLACKHOLE numbers :-)

#!/bin/bash
shopt -s -o nounset
printf "Enter root pwd: "
read -s PASSWORD
# get the list of active engines from MySQL
ENGINES=`mysql -uroot -p$PASSWORD -B -N -e "SELECT ENGINE from ENGINES WHERE SUPPORT<>’NO’" INFORMATION_SCHEMA`
#for e in $ENGINES; do
#    printf "\nFound engines: %s" $e
#done
printf "\nStarting test at %s \n" `date +%H:%M:%S`
# default initial settings
ITERATIONS=10
CONCURRENCY=50
COMMIT=500
WRITES=1000
let "UNIQUE_WRITES=$WRITES/2"
QUERIES=1000
let "UNIQUE_QUERIES=$QUERIES/5"
LOAD_TYPE=mixed
CHARS=4
INTS=8
INDX=10
SLAP="mysqlslap -u root -p$PASSWORD -h 127.0.0.1 -a -c $CONCURRENCY -i $ITERATIONS –auto-generate-sql-add-autoincrement –auto-generate-sql-secondary-indexes=$INDX –auto-generate-sql-write-number=$WRITES –auto-generate-sql-unique-write-number=$UNIQUE_WRITES –auto-generate-sql-unique-query-number=$UNIQUE_QUERIES -x $CHARS  -y $INTS –number-of-queries=$QUERIES –commit=$COMMIT –auto-generate-sql-load-type=$LOAD_TYPE "
function parse_slap {
    if [ $# -lt 1 ]; then
        AVERAGE="n/a"
        MINIMUM="n/a"
        MAXIMUM="n/a"
    else
        AVERAGE=$1
        MINIMUM=$2
        MAXIMUM=$3
    fi    
}
function run_slap {
        printf "%s\n" "SLAP $1:"
        printf "%s\n" " $CONCURRENCY simultaneous connections ||  $ITERATIONS runs through "
        printf "%s\n" " Writes : $WRITES, $UNIQUE_WRITES unique (Commit every $COMMIT) || Queries: $QUERIES, $UNIQUE_QUERIES unique "
        printf "%s\n" " Schema: $CHARS character columns, $INTS numeric with auto-increment PK and $INDX secondary indexes"
for engine in $ENGINES
    do
        SLAPPED=`$SLAP -e $engine 2>/dev/null | cut -c48-53 | tr -d \n`
"courier new,courier,monospace" size="3">        echo $SLAPPED >> $0.txt
        parse_slap $SLAPPED
            printf "For %s: %s Average, %s Min, %s Max\n" $engine $AVERAGE $MINIMUM $MAXIMUM
    done
}
run_slap "Base values"
echo
let WRITES=WRITES*10
let QUERIES=QUERIES*100
let UNIQUE_QUERIES=QUERIES/4
run_slap "more reads"
echo
UNIQUE_QUERIES=$QUERIES
UNIQUE_WRITES=$WRITES
let COMMIT=COMMIT*3
run_slap "More unique reads and writes"
echo
let INTS=INTS*5
let CHARS=CHARS*5
let INDX=INTS+CHARS-1
run_slap "wide indexed tables"
echo
let CONCURRENCY=CONCURRENCY*10
run_slap "massive concurrency"

Lately I find myself running quite a few large compiles and virtual machines, so I tricked out my Thinkpad T61 with a full 4 GB of RAM. Anyone with a more than a casual acquaintance with 32-bit operating systems and/or the powers of 2 will quickly see the problem I faced – on next boot Suse proudly reported I had a full 3.8G of RAM available.

 Now, to a man of my age that means that there is more memory laying fallow on this machine than I have had in most of the machines I’ve worked on throughout my career, and that’s just a shame. So I decided to dual-boot and try out a 64-bit distribution.

 At this point let’s reflect – when was the last time you actually worked on a 32bit desktop? Can you buy a laptop with a true 32-bit processor in it? Shouldn’t this charade have ended long ago?

 Well, I tried Fedora 10 x86_64, and I can tell you the charade has not ended. There is hope: For your browsing pleasure, Adobe now has an alpha available of their 64 bit flash plugin for linux, and my own employer just this month released update 6 of the JDK/JRE featuring for the first time a 64 bit Java plugin for mozilla. That’s right, folks, 64 bit linux will soon have cutting edge technology like Flash and Java applets in production-ready form!

Now for the bad news. Webex does not like the 64-bit Java environment (though admittedly I have yet to try the actual release, I have been using the final beta), which is a showstopper for me. Skype is still in 32-bit form, and while it works fine for chat & voice I never did get video to do anything but crash it. And I’ve found some really bad (freeze-your-laptop bad) bugs with the nvidia driver and NetworkManager – more and worse than this laptop has seen in a lot of distro-hopping. 

 The more cynical may say that my experience of instability could be chalked up to the fact that I’m not acutally comparing apples to apples – I went from OpenSuse 11.0 to Fedora 10 as well. I will say OpenSuse is the most stable distro I’ve used, but I don’t think Fedora is so lacking to explain the experience. I’m temped to re-install the whole box OpenSuse 11.1 in both flavors of bitness… but I’ll probably just go to 11.1 x86.

 I will still feel wasteful and depressed every time I run `free`, though.

I’ve been running OpenSuse for a while now, and there are some genuinely annoying things about it:

Sudo
‘Defaults targetpw’ (e.g. require root’s password to do anything as root) is not a good idea. Having the installer recommend I make my password also be root’s password compounds the stupidity.

Path
So sudo is already backwards, now I also have to remember what is in /sbin/ or /usr/sbin/ because they’re not in my path? Regular users don’t use ifconfig?

Ok, so I should have reconfigured both of these things (edited one file to change each) on day one and gotten these complaints out of the way. But I have this idea that you should try the spirit of a distro (after all, what else is there?). I do have bigger gripes, namely:

lsusb
I don’t have it. /proc/ doesn’t have what I’m looking for. Wtf?

RPM
I’ve been to the mountaintop. I’ve seen aptitude. I’ve abandoned fedora more than once in mire of dependencies and conflicting repositories. No, it hasn’t happened yet here, but I’m still wary.
Yet Suse is just So Stable. Somehow it is able to just do what I need it to when I need it to, and the packages are updated just often enough. I ditched Hardy, after all, for the lack of crucial features on my T61, and Suse was able to deliver all of them.
And it still does have Just Enough. I’m worried by what will happen when I really need to work to get something to work (for the reasons listed above)… but I haven’t yet.

Thanks Suse.

I’m off to do some MySQL training in Asia for the rest of the month. One week in Singapore, one week (and the weekend) in Bangalore.

Correction, Asia and the Emerging Markets. Apparently thanks to phenomenal growth, China and India have left Asia in the last few years (in our corporate parlance).

Always good to check flight status before you head to the airport, even (especially?) at 7am when you just checked 6 hours before. I now have 5 extra hours in beautiful John F Kennedy Airport, terminal 7. I haven’t seen the new JetBlue terminal, but this one almost makes JFK feel like a major international airport.

Cathay Pacific seems a fine airline, though perhaps technically challenged. The self-service kiosks were all down, and my plane is somewhere between here & Canada… but they gave me $25 to get breakfast & lunch plus a calling card while I wait! If the promise of "outlets for everyone" is true, I’ll give them a full thumbs-up.