worldgonemad.com

If the world was perfect, it wouldn't be. – Yogi Berra

I was reading another post comparing the different forks of MySQL (disclaimer: my employer), and again it seemed to me the term “fork” is somewhat imprecise. I agree with Morgan Tocker that “delta” does not capture these other creatures either – after all, isn’t a delta what makes a fork not a copy?

Wikipedia cites Eric Raymond’s definition that “The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community”, but also notes “However, this is not common present usage”. Kind of a shame – at least esr drew a hard line. The definition we’re left with could include any copy of MySQL with a patch or even a UDF attached to it.

“Fork” implies a complete departure. Drizzle is no doubt a fork of MySQL. They took the source, dramatically reworked it, and made something brand new. Can’t wait to see where it goes. Tracking MySQL development and maintaining a set of additions and enhancements seems to be a different concept.

“Distribution” is the OS-derived term that some people prefer, but I’d like to propose a different one: MySQL now has several spins. Quoth Fedora: “spins are alternate version (sic) of [the software], tailored for various types of users via hand-picked application set or customizations.

What’s the difference? Drizzle has a limited ability to incorporate what the MySQL development team comes up with next, and vice versa. There is much compatibility, but it is likely only going to become less over time, not more. If you are using a “drop-in replacement branch of the MySQL Database Server“, or a binary that “adds enhancements to the MySQL server code“, I should think you are counting on it not being a fork. So what seems like an issue of semantics is really an attempt to give end users a sense of current & future compatibility.

Terminology is important. Comments are welcome.

Shouldn’t really be that hard, should it? Ubuntu (current LTS is fine) on a mini-tower, with the hardware certified and supported.

Dell has a lot to say about linux, but pretty much on the server. You get to the laptops & workstations, and Linux is up there with FreeDOS. I’m scared to even ask for a quote (which you have to do by email) from Penguin Computing, because they are really geared toward the power workstation.

I don’t think I’m so unique: I’ve had a linux desktop for 10+ years, my employer pays for my main work machine, and I want a second box for home stuff. ‘Zat so hard?

If you have done a good job of building your rails models, you may find that they are helpful for your non-rails system maintenance and such. They may even be necessary to reuse if you follow the rails model of using activerecord validations (rather that database RI) to preserve the integrity of your data.

Or you may just find yourself rewriting the same code again and again, and want all that good railsiness to make it easier to write and maintain. Personally I find myself in some instance of ./script/console as often as irb just so I can get the activesupport helper methods ( 4.days.from_now and such) that many rails developers are surprised to find are not actually a standard part of ruby.

So, the good news is it is easy to reuse rails code outside of rails.

Let’s say you want to do some data manipulation (reporting, loading, scrubbing, etc) in your rails db, and want to use your models to do it. A few imports in your ruby script gets the necessary environment in place:

require ‘rubygems’
require ‘yaml’
require ‘active_record’
require ‘logger’

and a few more will load up your models (note: they’re probably not in the same location as mine, unless you are also working on an app called ’seweb’ in your home dir):

PROJECT_HOME = "#{ENV['HOME']}/seweb/"
require "#{PROJECT_HOME}/app/models/sales_rep.rb"
require "#{PROJECT_HOME}/app/models/organization.rb"
require "#{PROJECT_HOME}/app/models/team.rb"

Then connect to the appropriate database (note I’m connecting to the development environment – can you guess how I’d connect to ‘test’ or ‘production’?), with rails logging enabled:

ActiveRecord::Base.logger = Logger.new( STDERR )
db_config = YAML::load( File.open("#{seweb_home}/config/database.yml"))
ActiveRecord::Base.establish_connection( db_config["development"])

And you are good! If you are using a transactional database (such as my personal favorite, MySQL with InnoDB), you can make nice transaction wrappers for your work thusly:

ActiveRecord::Base.transaction do

        rep = SalesRep.find_or_initialize_by_name( ‘Kyllin D. Quota’ )
        # create the component parts
        if( rep.changed? )
            rep.organization = Organization.find_or_create_by_name ‘APAC’
            rep.team = Team.find_or_create_by_name ‘Enterprise’
            rep.save!
        end

        rescue Exception
            raise ActiveRecord::Rollback, "Invalid record for #{rep.name}"
        end

end

Pow. You get your rails sugar, rails validations, rails logging. Are you happy? Why yes, yes you are.

I was warned by my brother a while ago that should I start tweeting, he would stage an intervention. I had already confessed to accounts with facebook, multiply, myspace, and several others (disclaimer/explanation: all of those sites are customers of my longtime employer). Twitter, to the uninitiated, looks like the crack cocaine of social networking that turns the weekend photo-poster into a hardcore jittering lifecaster. Nobody wants to see their family member come to that, right? But follow along, twitter has purpose. Or just skip to the bottom.

 I was never an active friendster user. The first site I used regularly was the more inward-facing multiply.com – and then mostly because it was an easy way to foist photos of my daughter on my extended family. Multiply is more of a community-based, relationship-savvy site than a place to find online friends. If I am cousin-of-John, it makes the assumption (with my consent) that I am interested in content by wife-of-John, brother-of-John, etc. Combined with who I have directly connected to, Multiply can quickly become a nice "walled garden" of family and friends content and connections. Easy media uploading, a slider I can set to "show me what the people close to me are up to", and that’s really about all I needed.

 But then, like everyone else with a facebook account, sometime in the last 18 months a bizarre game of "this is your life" began. Invitations to connect came in from grade school friends, distant family, former co-workers, babysitters and fellow inmates. Scratch that last one.

 I’m in (facebook-only, mostly) contact with literally dozens of people that I hadn’t talked to in 5-25 years, that I seriously doubt I would have ever heard from or about. Its really interesting, and I enjoy seeing the 2-3 things a week they note about their lives, the occasional photo or link to what they’re doing. Unlike Multiply, facebook defaults to "everyone is an acquaintance" and gives you one firehose of updates sorted by when they were posted. Its fun to gaze over when I have a few minutes online and see who is up to what.

 Tweets then started to bleed in to the status update page. I knew of, and had zero interest in, this "text the world" service called Twitter. Why would I even use an online outlet to send a message to a friend of mine? I have an email address, cell phone number and at least one IM handle for most of my friends and families… can’t I connect with them easily enough? So why on earth are there facebook updates like "hanging out with @tomjefferson in #philly #consitition #usa"? What’s with all the @ signs and hashes?

Well, it was low-effort enough (no "friending", much less relationship definition, required for most posts) to review the updates for an individual person on twitter. Good friends who lived far away, co-workers involved in a crucial event, things like that (the celebs like Lance came later). To streamline the process of viewing those, I got an account.

 Twitter is the most stalker-friendly social networking site. Twitter by default does not even ask if you know or like anyone, only if you want to "follow" them. Seems kinda creepy for those unfamiliar with the Information Age. And it really is like a big water cooler on the internet. It’s hard to resist joining in after a bit.

The photos of the kids still go to multiply, and I still watch facebook to keep up with a more extended group. But the most inane of updates and commentary are best put out to twitter, and here is why. My frustrated tweet when a desktop social networking client crashed on me:

pantoniades: #gwibber unstable on fedora 11. Need a new desktop Twitter client.

Inane, right? On multiply I’d confuse more people than inform. I promise you most of my family doesn’t understand the context for half of the words above. On facebook, that’s just clutter. But on twitter, here’s what happens next:

pauljakma: @pantoniades grab #gwibber 1.2.0 (e.g. from #Fedora rawhide) – works great

I don’t know Paul Jakma. I’m guessing he’s a gwibber developer or enthusiast. But more importantly, he’s right. Yes, I could likely have found that tip in an IRC channel, bugzilla note or through some google search, but I was really not invested in this client. I got the fix, he kept somebody on his project, and neither one of us invested much effort (presumably he has a tickler on "#gwibber").

I haven’t found any long-lost friends on twitter, and I’m not putting the photos of my kids goofing around in the bathtub on facebook. Perhaps there is one uber-site to rule them all, but I’m also quite happy with the three I’ve got. Provided I can dodge the van my brother sends to take me off to deprogramming.

Once again, I was unable to attend all of the sessions I wanted to at this year’s User Converence, but I was happy to make it to Bob Burgess‘ talk on bash scripting with mysql. The slides and examples aren’t up yet, but when they are (which may be as you read this, check the last link), they would probably also be a great tutorial.

So, I got bore^D^D^D^D inspired later that day to put some of the practices into use, and worked up a script to run mysqlslap in various ways against a server, and then added a couple funcitons to try it out on each storage engine. The script is below in its entirety – bash scripters, please be kind in your comments. No, I didn’t write all this just for the pun in the subject. But I’m not above that.

The result?

Why don’t I use more BLACKHOLE tables? They are blazing fast!

 My results (on my lenovo T61, Fedora 10):

SLAP Base values:
 50 simultaneous connections ||  10 runs through
 Writes : 1000, 500 unique (Commit every 500) || Queries: 1000, 200 unique
 Schema: 4 character columns, 8 numeric with auto-increment PK and 10 secondary indexes
For InnoDB: 0.389 Average, 0.299 Min, 0.651 Max
For MyISAM: 0.364 Average, 0.355 Min, 0.377 Max
For BLACKHOLE: 0.137 Average, 0.124 Min, 0.147 Max
For CSV: n/a Average, n/a Min, n/a Max
For MEMORY: 0.375 Average, 0.363 Min, 0.444 Max
For ARCHIVE: n/a Average, n/a Min, n/a Max
For MRG_MYISAM: n/a Average, n/a Min, n/a Max

The "n/a" ones are tables that, generally for obvious reasons, couldn’t do the slap. My error handling needs work.

There are some expected trends that are good to validate – InnoDB improves with more concurrency (in a relative sense), MEMORY has remarkably little fluxuation in response time, things like that. But the marketing guys really have to capitalize on those BLACKHOLE numbers :-)

#!/bin/bash
shopt -s -o nounset
printf "Enter root pwd: "
read -s PASSWORD
# get the list of active engines from MySQL
ENGINES=`mysql -uroot -p$PASSWORD -B -N -e "SELECT ENGINE from ENGINES WHERE SUPPORT<>’NO’" INFORMATION_SCHEMA`
#for e in $ENGINES; do
#    printf "\nFound engines: %s" $e
#done
printf "\nStarting test at %s \n" `date +%H:%M:%S`
# default initial settings
ITERATIONS=10
CONCURRENCY=50
COMMIT=500
WRITES=1000
let "UNIQUE_WRITES=$WRITES/2"
QUERIES=1000
let "UNIQUE_QUERIES=$QUERIES/5"
LOAD_TYPE=mixed
CHARS=4
INTS=8
INDX=10
SLAP="mysqlslap -u root -p$PASSWORD -h 127.0.0.1 -a -c $CONCURRENCY -i $ITERATIONS –auto-generate-sql-add-autoincrement –auto-generate-sql-secondary-indexes=$INDX –auto-generate-sql-write-number=$WRITES –auto-generate-sql-unique-write-number=$UNIQUE_WRITES –auto-generate-sql-unique-query-number=$UNIQUE_QUERIES -x $CHARS  -y $INTS –number-of-queries=$QUERIES –commit=$COMMIT –auto-generate-sql-load-type=$LOAD_TYPE "
function parse_slap {
    if [ $# -lt 1 ]; then
        AVERAGE="n/a"
        MINIMUM="n/a"
        MAXIMUM="n/a"
    else
        AVERAGE=$1
        MINIMUM=$2
        MAXIMUM=$3
    fi    
}
function run_slap {
        printf "%s\n" "SLAP $1:"
        printf "%s\n" " $CONCURRENCY simultaneous connections ||  $ITERATIONS runs through "
        printf "%s\n" " Writes : $WRITES, $UNIQUE_WRITES unique (Commit every $COMMIT) || Queries: $QUERIES, $UNIQUE_QUERIES unique "
        printf "%s\n" " Schema: $CHARS character columns, $INTS numeric with auto-increment PK and $INDX secondary indexes"
for engine in $ENGINES
    do
        SLAPPED=`$SLAP -e $engine 2>/dev/null | cut -c48-53 | tr -d \n`
"courier new,courier,monospace" size="3">        echo $SLAPPED >> $0.txt
        parse_slap $SLAPPED
            printf "For %s: %s Average, %s Min, %s Max\n" $engine $AVERAGE $MINIMUM $MAXIMUM
    done
}
run_slap "Base values"
echo
let WRITES=WRITES*10
let QUERIES=QUERIES*100
let UNIQUE_QUERIES=QUERIES/4
run_slap "more reads"
echo
UNIQUE_QUERIES=$QUERIES
UNIQUE_WRITES=$WRITES
let COMMIT=COMMIT*3
run_slap "More unique reads and writes"
echo
let INTS=INTS*5
let CHARS=CHARS*5
let INDX=INTS+CHARS-1
run_slap "wide indexed tables"
echo
let CONCURRENCY=CONCURRENCY*10
run_slap "massive concurrency"

Lately I find myself running quite a few large compiles and virtual machines, so I tricked out my Thinkpad T61 with a full 4 GB of RAM. Anyone with a more than a casual acquaintance with 32-bit operating systems and/or the powers of 2 will quickly see the problem I faced – on next boot Suse proudly reported I had a full 3.8G of RAM available.

 Now, to a man of my age that means that there is more memory laying fallow on this machine than I have had in most of the machines I’ve worked on throughout my career, and that’s just a shame. So I decided to dual-boot and try out a 64-bit distribution.

 At this point let’s reflect – when was the last time you actually worked on a 32bit desktop? Can you buy a laptop with a true 32-bit processor in it? Shouldn’t this charade have ended long ago?

 Well, I tried Fedora 10 x86_64, and I can tell you the charade has not ended. There is hope: For your browsing pleasure, Adobe now has an alpha available of their 64 bit flash plugin for linux, and my own employer just this month released update 6 of the JDK/JRE featuring for the first time a 64 bit Java plugin for mozilla. That’s right, folks, 64 bit linux will soon have cutting edge technology like Flash and Java applets in production-ready form!

Now for the bad news. Webex does not like the 64-bit Java environment (though admittedly I have yet to try the actual release, I have been using the final beta), which is a showstopper for me. Skype is still in 32-bit form, and while it works fine for chat & voice I never did get video to do anything but crash it. And I’ve found some really bad (freeze-your-laptop bad) bugs with the nvidia driver and NetworkManager – more and worse than this laptop has seen in a lot of distro-hopping. 

 The more cynical may say that my experience of instability could be chalked up to the fact that I’m not acutally comparing apples to apples – I went from OpenSuse 11.0 to Fedora 10 as well. I will say OpenSuse is the most stable distro I’ve used, but I don’t think Fedora is so lacking to explain the experience. I’m temped to re-install the whole box OpenSuse 11.1 in both flavors of bitness… but I’ll probably just go to 11.1 x86.

 I will still feel wasteful and depressed every time I run `free`, though.

I’ve been running OpenSuse for a while now, and there are some genuinely annoying things about it:

Sudo
‘Defaults targetpw’ (e.g. require root’s password to do anything as root) is not a good idea. Having the installer recommend I make my password also be root’s password compounds the stupidity.

Path
So sudo is already backwards, now I also have to remember what is in /sbin/ or /usr/sbin/ because they’re not in my path? Regular users don’t use ifconfig?

Ok, so I should have reconfigured both of these things (edited one file to change each) on day one and gotten these complaints out of the way. But I have this idea that you should try the spirit of a distro (after all, what else is there?). I do have bigger gripes, namely:

lsusb
I don’t have it. /proc/ doesn’t have what I’m looking for. Wtf?

RPM
I’ve been to the mountaintop. I’ve seen aptitude. I’ve abandoned fedora more than once in mire of dependencies and conflicting repositories. No, it hasn’t happened yet here, but I’m still wary.
Yet Suse is just So Stable. Somehow it is able to just do what I need it to when I need it to, and the packages are updated just often enough. I ditched Hardy, after all, for the lack of crucial features on my T61, and Suse was able to deliver all of them.
And it still does have Just Enough. I’m worried by what will happen when I really need to work to get something to work (for the reasons listed above)… but I haven’t yet.

Thanks Suse.

Off to Asia

Comments off

I’m off to do some MySQL training in Asia for the rest of the month. One week in Singapore, one week (and the weekend) in Bangalore.

Correction, Asia and the Emerging Markets. Apparently thanks to phenomenal growth, China and India have left Asia in the last few years (in our corporate parlance).

Always good to check flight status before you head to the airport, even (especially?) at 7am when you just checked 6 hours before. I now have 5 extra hours in beautiful John F Kennedy Airport, terminal 7. I haven’t seen the new JetBlue terminal, but this one almost makes JFK feel like a major international airport.

Cathay Pacific seems a fine airline, though perhaps technically challenged. The self-service kiosks were all down, and my plane is somewhere between here & Canada… but they gave me $25 to get breakfast & lunch plus a calling card while I wait! If the promise of "outlets for everyone" is true, I’ll give them a full thumbs-up.

After 4 good releases, Ubuntu let me down with 8.04. Maybe it was the timing – I upgraded my laptop as part of restoring it from a hard drive crash a few weeks ago – but isn’t a brand new disk a good time to change your OS version? 

 On the upside, Hardy was the first OS I’ve installed where I opted to keep the default wallpaper (the bird is purty). And I’m pretty sure suspend (nVidia driver and all) was working better than previously, which is always good news. 

But I no longer had use of the VGA port for cloned or extended desktop, and I was unable to find a solution. That’s a dealbreaker for anyone who needs to do frequent presentations (or, for that matter, uses their laptop as a primary workstation and has a > 15" monitor).

Worse, vpnc was, at best, squirrelly. I do quite a lot over VPN, and we still have two of them (Sun & MySQL). My sunray solves half of that problem for me, but until that mobile sunray comes out…

So I was on the market. One of my coworkers mentioned to me that he was already using the RC of Suse 11 on his laptop – same model as mine, and with the same list of concerns – and so I thought I’d give it a shot.

There are a lot of good things to be said for Suse 11. A lot of the notebook stuff works better – suspend/resume, monitor changes, network switching – I hold my breath a lot less than i used to. NetworkManager recognized my WLAN card and integrated it in seamlessly (no more wvdial for me). For a dot-zero release it is also remarkably stable, and I even took the plunge and went with KDE4.

My only complaints thus far have been in the UI department – and those may be taken with a grain of salt coming from an Ubuntu/GNOME person.  I was unable to actually create a new VPN connection from the NetworkManager applet (though once configured, it did show up). I’ve had some confusion over what to do in YaST and what is part of desktop management – the former changes between an external and laptop screen, the later can change resolutions. I can’t seem to tell Suse to conserve power more when the laptop is unplugged, or allow me to use an external monitor with the laptop closed. 

And then Suse has different ideas about the root password than I do. The default setting when creating an initial user at install time was "Use same password for root" – I am guessing this is a compramise for Suse’s default sudo setting of requiring the root password rather than the user’s, which I’ve always thought odd. But Suse 11 did gently prod me into using enigmail and was very thorough about gpg key management, so they still get high marks on security from me.

So I’m sticking with Suse 11 for a while, though I might go over to GNOME depending on how the KDE4 thing shakes out over the next few weeks.

Bike month ends this weekend in the city, so I felt I had to bike to work at least once – it was great! Brooklyn to 101 Park (the Sun NY office) – six miles in just under a half hour.
My route, for those interested:

It’s a perk to having an office that I’d forgotten – I was a lot more awake than I usually am when I first sit at my keyboard. No bike parking or showers at 101 Park, but  those great friends of the bike commuter, talc and baby wipes, spruced me up nicely on arrival.