Fallacies Everywhere

The thing that has got me motivated enough to write this as a post is this article on the BBC website. It’s got a provocative title, “Green food report favours home-grown curry,” so of course I checked it out. The real meat of the story is that a commission in the UK has just turned in their report on the food production infrastructure in that country and they’ve made some recommendations. Well done, good work, I’m sure the Department of Agriculture (or whatever it’s called in the UK) will be pleased and start publishing tracts and faxing flyers to farms all over England and Wales. But here’s an interesting line, down in the middle of the story:

The project consisted of five subgroups to look at particular areas within the food system – wheat, dairy, bread, curry and geographical areas [emphasis mine] – with the goal of consider ways to “reconcile how we will achieve our goals of improving the environment and increasing good production”.

A few ideas occurred to me when I read this:

  1. That’s a heck of a food pyramid. “I think I’ll only have one helping of Kent, I’m trying to stay slim.” “Oh, go on, you know it’s the midlands that have all the calories.”
  2. Wait a second, we’re gonna look at five different things, two of which are grain? No, I get it, bread is different from grain because bread is processed grain so we need to look at the whole supply and production chain. It’s not a boondoggle to get the commission to pay for junkets to France to look at “bakeries” there.
  3. Curry? Are we sure this wasn’t put in there to justify the lunch tab at every commission meeting?
  4. So, a commission starts out investigating how to increase production of curry (and by the way, are we talking hing, chiles, pepper, ginger, fenugreek, turmeric, and on and on? Really? That’s some amazing climate change y’all are expecting in England.) comes to the surprising conclusion that increasing production of curry would be a good thing! And this is so surprising that the Beeb makes that the headline!

It should be no surprise to anyone who’s ever eaten at a hotel or restaurant in the UK that fresh vegetables and fruits are not on that list. Oh hey, this is like in those Stieg Larsson books where the only thing anyone ever eats is white bread and cheese and all they drink is coffee with milk.

Hey, The Beeb, check this out! (Oh yeah, specifically, this.)

I Didn’t Geddit

I just got an email from Yammer announcing that Yammer has been acquired by Microsoft. I had a couple of Yammer accounts because some people I wanted to play nice with have Yammer accounts, but I confess: I just don’t get why Yammer is something I should want. It doesn’t solve any problems I have. It’s sort of like spam: in theory, some person might be sitting around the computer thinking, “Golly. I really wish I knew where I could get some Au)t}*h-entic Tablets. If only someone with a fake email address would send me an email with a link to a website where I could score that!” Similarly, there might possibly be someone sitting in an office somewhere thinking, “Dang, I’m getting too much work done. I wish there were some kind of website that would have the faint whiff of corporate approval but that would really amount to my wandering the halls and chatting with everyone.” Someone, somewhere, who gets paid not for working but just for being some special and precious snowflake. You might be that person (if so, check your spam folder, it’s got what you want).

So now Microsoft has acquired Yammer. This is all the signal anyone should really need, and now it explains why I didn’t get what the heck Yammer is good for. It isn’t. It’s got the smell of something that’s got a lot of people excited; it’s got the minimum feature set required to be written up in a trade magazine as being an exemplar of whatever the hell it is supposed to be; it’s got confused corporate IT departments paying for it; and it’s not a big enough player to be unpurchasable. So, it’s crapware that Microsoft will rebrand as some kind of productivity thing for groups but is fundamentally a waste of time. Got it.

New Project

This seems to happen to me all the time. I’m wrapping up whatever I’m working on and I have a plan for what comes next. Then, before I have finished the current project but after all the real decisions have been made, I start coming up with all sorts of new projects. I haven’t really analyzed this behavior before, nor even reflected too much on it (although it does bear a strong resemblance to how I envision software development in general – but that’s a separate discourse), but I suppose it’s because the part of my mind that’s involved in creative problem solving gets antsy when it’s idle and starts coming up with things to do.

Some of these things might actually be really cool and worth pursuing, but most of them are, I think, the equivalent of sudoku or crossword puzzles. They’re engaging and require mental effort but in the end they just don’t produce anything good or useful. They’re intellectual busywork. That’s not bad if one is just trying to stay sharp, but it can be really distracting if there is real work to be done. To me, being able to tell when it’s appropriate to follow up on these fun side projects and when it’s not is a skill that I prize and try to develop. Acting on that decision is discipline.

I commonly talk about startups being, “resource constrained,” and use that as the starting point of my analysis of how a startup chooses technology, human resource policy, and other business decisions. Given that you want to do lots of things but you only have the ability to do a few things, which things do you choose to do? How you answer indicates, in some fundamental ways, what kind of a person you are; it reveals what you hold to be really important. What’s most important, people or money? Who’s more important, yourself or your family or strangers or shareholders?

It’s a common observation that ideas are cheap; that it’s execution that is expensive and valuable. So here’s an idea that has popped up to the forefront of my mind; it’s been kicking around for a while, but I don’t claim any kind of proprietary interest in it. It is certainly in the category of distraction to me since it is nowhere near the projects that I have coming up. I’ll write it down here so that if any of the three occasional readers wants to pick it up and run with it, they can be my guest. And if not, then the next time I’m actually idle and trying to stay sharp, it’ll be there for consideration.

Think about LACS; the idea was pretty nifty. An app that discovered other instances without being told explicitly by the user where those instances were; it exchanged data with the discovered instances without really interacting with the user, and the user got to see what was going on and to inject new data into the mix. That was cool. What if the messages had some extra metadata, like how reliable the author considers the message, or the repeater does; when the message was created, maybe other stuff. But really just playing urban legend, right? How interesting would that be? This is almost like Twitter, except that with Twitter (and Facebook, and G+, and the like) you have to choose what you see. You explicitly follow people, and they explicitly retweet (or like or repost or whatever) items. That gets you the water cooler conversations, but it doesn’t get you the snippets you overhear at a restaurant or in line at the grocery store. What if your phone communicated with the phones around you, collecting messages and offering messages and then the results of those exchanges came up in your Twitter feed? We all have these explicit networks – our friends, our families, our coworkers, our congregations and classmates – but there are also these implicit, loosely defined and weakly bound networks based on where we stop for gas, where we shop for butter, and where we go for burritos. Social networking apps try to emulate the explicit networks and extend their reach, sending messages out farther than we normally do in personal interactions. What about the others?

I hate the idea of foursquare because I think it’s creepy (and more than a little stupid) to advertise your specific, personal location at any and every moment. “Hey, Internet, Charles Emerson Winchester III is at Taqueria Vallarta in Felton right now and therefore a good hour and a half away from his house full of expensive and easily fenced consumer electronics. Also, please let his psycho stalkers know this.”

PDF::API2 and Landscape

I’m currently working on a way to generate a custom PDF of a Bugzilla record (which, in turn, contains a bunch of custom fields). One of the requirements of the spec is that the PDF is targeted at US letter size paper in landscape orientation. I’m looking at using PDF::API2 and PDF::Table but one of the problems I had, just like manu, is that when you call $page->rotate(90) it sure-enough rotates the page at render time, but it’s really as if you printed on a portrait-oriented page and then turned the page sideways. This is not what one expects, since this is not, at the end of the day, something that one has much call to do.

Here’s how to achieve the actually desired effect: set the mediabox manually to the appropriate settings for a landscape orientation.

$page = $pdfdoc->page();
$page->mediabox(0, 0, 11*72, 8.5*72);

In my code I’ve defined a function that returns that list so that the code can be a little more descriptive:

    return (0, 0, 11*72, 8.5*72);

$page = $pdfdoc->page();

There’s probably a prettier way to do it, but at least this works. Gotta ship code, man.

Prime Factorization – Actual Code

I finally had a little time free and put together a solution to the prime factorization problem. As with the tic-tac-toe program before, there are lots of opportunities to make this program better. This only runs once, rather than looping. The number being factored is a constant so you have to edit the program to get the factors of a different number. I limited the list of primes so the program as it stands will only compute the prime factors of numbers less than 65,535 (an homage to the old 8-bit days).Continue reading “Prime Factorization – Actual Code”

I Hate Your Favorite VCS

Whenever I have to start working with a new software package to do a task I already know how to perform using a different software package, I feel a little frustrated. I’m sure everyone can relate to this. Programs that do similar things are often unnecessarily differentiated. It’s as if DeWalt and Makita made cordless drills that not only had different colored plastic and different battery packs but also spun around in entirely different dimensions and one was hand operated while the other was controlled with facial tics.

I’ve now used, professionally, several different version control systems. Visual SourceSafe, CVS, Perforce, subversion, Mercurial, Bazaar, and git. I hate them all.

Okay, maybe that’s a bit strong. I’ve figured out how to get work done with Perforce. I really love its changelists. That’s great. Other systems let you shelve changes, and that’s great, too. But here’s the deal: I’m in a new environment, I know how to write software and I’ve got bugs to fix. I don’t want my tools to get in the way. And yet, here I am, trying to figure out how in the world to undelete a file, how to commit a change, how to remove files, and how to generate a diff that doesn’t make my eyes bleed. And did I mention that I’ve got actual work to do?

The latest crop of distributed SCM tools (git, Mercurial, and Bazaar) want you to drink their Kool-Aid and spend days just becoming a dittohead for their path. I’m getting a bit profane and testy because it’s taking me too dang long just to get done what I want to get done. I’m not a 16 year old with nothing better to do. I will pay actual money for someone to write a decent manual.

I want it to be task oriented and with real examples. I have a repo with files in it that shouldn’t be there. How do I delete them? I want my cleaned up repo to be picked up by the main repo. How do I do that? Those deleted files are actually metadata for my development environment; once I delete them from the repo I want to put them back in place locally; how do I keep the VCS client from deleting my files or bitching about them the next time I pull changes from the repo?

Everyone puts up “how-to” pages for creating a new repo (which you do once per project), checking out source, adding files, checking in changes, and sometimes even branching and merging. That’s not enough. Revert files to the unlabeled version from yesterday before lunch. Restore a deleted file. Ignore some files in the source tree that were temporary files or program logs or IDE metadata. Rename a file. Move a file from one directory to another. Move a whole directory. Look at the version history of a file to figure out who, four years ago, was the person who wrote an otherwise undocumented subroutine so you can ask about it. Start doing these things and you realize why configuration management is an actual professional field distinct from software development. You’ll also discover that whoever set up the repo in your company didn’t know what he was doing, any more than you do.


Hey, I Built a Thing

My new job involves hacking on Bugzilla and part of that involves email. Bug email, system administration email, yadda yadda. I wanted a way to test that email without it ever leaving my development system. We had a mechanism in place where the emails would just get written to a plain file, but that doesn’t help with HTML email. I wanted to be able to see the email rendered all nice and tidy in Mail.app. A couple of jobs ago, one of my coworkers put something together out of Python and it worked great, but I couldn’t find an already-done example online anywhere. Other servers were GUI or Mac-only and therefore wouldn’t work on a headless Linux test server. Whatever solution I picked, I’d have to write some code.

Well, I found a Java SMTP test server and wrote a simple POP3 server into it. So now I’ve modified dumbster and made my modifications available to the world. Use it, improve it, do as you like. It’ll make my life easier, and I hope it makes yours a little better, too.

Always Ask One More Question

Badb just asked me if I could show her how to calculate the square root of a number. I cast my mind back to 7th grade and tried to remember going up to the blackboard and calculating arbitrary square roots. “Um, yeah, but let me go look it up. It’s been a long time and I want to be sure I’m right,” I said. I found a nice explanation that aligned well with the method I learned oh-so-many years ago (and haven’t used since — what do you suppose slide rules, calculators, and general-purpose computers are for, if not for doing homework?) and came back to her. “Okay, so show me the number you’re supposed to take the square root of.”

It was 3600.

Srsly. Always ask another question. But keep the interwebs handy just in case your kid does need to compute the square root of 5,287 and show her work.

Time Out for SQL

Basic SQL is pretty simple, really. It’s only when I start dealing with big joins and group functions that I start to lose track of what exactly is going on. I’m in that situation now, so I’m constructing a small data set with short table and column names so I don’t have to type as much while I figure out the right syntax. The general problem statement I’ve got is this:

Given a pair of tables where one table holds a bunch of records and the other table holds labels for groups of those records, construct a single SELECT statement that will select all of the records as a single output cell with the format: “label1:’record1′,’record2′,’record3′,label2:’record4′,’record5′,’record6′”.

I’m working with MySQL and intend to use some swell functions, namely CONCAT and GROUP_CONCAT. My initial demo setup is this:

create table demo_a (
  c_id    int NOT NULL,
  c_name  varchar(60) NOT NULL

create table demo_b (
  r_id    int NOT NULL,
  r_name  varchar(60) NOT NULL,
  c_id    int NOT NULL

insert into demo_a(c_id, c_name) values (1, 'first cohort');
insert into demo_a(c_id, c_name) values (2, 'second cohort');

insert into demo_b(r_id, r_name, c_id) values(1, 'Dave', 1);
insert into demo_b(r_id, r_name, c_id) values(2, 'Sunny Jim', 1);
insert into demo_b(r_id, r_name, c_id) values(3, 'Hoos-Foos', 1);
insert into demo_b(r_id, r_name, c_id) values(4, 'Paris Garters', 2);
insert into demo_b(r_id, r_name, c_id) values(5, 'Harris Tweed', 2);
insert into demo_b(r_id, r_name, c_id) values(6, 'Zanzibar Buck-Buck McFate', 2);

Okay, now this query will return six rows, showing the names and labels of each record:

SELECT b.r_name, a.c_name from demo_b b left join demo_a a on b.c_id = a.c_id;

Let’s see if we can get the output down to two rows, each consisting of a label and some concatenated names.

SELECT a.c_name, GROUP_CONCAT(b.r_name ORDER BY b.r_id SEPARATOR ', ') from demo_b b left join demo_a a on b.c_id = a.c_id GROUP BY b.c_id;

That works! Okay, now let’s see if we can get that down to a single column.

SELECT CONCAT(a.c_name, ": ", GROUP_CONCAT(CONCAT("'",b.r_name,"'") ORDER BY b.r_id SEPARATOR ', ')) from demo_b b left join demo_a a on b.c_id = a.c_id GROUP BY b.c_id;

Hey, that’s pretty good! Okay, now how do I get these two rows collapsed down into a single one? I wonder if I need to wrap this select in another select, defining the two-row result set from the above query as a table and then grouping all those rows together. This sort of matryoshka nesting of queries is what gives me SQL headaches.

SELECT GROUP_CONCAT(x.labeled_names SEPARATOR ', ') single_row FROM
(SELECT CONCAT(a.c_name, ": ", GROUP_CONCAT(CONCAT("'",b.r_name,"'") ORDER BY b.r_id SEPARATOR ', ')) labeled_names
 from demo_b b left join demo_a a on b.c_id = a.c_id GROUP BY b.c_id) x;

OMG it totally works! Yay!

Factorize, More Boxes

Boxes and arrows, I know, it doesn’t look like a program yet. Still, it all is going somewhere, I promise. Let’s take a minute and talk about how factorization works. It’s pretty straightforward: either a number is evenly divisible by some other number (a factor) or it isn’t. Let’s look at an example: 48. What numbers go into 48? Well, I always think of it as 6 times 8, but it’s also 4 times 12. I’m the one with the keyboard, so let’s do 6 times 8. Now, let’s look at 6. Is it evenly divisible by anything? Sure, 2 and 3. How about them? Nope, they are not evenly divisible into smaller integer factors. That’s what it means to be prime, and two and three are the first two prime numbers. Eight, on the other side, divides down into twos. Here’s a tree, where each number gets split down into factors. When a factor can’t be split any more, that factor is prime:

The way the program is going to work (refer back to the previous drawing) is to start out with a number to factorize — let’s repeat this example and use 48 — and a list of primes. It’s going to start at the small end of the list of primes and keep trying until it runs out of primes or until it divides the number all the way down to primes. Here’s the tree the program would draw:

Here, the program would try the first prime, two. That works, so it would divide 48 by 2 and get 12. It would start over with 12 and see if two goes into 12. It does, and it would try again on 6. That works, too, and it would try again to divide 2 into 3. That doesn’t go evenly, so it would go to the next prime. The next prime is 3, which does go into 3 evenly. 3 divided by 3 is 1, so the program would know that it was done. This kind of drawing is called a tree (it’s upside down, don’t worry about it, but 48 is the root and all the primes are leaves). I’ve colored the primes green to highlight how they’re not being divided, and because they’re the leaves.

This process is pretty darned simple to explain. The only thing missing is, gosh, a list of prime numbers! It’s getting late and once again, I’m going to have to leave that for the next time. Here’s a bit more detail on the flowchart, though. This is the detail for the “let p be the next prime number” box. The first time we run through, we want to start at the circle labeled, “A,” but on all the following times, we want to come in at the circle marked, “B.”