Thursday, November 05, 2009

Fun With Data: SQL Server Unpivot

My dad ran for a township trustee seat this past election, so I helped him organize some data he got from the Board of Elections.  I massaged and finagled it into a SQL Server table in the format seen below:

Capture

This table has all registered voters for the township he was running in.

I started to wonder about the frequency that people vote.  How many people voted 0/4 years, 1/4 years, 2/4 years, etc.:

results

Well, lucky for me I’m working with Paul Montgomery who happens to be a whiz at all things SQL Server and he gave me a hand.

The first step in solving this challenge is manipulating the data so each voter has one row per year voted.  Paul pointed me to the UNPIVOT keyword.  UNPIVOT rotates columns into rows:

select * from (
SELECT * FROM Voters) src
UNPIVOT (Voted FOR [Year] IN (Voted200905, Voted200811, Voted200803, Voted2007)) unpvt

This query gives me this result:

unpivot 

Now I can group by the voter’s ID and sum the times they voted.

SELECT VoterID, Years = SUM(CAST(Voted AS TINYINT)) FROM (
SELECT * FROM Voters.dbo.Voters) src
UNPIVOT (Voted FOR [Year] IN (Voted200905, Voted200811, Voted200803, Voted2007)) unpvt
GROUP BY VoterId

grouped

Now I just need to group by the “Years” column and I’ve got my result:

SELECT Years, count(1) as Voters FROM (
    SELECT VoterID, Years = SUM(CAST(Voted AS TINYINT))
        FROM (SELECT * FROM Voters.dbo.Voters) src
        UNPIVOT (Voted FOR [Year] IN (Voted200905, Voted200811, Voted200803, Voted2007)) unpvt
        GROUP BY voterId) voters
GROUP BY Years
ORDER BY Years

results

Only 355 of the 5091 registered voters (which already is a small subset of the population) voted in all 4 of the elections I had data for!