Hi everyone! We're just getting settled and plugged into our laptops. Workshop will be starting shortly.
Fred Vallance-Jones is introducing Glen McGregor. We're starting off by downloading some Excel sheets of raw data we'll be working off of.
McGregor: Nowadays, anyone working in a newsroom should have some literacy with Excel. If you're working with a city council budget, or analyzing police crime stats, you need to be able to sift through numbers with it.
He'll be showing us some ways to find stories in data using Excel today.
Right now we're all working with a dataset from 2013. McGregor was curious about how much money Justin Trudeau had around the time that it looked like we was going to win the Liberal party. Trudeau's team had agreed to give him numbers on how much he made in speaking fees that year.
What's the first number you'd want from a list like that? The total. The formula to do that with a list is =SUM(first cell in range:last cell in range).
So: Trudeau charged $1,316,500 in speaking fees that year.
Next number: the average. The formula to do that is =AVERAGE(first cell in range:last cell in range).
So: Trudeau charged an average of $10,703 per event.
Question from audience: How would you sift to find out how much he charged for events in one region of Canada?
McGregor: The dataset would need a separate column that sifted that out first.
Next news-worthy number: the maximum amount he charged for an event. To do this: Keep your cursor in a cell in the amount range, head to SORT on the toolbox at the top of the screen, and do so from A-Z.
So: the most Trudeau charged for an event that year was $20,000 (for 11 events).
Next Q: How did what he charge for speaking fees change over the years? Excel recognizes dates, so if you sort by the date column, and highlight the numbers in the year that you're interested in, the sum of the year you're looking for will be calculated at the bottom of the screen.
Next topic: Gymnastics. He's got a dataset looking at the scores of gymnasts from a specific competition (his daughter is a gymnast).
If you'd like to calculate the total score for each gymnast in a column without re-writing the SUM formula over and over, you can copy and paste. Ctrl + Paste is relative to the position of where you're placing the formula.
How to find out the ranking of each gymnast for a specific section of the competition (in this case freestyle?) Use the formula =RANK(first cell in the column, first cell in the range;last cell in the range).
Note: To keep the formula for changing its position, use anchors. That's putting writing out the cells in a range with a dollar sign. For example: B$2. So an example formula would look like =RANGE(B2,B$2:b$9)
Next topic: Sunshine lists!
What are some of the first numbers you'd want from a sunshine list?
Who has the highest salary in the province? You can use SORT to figure this out.
McGregor: Usually, for journalism, total, highest and average are going to be the quickest numbers to use for a story; they're the ones that jump out at you for a lede.
Next data set: a list of Texas inmates who have been executed by the state. The state makes this info freely available, which includes names, age, date of execution, race, county.
First question: how does this data break down by race? Or by year? You'll use pivot tables to do this: it's a way of aggregating data within Excel by type that reaches beyond simple numbers.
To do this: create a new column for the dataset you're looking for. In this case, we're going by year. The formula for the first cell in this new "Year" row is =YEAR(first cell in original date range:last cell in original date range).
Remember: the YEAR formula works since Excel recognizes date formats.
For a pivot table: keep your cursor on any cell in the data set, and click on the pivot table option in the tool bar. It'll ask you if you want to put the data in a new worksheet or use the existing one. McGregor: It's often easier to do this in a new sheet.
In the new worksheet, Excel will offer you the different pivot table fields you can re-organize the dataset by. Drag and drop the datasets you're interested in, and they'll show up on the table sorted into the range you're looking for.
In the case of year, you can go from first year to last. In this case you'll find the number of executions from the first year (1983) to last (present). You'll find that there are fewer executions in the earlier years.
We're not organizing by race. If you wanted to find out, for example, what the racial breakdown of executions was like when a certain governor held office, you'd select race for columns and governors for rows.
Finding: Perry presided over the most executions (236) of that period.
How about ratios? Underneath the column for Bush, for example, you could use the formula =cell of ratio subject/cell of total*100
We're now using pivot tables to sort through Ottawa bicycle theft data.
Question from audience: Did you take an Exel workshop when you first started reporting?
McGregor: Not until much later. A few years ago I wrote a story where a source gave him bad information, and he was served a libel notice over the story that resulted from that info. "It really rattled me." He's been keen to use datasets to backup stories ever since.
Last dataset: registered pets in Ottawa.
This data set has been used in the past to find out bite/attack rates for pitbulls in the province. But we'll be looking for some more light-hearted facts with this sheet today: most popular dog and cat names in Ottawa.
Using pivot tables with the filter for name, the top 5 names for dogs and cats are: Bailey, Buddy, Max, Maggie, Toby.
McGregor: Do this and write a story about the top 3 cat names, and colleagues will be leaving cans of cat food and notes on your desk for weeks. But it'll also be one of the most-read stories on the site.
Everything we've covered are super simple formulas to sift through data quickly—and turn around a story quickly, if you know what numbers you're looking for.
More to come on datasets in tomorrow's day-long workshop.
And that wraps up this panel! Thanks for joining us.