Fun with ABS datapack, top 20 Viet suburbs in Victoria

Just downloaded the 2011 ABS (Australian Bureau of Statistics) data pack the other day. I first heard of it from Slashdot, where they mentioned it was a pain in the ass to download the data directly. The alternative is to fork out $200 to get a DVD delivered!! Fortunately, someone was being a true aussie and packaged it all up into a single 4.9GB torrent file. When decompressed it expands to a whopping 22 GB of CSV and some sort of map file.

Navigating the CSV files is a bit tricky because they make heavy use of acronyms and id codes that require a separate lookup file. Nonetheless, after 30 min or so I thought I’d compile some simple stats. For fun I made a list of the top 20 Viet suburbs in Victoria, Australia. Why? coz I’m Viet.

Suburb 2011 count (possible random noise added by ABS)
1 Springvale 4183
2 St Albans – South 3111
3 Braybrook 2891
4 Sunshine North 2462
5 St Albans – North 2386
6 Noble Park 2293
7 Springvale South 2227
8 Sunshine West 2144
9 Keysborough 2005
10 Kings Park (Vic.) 1639
11 Deer Park – Derrimut 1575
12 Cairnlea 1565
13 Richmond (Vic.) 1343
14 Footscray 1239
15 Maribyrnong 1125
16 Thomastown 1051
17 Sunshine 990
18 Keilor East 891
19 West Footscray – Tottenham 824
20 Lalor 790

I believe the count is based on people born in Vietnam, not sure about Viets born in Australia. The data above tends to correlate with what I’ve observed.

What I found interesting about the data in general is the issue of confidentiality. To protect the data from pin pointing back to individuals they added random noise to the data and even advised against using stats that have small numbers. How small is small? I have no idea. Also of interest is it’s a fairly old Act:

Under the Census and Statistics Act (1905) it is an offence to release any information collected under the Act that is likely to enable identification of any particular individual or organisation. Introduced random error is used to ensure that no data are released which could risk the identification of individuals in the statistics.

Here are some links of interest of this topic:

I’ll probably spend more time playing with the data trying to come up with more racially targeted stats, because they’re cool, interesting and this is Australia 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *