About a week ago, I posted an SQLite database containing 2010 US Census population data. However, I discovered a slight problem recently and am releasing an update today. The problem is that I made the assumption that there is a one-to-many relationship between states and ZCTAs. In other words, I made the assumption that states contain many ZCTAs. While this relationship certainly exists, it’s actually too simple. The real relationship is many-to-many. There are a few instances of ZCTAs that are actually contained in multiple states (the ZCTA 30165, for example, is located in both Georgia and Alabama).
This updated schema accounts for this many-to-many relationship:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | CREATE TABLE states ( id INTEGER NOT NULL, state TEXT NOT NULL, centroid_longitude REAL NOT NULL, centroid_latitude REAL NOT NULL, population_total INTEGER NOT NULL, population_male_total INTEGER NOT NULL, population_male_lt5 INTEGER NOT NULL, population_male_5to9 INTEGER NOT NULL, population_male_10to14 INTEGER NOT NULL, population_male_15to17 INTEGER NOT NULL, population_male_18to19 INTEGER NOT NULL, population_male_20 INTEGER NOT NULL, population_male_21 INTEGER NOT NULL, population_male_22to24 INTEGER NOT NULL, population_male_25to29 INTEGER NOT NULL, population_male_30to34 INTEGER NOT NULL, population_male_35to39 INTEGER NOT NULL, population_male_40to44 INTEGER NOT NULL, population_male_45to49 INTEGER NOT NULL, population_male_50to54 INTEGER NOT NULL, population_male_55to59 INTEGER NOT NULL, population_male_60to61 INTEGER NOT NULL, population_male_62to64 INTEGER NOT NULL, population_male_65to66 INTEGER NOT NULL, population_male_67to69 INTEGER NOT NULL, population_male_70to74 INTEGER NOT NULL, population_male_75to79 INTEGER NOT NULL, population_male_80to84 INTEGER NOT NULL, population_male_ge85 INTEGER NOT NULL, population_female_total INTEGER NOT NULL, population_female_lt5 INTEGER NOT NULL, population_female_5to9 INTEGER NOT NULL, population_female_10to14 INTEGER NOT NULL, population_female_15to17 INTEGER NOT NULL, population_female_18to19 INTEGER NOT NULL, population_female_20 INTEGER NOT NULL, population_female_21 INTEGER NOT NULL, population_female_22to24 INTEGER NOT NULL, population_female_25to29 INTEGER NOT NULL, population_female_30to34 INTEGER NOT NULL, population_female_35to39 INTEGER NOT NULL, population_female_40to44 INTEGER NOT NULL, population_female_45to49 INTEGER NOT NULL, population_female_50to54 INTEGER NOT NULL, population_female_55to59 INTEGER NOT NULL, population_female_60to61 INTEGER NOT NULL, population_female_62to64 INTEGER NOT NULL, population_female_65to66 INTEGER NOT NULL, population_female_67to69 INTEGER NOT NULL, population_female_70to74 INTEGER NOT NULL, population_female_75to79 INTEGER NOT NULL, population_female_80to84 INTEGER NOT NULL, population_female_ge85 INTEGER NOT NULL, PRIMARY KEY (id) ); CREATE TABLE counties ( id INTEGER NOT NULL, state_id INTEGER NOT NULL, county TEXT NOT NULL, centroid_longitude REAL NOT NULL, centroid_latitude REAL NOT NULL, population_total INTEGER NOT NULL, population_male_total INTEGER NOT NULL, population_male_lt5 INTEGER NOT NULL, population_male_5to9 INTEGER NOT NULL, population_male_10to14 INTEGER NOT NULL, population_male_15to17 INTEGER NOT NULL, population_male_18to19 INTEGER NOT NULL, population_male_20 INTEGER NOT NULL, population_male_21 INTEGER NOT NULL, population_male_22to24 INTEGER NOT NULL, population_male_25to29 INTEGER NOT NULL, population_male_30to34 INTEGER NOT NULL, population_male_35to39 INTEGER NOT NULL, population_male_40to44 INTEGER NOT NULL, population_male_45to49 INTEGER NOT NULL, population_male_50to54 INTEGER NOT NULL, population_male_55to59 INTEGER NOT NULL, population_male_60to61 INTEGER NOT NULL, population_male_62to64 INTEGER NOT NULL, population_male_65to66 INTEGER NOT NULL, population_male_67to69 INTEGER NOT NULL, population_male_70to74 INTEGER NOT NULL, population_male_75to79 INTEGER NOT NULL, population_male_80to84 INTEGER NOT NULL, population_male_ge85 INTEGER NOT NULL, population_female_total INTEGER NOT NULL, population_female_lt5 INTEGER NOT NULL, population_female_5to9 INTEGER NOT NULL, population_female_10to14 INTEGER NOT NULL, population_female_15to17 INTEGER NOT NULL, population_female_18to19 INTEGER NOT NULL, population_female_20 INTEGER NOT NULL, population_female_21 INTEGER NOT NULL, population_female_22to24 INTEGER NOT NULL, population_female_25to29 INTEGER NOT NULL, population_female_30to34 INTEGER NOT NULL, population_female_35to39 INTEGER NOT NULL, population_female_40to44 INTEGER NOT NULL, population_female_45to49 INTEGER NOT NULL, population_female_50to54 INTEGER NOT NULL, population_female_55to59 INTEGER NOT NULL, population_female_60to61 INTEGER NOT NULL, population_female_62to64 INTEGER NOT NULL, population_female_65to66 INTEGER NOT NULL, population_female_67to69 INTEGER NOT NULL, population_female_70to74 INTEGER NOT NULL, population_female_75to79 INTEGER NOT NULL, population_female_80to84 INTEGER NOT NULL, population_female_ge85 INTEGER NOT NULL, PRIMARY KEY (id), FOREIGN KEY (state_id) REFERENCES states(id) ); CREATE TABLE zctas ( id INTEGER NOT NULL, zcta TEXT NOT NULL, centroid_longitude REAL NOT NULL, centroid_latitude REAL NOT NULL, population_total INTEGER NOT NULL, population_male_total INTEGER NOT NULL, population_male_lt5 INTEGER NOT NULL, population_male_5to9 INTEGER NOT NULL, population_male_10to14 INTEGER NOT NULL, population_male_15to17 INTEGER NOT NULL, population_male_18to19 INTEGER NOT NULL, population_male_20 INTEGER NOT NULL, population_male_21 INTEGER NOT NULL, population_male_22to24 INTEGER NOT NULL, population_male_25to29 INTEGER NOT NULL, population_male_30to34 INTEGER NOT NULL, population_male_35to39 INTEGER NOT NULL, population_male_40to44 INTEGER NOT NULL, population_male_45to49 INTEGER NOT NULL, population_male_50to54 INTEGER NOT NULL, population_male_55to59 INTEGER NOT NULL, population_male_60to61 INTEGER NOT NULL, population_male_62to64 INTEGER NOT NULL, population_male_65to66 INTEGER NOT NULL, population_male_67to69 INTEGER NOT NULL, population_male_70to74 INTEGER NOT NULL, population_male_75to79 INTEGER NOT NULL, population_male_80to84 INTEGER NOT NULL, population_male_ge85 INTEGER NOT NULL, population_female_total INTEGER NOT NULL, population_female_lt5 INTEGER NOT NULL, population_female_5to9 INTEGER NOT NULL, population_female_10to14 INTEGER NOT NULL, population_female_15to17 INTEGER NOT NULL, population_female_18to19 INTEGER NOT NULL, population_female_20 INTEGER NOT NULL, population_female_21 INTEGER NOT NULL, population_female_22to24 INTEGER NOT NULL, population_female_25to29 INTEGER NOT NULL, population_female_30to34 INTEGER NOT NULL, population_female_35to39 INTEGER NOT NULL, population_female_40to44 INTEGER NOT NULL, population_female_45to49 INTEGER NOT NULL, population_female_50to54 INTEGER NOT NULL, population_female_55to59 INTEGER NOT NULL, population_female_60to61 INTEGER NOT NULL, population_female_62to64 INTEGER NOT NULL, population_female_65to66 INTEGER NOT NULL, population_female_67to69 INTEGER NOT NULL, population_female_70to74 INTEGER NOT NULL, population_female_75to79 INTEGER NOT NULL, population_female_80to84 INTEGER NOT NULL, population_female_ge85 INTEGER NOT NULL, PRIMARY KEY (id) ); CREATE TABLE states_zctas ( state_id INTEGER NOT NULL, zcta_id INTEGER NOT NULL, PRIMARY KEY (state_id, zcta_id), FOREIGN KEY (state_id) REFERENCES states(id), FOREIGN KEY (zcta_id) REFERENCES zctas(id) ); -- need an index on (state_id, zcta_id) and (zcta_id, state_id) CREATE INDEX states_zctas_index2 ON states_zctas(zcta_id, state_id); |
I haven’t changed a lot, but a few things have changed. First, note the states_zctas table which now accounts for the many-to-many relationship between states and ZCTAs. I’ve also renamed the population fields in the tables. *_population is now population_* for clarity reasons.
Accessing data is a little trickier as an INNER JOIN is required. For example, to access the total population for each ZCTA in Iowa, this query does the trick:
1 2 3 4 5 | SELECT zctas.zcta, zctas.population_total FROM zctas, states INNER JOIN states_zctas ON states_zctas.zcta_id=zctas.id AND states_zctas.state_id=states.id WHERE states.state="Iowa"; |
Download: 2010CensusPopulation.7z (size: 3mb)
