4 Data revision (merging the most sparse units)

 
compute hbst3=hbst. 
compute wbst3=wbst. 
** 1) Manual recodes :. 
* A (typically very long) list of manual recodes as derived from 
* the surveying of originally sparsely represented units and 
* preliminary model results. 
recode hbst3 (13192=13191) (21132 = 21131) (21302, 21312 = 21392) 
     (21482 = 21412) (21432, 21442, 21452 = 21422) (22212 = 22211) 
     (22222 = 22221) (etc = etc) . 
recode wbst3  (11401, 11202 = 11201) (12261, 12221 = 12251) 
    (12222 = 12252) (13192 = 13191) (21211, 21131, 21132, 21111, 21112 = 21101) (21301 = 21311) 
     (21401, 21421, 21431, 21441, 21451, 21461, 21481 = 21411) 
     (21312 = 21392) (etc = etc) . 
 

** 2) Psuedo-diagonal corrections . 
* We anticipate from the preliminary models some of the 
* major pseudo-diagonals (although a lot more may yet be defined 
* at a later stage). Here we already excluded them from our 
* manual revisions, so it is necessary, and convenient for other 
* reasons, to recode all relevant cases into categories that still 
* exist in the revised units. 
* eg 1 : all farming pseudo-diagonals:. 
compute agric1=0. 
if ( (h2gp=61 | hocc=9211 | hocc=3212) & (w2gp=61 | wocc=9211 | wocc=3212) ) agric1=1. 
* eg 2 : all diagonals by occupational title. 
compute diocc=(hocc=wocc). 
* recoded base unit values : for agric1, choose 1 category each; 
* for diocc, choose one category for each of the 5 major group units. 
if (agric1=1) hbst3=61101. 
if (agric1=1) wbst3=61101. 
if (diocc=1 & h1gp=1) hbst3=12303. 
if (diocc=1 & h1gp=2) hbst3=23593. 
if (diocc=1 & h1gp=3) hbst3=34103. 
if (diocc=1 & h1gp=4) hbst3=41103. 
if (diocc=1 & h1gp=5) hbst3=52203. 
if (diocc=1 & w1gp=1) wbst3=12303. 
if (diocc=1 & w1gp=2) wbst3=23593. 
if (diocc=1 & w1gp=3) wbst3=34103. 
if (diocc=1 & w1gp=4) wbst3=41103. 
if (diocc=1 & w1gp=5) wbst3=52203. 

** 3) Recompute derivative units and add value labels. 
compute hocc3=trunc(hbst3 / 10). 
compute hempst3=hbst3 - (10 * (trunc(hbst3 / 10) ) ). 
compute wocc3=trunc(wbst3 / 10). 
compute wempst3=wbst3 - (10 * (trunc(wbst3 / 10) ) ). 
execute. 
compute h1gp3=trunc(hocc3/1000). 
compute w1gp3=trunc(wocc3/1000). 
recode h1gp3 w1gp3 (5 thru hi=5). 
compute h2gp3=trunc(hocc3/100). 
*etc through all subgroups. 
compute h1gp3st=h1gp3*10 + hempst3. 
compute w1gp3st=w1gp3*10 + wempst3. 
*etc through all subgroups-by-status. 
* Value Labels : . 
include file="versionlabels.sps". 
occlab occ={hocc3 wocc3}. 
stlab occ={hempst3 wempst3}. 
bstlab occ={hbst3 wbst3}. 
majlab occ={h1gp3 w1gp3}. 
* plus variable labels if desired :. 
variable label hocc3 "Title unit males revision 1". 
variable label hbst3 "Title-by-status unit males revision 1". 
*etc. 
* summary info :. 
weight by freq. 
tables /format blank missing ('.') /ftotal=ftot1 "Total" 
  /tables (labels) + ftot1 by (hbst3 + wbst3) 
   /statistics count ((F5.0) ' Cases ') 
      /title "Male and Female title-by-status revision 1". 
*etc. 
weight off. 

** 4) Calculate square autorecoded values. 
* [FOLLOW THE ILLUSTRATION SHOWN IN SECTION 2.4 EARLIER, SUBSTITUTING THE NUMBERS 
* 3 AND 4 AT THE END OF VARIABLE NAMES IN PLACE OF [BLANK] 
* AND 2 (IE, HOCC- -> HOCC3; HOCC2 -> HOCC4, ETC) ]. 
sav out="revision1.sav". 

 
 
Return to Data revision
 

Last modified 14 February 2002
This document is maintained by Paul Lambert (paul.lambert@stirling.ac.uk)