5.2.1 Handling pseudo-diagonals in LEM: SPSS syntax

 
5.2.1.2 
data list file="des560zeros.dat" free / wunit1 to wunit560. 
compute hbst4=$casenum. 
sav out="temp1.sav". 


Return to Diagonals in RC models

 
5.2.1.3 
* starting point is defining a short macro which is run for each identified 
* pseudo-diagonal combination. 
define psd1 (hoc=!tokens(1) / h=!tokens(1) /w=!tokens(1) ) . 
if (!hoc=!h) !concat(wunit,!w)=lagwend. 
!enddefine. 
* This relies on the existence of a variable, 'lagwend', which starts 
* equal to 0 for all cases then gets updated by 1 for each new pseudo-diagonal. 
* For the a priori diagonals, we fill in the values of the pseudo-diagonals 
* by inspection; ie checking what are the values of the square-autorecoded 
* variables {h/w}bst4 that correspond to the original 
* values of the variables {h/w}bst. 
* This inspection can be achieved by crosstabulating the data in terms 
* of the new units conditional upon the pseudo-diagonals from the old units. 
get file="revisions1.sav". 
weight by freq. 
* Eg 1 : all diagonal titles. 
compute diocc=(hocc=wocc). 
temp. 
select if (diocc=1). 
cro hbst4 by wbst4. 
* Eg 2 : farming semi-diagonal. 
compute farm=(hocc=110 & (wocc=110 | wocc=160)). 
temp. 
select if (farm=1). 
cro hbst4 by wbst4. 
* etc etc. weight off. 
* The crosstabs should not generally cover too many combinations, 
* as during the data revision section we already 'tidied up' 
* the treatment of pre-defined problem occuaptions. 
* Next, we use that information from inspection to fill in the 
* relevant values (of {h/w}bst4) into the following sequence of macros. 
* [eg, say the first table revealed three combinations, namely 
* (1,1), (75,75) and (450,450), and the second table showed one 
* combination, namely (200,205). 
get file="temp1.sav". 
compute lagwend=0. 
execute. 
compute lagwend = lagwend +1. 
psd1 hoc=hbst4 h=1 w=1. 
compute lagwend = lagwend +1. 
psd1 hoc=hbst4 h=75 w=75. 
compute lagwend = lagwend +1. 
psd1 hoc=hbst4 h=450 w=450. 
compute lagwend = lagwend +1. 
psd1 hoc=hbst4 h=200 w=205. 


Return to Diagonals in RC models

 
5.2.1.4 
* assumes continuation from stages 1-3 above. 
* starting value of lagwend is already specified (max value of diags). 
fre var=lagwend. 
* Because of the complexity of the multiple data entries at this 
* stage, we use include files to add in the relevant additional pseudo-diagonals. 
include file="pseuddiags1.sps". 
* where the include file has the following structure, and is 
* updated as we move from each version of the model to the next : . 
 
* CONTENTS OF PSEUDDIAGS1.SPS. 
* Extreme positive pseudo-diagonals :. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=408 w=360. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=408 w=284. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=283 w=7. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=283 w=20. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=283 w=195. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=283 w=86. 
* Extreme negative pseudo-diagonals :. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=356 w=239. 
* Extreme positive outliers :. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=44 w=240. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=534 w=269. 
* Extreme negative outliers :. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=7 w=18. 
compute lagwend = lagwend +1. 
prob2 hoc=hbst4 h=195 w=241. 
* (Updates from successive versions can either be pasted into each * successively run include files, or pasted into the bottom of the single * include file) * Note that because the include file uses a lot of duplicate * text, it can often be created by using 'replace' functions from * starting data which included just two columns for the combination values. * (this is desirable because it saves time during data entry).


Return to Diagonals in RC models

 
5.2.1.5 
* assumes continuation from stages 1-4 above. 
* The current SPSS data file now has the form of the desired design matrix. 
fre var=lagwend. 
* The value (the max number of pseudo-diagonal combinations named) 
* is important : it will have to be named in subsequent lEM command files. 
write out="des560r1u1p1.txt" / wunit1 to wunit560 (560(F4.0)). 
execute. 
* (the indicators r1, u1 and p1 will be explained later). 


Return to Diagonals in RC models

 
5.2.1.6 
get file="revisions1.sav". 
* 6.1) first check the initial pseudo-diagonals. 
* again caculating dummy indicators from original data :. 
* Eg 1 : all diagonal 
titles. compute diocc=(hocc=wocc). 
* Eg 2 : farming semi-diagonal. compute farm=(hocc=110 & (wocc=110 | wocc=160)). 
* etc etc. 
* Create an aggregate indicator of all initial pseudo-diagonals. 
compute psds1=(diocc=1 | farm=1 | etc etc). 
variable label psds1 "Pseudo-diagonals original sources". 
weight by freq. 
tables /format blank missing ('.') /ftotal=ftot1 "Total" 
  /tables (labels) + ftot1 by psds1 > (hbst4 + wbst4) /statistics count ((F5.0) ' Cases ') 
  /title "Male and female title-by-status units". 
weight off. 
* 6.2) Then incorporate all the other pseudo-diagonals using 
* a macro and an adjusted version of the include files. 
compute h1=hbst4.
compute w1=wbst4. 
compute psds2=psds1. 
variable label psds2 "All pseudo-diagonals all sources". 
* macro for identifying other pseudo-diagonals by indicator variable. 
define idpr2 ( h=!tokens(1) /w=!tokens(1) ). 
if ( h1=!h & !w1=!w) psds2=1. 
!enddefine. 
* Inlcude file idenpsds1.sps as follows :. 
include file="idenpsds1.sps". 
* where the include file has the following structure, and is 
* updated (with more combinations) as we move from each version of the model to the next : 
* (note that the named values in the file are identical to those 
* in the pseuddiags1.sps file, so the idenprobs1.sps file can 
* be created by use of a 'replace' function for the relevant terms). 
 
* CONTENTS OF IDENPDS1.SPS. 
* Extreme positive pseudo-diagonals :. 
idpr2 h=408 w=360. idpr2 h=408 w=284. 
idpr2 h=283 w=7. idpr2 h=283 w=20. idpr2 h=283 w=195. 
idpr2 h=283 w=86. 
* Extreme negative pseudo-diagonals :. 
idpr2 h=356 w=239. 
* Extreme positive outliers :. 
idpr2 h=44 w=240. 
idpr2 h=534 w=269. 
* Extreme negative outliers :. 
idpr2 h=7 w=18. 
idpr2 h=195 w=241. 
weight by freq. cro psds1 by psds2. weight off. * then check : are any occupational unit distributions * adversely affected (under-represented)after pseudo-diagonal * combinations are excluded. set ovars both onums both tvars both tnumbers labels. weight by freq. tables /format blank missing ('.') /ftotal=ftot1 "Total" /tables (labels) + ftot1 by psds2 > (hbst4 + wbst4) /statistics count ((F5.0) ' Cases ') /title "Male and female title-by-status by exclusion as Pseudo-diagonal". weight off. set ovars both onums both tvars both tnumbers both. * This table is significant : it shows all the base units and the * number of cases with which they are represented in the latest * model version, plus the number of cases excluded as pseudo-diagonals. * We usually save it (eg paste into excel), and use it in the final report. sav out="datamodel1.sav".


Return to Diagonals in RC models

  


Last modified 14 February 2002
This document is maintained by Paul Lambert (paul.lambert@stirling.ac.uk)