***Exporting occupational codes to Pajek **Dave Griffiths, University of Stirling **17 November 2010 **This enables occupational pairings to be converted into a Pajek file. **This file predicts the expected numbers of pairings given the number of people, by gender, in each unit group. **This expectation is then compared to the actual number observed to identify the relative levels of over/under-representation. **From this, a file is available to be converted by txt2pajek to get a social network matrix. *txt2pajek can be downloaded from: http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/text2pajek.htm **hocc and wocc can be exported to populate a social network. **pro_obs and value can be used as thresholds to dichotomise the network, the values of lines or both. **Examples of research using these methods can be found at: http://www.camsis.stir.ac.uk/sonocs/papers ************************ **Requirements * A horizontal dataset with: * i) an occupation variable for the ego (called 'hocc') * ii) an occupation variable for their lter (called 'wocc') * iii) all cases without occupational information for respondent and/or spouses removed. **Please note, this follows the CAMSIS convention of holding the husband's occupation (hocc) and wife's (wocc) * This method can be used for other pairings of data, including, for instance, fathers and sons, or housemates ******Exporting only those linkages which are above the expected values **create frequency dataset capture drop freq gen freq = 1 collapse (count) freq, by(hocc wocc) list in 1/20 *****Find total for each category capture drop tot egen tot=sum(freq) summarize tot *******Find totals for men and women capture drop nhocc capture drop nwocc egen nhocc=sum(freq), by(hocc) egen nwocc=sum(freq), by(wocc) list hocc wocc freq nhocc nwocc in 1/20 ****Find percentage for each category for men and women capture drop phocc capture drop pwocc gen phocc=nhocc/tot gen pwocc=nwocc/tot summarize list hocc wocc freq phocc pwocc in 1/20 *******Calculate expected numbers of women capture drop ewocc gen ewocc=pwocc*nhocc summarize list hocc wocc ewocc freq nhocc nwocc in 1/20 **************create expectation surplus capture drop value gen value=freq/ewocc ************Create standard error predictions capture drop prop gen prop = freq/tot capture drop staner gen staner = sqrt((prop)*(1 - prop) / tot) list freq tot phocc pwocc ewocc value prop staner in 1/20 **staner = proportion variance expect **therefore, we need to compare actual proportion to expect capture drop pro_obs gen pro_obs = freq/tot capture drop pro_exp gen pro_exp = ewocc/tot capture drop pro_min gen pro_min = pro_obs - staner capture drop pro_max gen pro_max = pro_obs + staner capture drop value gen value = pro_obs / pro_exp capture drop val_min gen val_min = pro_min / pro_exp capture drop val_max gen val_max = pro_max / pro_exp ***********************label variables label variable tot "total number in sample" label variable nhocc "total number of males in occupation" label variable nwocc "total number of females in occupation" label variable phocc "percentage of men in occupation" label variable pwocc "percentage of women in occupation" label variable ewocc "expected number of partnerships" label variable staner "Standard error for tie" label variable pro_obs "Observed proportion of all ties" label variable pro_exp "Expected proportion of all ties" label variable pro_min "Lower confidence interval of observed proportion" label variable pro_max "Higher confidence interval of observed proportion" label variable value "Observed value of representation" label variable val_min "Value of representation for lower confidence interval" label variable val_max "Value of representation for higher confidence interval" **This do file was created as part of the Economic and Social Research Council funded project: **Social Networks and Occupational Structure (ESRC grant no: RES-062-23-2497) **Paul Lambert and Dave Griffiths, University of Stirling *For more information on the project, see http://www.camsis.stir.ac.uk/sonocs/