##### SPSS Syntax for Demographic Analysis #####

* By Joao Pedro de Magalhaes (http://jp.senescence.info/contact.php)
* Last modified on February 6, 2018
* Further information and updates at: http://genomics.senescence.info/software/demographic.html
* I am not responsible for any damage the use of this syntax may cause. Use at your own risk.


* Brief description and tutorial

Normally, the demographic analysis is conducted by the following order: 1) label the variables and calculate the hazard function; 2) generate graphs; 3) do linear or non-linear regression; 4) analyse regression coefficients and, if applicable, determine the statistical significance of the differences in slope between two or more cohorts.

To do a quick analysis of the data in mortality.txt, start by loading the data into SPSS and then run the "Initialization" syntax which is used to label the variables and calculate the hazard function. Afterwards, run the "New Calculations (multiple experiments)" which will generate graphs of the data as well as allow you to obtain the parameters of the Gompertz equation by non-linear regression. For the controls, you should obtain A = -6.109890153 and G = 2.822259306 for an equation: m(t) = 0.0022 e ^ 2.82t (note: exp(-6.10) = 0.0022). To determine if the slope of the Gompertz curve changes between the two samples, please use the "G significance test" below. In the mortality.txt example, the significance is < 0.001; you can find the statistical significance in the "Coefficients" table under the "Sig." column of the "test_t" line.


* Start of syntax

### Initialization ###
* original name of variable depends on the method used to import the data into SPSS; use the commands below that apply

rename variables var00001 = time.
rename variables var00002 = qx.
rename variables var00003 = deaths.
rename variables var00004 = sample.

rename variables V1 = time.
rename variables V2 = qx.
rename variables V3 = deaths.
rename variables V4 = sample.

compute hz = 2*qx/(2-qx).
compute ln_hz = LN(hz).
Variable label time "Age (yrs)".
Variable label hz "Hazard".
Variable label ln_hz "ln Hazard".
execute.

string label (A6).
compute label = "WT".
if sample = '2' label = 'KO'.
execute.

### New Calculations (single experiment) ###

weight by deaths.
execute.

GRAPH
  /SCATTERPLOT(BIVAR)=time WITH ln_hz
  /TITLE= 'ln Hazard'
  /MISSING=LISTWISE .
execute.

MODEL PROGRAM A=-1 G=1 .
COMPUTE PRED_ = A+G*time.
NLR ln_hz
  /PRED PRED_
  /CRITERIA SSCONVERGENCE 1E-8 PCON 1E-8 .
execute.

REGRESSION
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN
  /DEPENDENT ln_hz
  /METHOD=ENTER time
  /SCATTERPLOT=(ln_hz ,*ZRESID )
  /SAVE COOK RESID .
execute.

### New Calculations (multiple experiments) ###

weight by deaths.
execute.

GRAPH
  /SCATTERPLOT(BIVAR)=time WITH ln_hz BY label
  /TITLE= 'ln Hazard'
  /MISSING=LISTWISE .
execute.

COMPUTE filter_$=(sample=1).
FILTER BY filter_$.

MODEL PROGRAM A=-1 G=1 .
COMPUTE PRED_ = A+G*time.
NLR ln_hz
  /PRED PRED_
  /CRITERIA SSCONVERGENCE 1E-8 PCON 1E-8 .
execute.

COMPUTE filter_$=(sample=2).
FILTER BY filter_$.

MODEL PROGRAM A=-1 G=1 .
COMPUTE PRED_ = A+G*time.
NLR ln_hz
  /PRED PRED_
  /CRITERIA SSCONVERGENCE 1E-8 PCON 1E-8 .
execute.

COMPUTE filter_$=(sample=1).
FILTER BY filter_$.

REGRESSION
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN
  /DEPENDENT ln_hz
  /METHOD=ENTER time
  /SCATTERPLOT=(ln_hz ,*ZRESID )
  /SAVE COOK RESID .
execute.

COMPUTE filter_$=(sample=2).
FILTER BY filter_$.

REGRESSION
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN
  /DEPENDENT ln_hz
  /METHOD=ENTER time
  /SCATTERPLOT=(ln_hz ,*ZRESID )
  /SAVE COOK RESID .
execute.

### G significance test ###

filter off.
execute.

weight by deaths.
execute.

compute test = 0.
if sample = 1 test = 1.
compute test_t = test*time.
execute.

regression
 /dep ln_hz
 /method = enter test time test_t.
execute.