br Materials and methods br Study population
2. Materials and methods
2.1. Study population and design
We utilized data from the Korean National Health Insurance Service (NHIS) that was collected from January 2002 to December 2015. The Korean health insurance system officially started in 1977 and gradually expanded to the NHIS system to achieve nationwide health insurance that covered 97% of the population of South Korea by 1989 . The NHIS collects individuals' medical claims data, such as disease diagnosis and medication prescription, as well as de-mographic factors. From the initial data, we designed a retrospective cohort study of individuals by randomly selecting 5% of adults in the NHIS database who were aged 50 years or older and followed up at least five years during the period 2 January 2002 to 31 December 2015 (N ¼ 431,454). We mutually excluded individuals who had been diagnosed with cancer (N ¼ 13,038), and individuals who had any record of sedative-hypnotic prescription (N ¼ 188,928) during the initial two years. A total of 236,759 participants were included in the final analysis.
2.2. Definition of exposures
We pre-defined types of sedative-hypnotic medications which included benzodiazepines, zolpidem, and other medications pre-scribed for off-label uses (antidepressants (amitriptyline, imipra-mine, low-dose formulation mirtazapine, nortriptyline, trazodone) and low-dose anti-psychotics (chlorpromazine, levomepromazine, quetiapine)). We tracked the total amount of sedative-hypnotic medication use per person from 2004 and standardized it by a defined daily dose (DDD) . We only included the participants whose cumulative dose of sedative-hypnotic medication exceeded 30 DDD in the exposure group. The total cumulative dose was categorized into three ranges: 30e179 DDD, 180e359 DDD, and over 360 DDD.
For subgroup analyses, we classified the sedative-hypnotic medications into two groups: the gamma-aminobutyric MPP+ Iodide (GABA) receptor agonist (GABAA) group, which included the benzodiazepines and zolpidem, and the non-GABA group, which included the antidepressants and low-dose antipsychotics. When a participant's cumulative dose exceeded 30 DDD for one of the groups, the participant was considered as exposed to a specific group. If a participant reached 30 DDD for GABAA-group exposure and 30 DDD for non-GABA-group exposure or vice versa during follow-up, they were analysed as having exposure to both groups; such cases were referred to as ‘combination exposure.’ In such cases, the cumulative dose for GABAA-group medications and non-GABA-group medications both had to
exceed 30 DDD for the participant to be considered as exposed to sedative-hypnotic medications.
2.3. Definition of outcomes
We used the classification codes from the Korean Standard Classification of Diseases, sixth version  (KCD-6; Appendix I) to classify cancer diagnoses, as this was the standard code used for disease coding in the NHIS. The KCD-6 uses a coding system iden-tical to the World Health Organization International Classification of Diseases, 10th version (ICD-10) .
For the definition of cancer cases, we used a working definition of the ‘major disease’ and ‘first minor disease’ items in the claims data. If either one of these items reported a pre-defined cancer code (Appendix I) between 1 January 2004 and 31 December 2015 with a confirmed hospitalization, the participant was considered to be a case. The first date that the cancer code was reported was consid-ered to be the index date. For external validity, we compared the incidence of cancer in our study with cancer statistics from the Korean Central Cancer Registry .
Age, sex, comorbidities, and insurance premium were included as possible confounders in the final models. Insurance premium was used as a proxy for socio-economic status , because it was deter-mined by the economic status of the beneficiary, and grouped into three categories by distribution. To measure comorbidities of each participant, we utilized a previously developed algorithm applying the Charlson Comorbidity Index  using ICD-10-CM codes with pre-designated weights. We counted any disease listed in the Charl-son Comorbidity Index which was reported more than twice during the study period. As we coded the covariates with operational defi-nition, there were no missing data for these variables.
2.5. Statistical analysis
We compared baseline characteristics by exposure status. For exposure status, we applied a time-varying analysis to capture various exposure conditions over time. Because there was a chance of reverse causation, we considered the lag effect in our model. A five-year lag period was applied between exposure and outcome. A Cox proportional hazard model was applied to calculate hazard ratios (HRs) and 95% confidence intervals (Cis) between sedative-hypnotic medication use and cancer incidence. Person-time was measured from January 2002 until the first occurrence of cancer diagnosis, loss to follow-up, or the end of follow-up in December 2015. For individual cancers, we did not consider the presence of other types of cancer; eg, when calculating the risk of incident lung cancer, a diagnosis of colon cancer before the lung can-cer diagnosis did not change the analyses. All analyses were conducted after stratification by sex. We applied the Bonferroni correction for multiple comparisons for the analysis by different cancer sites. Age was used as the time scale. For covariates, insur-ance premium at baseline and comorbidity status calculated with the Charlson Comorbidity Index was added into the model. All statistical analyses were conducted with SAS 9.4 (SAS Institute Inc., Cary, NC, USA).