Abstract
Target trial emulation (TTE) is an observational, quasi-experimental research design that emulates a randomized clinical trial (RCT) structure within a large set of observational data; the “target trial” is a hypothetical RCT that would have ideally answered the research question. TTEs can address study objectives that, for ethical or logistic reasons, cannot easily be examined in RCTs. Advantages of TTEs over conventional approaches to observational data are that TTEs can reduce bias, improve the understanding of findings, and facilitate causal inference. This article explains what TTEs are, how TTEs are performed, and how TTEs differ from observational studies, quasi controlled studies, and RCTs. Prevalent user bias and immortal time bias are explained, as is how TTEs are designed to avoid these biases. Strengths and limitations of TTEs are discussed. This article also presents 2 recent studies: one, comprising 3 TTEs that examined scholastic outcomes in children gestationally exposed to benzodiazepines and z-drugs in different periods during pregnancy; and the other, a TTE that examined manic switch as an outcome in bipolar depression patients who received antidepressant treatment. The TTEs found that early, mid, or late pregnancy exposure to benzodiazepines or z-drugs was not associated with impairment in fifth-grade numeracy and literacy performance; and that, in patients with bipolar depression, antidepressant drugs (with or without concurrent mood stabilizers) did not increase the 1-year risk of hypomania, mania, or mixed episodes, nor did they reduce the risk of recurrence of bipolar depression. The TTEs that yielded these results had limitations, and so these findings are suggestive, not definitive. As a general conclusion, TTEs may be viewed as pragmatic, naturalistic, real-world emulations of RCTs, with some advantages over conventional observational studies, but they cannot drive causal inference.
J Clin Psychiatry 2025;86(1):25f15796
Author affiliations are listed at the end of this article.
Target trial emulation (TTE) is an observational research design that emulates a randomized clinical trial (RCT) structure within a large set of observational data, such as data sourced from a health care database, insurance database, or medical registry; the (hypothetical) RCT that would have ideally answered the research question is the “target trial” that is emulated.
What TTEs Do
TTEs use observational data to answer research questions that, for ethical or logistic reasons, cannot easily be examined in RCTs. As examples, TTEs can examine childhood outcomes associated with gestational exposure to neuropsychiatric drugs, or rare adverse effects of neuropsychiatric treatments. TTEs can also conduct head-to-head comparisons of medical, psychosocial, or lifestyle interventions, something that is not easy to do prospectively in research environments. Performing TTEs can be better than performing conventional analyses of observational data because TTEs can reduce bias, improve the understanding of findings, and facilitate causal inference.
Background
The first study employing TTE was probably published in 2008.1 In this study, the authors emulated an RCT within the Nurses’ Health Study data to determine the risk of coronary heart disease events in postmenopausal women who initiated vs did not initiate hormone replacement therapy early vs late after menopause. Importantly, the TTE results appeared to resolve discrepancies in findings between the observational Nurses’ Health Study and the Women’s Health Initiative RCT.1 In other words, if RCTs are a gold standard for detecting cause-effect relationships, analyzing observational data using conventional methods can result in misleading conclusions, but emulating an RCT within the same observational data can align the study findings with the gold standard results.
The term “target trial emulation” and the framework for its use were probably first described in 2016.2 The TTE research design has now been used in dozens of studies across many different branches of medicine, including psychiatry, and the concept and its applications have been explained in many articles.3–8
TTE is easy to understand if readers are familiar with RCTs and observational studies. This article explains TTE and describes 2 recent studies that used TTE methods to examine research questions that are difficult to examine in formal RCTs. One study was conducted in women who were pregnant,9 and the other, in adults diagnosed with bipolar depression.10
Limitations of Conventional Cohort Studies
Observational cohort study data, extracted from health care or related databases, are conventionally analyzed using models of regression. In such studies, few restrictions are set on the eligibility of subjects for analysis. As a result, subjects are included regardless of sociodemographic and clinical characteristics, presence of medical and psychiatric comorbidities, presence of alcohol and substance use disorders, choice of drug and dose of drug, use of concurrent medications, duration of treatment, duration of follow-up, and other details, all of which are clearly and (usually) restrictively defined in RCTs. The overarching inclusiveness results in some strengths and some limitations.
A strength of such broadly representative data is that these are real-world data, and so the analyses yield results that have good external validity. However, there are many limitations to the results obtained from conventional approaches to such data. An important but seldom acknowledged limitation is that readers do not get a sense of the effects of specific drugs, doses, and treatment schedules within a specified time span, or the patient subpopulations to which the results can be generalized. Generalization of findings is especially a problem when complex statistical models are applied to address unbalanced covariates and confounding.
Other problems with conventional approaches relate to special kinds of bias, such as prevalent user bias and immortal time bias. These are discussed in the next sections.
Prevalent User Bias
Consider a health care database in which we identify all patients who had a diagnosis of bipolar disorder on our chosen study start date, January 1, 2015. We classify this sample of patients into 2 groups: patients who were using an antidepressant drug on our study start date (treatment group) and patients who were not using an antidepressant drug (comparison group). We wish to determine whether antidepressant use increases the risk of manic switch in bipolar disorder. We follow our sample in the health care database till December 31, 2020, our chosen study end date. We define our primary endpoint as the occurrence of a manic switch. Patients are censored if they start antidepressant treatment in the comparison group, if they are lost to follow-up, or if they reach the study end date without experiencing a manic switch (right censoring).
Such a study appears sound on the surface but is vulnerable to a prevalent user bias. That is, patients receiving an antidepressant on the study start date may have been overrepresented for tolerating antidepressants well and not experiencing a manic switch while on these drugs.
Prevalent user bias occurs when follow-up starts after rather than at the time of treatment assignment. It is sometimes described in other ways, such as depletion of susceptibility risks bias, current user bias, or persistent user bias.3,6
Immortal Time Bias
In the example above, to avoid a prevalent user bias, we decide that we will not use a calendar date as our study start date. Instead, we identify antidepressant-naïve bipolar patients and set the start date as the date of diagnosis of their first episode of major depression. We now follow these patients in the health care database till December 31, 2020. We compare patients who started antidepressant treatment within 6 months of bipolar depression diagnosis with those who did not start antidepressant treatment to determine whether antidepressant initiation is associated with switch into mania. In all other regards, the study design is the same as that described above.
This study appears better designed than the previous one but suffers from immortal time bias because patients in the antidepressant group would not have experienced a manic switch between the date of diagnosis of depression and the date of starting antidepressant treatment (had they experienced a manic switch, they would not have been prescribed an antidepressant drug). So, the patients in the antidepressant group were immortal to the study outcome for up to 6 months, by design. The comparison group would not have enjoyed any period of immortality.
Immortal time bias occurs when follow-up starts before patients are eligible for the trial and are assigned to their treatment.3 Immortal time bias can be avoided by synchronizing time zero for treatment allocation in the groups being compared.
Target Trial Emulation
As already stated, TTE is a research design that attempts to create an RCT structure within an available set of observational data. The hypothetical RCT that ideally answers the research question is the target trial to be emulated. The source of the observational data is any medical record system, including health care databases, insurance databases, national registries, or disease or other specific registries. TTE is a quasi-experimental research design,11 and randomization is emulated by adjustment for covariates and confounds.12 With proper emulation, TTE can overcome the limitations of conventional analyses, and the biases listed in the previous sections.
The first step in TTE is to outline the target trial; that is, the protocol for the gold standard RCT that addresses the research question. The second step is to outline the TTE that resembles the RCT; thus, the protocol for the TTE is prespecified. The third and final step is to extract the data from the source and to analyze the data, following the specifications in the TTE protocol.
Critical elements that need to be defined in the target RCT and emulated in the TTE are presented in Table 1. Key differences between TTEs and observational studies, quasi-experimental studies, and RCTs are presented in Table 2. Readers who want more detailed discussions can consult the sources cited earlier.3–8
Benzodiazepines and z-Hypnotics in Pregnancy
For ethical and other reasons, RCTs of neuropsychiatric drugs are rare in pregnancy. There is therefore uncertainty about the antenatal and perinatal safety of these drugs, including uncertainty about whether gestational exposure to these drugs affects neurodevelopment during childhood and adolescence. In this context, Sundbakk et al9 described 3 TTE studies that examined scholastic skills in children who had been gestationally exposed to benzodiazepine and z-hypnotic drugs. Separate TTEs examined outcomes after early (until gestational week 16), mid (between weeks 17 and 28), and late (between weeks 29 and the end of pregnancy) gestational exposure. For simplicity, only what the authors did is described, and not the target trial and how it was emulated; these details were presented by the authors in their paper and should be reasonably obvious from the description that follows.
The samples for the 3 TTEs were drawn from the Norwegian Mother, Father and Child (MoBa) cohort study and from linked national medical registers. Women were eligible if they were recruited during 2002–2008, if they had a singleton pregnancy, if they had a self-reported history of anxiety or depression (or self-reported use of antidepressant drugs) before pregnancy, and if they had completed relevant MoBa questionnaires.
For each study, exposure was based on treatment initiation with a benzodiazepine or z-drug during the window that defined that study, with no exposure to these drugs prior to that window. For each study, time zero was the start of the eligibility period and the time of possible initiation of exposure; that is, the start of the window that defined that study. Thus, time zero was gestational week 0, week 17, and week 29 for the early, mid, and late pregnancy exposure studies.
Groups were defined by exposure. Exposed pregnancies were those with exposure during the gestational window that defined that study but no exposure earlier during that pregnancy. Unexposed (control) pregnancies were those with exposure neither during the window that defined that study nor earlier.
Children were followed until assessment at the national fifth-grade tests, and the outcomes were the numeracy and literacy scores in these tests. The data were analyzed with adjustment for baseline covariates by inverse probability of treatment weights using propensity scores to emulate randomization. Covariates that were adjusted for included maternal age, education, socioeconomic status, parity, smoking and alcohol use, body mass index, sleeping problems, medical conditions, other medication use, and self-reported anxiety or depression, among others.
The exposed and unexposed samples comprised 197 and 7,598 pregnancies for the early pregnancy TTE, 34 and 6,651 pregnancies for the mid pregnancy TTE, and 24 and 5,719 pregnancies for the late pregnancy TTE.
Important findings from the TTEs9 are presented in Table 3. The results suggested that, regardless of period of exposure during pregnancy, benzodiazepines and z-drugs do not significantly affect fifth-grade numeracy and literacy performance.
Limitations of the MoBa TTEs
The MoBa TTEs9 suffered from many limitations, the most obvious of which was the very small number of exposed pregnancies, especially for the mid (n = 34) and late (n = 24) pregnancy TTEs. Another limitation was that for the early and mid pregnancy TTEs, pregnancies were considered unexposed if there was no exposure during the study time window even if there was exposure during that pregnancy but after that time window. This assumes, without justification, that later exposure does not affect outcomes, potentially compromising the internal validity of the TTEs. Finally, as in most other studies of this nature, there was no consideration of post-recruitment bias (in RCTs, postrandomization bias).13 Examples of such bias are pregnancy complications and early childhood adversities.
Antidepressants in Bipolar Depression
The use of antidepressant drugs in bipolar depression is discouraged because of concerns that it could result in roughening of the course of the illness, manic switch, and cycle acceleration. Yet, some RCTs have failed to identify such risks, and the use of antidepressants in bipolar depression is widely prevalent, especially under cover of a mood stabilizer or atypical antipsychotic drug. The safety of antidepressants in bipolar depression is hard to address in an RCT because of the sample size necessary for an adequately powered study, the duration of follow-up required for a sufficient number of events to accrue, and the inevitability of dropouts during follow up, making detection of events difficult.
In this context, Rohde et al10 described a TTE that examined the risk of antidepressant-induced mania in patients with bipolar depression. Only what the authors did is described, and not the target trial and how it was emulated; this information was presented by the authors in their paper and should be understandable from the description that follows.
The data were drawn from nationwide Danish health registers. Eligible subjects were adult inpatients, discharged with a first diagnosis of bipolar depression, who did not have a diagnosis of bipolar depression in the previous 2 years, who did not use antidepressant medication in the previous 2 years, and who did not have a schizophrenia spectrum disorder diagnosis. Out of 7,877 subjects, 979 were found eligible. These subjects were assigned to treated (n = 358) or untreated (n = 621) groups based on redemption of a prescription for an antidepressant within 2 weeks of discharge.
Time zero was the time of eligibility and treatment assignment; that is, 2 weeks after discharge. Patients were followed for 1 year from time zero with the primary outcome being admission for hypomania or mania. In a sensitivity analyses, admission for mixed episode was added as a primary outcome. Admission for bipolar depression was examined as a secondary outcome.
The data were analyzed in an intent-to-treat model using Cox proportional hazards regression, and randomization was emulated by adjustment for age, sex, calendar year, education, marital status, occupation status, number of previous outpatient visits and inpatient admissions, medical comorbidity, previous substance use, other psychiatric disorders, previous admissions for hypomania, mania, or mixed episodes, use of sedative/ hypnotic drugs, and severity of the index depressive episode.
Important findings from the study10 are presented in Table 4. In summary, in no analysis was antidepressant treatment associated with an increased risk of hypomania, mania, or mixed episodes. Antidepressant drugs did not protect against recurrence of bipolar depression, either.
Limitations of the TTE of Antidepressants in Bipolar Depression
The TTE10 did not censor or exclude the comparison group patients (26.4%) who redeemed prescriptions for an antidepressant drug during follow-up; the analysis, as performed, could have biased the findings of the study towards the null. In all analyses, the number of events was small and so the fully adjusted models may have suffered from overfitting. The TTE could not examine roughening of the course of illness, or mood disturbance not requiring admission, as antidepressant-associated adverse outcomes. The TTE could not examine findings in bipolar 1 vs bipolar 2 subgroups. The latter 2 limitations were related to unavailability of the relevant data.
General Limitations of TTEs
TTEs address immortal time bias, prevalent user bias, and other biases that limit conventional research designs that examine observational data. TTEs also improve the understanding of the treatment environment to which the results can be generalized. However, TTEs have limitations, some of which are not well acknowledged.
First, as would have been apparent from previous sections, the sample of eligible participants becomes small when target trial eligibility criteria are applied; this reduces statistical power. Next, TTE data extracted from records are unlikely to meet RCT standards for reliability and validity. That is, they are likely to be imprecise, and imprecision in diagnosis, treatment details, and assessment of outcomes will blur the values of key variables. The resultant statistical noise will further compromise statistical power. This is a limitation of all retrospective studies and not TTEs alone, but impacts TTEs more because of the sample size attenuation. This limitation also means that power calculation procedures that work well for RCTs will overestimate power in TTEs.
Third, postrandomization biases are more difficult to control and identify when data are extracted from records than when data are collected prospectively in RCTs. Fourth, TTEs cannot emulate subject-blinding and assessor-blinding, nor can it emulate placebo controls. Fifth, follow-up is unlikely to be as rigorous in a TTE as in an RCT. Sixth, data are more likely to be missing in TTEs than in RCTs. All of these are also limitations of traditional retrospective observational studies.
Finally, emulation of randomization in a TTE is irremediably compromised by confounding by severity of indication and by inadequately measured, unmeasured, and unknown confounds; admittedly, these are limitations of all non-randomized studies, and not TTEs, alone. Thus, optimism notwithstanding,4,6 TTEs may be viewed as pragmatic, naturalistic, real world emulations of RCTs, with some advantages over conventional observational studies, but they cannot drive causal inference.
Other Limitations of TTEs
TTEs cannot study interventions that have not yet become available for clinical use because no information about these will be available in health care databases. Likewise, TTEs cannot study novel dosing or unusual treatment strategies, or use of a treatment for a new indication, if information for these does not exist in health care databases. TTEs cannot emulate disease vs healthy control or biomarker present vs absent designs because subjects cannot be randomized to these groups; so, there is no RCT to emulate (however, the principles of TTE can certainly be applied in these contexts).
Take-Home Messages
Early, mid, or late pregnancy exposure to benzodiazepines or z-drugs was not associated with impairment in fifth-grade numeracy and literacy performance. In patients who were discharged after treatment for bipolar depression, antidepressant drug prescription (with or without concurrent mood stabilizers) within 2 weeks of discharge did not increase the 1-year risk of hypomania, mania, or mixed episodes, nor did it reduce the risk of bipolar depression recurrence. The TTE studies that yielded these results had specific limitations over and above the general limitations of TTEs, and so these conclusions are suggestive, not definitive.
Article Information
Published Online: February 12, 2025. https://doi.org/10.4088/JCP.25f15796
© 2025 Physicians Postgraduate Press, Inc.
To Cite: Andrade C. Target trial emulation: a concept simply explained. J Clin Psychiatry. 2025;86(1):25f15796.
Author Affiliations: Department of Psychiatry, Kasturba Medical College, Manipal Academy of Higher Education, Manipal, India; Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bangalore, India ([email protected]).
Relevant Financial Relationships: None.
Funding/Support: None.
Each month in his online column, Dr Andrade considers theoretical and practical ideas in clinical psychopharmacology with a view to update the knowledge and skills of medical practitioners who treat patients with psychiatric conditions.
Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bangalore, India. Please contact Chittaranjan Andrade, MD, at Psychiatrist.com/contact/andrade.
References (13)
- Hernán MA, Alonso A, Logan R, et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology. 2008;19(6):766–779. PubMed CrossRef
- Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–764. PubMed
- Hernán MA, Sauer BC, Hernández-Díaz S, et al. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol. 2016;79:70–75. PubMed CrossRef
- Hernán MA, Wang W, Leaf DE. Target trial emulation: a framework for causal inference from observational data. JAMA. 2022;328(24):2446–2447. PubMed
- Matthews AA, Danaei G, Islam N, et al. Target trial emulation: applying principles of randomised trials to observational studies. BMJ. 2022;378:e071108. PubMed
- Fu EL. Target trial emulation to improve causal inference from observational data: what, why, and how? J Am Soc Nephrol. 2023;34(8):1305–1314.
- Hubbard RA, Gatsonis CA, Hogan JW, et al. “Target trial emulation” for observational studies - potential and pitfalls. N Engl J Med. 2024;391(21):1975–1977. PubMed
- Honap S, Danese S, Peyrin-Biroulet L. Target trial emulation: improving the quality of observational studies in inflammatory bowel disease using the principles of randomized trials. Inflamm Bowel Dis. 2024 Jun;11:izae131.
- Sundbakk LM, Wood M, Gran JM, et al. Prenatal exposure to benzodiazepine and z-hypnotics and fifth-grade scholastic skills - emulating target trials using data from the Norwegian Mother, Father and Child Cohort Study. Am J Epidemiol. 2025;194(1):73–84. PubMed CrossRef
- Rohde C, Østergaard SD, Jefsen OH. A nationwide target trial emulation assessing the risk of antidepressant-induced mania among patients with bipolar depression. Am J Psychiatry. 2024;181(7):630–638. PubMed
- Andrade C. The limitations of quasi-experimental studies, and methods for data analysis when a quasi-experimental research design is unavoidable. Indian J Psychol Med. 2021;43(5):451–452.
- Andrade C. Confounding by indication, confounding variables, covariates, and independent variables: knowing what these terms mean and when to use which term. Indian J Psychol Med. 2024;46(1):78–80.
- Andrade C. Poorly recognized and uncommonly acknowledged limitations of randomized controlled trials. Indian J Psychol Med. 2024;20:02537176241297953.
This PDF is free for all visitors!
Save
Cite