Studies
PROMISE-1 (NCT02559895) and PROMISE-2 (NCT02974153) were double-blind, randomized, placebo-controlled, parallel-group studies. Study protocols have been published previously [7, 8]. For both clinical trials, the primary endpoint was change in frequency of migraine days over Weeks 1‒12. The ≥ 50% migraine responder rate (MRR), captured over Weeks 1‒12 and 13‒24, was a prespecified secondary endpoint, where patients achieving ≥ 50% MRR are those with a reduction of ≥ 50% in number of MMDs. The ≥ 50% MRR was selected as the endpoint for comparison as it is commonly used to evaluate preventive migraine treatment and is a metric that is comparable across the studies evaluated here. It also normalizes the variation inherent in MMDs (i.e., change from baseline). In addition, the 50% MRR coincides with the International Headache Society guidelines and the American Headache Society Position Statement on setting realistic expectations with use of advanced preventive therapy [9, 10].
PROMISE-1 included patients with episodic migraine (EM) and randomized eligible patients to receive eptinezumab 30 mg, 100 mg, 300 mg, or placebo, administered intravenously on Day 0 and every 12 weeks through Week 36 (i.e., up to 4 doses) [8]. Post-dose clinic visits occurred at Weeks 4, 8, 12, 16, 20, 24, 28, 36, 48, and 56. In PROMISE-1, MOH diagnosis and body mass index (BMI) were not used to determine inclusion/exclusion from the study. Baseline MMDs across all treatment groups were ~ 8.6, and baseline MHDs across all treatment groups were ~ 10.1.
PROMISE-2 included patients with chronic migraine (CM) and randomized eligible patients to receive eptinezumab 100 mg and 300 mg, or placebo, administered intravenously on Day 0 and Week 12 (i.e., up to 2 doses) [7]. Post-dose clinic visits occurred at Weeks 2, 4, 8, 12, 16, 20, 24, and 32. Patients with MOH not associated with opioid analgesics or barbiturate compounds were eligible for the study, and 40.5% (286/706) of eptinezumab-treated patients had MOH diagnosis at baseline (100 mg, n = 139; 300 mg, n = 147). Patients with a BMI ≥ 39 kg/m2 at screening were excluded from the study. Baseline MMDs across all treatment groups were ~ 16.1, and baseline MHDs across all treatment groups were ~ 20.4.
Patient subgroups
Patients were divided into subgroups based on patient characteristics, baseline migraine characteristics, and baseline patient-reported outcome responses. Patient characteristics included BMI (i.e., normal/underweight, overweight, obese), age (i.e., > 40 and ≤ 40 years), and MOH diagnosis (i.e., yes or no; PROMISE-2 only). Migraine characteristics consisted of baseline MMDs, baseline MHDs, history of migraine with aura (i.e., yes or no), average headache length (hours), and patient-identified most bothersome symptom (PI-MBS; PROMISE-2 only). Validated patient questionnaires consisted of the EuroQol 5-dimension, 5-level scale (EQ-5D-5L), 36-item Short-Form Health Survey (SF-36; v2.0), and 6-item Headache Impact Test (HIT-6; PROMISE-2 only).
BMI subgroups are in alignment with weight status categories indicated by the US Food and Drug Administration [11], and age subgroups are in alignment with the median age of the US population (38.3 years in 2019) [12]. Thresholds defining MMD and MHD subgroups in the PROMISE-1 study have been used to distinguish between low-frequency and high-frequency episodic migraine [13,14,15,16,17,18,19,20]. Eligibility for PROMISE-2 included ≥ 15 MHDs, including ≥ 8 MMDs; thus, the cutpoints used for those subgroup definitions represented 1 week above those values.
Groupings for the PI-MBS were based on those established previously for validation [21, 22]. “Pain related to migraine” included eye pain, headache, pain, pain—anatomical, pain with activity, and throbbing/pulsation; “cardinal/traditional MBS” included nausea/vomiting, sensitivity to light, and sensitivity to sound; and “other symptoms” included allodynia, aura, cognitive disruption, dizziness, fatigue, inactivity, mood changes, neck pain, pressure/tightness, sensitivity to smell, sensory disturbance, sleep disturbance, speech difficulty, vision impacts, multiple, and other.
The EQ-5D-5L consists of five dimensions of health and the patient’s self-rated health [23]. Dimensions include mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. For each dimension, the patient must select one of five responses: no problems, slight problems, moderate problems, severe problems, and extreme problems. Self-rated health is recorded on a visual analogue scale from “the best health you can imagine” (100) to “the worst health you can imagine” (0). For creating subgroups, responses on all dimensions were divided into two distinct groups: responses of no problems and responses of slight, moderate, severe, or extreme problems, representing those with no issues versus those with any issues in the respective domains. For the self-rated health scale, scores were divided into two response categories, > 80 or ≤ 80, based on the normative US population mean [24].
The SF-36 (v2.0) comprises 36 questions that measure functional health and well-being across eight domains [25]. These domains include physical functioning (10 items), role-physical (4 items), bodily pain (2 items), general health (5 items), vitality (4 items), social functioning (2 items), role-emotional (3 items), and mental health (5 items). These individual domains are then able to be combined to calculate a physical component summary score and a mental component summary score. The SF-36 utilizes a normative-based scoring system, which is the standardization of all SF-36 domains and component summary scores to a scale from 0 to 100 with mean of 50 and a standard deviation of 10, designed to be representative of the general US population [25]. To create subgroups, baseline scores were dichotomized into subgroups representing a more favorable health state (> 45.0) versus a less favorable health state (≤ 45.0), representing half a standard deviation below the normative US population mean of 50.
The HIT-6 (PROMISE-2 only) is a questionnaire used to measure the impact on the ability to function normally in daily life when a headache occurs [26]. It is a 6-question, Likert-type, self-reporting questionnaire (response scores: never = 6, rarely = 8, sometimes = 10, very often = 11, and always = 13), which assessed: 1) “When you have headaches, how often is the pain severe?”; 2) “How often do headaches limit your ability to do usual daily activities including household work, work, school, or social activities?”; 3) “When you have a headache, how often do you wish you could lie down?”; 4) “In the past 4 weeks, how often have you felt too tired to do work or daily activities because of your headaches?”; 5) “In the past 4 weeks, how often have you felt fed up or irritated because of your headaches?”; and 6) “ In the past 4 weeks, how often did your headaches limit your ability to concentrate on work or daily activities?” The total score for the HIT-6 is calculated from summing individual items (score range of 36‒78 points), with score ranges representing severe impact = ≥ 60, substantial impact = 56–59, some impact = 50–55, and little to no impact = ≤ 49. HIT-6 total score subgroups were based on the threshold for severe life impact at baseline, of which most patients were severe (eptinezumab 100 mg, 89.6%; eptinezumab 300 mg, 88.6%; and placebo, 87.4%). For each HIT-6 item, patients were divided into two distinct subgroups: those who responded never, rarely, or sometimes (i.e., patients who were less severely impacted) and those who responded very often or always (i.e., patients who were more severely impacted).
Statistical analysis
Eptinezumab 100 mg and 300 mg were compared by calculating the odds ratio of achieving ≥ 50% MRR over Weeks 1‒12 and over Weeks 13‒24 in PROMISE-1 and in PROMISE-2.
The odds ratios and 95% confidence intervals (CIs) were calculated from logistic regression; p-values were not calculated because of the post hoc nature of this analysis. Each subgroup was run through a separate model with the ≥ 50% migraine responder status as the response variable and treatment (100 mg vs 300 mg) as the predictor variable. If the 95% CI crossed 1, it was interpreted that there was no meaningful difference in efficacy between the two doses of eptinezumab.
No attempt has been made to control for multiplicity; however, this analysis has been designed to allow for a degree of independent replication. The results from the two studies have been analyzed separately and for the two dosing intervals (Weeks 1‒12 and 13‒24) with the goal of identifying factors observed in both studies versus those seen in only one study. This independent replication may allow for an opportunity to identify random results, which are expected to occur when many repeat comparisons are performed, versus real and repeatedly observed effects.
In evaluating the results presented in this manuscript, it is important to understand the statistical properties of the analyses performed. The goal of these analyses is not to posit that 100 mg and 300 mg are identical; there is no clinical or biologic reason to assume exact equivalence of these doses. The goal is instead to determine if meaningful differences might exist. The definition of meaningful is, by necessity, left to the reader. As a data reduction step, this manuscript focuses on cases when the CI of the odds ratio fails to cover 1. However, from the point of view of false-positive and false-negative findings, we know that situations where the CI does not cover 1 may be false positive and that cases where the CI does cover 1 do not indicate that the two treatments are identical. Given that 95% CIs are being used, we expect that 1 in 20 CIs will fail to cover 1 even if the “truth” was exact equivalence. We also know that the sample size used for the various CIs limits what size of effects might be called out using this 95% interval data reduction approach. To attempt to address these issues, results have been presented for both studies individually and for two separate time points to mimic the concept of experimental replication within the confines of the available data.