 Research article
 Open Access
 Published:
Identifying menstrual migraine– improving the diagnostic criteria using a statistical method
The Journal of Headache and Painvolume 20, Article number: 95 (2019)
Abstract
Objective
To develop a robust statistical tool for the diagnosis of menstrually related migraine.
Background
The International Classification of Headache Disorders (ICHD) has diagnostic criteria for menstrual migraine within the appendix. These include the requirement for menstrual attacks to occur within a 5day window in at least \(\frac {2}{3}\) menstrual cycles (\(\frac {2}{3}\)criterion). While this criterion has been shown to be sensitive, it is not specific. Yet in some circumstances, for example to establish the underlying pathophysiology of menstrual attacks, specificity is also important, to ensure that only women in whom the relationship between migraine and menstruation is more than a chance occurrence are recruited.
Methods
Using a simple mathematical model, a Markov chain, to model migraine attacks we developed a statistical criterion to diagnose menstrual migraine (sMM). We then analysed a data set of migraine diaries using both the \(\frac {2}{3}\)criterion and the sMM.
Results
sMM was superior to the \(\frac {2}{3}\)criterion for varying numbers of menstrual cycles and increased in accuracy with more cycle data. In contrast, the \(\frac {2}{3}\)criterion showed maximum sensitivity only for three cycles, although specificity increased with more cycle data.
Conclusions
While the ICHD \(\frac {2}{3}\)criterion is a simple screening tool for menstrual migraine, the sMM provides a more specific diagnosis and can be applied irrespective of the number of menstrual cycles recorded. It is particularly useful for clinical trials of menstrual migraine where a chance association between migraine and menstruation must be excluded.
Introduction
Menstrual migraine
The International Classification of Headache Disorders 3 (ICHD3) provides diagnostic criteria for menstrual migraine without aura (MM)^{Footnote 1}[1]. As these criteria have not been thoroughly validated they are placed in the appendix.
The criteria are based on three main features:

1.
The type of migraine: migraine without aura (MO);

2.
The timing of attacks in relation to menstruation: they should occur during the menstrual window, i.e. the 5days starting two days before onset of menstruation until the third day of bleeding (i.e. day 1±2); and

3.
The frequency of attacks in relation to menstruation: attacks should be present in at least two out three consecutive menstruations.
The term MM covers two subtypes: A1.1.1 pure menstrual migraine (PMM), and A1.1.2 menstrually related migraine (MRM). Women with PMM have exclusively perimenstrual attacks, while women with MRM have additional nonmenstrual attacks. The focus of this research is on MRM, and to some extent, PMM. Here we refer to the above ICHDcriteria jointly as the twooutofthree (\(\frac {2}{3}\))criterion and to MRM diagnosed by this criterion as \(\frac {2}{3}MRM\).
There is evidence to support features 1 and 2; i.e. the migraine type and the timing of the attacks [2–7]. However, the third feature, considering the frequency of attacks on menstrual days, is not statistically sound although it was originally introduced to rule out spurious association between menstruation and migraine [8–10]. These criticisms remain forceful even when migrainediaries of high quality are available for the patients. A pertinent question is: how to ensure that attacks with menstruation are not occurring by chance [9]?
It is debated whether MM should be regarded as MO triggered by menstruation, or, if MM constitutes a distinct entity [9, 11, 12]. Indeed, after decades of research, the pathophysiological mechanisms of MM are poorly understood. In order to further penetrate these mechanisms it is crucial that a homogeneous population of patients – where the association between menstruation and migraines is greater than chance – is studied.
Statistical criteria
To appreciate the problem, an inherent shortcoming with the \(\frac {2}{3}\)criterion is that it is neither sensitive nor specific for a de facto association: the \(\frac {2}{3}\)criterion risks including women where the association is entirely absent, [8, 10] and, conversely, the \(\frac {2}{3}\)criterion may exclude women with a clear and statistically significant association. This occurs when migraine attacks are less frequent (e.g. women with migraine attacks in every second menstruation and only very rarely outside the menstrual window). Furthermore, it unclear how the criteria are to be applied to diaries with more than three 5day menstrual windows.
Partly to address the concern regarding spurious associations, a probability criterion (PC) for MRM was proposed by Marcus et al. [8] Unfortunately, the PC’s original formalization was mathematically flawed. Later, Barra et al. published a corrected version of the PC, together with a simulationanalysis of its testcharacteristics [10].
The statistical test that underpins the PC from Barra et al. [10] relies on a nonclustering assumption for correct size: the criterion’s rate of type I errors. The nonclustering assumption (or the independence of attacks assumption) asserts that there is a daytoday constant and independent probability of migraine that is unaltered by observing headaches. However, this assumption does not hold. Migraine days do cluster. According to the ICHD definition migraine attacks may last up to three days (72 h) untreated or unsuccessfully treated [1]. In a recent study^{Footnote 2}, it was shown that about 50% of migraine attacks are expected to span more than one day [13].
The aim of the present work is to develop the PC into a more robust statistical criterion for MRM, which is independent of the clustering of attacks. By focusing on the number of migraine attacks – rather than the number of migraine days falling inside or outside the menstrual window – the simple statistical test (and its interpretation) from Barra et al. can be retained [10]. This leads to a novel and statistically attractive alternative diagnostic criterion for MM: statistical MM (sMM). Furthermore, we analyse a data set of migraine diaries, to compare the \(\frac {2}{3}\)MRM to the sMM, and discuss differences, and their implications for further research on MM. We also assessed the new criterion’s accuracy in a simulation study.
We appreciate that the sMM criterion developed here necessitates somewhat more complicated calculation and bookkeeping of the migraine diaries than the PC from Barra et al., [10] but argue that this tradeoff is worthwhile. On this note, some of the materials presented over the next sections might appear intimidating to the mathematically untrained. However, the mathematics presented is quite simple, and most readers will be able to understand the formulae and reasoning with some efforts. This is not to say that it is easy to penetrate all the details, nor that a quick readthrough will suffice for a full understanding. The “Discussion” section therefore begins with a very simplified account of what we have done.
A note on the terminology is warranted. The term MM is taken to mean menstrual migraine, and includes both the pure variant (PMM) and menstrually related migraine (MRM). In this article, \(\frac {2}{3}\)MRM and sMM (and the PC from Barra et al. and Marcus et al.) denote diagnostic criteria for MRM. However, the sMM criterion can diagnose PMM, since sMM will also classify most migraine diaries displaying PMM as a case of sMM. There is clearly a strong statistical association between menstruation and migraine in women with PMM, and the sMM criterion will identify this.
Methods
Theory
In this paper we will assume that migraine attacks can be modelled by the simple Markov chain model in Fig. 1, as suggested by Barra et al.[13]
Within this framework, MM can be defined as a patient’s tendency to have an increased migraine probability (μ^{M}) during her 5day menstrual windows, as compared to the nonmenstrual migraine probability (μ^{NM}). We may then ask: does the individual patient experience a statistically significant increase in the probability of migraine onset during the menstrual window?
The previous publications on the PC used a very simple exact test (onesided Fisher’s Exact with midp correction [10, 14]) yielding pvalues for a nullhypothesis of nonassociation between menstruation and migraine, so that low pvalues indicate a likely association between menstrual windowdays and migraine days (pvalues are inherently hard to interpret; [15] we give a precise statement below). In terms of the Markov chain model, the nonclustering assumption is equivalent to μ≠δ. But, the assumption of nonclustering is empirically false: migraine days do cluster [13].
Here we show that by focusing on when attacks start – that is estimating individuals’ μ’s based on their headache diaries – we retain most of the simplicity, and all of the statistical rigour, of the PC, while relaxing the nonclustering assumption.
The main points about the criterion we introduce below are: that the pvalues are computed from a patient’s 2×2table classifying days on which a migraine attack could start, as any of the four possible combinations of menstrual vs. nonmenstrual, and, migraine started vs. migraine did not start. Secondly, that a onesided test is employed: we are only interested in patients with an elevated migraine probability during the menstrual window. A twosided test would be unnecessarily conservative for our purposes, and furthermore obscure the desired interpretation of the resulting pvalue. This pvalue can be interpreted thus:

there is no association between the patient’s migraine attack pattern and the menstrual cycle, so that there is no increased probability of observing migraine attacks on menstrual days, i.e.:
$$\mu^{\mathrm{M}}\mu^{\text{NM}}\stackrel{\text{\tiny def}}{=}\Delta\mu=0$$ 
the probability of seeing attacks start as frequently within the patient’s menstrual windows, compared to outside them, as observed in the patient’s diary, equals p.
Hence, a ‘low p’ means that association between menstruation and migraine is likely.
Consider the excerpt from a hypothetical headache diary given in Fig. 2. The first row records a first day of a menstrual bleeding on the fourth day (X), meaning that the days 2—6 define a 5day menstrual window (indicated by shading). In the second row, each day on which migraine was present is indicated (M), i.e. days 2—4, 8 and 9. Counting migraine days within and outside the menstrual window yields that out of the N=9 days (of the excerpt) we count n=5 migraine days in total, k=3 menstrual migraine days, and K=5 menstrual days, for the following contingency table (Fig. 3):
These key figures can then be used to compute the probability of seeing k (or more) migraine days falling within the menstrual window days, given that we have observed a total of N days, out of which n were migraine days by the following formulae:
Formula (1) specifies the probability mass function f_{HG} for the hypergeometric distribution: it computes the probability of seeing exactly k migraine days within the menstrual window, given that migraine days are equally likely to occur on any day. Formula (2) gives the pvalue we seek; the sum of the probabilities of values i that is greater than or equal to k. The last term is the midp correction, which is justified for our purposes because n itself is random prior to observing each woman’s diary. For a further discussion of this test see Lydersen et al. and Barra et al. [10, 14] However, the nonclustering assumption is crucial for this test to be of correct size. For an appropriate statistical test the size should be dominated by the preset significance level, so that for a significance level of e.g. α=0.05, the probability of rejecting nonassociation ought never to exceed 5% on a sample of diaries satisfying the nullhypothesis.
Removing the need for nonclustering – trimming: counting only attackstarts
Returning to the Markov chain model, we realize that it can be appropriate to perform the test just discussed if we focus solely on days of the headache diary which corresponds to the transition probability μ. Under the nullhypothesis, this parameter ought not to be influenced by whether or not a day falls within a menstrual window. Conversely, if μ depends on the menstrual status, then we could hope to detect this by the onesided test for Δμ=0 versus the alternative hypothesis Δμ>0.
This can be achieved quite straightforwardly by subjecting the headache diaries to what we call trimming. Trimming is illustrated in the bottom panel of Fig. 2. Now, we ignore information from days on which an attack is ongoing, and consider only information from days on which an attack may potentially start. Note that we must also disregard any two days immediately following a migraine attack, and also make sure that only migraine attacks with an identifiable start are included.
The rationale for trimming has been explained in Barra et al. [13] as well as in the guidelines for controlled trials of drugs in migraine, in which the International Headache Society considers that any headache pain from 2—48h after initial pain freedom should be considered a relapse, i.e. part of the same attack [16]. As a consequence we must count socalled migraine locked days – i.e. days that are immediately preceded and immediately succeeded by migraine days – as a migraine day. For example, if day 3 in Fig. 2 had been recorded in the diary as a nonmigraine day which was ‘migrainelocked’ by migraine days recorded on days 2 and 4, then day 3 would be imputed as a migraine day. We refer to Barra et al. for a more detailed exposition of how to map days to Markov chain states, and for a justification for imputing onto migrainelocked days.
By performing this trimming, we may classify the remaining diarydays according to the exact same logic before, and furthermore, revert to using the formulae (1&2, p. 23) above. Importantly, this test will have size equal to the chosen αlevel, regardless of the behaviour of the δtransitions.
Returning to Fig. 2, the days removed by trimming are 3—6 and 9 (hatched in the second row). This yields N=4,n=2,K=1, and k=1 for computing the pvalue.^{Footnote 3} We now have all the pieces necessary for our proposed statistical MM diagnosis:
Statistical Menstrual Migraine – sMM( α )

1.
Migraine without^{Footnote 4} aura;

2.
A trimmed (migrainelocked free) headache diary’s onesided Fischer Exact midp corrected pvalue <α on a test of Δμ=0.
This diagnosis is properly a family of diagnoses: any α<0.5 defines a possible cutoff, hence e.g. sMM(0.1) means that an αlevel of 0.1 has been employed – more on this in the empirical part of the study.
Data
We used a data set of headache diaries from 165 women attending the City of London Migraine Clinic during the period 1998—1999; details on this data set has been published previously [4]. Importantly, none of the women were using hormonal contraception, all initial diagnoses of migraine type headache were set by headache experts, and only records with a minimum of three consecutive menstrual cycles were included in our study; other characteristics of the migraine episodes (e.g. laterality) were not relevant for the method being developed here, and were not analysed.
We computed the length (in days) of each menstrual cycle, and the individual mean cycle lengths. Cycles of duration longer than twice that woman’s (individual) mean cycle length, were assumed to represent missing data, and the respective portion of the headache diaries were omitted. For example, if a woman displayed cycle lengths of (28, 28, 80, 28, 28, 28) days, we retained only the latter three cycles in the final analysis; in the case (28, 28, 80, 28, 28) the entire diary was excluded, as three consecutive cycles were not extractable. We imputed migraines on any migrainelocked days. Furthermore, to ensure that no migraines were erroneously registered as within or outside a menstrual window, all diaries were truncated at 15 days prior to the first, and 15 days post the last, registered menstrual bleeding. We computed descriptive statistics (means, medians, interquartile ranges (IQR)) for the number of cycles, migraine days and attacks, and migrainelocked days, both for the individual women and for the pooled data.
Diagnosing
Diagnosing the women was done by each of the two methods; the \(\frac {2}{3}\)MRM and the sMM. the \(\frac {2}{3}\)MRM diagnoses were set by an algorithm which verified that a migraine attack started within \(\geq \frac {2}{3}\) of the menstrual windows. Furthermore, an sMM pvalue was computed for each patient based on her trimmed diary.
Analysis
We compared the subgroups of patients diagnosed with each of the two diagnoses, considering various levels of α as a cutoff. Descriptive statistics were computed for each group for comparison. Empirical parameters for the Markov chain (μ, μ^{NM}, μ^{M}) were estimated from the data.
The specificity of the test is the chosen αlevel – by construction. The sensitivity of the test depends on numerous circumstances, but clearly increases in both Δμ=μ^{M}−μ^{NM} and the numberof days/menstrual cycles in a diary [10].
Since a true ‘gold standard’ for MM does not exist we conducted a simulation study to explore the two criterions’ testcharacteristics by ROC curve analysis and AUCscoring [17, 18]. The idea here is to exploit the Markov chain model so that we can generate two sample populations, one of true positives and one of true negatives. The Markov chain model, was populated by sampling from the empirical distributions of μ’s, drawing from the patients who were diagnosed with both\(\frac {2}{3}\)MRM and sMM(0.1) for simulating true positives (μ^{M} and μ^{NM}), and patients receiving neither diagnosis for simulating true negatives (μ). We simulated 10 000 diaries containing three menstrual windows for 28day cycles (23 + 5 days) together with 10 days into the fourth cycle, for each category. Each diary was diagnosed for sMM(0.1) and \(\frac {2}{3}\)MRM, sensitivity and specificity. Accompanying ROCcurve plots were also generated. This simulation was repeated for 4—9 cyclediaries.
All statistical analyses were performed with the statistical software R (v.3.4.0, 20170421) within the RStudio platform; plots were generated with ggplot2 and plotly [19–22].
Ethics
All data were fully anonymised prior to analysis for this study. At the time of data collection (1996—1998) consent was not required for surveillance studies [4].
Results
Descriptive statistics
A total of 46 (27.9%) diaries were excluded: 38 did not contain three consecutive menstrual cycles; 8 contained menstrual cycles of atypical duration; leaving 119 diaries eligible for analysis. A total of 15 358 diary days, 541 menstrual bleeds, 2 153 migraine days, and 1 070 migraine attacks were recorded in the retained data. The women recorded an average of 4.5 menstrual cycles (median = 4; range = 3—15). The median of the individual mean cycle lengths was 28.0 days (mean = 28.8, range = 15—84). See also Table 1.
Comparison of \(\frac {2}{3}\)MRM and sMM
Among the 119 women, 54 (45.4%) fulfilled the criteria for \(\frac {2}{3}\)MRM. For sMM the number of women diagnosed depended on the chosen αlevel (Fig. 4).
We (arbitrarily) set α=0.1 for diagnosing sMM in the subsequent analyses comparing those who were diagnosed with either/neither \(\frac {2}{3}\)MRM and/or sMM; see Fig. 5. This αlevel seems a reasonable compromise between sensitivity and specificity for MM. However, it is important to note that about 10% of those without an association will then be diagnosed with sMM(0.1): the specificity of sMM equals 1−α by construction.
Summary statistics for the diagnosebased subgroups are displayed in Table 2.
Women who fulfilled the \(\frac {2}{3}\)MRMcriteria exclusively – i.e. \(\frac {2}{3}\)MRM but not sMM(0.1) – presented with fewer recorded cycles, and elevated overall migraine frequencies; i.e. the typical candidate for being a false positive. Conversely, the five women who fulfilled the sMM(0.1) criteria exclusively had longer observational lengths, but lower migraine frequency. The group of sMMexclusive women all had sMM pvalues in the range 0.05—0.10, and represent roughly the expected count of false positives given \(\frac {2}{3}\)MRM as the ‘gold standard’. If, conversely sMM(0.1) is held as a ‘gold standard’, this suggests that \(\frac {2}{3}\)MRM is quite sensitive, but unacceptably unspecific.
Figure 6 displays this relationship graphically, and also visualises the differences between the two methods with respect to migraine frequency and the number of cycles recorded.
In an ad hoc sub analysis, we also computed summary statistics for the 27 women with six or more recorded cycles, under the rationale that more information ought to yield more trustworthy estimates. The general trends remained; see Table 3.
Sensitivity–specificity simulation and criteria performance
The results of the simulation analyses are contained in Fig. 7. As expected, both methods display increased performance monotonically in the number of cycles observed in the underlying simulation, reflected in an increasing AUC value.
We note that the sMM is superior across the simulations of varying number of menstrual windows.
Strikingly, the \(\frac {2}{3}\)MRMdiagnosis loses sensitivity when the number of observed menstrual windows is increased until the number of cycles reaches the next multiple of three. For the series of simulations involving three, four, and five cycles, we observe an increasing specificity of \(\frac {2}{3}\)MRM, but an accompanying drop in sensitivity, resulting in an overall deterioration as measured by the AUCvalue. For six cycles, the AUCvalue increases, followed by a similar pattern through seven and eightcycle simulations, before the AUC again is increased for the ninecycle simulation. Furthermore, the maximal sensitivity is observed for 3 cycles, revealing this criterion’s inability to convert the additional information into sensitivity for MM.
The sMM, on the other hand, shows the expected monotonic gain in accuracy with increasing information.
Discussion
We have presented a novel statistical criterion sMM for diagnosing MM in women: a statistically more robust version of previously proposed probability criterion, [8, 10] which is inappropriate given the empirically observed clustering of migraine days [13].
To remedy this we have developed a methodology for quantifying the probability that a woman’s migraine pattern is associated with her menstrual cycle based on (i) a simple model for the progression of migraine attacks (the Markov chain model), and (ii) standard statistical hypothesis tests (Fisher’s exact test). This method improves on previously suggested criteria by being more accurate (fewer false positives and fewer false negatives was shown in the simulation analysis) and more robust (no dubious assumptions like nonclustering of migraine days). We also saw that the sMM identifies most of the women identified by the ICHD’s \(\frac {2}{3}\)MRM criterion, but is more restrictive; in particular with regard to women with relatively elevated number of migraine days per 30 days. This might mean that the \(\frac {2}{3}\)MRM criterion yields unacceptably many false positives. We also saw that sMM was able to establish association for a few women that did not satisfy the \(\frac {2}{3}\)MRM criterion, which highlight that ‘\(\frac {2}{3}\)’ might be arbitrary.
Main findings
We found that women with shorter migraine diaries – in particular those that contained fewer 5day menstrual windows – paired with increased overall migraine frequency, appeared more likely to be diagnosed with \(\frac {2}{3}\)MRM than sMM: in about \(\frac {1}{3}\) of those fulfilling the \(\frac {2}{3}\)criterion, the association between migraine and menstruation was weak or even absent as measured by sMM(0.1), suggesting that the current criteria are quite unspecific.
The ICHD \(\frac {2}{3}\)criterion is also ambiguous particularly when considering extended diary data over a number of menstrual cycles; some women may fulfil the diagnostic criteria for some periods during the total period of observation, but not during other periods. For example, a woman with four cycles and migraine in the first two will not fulfil the criteria because of the 4th cycle. If she had only recorded 3 cycles, she would be diagnosed with MM. Furthermore, a serious deficiency with the \(\frac {2}{3}\)MRM criterion is the discrete nature of the test, and the arbitrary cutoff ‘twooutofthree’. As demonstrated in the simulation study, this feature makes the \(\frac {2}{3}\)criterion unable to exploit information gained in e.g. four or five cycle diaries; instead there is an implicit tradeoff between sensitivity and specificity which is controlled by the number of recorded cycles, rather than the researcher or clinician. It is beyond the scope of this work to investigate this further but these results suggest that one could choose other cutoff values than \(\frac {2}{3}\), depending on the number of cycles recorded, to partially ameliorate this situation.
Why do we need an alternative diagnostic tool?
Menstrual migraine is still a disorder characterized by large knowledgegaps. The pathophysiology is incompletely understood and consequently few highquality studies on medical treatment are available. Most of the current treatment strategies are based on the assumption that oestrogenwithdrawal is a direct or indirect trigger, while other possible mechanisms have received little attention.
To pin down the pathophysiological mechanisms responsible for MM, we need a homogeneous group of women in whom the association between migraine attacks and menstruation is proven, preferably at.05 (or lower) level of significance. In our sample, there were 29 (24%) women with a pvalue <.05, and 16 (13%) with a <.01association; these latter also all fulfilled the \(\frac {2}{3}\)criterion.
For clinical trials on MM one should be cautious both with respect to sMM and \(\frac {2}{3}\)MRM: a false association can introduce unwanted noise, while a lower than \(\frac {2}{3}\)frequency in menstrual windows could artificially inflate the measured effect of a prophylactic regime. Since the sMM does not take the regularity of attacks into account, it could be necessary to combine the criterion into a ‘\(\frac {2}{3}\)sMM’ criterion if the context of diagnosing women calls for both a certain migraine burden and a high confidence in a true association.
Clinically one would want to treat these women, at least on a watchful waiting basis, with possible further headache diary keeping for obtaining better certainty of association.
The proposed criterion is statistically robust in the sense that if sMM is diagnosed even after only two cycles, the accompanying pvalue is still valid. A pvalue of.03 means that there is only a 3% chance that the association observed is spurious. Of course, this is also true for \(\frac {2}{3}\)MRM: if a woman completing two cycles in her diary had migraine onset during both menstrual windows, then technically she would already qualify for \(\frac {2}{3}\)MRM. This is, incidentally, exactly the problem with the \(\frac {2}{3}\)criterion: for such women the information from the third cycle is completely disregarded and it is worrisome that even the presence of several nonmenstrual attacks during the third cycle, combined with a migrainefree third menstrual window, would not inform the diagnosis. In our data, of the 54 women diagnosed with \(\frac {2}{3}\)MRM, 5 (9.3%) of the cases were women with only three cycles recorded, and with migraine in exactly the first two cycles. None of these women got an sMMdiagnosis (pvalues in the range 0.15—0.35).
Indeed, the \(\frac {2}{3}\)MRMdiagnosis depends more on μ – the overall migraine frequency – than on Δμ, because an elevated overall migraine frequency is likely to result in an \(\frac {2}{3}\)MRMdiagnosis regardless of Δμ. Furthermore, a nonzero Δμ paired with a low μ^{NM} is unlikely to be picked up on. The sMMmethod, in contrast, is sensitive and specific only for Δμ.
We also remark that the sMM is closer in spirit to \(\frac {2}{3}\)MRM than the PC in the following sense: the sMM and \(\frac {2}{3}\)MRM criteria focus on migraine onset during the 5day menstrual window. The PC is sensitive for overlap with the 5day menstrual window. We believe this is a further reason to encourage the use of sMM over the PC, if a replacement or complementary criterion for the \(\frac {2}{3}\)MRM is desirable.
Limitations
This study has some limitations. Firstly, the data set was not large and the method should be tested on a larger data set before full adoption. Secondly, we rely on a migraine model with temporal unit ‘day’. Some might argue that the ‘hour’ is more appropriate. It is, however, straightforward to adapt the Markov chain migraine model from Barra et al. together with the sMM criterion presented here to any temporal unit, given that rich enough data are available so that its parameters can be estimated [13].
The sMM detects women with a statistical association between migraine and menstruation. Moreover, in contrast to the \(\frac {2}{3}\)MRM, it does not take the regularity of attacks into account. This means that a combination of both methods could be indicated in certain cases, e.g. in clinical trials. The sMM does not directly distinguish between PMM and MRM, although women with PMM will form a subgroup of women with low pvalues. Whether a distinction between PMM and MRM is necessary within a population with a significant association is questionable.
Paradoxically, the ultimate aim of developing the sMM diagnosis is that it will catalyse its own redundancy. It is developed as a mean to an end; the end being a pathophysiologicalbased MRM diagnosis. That is, we would like to identify and treat MM without having to resort to statistical analyses, instead relying on objective biomarkers. In order to achieve this, increased statistical accuracy for recognising MM is wanted.
Conclusions
The current ICHDcriteria for MRM is a useful screening tool but when diagnostic accuracy is a requisite, the more sensitive and specific sMM diagnosis could subsequently be applied to be used to include only those with an sMM diagnosis. For example, studies exploring pathophysiological mechanisms need to ensure that the association between migraine and menstruation is greater than chance. The sMM diagnosis reported here may be used as a supplement to – or as a replacement for – the appendix criteria in the ICHD.
We do not advocate using this methodology without caution, and applying either \(\frac {2}{3}\)MRM or sMM to individual patients should be guided by sound clinical judgement. However, in a context of selecting a larger group of patients for certain types of clinical trials, the sMM should be considered as an important aid.
Availability of data and materials
The raw, anonymised headache diaries are available from https://www.researchgate.net/publication/335577175_Headache_Diaries_DATABASE;https://doi.org/10.13140/RG.2.2.18517.37608.
Notes
 1.
Recently, MM with aura has been introduced in the ICHD3. However, in this paper we consider MM without aura; the methods can be applied to both subtypes.
 2.
Available as a preprint from url.to.be.provided.upon.acceptance, or by contacting the corresponding author.
 3.
The obvious catch is the reduction of the number of observations, and thus the power of the test: N=9 in the original diary, and only N=4 for the trimmed diary.
 4.
Please note that if one wants to include migraine with aura, the identical framework can be used. However, we focus on migraine without aura here.
References
 1
Headache Classification Committee of the International Headache S (2018) Headache Classification Committee of the International Headache Society (IHS) The International Classification of Headache Disorders, 3rd edition. Cephalalgia 38(1):1–211.
 2
Johannes CB, Linet MS, Stewart WF, Celentano DD, Lipton RB, Szklo M (1995) Relationship of headache to phase of the menstrual cycle among young women: a daily diary study. Neurology 45(6):1076–1082.
 3
Stewart WF, Lipton RB, Chee E, Sawyer J, Silberstein SD (2000) Menstrual cycle and headache in a population sample of migraineurs. Neurology 55(10):1517–1523.
 4
MacGregor EA, Hackshaw A (2004) Prevalence of migraine on each day of the natural menstrual cycle. Neurology 63(2):351–353.
 5
MacGregor EA, Frith A, Ellis J, Aspinall L, Hackshaw A (2006) Incidence of migraine relative to menstrual cycle phases of rising and falling estrogen. Neurology 67(12):2154–2158.
 6
Wober C, Brannath W, Schmidt K, Kapitan M, Rudel E, Wessely P, WoberBingol,̧ C, PAMINA Study Group (2007) Prospective analysis of factors related to migraine attacks: the PAMINA study. Cephalalgia 27(4):304–314.
 7
Pinkerman B, Holroyd K (2010) Menstrual and nonmenstrual migraines differ in women with menstruallyrelated migraine. Cephalalgia Int J Headache 30(10):1187–1194.
 8
Marcus DA, Bernstein CD, Sullivan EA, Rudy TE (2010) A Prospective Comparison Between ICHDII and Probability Menstrual Migraine Diagnostic Criteria. Headache J Head Face Pain 50(4):539–550.
 9
MacGregor EA (2012) Classification of Perimenstrual Headache: Clinical Relevance. Curr Pain Headache Rep 16(5):452–460.
 10
Barra M, Dahl FA, Vetvik KG (2015) Statistical Testing of Association Between Menstruation and Migraine. Headache J Head Face Pain 55(2):229–240.
 11
MacGregor EA (2008) Menstrual migraine. Curr Opin Neurol 21(3):309.
 12
MacGregor EA (2007) Menstrual migraine: a clinical review. BMJ Sex Reprod Health 33(1):36–47.
 13
Barra M, Dahl FA, Vetvik KG, MacGregor EA, Vetvik KG (2019) What Constitutes a Migraine Attack? – A Counting Clinician’s Perspective. https://doi.org/10.13140/RG.2.2.24389.40169.
 14
Lydersen S, Fagerland MW, Laake P (2009) Recommended tests for association in 2 ×2 tables. Stat Med 28(7):1159–1175.
 15
Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, Altman DG (2016) Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol 31(4):337–350.
 16
Diener HC, Tassorelli C, Dodick DW, Silberstein SD, Lipton RB, Ashina M, Becker WJ, Ferrari MD, Goadsby PJ, PozoRosich P, Wang SJ, Mandrekar J, International Headache SocietyClinicalTrialsStandingCommittee (2019) Guidelines of the International Headache Society for controlled trials of acute treatment of migraine attacks in adults: Fourth edition. Cephalalgia Int J Headache 39(6):687–710.
 17
Hanley JA (1989) Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagn Imaging 29(3):307–335.
 18
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874.
 19
R Core Team (2017) R: A Language and Environment for Statistical Computing (version 3.4.0).Vienna. http://www.Rproject.org/.
 20
RStudio Team (2016) RStudio (version 1.0.143). Window s, desktop, English. Boston, MA: RStudio, Inc. http://www.rstudio.com/.
 21
Wickham H (2009) ggplot2  Elegant Graphics for Data Analysis. Use R!. Springer, New York.
 22
Sievert C, Hocking T, Chamberlain S, Ram K, Corvellec M, Despouy P (2017) Plotly: Create Interactive Web Graphics via “Plotly.Js” (version 4.7.1). R, English. Plotly Technologies Inc. https://cran.rproject.org/web/packages/plotly/plotly.pdf.
Acknowledgments
Not applicable.
Funding
MB and FAD were partly funded by Norwegian Research Council grants No. 196454 and 237809.
Author information
Affiliations
Contributions
MB conceived the study, designed and performed the analyses, interpreted results, and drafted on the manuscript. FAD designed the analyses, interpreted results, and revised the manuscript. EAM collected the data, interpreted results, revised and drafted on the manuscript. KGV conceived the study, interpreted results, and revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Mathias Barra.
Ethics declarations
Ethics approval and consent to participate
See “n” section above.
Consent for publication
See “Ethics” section above.
Competing interests
The authors declare that they have no competing interests
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Menstrually related migraine
 Diagnostic criteria
 Statistical criteria
 Markov chain model
 Operations research
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.