An approach to finding specific forms of dysbiosis that associate with different disorders ========================================================================================== * Jonathan Williams * Inga Williams * Karl Morten * Julian Kenyon ## Abstract **Background** Many disorders display dysbiosis of the enteric microbiome, compared with healthy controls. Different disorders share a pattern of dysbiosis that may reflect ‘reverse causation’, due to non-specific effects of illness-in-general. Combining a range of disorders into an ‘aggregate non-healthy active control’ (ANHAC) group should highlight such non-specific dysbiosis. Differential dysbiosis between the ANHAC group and specific disorders may then reflect effects of treatment or bowel dysfunction, or may potentially be causal. Here, we illustrate this logic by testing if individual genera can differentiate an ANHAC group from two specific diagnostic groups. **Methods** We constructed an ANAHC group (n=17) that had 14 different disorders. We then used random forest analyses to test differential dysbiosis between the ANHAC group and two other disorders that have no known pathology, but: (i) symptoms of illness (Myalgic Encephalomyelitis / Chronic Fatigue Syndrome – ME/CFS – n = 38); or (ii) both illness and bowel dysfunction (ME/CFS comorbid with Irritable Bowel Syndrome – IBS – n=27). **Results** Many genera differentiated the ANHAC group from co-morbid IBS. However, only two genera - Roseburia and Dialister – discriminated the ANHAC group from ME/CFS. **Conclusions** Different disorders can associate with specific forms of dysbiosis, over-and-above non-specific effects of illness-in-general. Bowel dysfunction may contribute to dysbiosis in IBS via reverse causation. However, ME/CFS has symptoms of illness-in-general, but lacks known pathology or definitive treatment that could cause dysbiosis. Therefore, the specific dysbiosis in ME/CFS may be causal. [230 words] **Contribution to the field** Many disorders associate with enteric dysbiosis. The pattern of dysbiosis is largely consistent between unrelated disorders, which suggests that it mainly reflects non-specific secondary effects of illness-in-general (e.g. due to changes in activity levels, or diet). However, faecal microbiome transplantation (FMT) can be therapeutic in some disorders. This implies that unique features of dysbiosis may cause those specific disorders. Here, we propose a way to assess causal effects of dysbiosis, by testing if individual genera can discriminate individual disorders from an ‘aggregate non-healthy active control’ (ANHAC) group. Dysbiosis in the ANHAC group can control for non-specific effects of illness-in-general on the microbiome and so highlight potentially-causal forms of dysbiosis in specific disorders. This approach may provide insight into pathogenetic mechanisms of individual disorders and help to design specific forms of FMT to counteract them. ## Introduction Many disorders associate with enteric dysbiosis. The pattern of dysbiosis is largely consistent between unrelated disorders.[1] So, this shared pattern may reflect reverse causation, due to non-specific secondary effects of illness-in-general (e.g. changes in activity, or diet). However, faecal microbiome transplantation (FMT) can be therapeutic in some disorders,[2–6] which indicates that unknown elements of dysbiosis may cause those specific disorders. Identifying such causal elements of dysbiosis could illuminate pathogenetic mechanisms and hence improve therapeutic strategies. The first step toward this identification is to control for non-specific secondary effects of illness-in-general on the microbiome.[1] Here, we propose a method to achieve this. In principle, (a) illness-in-general may cause dysbiosis, (b) specific features of single disorders may cause dysbiosis, or (c) dysbiosis can cause single disorders. In order to highlight (b) and (c), we propose eliminating (a) by constructing an ‘aggregate non-healthy active control’ (ANHAC) group that includes a range of unrelated disorders (c.f.[7]). The profile of dysbiosis in this ANHAC group should represent the ‘shared signature’[1] of illness-in-general ((a) above – see detailed rationale in Methods). Hence, differential dysbiosis between the ANHAC group and single disorders may reflect specific associations of dysbiosis with those single disorders ((b) and (c), above). Ideally, an ANHAC group should comprise patients with a wide range of disorders whose demo-graphic and clinical (e.g. organ system and treatment) characteristics resemble those of the target disorder(s). Each disorder in the ANHAC group may possess a specific form of dysbiosis, but overall these specific associations should cancel. Hence, if the dysbiosis in the whole ANHAC group differs from that in a single target disorder, then the elements of dysbiosis that differentiate the two groups may be unique to the target disorder. Moreover, if the demographic and clinical features and treatments of the ANHAC group and target disorder match, then elements of dysbiosis that are unique to the target disorder may be causal. One way to assess our proposed method is to compare the ANHAC group with single disorders that have no established pathology or definitive treatment. Irritable Bowel Syndrome (IBS) lacks known pathology or definitive treatment, but associates with bowel dysfunction. Hence differential dysbiosis between the ANHAC group and IBS may be secondary to bowel dysfunction. Going one step further, Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) also lacks known pathology or definitive treatment or major bowel dysfunction (comorbidity between IBS and ME/CFS is common[8, 9], but not ubiquitous). Therefore, because ‘average’ dysbiosis in an ANHAC group should reflect only illness-in-general, any differential dysbiosis between an ANHAC group and ME/CFS should not result from ‘reverse causation’ (see above), but may reflect microbiome features that contribute to causing ME/CFS. In principle, a potential problem for our proposed method is heteroscedasticity. That is, the ANHAC group is diagnostically heterogeneous by design, so that the *mean* abundance of each genus reflects the effects of illness-in-general. However, the diagnostic heterogeneity may cause the *range* of observations of each individual genus to be wide. Such heteroscedasticity may make it impossible to detect differences between the *mean* levels of individual genera in the ANHAC group and individual disorders. One goal of the present study is to determine how far the problem of heteroscedasticity may be important in practice. ## Methods ### Ethics We obtained microbiome data routinely for the primary purpose of guiding clinical management. Additionally, we obtained written consent from each participant to use their data for research. In line with UK legislation, the South Central Hampshire Research Ethics Committee deemed that our study did not, therefore, need ethical review. ### Patients We studied 4 groups of patients – ANHAC, cancer, ME/CFS and IBS. We report only comparisons between the ANHAC group and ME/CFS or IBS, here (the comparison between the ANHAC and cancer groups is in the Supplementary information). Our IBS group comprised mainly people with co-morbid IBS and ME/CFS (such co-morbidity is common[3, 10–12] – but see[13]). This co-morbidity is advantageous in the present context, as it can highlight effects of bowel dysfunction. We have reported data from some participants previously and posted the current data-set online in May 2023.[3, 14, 15] Here, we analysed only participants aged over 14. None had received FMT, but (i) all received dietary advice, (ii) those with established pathologies or treatments had ‘treatment-as-usual’ and (iii) those with ME/CFS or IBS had received β-glucan 1.3 and 1.6 for about 4 weeks prior to stool sampling. Here we coarsened patients’ ages to the nearest 5 years. ### Microbiome DNA sequencing Participants collected stool samples and sent them for analysis via bacterial 16S ribosomal RNA gene sequencing of bacterial DNA (Atlas Biomed - see[16]). Specifically, identification of microbiota used NanoSeq Illumina paired sequencing to get 250bps after merging. The analysis generated Amplicon Sequence variants using QIIME2.[17] ### Rationale Our design extends that of Gupta et al., who pooled stool metagenomes from 12 different disease or abnormal bodyweight conditions into “a single aggregate nonhealthy group”.[7] Gupta and colleagues reasoned that the comparing the profile of abundances of different genera in this aggregate nonhealthy group with the corresponding profile of healthy people would help to define the dysbiosis associated with illness-in-general. We extended Gupta’s logic to compare our aggregate non-healthy active control (ANHAC) group with groups that had single disorders – ME/CFS or IBS. We reasoned these comparisons could identify genera that associate specifically (and, possibly, causally) with individual disorders, after accounting for effects of illness-in-general. Following the above reasoning, we tested if abundances of individual genera in an ANHAC group differ from those in single disorders. Specifically, we tested differential dysbiosis between an ANHAC group and two specific disorders: (1) IBS; and (2) ME/CFS. IBS lacks known pathology, so (1) can assess how far bowel dysfunction may cause dysbiosis; differential dysbiosis between ME/CFS and the ANHAC group may reflect causes of ME/CFS (see Introduction). The present method requires the ANHAC group to be heterogeneous. However, if the microbiota of the ANHAC group are too heterogeneous, then comparisons with other, single-diagnosis groups may lack power to detect differences (see Introduction). We initially used binomial tests to compare (i) the proportion of zeros and (ii) the variances of each genus in the ANHAC group with those in each single-diagnosis group. However, (i) and (ii) inter-relate. Therefore we assessed heterogeneity more accurately using hierarchical mixed-membership beta regression, to test effects of diagnostic grouping on variance of abundances and on zero inflation while accounting for different levels of abundance of each genus between people (see Supplementary Information). We used random forest (RF) analyses to test if the relative abundances of microbial genera can predict diagnostic categories.[18] RF may be optimal for this purpose.[19–22] However, RF tends to perform badly in high-dimensional analyses with a low fraction of relevant predictors and small sample sizes.[23] Therefore, we used preliminary RF analyses to select important variables for each classification, before constructing large RFs that used the selected variables to predict each diagnostic grouping. We weighted the RFs’ sampling procedures to eliminate effects of imbalance between group sizes, by down-sampling the larger group when growing decision trees. We assessed the RFs’ abilities to predict each diagnostic grouping using the areas under the Receiver Operating Characteristic curves (AUROCs) along with their overall classification errors, which are available from the RF output.[24] We computed the 95% confidence limits for each AUROC.[25] The Supplementary Information gives the details of the RF analyses. Finally, we assessed the dependence of clinical groupings on abundances of different microbiota qualitatively, by generating partial plots to show the forms of their (independent) effects on diagnostic groupings. RF analyses can, in principle, discriminate different groups on the basis of heteroscedasticity (the variances of individual genera in each group), rather than differences in location (the mean abundance of each genus in each group). In this case, the partial relation between group membership and the abundance of a genus should show a biphasic (U-shaped) form. In contrast, if the RF discriminates two groups on the basis of the mean abundance of a genus in each group, then the partial relationship should show a monotonic form. We examined the partial dependence of group membership on each genus that was important in the RF analyses. ## Results ### Sample The main study included 82 participants (see Supplementary Information for cancer group). Three-quarters (61/82) were female and the median age was 58 (IQR 35-67). The age and sex distributions of these diagnostic groups were similar (age: KW χ2 = 0.8, 2df, p=0.67; sex χ2 = 2.7, 2df, p=0.25). The ANHAC group comprised 17 people with miscellaneous disorders (1 each of morbid obesity, acne rosacea, polymyalgia rheumatica, alopecia areata, anxiety and insomnia, autism, herpes genitalis, seborrhoeic dermatitis, eczema, Alzheimer-type dementia, and Parkinson’s Disease; 2 each with Crohn’s disease, ulcerative colitis, and motor neurone disease); 38 people had ME/CFS; 27 people had IBS (23 co-morbid with ME/CFS). ### Compositions of microbiota Data were available for 193 genera, but only 94 genera had fewer than 80% non-zero abundances (see Supplementary Table S1). There was no evidence of heteroscedasticity between different diagnostic groups (see Supplementary Information). ### Differential dysbiosis between the ANHAC group and single-diagnosis groups The proportions of individual genera discriminated the ANHAC group from the ME/CFS and IBS groups (see Table 1). View this table: [Table 1:](http://medrxiv.org/content/early/2024/04/24/2024.04.23.24306162/T1) Table 1: accuracy of the discrimination between clinical groupings by the random forest analyses Thirty individual genera discriminated the ANHAC group from the IBS group (see Supplementary Information). In contrast, only two specific genera – Roseburia and Dialister – discriminated the ANHAC group from the ME/CFS group (see Figures 1A-1B). The grouping of ‘Unclassified Bacteria’ also contributed to discriminating the ANHAC group from both the IBS and CFS groups. The Supplementary Information shows the full details of the dependence of diagnostic categories on individual genera that contribute to differential dsybiosis. ![Figures 1A-1B](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/04/24/2024.04.23.24306162/F1.medium.gif) [Figures 1A-1B](http://medrxiv.org/content/early/2024/04/24/2024.04.23.24306162/F1) Figures 1A-1B partial plots of the dependence of ME/CFS on Roseburia and Dialister. Each panel shows the probability of ME/CFS (compared with theANHAC group; y-axis) over the range of abundances (%) of an individual genus (x-axis), adjusted for all other predictors in the random forest (RF) model. Both plots show monotonic dependence of ME/CFS on the abundance of the genera, which indicates that heteroscedasticity did not exert undue influence in the RF model. See Supplementary files for partial plots of all genera selected for the random forest analyses. ## Discussion The present study found that proportions of microbiota can discriminate a heterogeneous aggregate non-healthy active-control (ANHAC) group from other, more homogeneous patient groups. Overall dysbiosis in the ANHAC group may reflect non-specific secondary effects of illness-in-general. Hence, differential dysbiosis between the ANHAC group and other, more homogeneous diagnostic groups may reflect specific effects of pathologies or their treatments in the homogenous groups. Alternatively, in groups that lack established pathology or definitive treatment (e.g. ME/CFS) – differential dysbiosis from the ANHAC group may indicate causal elements of the microbiome. Overall, our results provide preliminary proof of the principle that this approach can delineate specific dysbiosis in individual disorders, which may help to tailor appropriate treatments. The present study did not include healthy controls. Many single-disorder groups differ from healthy controls[1, 7] and there is “a common signal for gut dysbiosis …. shared across unrelated diseases”. [1] Hence, it is likely that (a) the microbiomes of healthy controls will differ from any patient group (e.g. [1, 26–28]); and (b) our ANHAC group showed dysbiosis, overall. Consistent with (b), the random forest could not differentiate the ANHAC group from a grouping of cancer diagnoses (see Supplementary Information and compare[1, 29–32]). However, the absence of healthy controls limits characterization of our ANHAC group and the nature of its dysbiosis. This limitation (which we address further in the Supplementary Information) does not prevent the proof of principle of our approach of comparing an ANHAC group with single-disorder groups. Further studies should compare ANHAC groups with healthy controls to assess the nature of dysbiosis that reflects illness-in-general.[7] The random forest analyses differentiated the ANHAC grouping from both the ME/CFS and IBS groups. Previous studies have shown that the microbiomes of healthy controls differ from both IBS[1, 33, 34] and from ME/CFS.[14, 35–38] So, the differences that we observed imply that ME/CFS and IBS have particular forms of dysbiosis. Almost all of the IBS group had co-morbid IBS and ME/CFS. Hence, the genera that discriminate the ANHAC and IBS groups may represent the combined effects of ME/CFS and gastrointestinal dysfunction. Consistent with this, low levels of Dialister and high levels of Roseburia or Unclassified bacteria contributed to discriminating both ME/CFS and IBS from the ANHAC grouping (see Supplementary file of partial plots). These convergent findings strengthen each other. The remaining genera that discriminate IBS from the ANHAC grouping may reflect causes or effects of the gastrointestinal dysfunction in IBS. The ME/CFS group had higher proportions of Roseburia than the ANHAC group. The logic of comparing single disorders with an ANHAC group is that this can control for ‘reverse causation’ due to illness-in-general (see Introduction). Since ME/CFS lacks established pathology,[1, 39] our result implies that Roseburia may cause ME/CFS. At first glance, this conflicts with reports that *low* abundance of Roseburia generally associates with illness-in-general[1] and fatigue.[40] However, people with ME/CFS complain of cognitive difficulties (“brain fog”[41, 42]) and our observation of more Roseburia in ME/CFS fits with findings of recent Mendelian Randomisation (MR) studies that higher Roseburia can *cause* worse cognition.[43, 44] Additionally, our observation of low Dialister in ME/CFS fits with findings that levels of Dialister were almost significantly lower in ME/CFS[1] and that low levels of Dialister can *cause* worse cognition.[43, 44] It is unlikely that 2 out of 3 taxa that predicted ME/CFS in our small study would concur with findings of causal MR analyses of a large sample (¼ million) by chance alone (p=0.006 – see Supplementary Information). Hence, the congruity of these results reinforces our logic that excluding reverse causation, by comparing single disorders with an ANHAC group, may help to identify causal elements of dysbiosis. The ME/CFS group had a higher proportion of “Unclassified Bacteria” than the ANHAC group. Despite advances in understanding the microbiome, many of its bacteria remain unclassified. Again, the logic that comparing single disorders with an ANHAC group can exclude reverse causation suggests that chronic infection with unknown bacteria could cause ME/CFS. An alternative possibility is that Roseburia may interact with an unknown taxon (see Supplementary Information). If confirmed, these possibilities would have important clinical implications. No other studies have compared individual disorders with an ANHAC group. Therefore, we used publicly-available data (see Supplementary Information) to do this for two of the specific disorders that we studied – cancer and IBS. In brief, we used publicly-available data from a recent large study[1] to create a new ANHAC group that represents the microbiome signature of illness-in-general. We reasoned that deviations from this signature in specific disorders may reflect specific forms of dysbiosis that associate uniquely with those disorders. The genera that deviated from the signature of illness-in-general in the regressions for cancer and IBS in those independent data correspond with the genera that random forest selected to discriminate these disorders in our study. This correspondence reinforces the reliability and validity of the ANHAC approach to defining associations between specific disorders and specific forms of dysbiosis. We describe the methods, results and limitations of these analyses in more detail the Supplementary Information. Heteroscedasticity was not a major limitation for the ANHAC comparisons. Not only did these comparisons yield results that resemble other recent findings (see above), but the partial plots of group membership on the abundances of individual genera were monotonic (which indicates that heteroscedasticity may be unimportant – see Methods). However, the ANHAC group in our study was small, which may limit the influence of heteroscedasticity. Further studies should construct larger ANHAC groups and examine the influence of heteroscedasticity more fully. We used random forest (RF) analyses to compare individual diagnoses with our ANHAC group. Using RF is not essential to the ANHAC approach that we present here. In principle, this approach can use any form of discriminant analysis, or machine learning model. However, RF analyses are generally both robust and sensitive when analysing microbiome data.[20, 45] Moreover in the present context, RF models generated partial plots that could help to assess the influence of heteroscedasticity in the ANHAC approach (see above). Our study focused on excluding reverse causation in associations between enteric dysbiosis and disorders that lack established pathology. However, there is increasing recognition that reverse causation can complicate many kinds of clinico-pathological associations that underpin our understanding of diseases (e.g. [46–53]). Methods in current use to exclude reverse causation may rely on unproven assumptions,[54] large samples,[55] long-term follow-up (e.g.[46, 56]) and/or complex statistical methods.[50, 56] In contrast, our method of constructing an ANHAC grouping uses clinical reasoning to help to exclude reverse causation due to illness-in-general. In the case of ME/CFS (where reverse causation due to definitive treatment is unlikely), our small study yielded findings comparable with those of complex Mendelian Randomisation analyses of a large sample. [43, 44] Such large samples are likely to include people with a wide range of illnesses, as well as a majority of ‘healthy controls’ – and so may already incorporate elements of our ANHAC approach. Potentially, (a) our ANHAC method may be applicable with other types of data that may carry a shared signature of illness-in-general – e.g. metabolomic or immunological measures – and (b) combining our ANHAC method with other strategies to exclude reverse causation may help to clarify causal mechanisms more efficiently in cross-sectional samples. In summary, we present an approach to control for reverse causation of dysbiosis due to illness-in-general, and so help to define forms of dysbiosis that may contribute to causing specific disorders. We used the unique features of ME/CFS to illustrate our method, but did not expect our method to illuminate ME/CFS because our sample is so small. Nevertheless, correspondence between our findings and those of Mendelian Randomisation studies of microbiota that affect cognition[43, 44] supports the rationale of our approach. Ultimately, such causal knowledge may help to tailor FMT to treat ME/CFS. ## Supporting information Supplementary information [[supplements/306162_file02.pdf]](pending:yes) R code for random forest analysis [[supplements/306162_file03.pdf]](pending:yes) R code for analysis of Cao's dataset [[supplements/306162_file04.pdf]](pending:yes) R code for analysis of Gacesa's dataset [[supplements/306162_file05.pdf]](pending:yes) R code for heteroscedasticity analysis [[supplements/306162_file06.txt]](pending:yes) Examples of decision trees of interaction between genera [[supplements/306162_file07.pdf]](pending:yes) AUROCs and partial plots [[supplements/306162_file08.pdf]](pending:yes) Dove_Atlas data [[supplements/306162_file09.xlsx]](pending:yes) ## Data Availability All data produced in the present study are available upon reasonable request to the authors, or are already available online (as indicated in the text). [https://figshare.com/articles/dataset/Raw\_data\_for\_microbiome\_abondances\_for\_patients\_1\_Cancer\_2\_Chronic\_Fatigue\_syndrome\_CFS\_Irritable\_bowel\_Syndrome\_IBS\_4\_Miscellaneous\_6\_CFS\_/22762448](https://figshare.com/articles/dataset/Raw\_data\_for\_microbiome\_abondances\_for\_patients\_1\_Cancer\_2\_Chronic\_Fatigue\_syndrome\_CFS\_Irritable\_bowel\_Syndrome\_IBS\_4\_Miscellaneous_6_CFS_/22762448) ## Acknowledgements IW received support from the Action for ME Foundation. ## Footnotes * † These authors share senior authorship * Received April 23, 2024. * Revision received April 23, 2024. * Accepted April 24, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at [http://creativecommons.org/licenses/by-nc/4.0/](http://creativecommons.org/licenses/by-nc/4.0/) ## References 1. 1.Gacesa R, Kurilshikov A, Vich Vila A, Sinha T, Klaassen MAY, Bolte LA, et al. Environmental factors shaping the gut microbiome in a Dutch population. Nature. 2022;604:732–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-022-04567-7&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35418674&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 2. 2.Hu X-F, Zhang W-Y, Wen Q, Chen W-J, Wang Z-M, Chen J, et al. Fecal microbiota transplantation alleviates myocardial damage in myocarditis by restoring the microbiota composition. Pharmacol Res. 2019;139:412–21. 3. 3.Kenyon JN, Coe S, Izadi H. A retrospective outcome study of 42 patients with Chronic Fatigue Syndrome, 30 of whom had Irritable Bowel Syndrome. Half were treated with oral approaches, and half were treated with Faecal Microbiome Transplantation. Hum Microbiome J. 2019;13:100061. 4. 4.Gheorghe CE, Ritz NL, Martin JA, Wardill HR, Cryan JF, Clarke G. Investigating causality with fecal microbiota transplantation in rodents: applications, recommendations and pitfalls. Gut Microbes. 2021;13:1941711. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 5. 5.Mullish BH, Tohumcu E, Porcari S, Fiorani M, Di Tommaso N, Gasbarrini A, et al. The role of faecal microbiota transplantation in chronic noncommunicable disorders. J Autoimmun. 2023;:103034. 6. 6.Shao T, Hsu R, Hacein-Bey C, Zhang W, Gao L, Kurth MJ, et al. The Evolving Landscape of Fecal Microbial Transplantation. Clin Rev Allergy Immunol. 2023;:1–20. 7. 7.Gupta VK, Kim M, Bakshi U, Cunningham KY, Davis JM, Lazaridis KN, et al. A predictive index for health status using species-level gut microbiome profiling. Nat Commun. 2020;11:4635. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-020-18476-8&link_type=DOI) 8. 8.Riedl A, Schmidtmann M, Stengel A, Goebel M, Wisser A-S, Klapp BF, et al. Somatic comorbidities of irritable bowel syndrome: A systematic analysis. J Psychosom Res. 2008;64:573–82. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jpsychores.2008.02.021&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18501257&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000256455300003&link_type=ISI) 9. 9.Mariman A, Delesie L, Tobback E, Hanoulle I, Sermijn E, Vermeir P, et al. Undiagnosed and comorbid disorders in patients with presumed chronic fatigue syndrome. J Psychosom Res. 2013;75:491–6. 10. 10.Aaron LA, Burke MM, Buchwald D. Overlapping conditions among patients with chronic fatigue syndrome, fibromyalgia, and temporomandibular disorder. Arch Intern Med. 2000;160:221–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/archinte.160.2.221&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10647761&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000084877700013&link_type=ISI) 11. 11.Nagy-Szakal D, Barupal DK, Lee B, Che X, Williams BL, Kahn EJR, et al. Insights into myalgic encephalomyelitis/chronic fatigue syndrome phenotypes through comprehensive metabolomics. Sci Rep. 2018;8:10056. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41598-018-28477-9&link_type=DOI) 12. 12.Petersen MW, Schröder A, Jørgensen T, Ørnbøl E, Dantoft TM, Eliasen M, et al. Prevalence of functional somatic syndromes and bodily distress syndrome in the Danish population: the DanFunD study. Scand J Public Health. 2020;48:567–76. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 13. 13.Monden R, Rosmalen JGM, Wardenaar KJ, Creed F. Predictors of new onsets of irritable bowel syndrome, chronic fatigue syndrome and fibromyalgia: the lifelines study. Psychol Med. 2022;52:112–20. 14. 14.Morten KJ, Staines-Urias E, Kenyon J. Potential clinical usefulness of gut microbiome testing in a variety of clinical conditions. Hum Microbiome J. 2018;10:6–10. 15. 15.Williams J, Karl Morten, Kenyon J, Williams I. Raw data for microbiome abondances for patients 1)Cancer, 2)Chronic Fatigue syndrome(CFS)+ Irritable bowel Syndrome (IBS), 4) Miscellaneous, 6) CFS. 2023. 16. 16.Romanov VA, Karasev IA, Klimenko NS, Koshechkin SI, Tyakht AV, Malikhova OA. Luminal and Tumor-Associated Gut Microbiome Features Linked to Precancerous Lesions Malignancy Risk: A Compositional Approach. Cancers. 2022;14:5207. 17. 17.Janssen S, McDonald D, Gonzalez A, Navas-Molina JA, Jiang L, Xu ZZ, et al. Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information. mSystems. 2018;3:e00021–18. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1128/mSystems.00021-18&link_type=DOI) 18. 18.Ishwaran H, Kogalur UB, Chen X, Minn AJ. Random survival forests for high-dimensional data. Stat Anal Data Min ASA Data Sci J. 2011;4:115–32. 19. 19.Song K, Wright FA, Zhou Y-H. Systematic Comparisons for Composition Profiles, Taxonomic Levels, and Machine Learning Methods for Microbiome-Based Disease Prediction. Front Mol Biosci. 2020;7:610845. 20. 20.Troll M, Brandmaier S, Reitmeier S, Adam J, Sharma S, Sommer A, et al. Investigation of Adiposity Measures and Operational Taxonomic unit (OTU) Data Transformation Procedures in Stool Samples from a German Cohort Study Using Machine Learning Algorithms. Microorganisms. 2020;8:547. 21. 21.Kubinski R, Djamen-Kepaou J-Y, Zhanabaev T, Hernandez-Garcia A, Bauer S, Hildebrand F, et al. Benchmark of Data Processing Methods and Machine Learning Models for Gut Microbiome-Based Diagnosis of Inflammatory Bowel Disease. Front Genet. 2022;13:784397. 22. 22.Bakir-Gungor B, Hacılar H, Jabeer A, Nalbantoglu OU, Aran O, Yousef M. Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods. PeerJ. 2022;10:e13205. 23. 23.Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. 2nd edition. New York: Springer; 2009. 24. 24.Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: visualizing classifier performance in R. Bioinformatics. 2005;21:3940–1. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/bti623&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16096348&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000232596300023&link_type=ISI) 25. 25.LeDell E, Petersen M, van der Laan M. cvAUC: Cross-Validated Area Under the ROC Curve Confidence Intervals. 2022. 26. 26.DeGruttola AK, Low D, Mizoguchi A, Mizoguchi E. Current understanding of dysbiosis in disease in human and animal models. Inflamm Bowel Dis. 2016;22:1137–50. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MIB.0000000000000750&link_type=DOI) 27. 27.Safadi JM, Quinton AMG, Lennox BR, Burnet PWJ, Minichino A. Gut dysbiosis in severe mental illness and chronic fatigue: a novel trans-diagnostic construct? A systematic review and meta-analysis. Mol Psychiatry. 2021;:1–13. 28. 28.Evans T, Ali U, Anderton R, Raby E, Manning L, Litton E. Lower gut dysbiosis and mortality in acute critical illness: a systematic review and meta-analysis. Intensive Care Med Exp. 2023;11:6. 29. 29.Oliva M, Mulet-Margalef N, Ochoa-De-Olza M, Napoli S, Mas J, Laquente B, et al. Tumor-Associated Microbiome: Where Do We Stand? Int J Mol Sci. 2021;22:1446. 30. 30.Sędzikowska A, Szablewski L. Human Gut Microbiota in Health and Selected Cancers. Int J Mol Sci. 2021;22:13440. 31. 31.Avuthu N, Guda C. Meta-Analysis of Altered Gut Microbiota Reveals Microbial and Metabolic Biomarkers for Colorectal Cancer. Microbiol Spectr. 2022;10:e0001322. 32. 32.Suga D, Mizutani H, Fukui S, Kobayashi M, Shimada Y, Nakazawa Y, et al. The gut microbiota composition in patients with right- and left-sided colorectal cancer and after curative colectomy, as analyzed by 16S rRNA gene amplicon sequencing. BMC Gastroenterol. 2022;22:313. 33. 33.Chong PP, Chin VK, Looi CY, Wong WF, Madhavan P, Yong VC. The Microbiome and Irritable Bowel Syndrome - A Review on the Pathophysiology, Current Research and Future Therapy. Front Microbiol. 2019;10:1136. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3389/fmicb.2019.01136&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 34. 34.Napolitano M, Fasulo E, Ungaro F, Massimino L, Sinagra E, Danese S, et al. Gut Dysbiosis in Irritable Bowel Syndrome: A Narrative Review on Correlation with Disease Subtypes and Novel Therapeutic Implications. Microorganisms. 2023;11:2369. 35. 35.Giloteaux L, Goodrich JK, Walters WA, Levine SM, Ley RE, Hanson MR. Reduced diversity and altered composition of the gut microbiome in individuals with myalgic encephalomyelitis/chronic fatigue syndrome. Microbiome. 2016;4:30. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s40168-016-0171-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27338587&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 36. 36.Du Preez S, Corbitt M, Cabanas H, Eaton N, Staines D, Marshall-Gradisnik S. A systematic review of enteric dysbiosis in chronic fatigue syndrome/myalgic encephalomyelitis. Syst Rev. 2018;7:241. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 37. 37.Guo C, Che X, Briese T, Allicock O, Yates RA, Cheng A, et al. Deficient butyrate-producing capacity in the gut microbiome of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome patients is associated with fatigue symptoms. 2021. 38. 38.König RS, Albrich WC, Kahlert CR, Bahr LS, Löber U, Vernazza P, et al. The Gut Microbiome in Myalgic Encephalomyelitis (ME)/Chronic Fatigue Syndrome (CFS). Front Immunol. 2021;12:628741. 39. 39.Wang T, Sternes PR, Guo X-K, Zhao H, Xu C, Xu H. Autoimmune diseases exhibit shared alterations in the gut microbiota. Rheumatology. 2023;:kead364. 40. 40.Borren NZ, Plichta D, Joshi AD, Bonilla G, Peng V, Colizzo FP, et al. Alterations in Fecal Microbiomes and Serum Metabolomes of Fatigued Patients With Quiescent Inflammatory Bowel Diseases. Clin Gastroenterol Hepatol Off Clin Pract J Am Gastroenterol Assoc. 2021;19:519–527.e5. 41. 41.Jason LA, Boulton A, Porter NS, Jessen T, Njoku MG, Friedberg F. Classification of myalgic encephalomyelitis/chronic fatigue syndrome by types of fatigue. Behav Med Wash DC. 2010;36:24–31. 42. 42.Zalewski P, Kujawski S, Tudorowska M, Morten K, Tafil-Klawe M, Klawe JJ, et al. The Impact of a Structured Exercise Programme upon Cognitive Function in Chronic Fatigue Syndrome Patients. Brain Sci. 2019;10:4. 43. 43.Cao W, Xing M, Liang S, Shi Y, Li Z, Zou W. Causal relationship of gut microbiota and metabolites on cognitive performance: A mendelian randomization analysis. Neurobiol Dis. 2024;191:106395. 44. 44.Wang Q, Song Y, Wu X, Luo Y, Miao R, Yu X, et al. Gut microbiota and cognitive performance: A bidirectional two-sample Mendelian randomization. J Affect Disord. 2024;353:38–47. 45. 45.Volkova A, Ruggles KV. Predictive Metagenomic Analysis of Autoimmune Disease Identifies Robust Autoimmunity and Disease Specific Microbial Signatures. Front Microbiol. 2021;12:621310. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3389/fmicb.2021.621310&link_type=DOI) 46. 46.Flensborg-Madsen T, von Schloten M, Flachs E, Mortensen E, Prescott E, Tolstrup J. Tobacco smoking as a risk factor for depression. A 26-year population-based follow-up study. J Psychiatr Res. 2011;45:143–9. 47. 47.Pearl J. Linear Models: A Useful “Microscope” for Causal Analysis. J Causal Inference. 2013;1:155–70. 48. 48.Barbui C, Gastaldon C, Cipriani A. Benzodiazepines and risk of dementia: true association or reverse causation? Epidemiol Psychiatr Sci. 2013;22:307–8. 49. 49.Banack HR, Bea JW, Kaufman JS, Stokes A, Kroenke CH, Stefanick ML, et al. The Effects of Reverse Causality and Selective Attrition on the Relationship Between Body Mass Index and Mortality in Postmenopausal Women. Am J Epidemiol. 2019;188:1838–48. 50. 50.Bucur IG, Claassen T, Heskes T. Inferring the direction of a causal link and estimating its effect via a Bayesian Mendelian randomization approach. Stat Methods Med Res. 2020;29:1081–111. 51. 51.Garcia GR, Coleman NC, Pond ZA, Pope CA. Shape of BMI-Mortality Risk Associations: Reverse Causality and Heterogeneity in a Representative Cohort of US Adults. Obes Silver Spring Md. 2021;29:755–66. 52. 52.Ferrari G, de Maio Nascimento M, Petermann-Rocha F, Rezende LFM, O’Donovan G, Gouveia ÉR, et al. Lifestyle risk factors and all-cause and cause-specific mortality in the Mexico City prospective study: Assessing the influence of reverse causation. J Affect Disord. 2024;352:517–24. 53. 53.Xiao R, Dong L, Xie B, Liu B. A Mendelian randomization study: physical activities and chronic kidney disease. Ren Fail. 2024;46:2295011. 54. 54.Boef AGC, Dekkers OM, le Cessie S. Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int J Epidemiol. 2015;44:496–511. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyv071&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25953784&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) 55. 55.Burgess S. Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome. Int J Epidemiol. 2014;43:922–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyu005&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24608958&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F24%2F2024.04.23.24306162.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000338127000037&link_type=ISI) 56. 56.Leszczensky L, Wolbring T. How to Deal With Reverse Causality Using Panel Data? Recommendations for Researchers Based on a Simulation Study. Sociol Methods Res. 2019;:0049124119882473.