P-Curve Analysis: Moral Injury and Mental Health in Civilians

By Aaliya Hussain

For Want of P-Values…

Initially, my topic of choice was lesion network mapping (LNM), which utilizes human connectome data to identify functional brain networks associated with neurological and psychiatric conditions that are caused by lesions, as I have previously applied the technique in a research project. I planned to extract p-values from the ten most recent studies which used LNM network “connectivity” or “disconnection” (as defined by the original authors) to predict behavioral outcomes, create a p-curve from the p-values, and test for the evidential value of the studies through right-skewness tests of the p-curve.

After pre-registering my meta-analysis (link: https://doi.org/10.17605/OSF.IO/HY6F7) and spending approximately two-thirds of a weekend searching for studies, I realized that finding ten studies that fit my selection criteria was likely not realistic within my time constraints. LNM is a fairly new technique, first appearing in the literature about seven years ago in the seminal Boes et al. (2015) publication. Many LNM studies, particularly the earlier ones, mainly reported qualitative findings about the anatomical locations of functional connectivity networks disrupted by brain lesions associated with specific conditions rather than quantitative predictions of behavioral metrics, and several studies that did report statistical results relevant to my analysis included statistical parameters which (to my knowledge) do not necessarily produce p-values with uniform null distributions. All three factors likely contributed to the paucity of studies that fit my selection criteria.

A Meta-Meta-Analysis

I restarted with a new topic: moral injury. Morality, especially when examined through a neuropsychological lens, is fascinating to me. The concept of moral injury, loosely defined as psychological harm caused by experiencing events that violate deeply-held moral principles, affirms one of my own personal intuitions (which, according to Prentice et al. (2018), is often assumed but rarely directly, empirically tested) about human morality: the majority of people have a fundamental psychological need to view themselves as morally good. To indirectly examine this intuition and address the lack of focus on civilian populations in moral injury research, I decided to test the evidential value of studies examining associations between moral injury and mental health metrics in civilian groups and pre-registered my meta-analysis (link: https://doi.org/10.17605/OSF.IO/KTJ4C). Based on my intuition, I hypothesized that the studies would demonstrate evidential value, which would be supported by statistically significant (p < 0.05) right-skewness tests for both the full and half p-curves.

Rather than conducting my own search of a scientific database for studies, I started with an initial pool of eleven studies, which were drawn from the studies included in the Williamson et al. (2018) and the McEwen et al. (2021) meta-analyses on moral injury and mental health. The two meta-analyses collectively included over seventy studies in total, but the overwhelming majority examined military groups rather than civilians and were excluded. I defined further exclusion criteria, detailed below, to ensure that I did not violate the assumptions of p-curve analyses.

Additional exclusion criteria:

  • The study does not contain a statistically significant (i.e. p < 0.05) p-value from testing a hypothesis about an association between moral injury and a metric health metric.
  • The study contains participant overlap with another study; in the case of studies with participant overlap, the study published first, in chronological terms, was included, and later studies are excluded.

After applying the exclusion criteria, ten studies remained, with Nickerson et al. (2018) excluded based upon the second additional exclusion criteria listed above.

Sausage-Making: P-Curve Edition

Extracting the appropriate p-values was far messier and involved much more subjective judgement than I consider ideal. Most of the included studies made their hypotheses clear, and I did specify in my pre-registration that I would select the first (in terms of reference in the study text) statistically significant p-value and test statistic in the event that a study reported multiple “relevant” statistical results. However, I rather loosely defined “relevant” in my pre-registration.

Ultimately, I did my best to follow the process* below when selecting p-values:

  • The first statistically significant (i.e. p < 0.05) p-value resulting from a test of the study’s hypothesis about an association between moral injury and a mental health metric was extracted along with the corresponding test statistic. If the study tested multiple hypotheses about associations between moral injury and mental health metrics, the first hypothesis to be stated in the text of the study was selected. Tests which included demographic information as covariates were not excluded.
  • If the selected test statistic was not known (to my best knowledge of statistics) to produce p-values with a uniform null distribution, the next statistically significant p-value resulting from a test of the study’s hypothesis about an association between moral injury and a mental health metric was extracted along with its corresponding test statistic. This step was repeated as many times as necessary. In practice, all test statistics which were not Pearson correlation coefficients ®, t-statistics, or F-statistics were excluded.
  • If no p-values remained after applying the above step, the first statistically significant p-value reported in the supplementary data extraction tables (which contained additional data requested by the authors of the meta-analyses from the authors of the original studies) of the McEwen et al. (2021) or the Williamson et al. (2018) meta-analyses corresponding to a test of an association between moral injury and a mental health outcome was extracted along with its corresponding test statistic. In practice, this step was necessary only for the Hoffman et al. (2018) study, in which a Pearson correlation coefficient was extracted from a supplementary table of the McEwen et al. (2021) meta-analysis.

*(Note: This was not stated in my pre-registration as I did not anticipate having to define a full procedure for extracting p-values and test statistics.)

Once I finished extracting the p-values, I encountered another meta-analytical snag: finding the appropriate degrees of freedom used to calculate each of the corresponding test statistics. Not all of the studies–to my surprise–actually explicitly reported the number of degrees of freedom used in each statistical test. When studies did not report the degrees of freedom used, I interpolated based upon the following statistical rules:

  • For simple linear regression: degrees of freedom = n (sample size) – 2
  • For multivariate linear regression: degrees of freedom = n (sample size) – k (number of predictors) – 1

I created a study disclosure table** (link: https://docs.google.com/document/d/169wyz1GbZpOD1mkQDVoP7lyoZhXVv–MuPL3yJTElZE/edit?usp=sharing) to summarize and organize all of the information I extracted. Screenshots of the table are below:

**(Note: The “Robustness results” column is blank because I could not determine the appropriate robustness results to select for the studies.)

Pretty P-Curves

I utilized an online p-curve calculator (version 4.06, link: http://www.p-curve.com/app4/), which is implemented based upon the methodology first detailed in Simonsohn et al. (2014), to conduct the p-curve analyses.***

Pictures of the outputted p-curve and table with right-skewness test results are below:

The results of the p-curve analysis support my hypothesis. Both the full and half p-curve were found to be statistically significantly right-skewed (p < 0.0001 for both continuous tests), suggesting that the studies do contain evidential value and that the associations between moral injury and mental health metrics is a real effect. 

I generated an additional p-curve and table of right-skewness test results (screenshots included below) using only the seven studies from the McEwen et al. (2021) meta-analysis. Moral injury is a loosely defined concept within the psychological research literature, and controversy exists over whether the concept can be meaningfully distinguished from trauma, given that many “potentially morally injurious events” are also potentially traumatic. Both the McEwen et al. (2021) and the Williamson et al. (2018) meta-analyses were predicated on the assumption that moral injury is a distinct concept from trauma, but the McEwen et al. (2021) placed more stringent selection criteria regarding measures of moral injury, requiring that studies used measures that captured moral injury itself rather than exposure to “potentially morally injurious events.”

The results of p-curve analyses on the McEwen et al. (2021) subset of studies are similar to those on the full set of studies. Both the full and half p-curves were found to be statistically significantly right-skewed (p < 0.0001 for both continuous tests), providing support for the evidential value of the subset of studies.

***(Note: The exact input used for both p-curve analyses can be found at this link: https://docs.google.com/document/d/1vAFZvSMSYI-230YU_sGBoBknBccFmc6myU1pRlHkuOY/edit?usp=sharing.) 

(Somewhat Psychoanalytic) Reflections

The results of the p-curve analyses also indirectly support my intuition about human morality. If my intuition is correct, and most humans do have a psychological need to perceive themselves as morally good, then undergoing moral injury would be expected to lead to detrimental mental health outcomes–which is what the results suggest. While I am quite relieved that I do not need to uproot one of my fundamental beliefs about human nature (and somewhat ego-stroked from receiving empirical support for my intuition), I also recognize that having my intuitions supported may lead to unconscious biases. 

As I have observed in every other scientific research discipline that I have been exposed to, p-curving involves far more subjective judgement than I initially expected. Though I made those judgements before I even generated the results and had no intention of swaying the results with my judgements, I cannot rule out the possibility that I am less likely to examine the soundness of my judgements quite as critically as if the results had challenged my intuitions. Biases, after all, are significant contributors to the very need for meta-analyses and meta-statistics.

My possible biases aside, I found p-curving to be a wonderful introduction to meta-analysis. Though I do endorse the Twainian view of statistics, I also find the discipline of statistics fascinating, and meta-statistics, even more so. I will certainly consider learning more about meta-statistics and conducting meta-analyses.     


Backholm, K., & Idås, T. (2015). Ethical dilemmas, work-related guilt, and posttraumatic stress reactions of news journalists covering the terror attack in Norway in 2011. Journal of Traumatic Stress, 28(2), 142–148. https://doi.org/10.1002/jts.22001 

Boes, A. D., Prasad, S., Liu, H., Liu, Q., Pascual-Leone, A., Caviness, V. S., & Fox, M. D. (2015). Network localization of neurological symptoms from focal brain lesions. Brain, 138(10), 3061–3075. https://doi.org/10.1093/brain/awv228 

Crane, M. F., Phillips, J. K., & Karin, E. (2015). Trait perfectionism strengthens the negative effects of moral stressors occurring in veterinary practice. Australian Veterinary Journal, 93(10), 354–360. https://doi.org/10.1111/avj.12366 

Currier, J. M., Holland, J. M., Rojas-Flores, L., Herrera, S., & Foy, D. (2015). Morally injurious experiences and meaning in Salvadorian teachers exposed to violence. Psychological Trauma: Theory, Research, Practice, and Policy, 7(1), 24–33. https://doi.org/10.1037/a0034092 

Drevo, S. E. (2017). The war on journalists: Pathways to posttraumatic stress and occupational dysfunction among journalists (Order No. 10605088). Available from ProQuest Dissertations & Theses Global. (1949686678). https://www.proquest.com/dissertations-theses/war-on-journalists-pathways-posttraumatic-stress/docview/1949686678/se-2?accountid=12492 

Feinstein, A., Pavisian, B., & Storm, H. (2018). Journalists covering the refugee and migration crisis are affected by moral injury not PTSD. JRSM Open, 9(3), 205427041875901. https://doi.org/10.1177/2054270418759010 

Hoffman, J., Liddell, B., Bryant, R. A., & Nickerson, A. (2018). The relationship between moral injury appraisals, trauma exposure, and mental health in refugees. Depression and Anxiety, 35(11), 1030–1039. https://doi.org/10.1002/da.22787 

Komarovskaya, I., Maguen, S., McCaslin, S. E., Metzler, T. J., Madan, A., Brown, A. D., Galatzer-Levy, I. R., Henn-Haase, C., & Marmar, C. R. (2011). The impact of killing and injuring others on mental health symptoms among police officers. Journal of Psychiatric Research, 45(10), 1332–1336. https://doi.org/10.1016/j.jpsychires.2011.05.004 

McEwen, C., Alisic, E., & Jobson, L. (2021). Moral Injury and Mental Health: A systematic review and meta-analysis. Traumatology, 27(3), 303–315. https://doi.org/10.1037/trm0000287

Nickerson, A., Schnyder, U., Bryant, R. A., Schick, M., Mueller, J., & Morina, N. (2015). Moral injury in traumatized refugees. Psychotherapy and Psychosomatics, 84(2), 122–123. https://doi.org/10.1159/000369353

Papazoglou, K. (2018). The examination of different pathways leading towards police traumatization: Exploring the role of moral injury and personality in police compassion fatigue. Dissertation Abstracts International: Section B: The Sciences and Engineering, 79(3-B(E)).

Prentice, M., Jayawickreme, E., Hawkins, A., Hartley, A., Furr, R. M., & Fleeson, W. (2018). Morality as a basic psychological need. Social Psychological and Personality Science, 10(4), 449–460. https://doi.org/10.1177/1948550618772011

Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534–547. https://doi.org/10.1037/a0033242 

Steinmetz, S. E., Gray, M. J., & Clapp, J. D. (2019). Development and evaluation of the perpetration‐induced distress scale for measuring shame and guilt in civilian populations. Journal of Traumatic Stress, 32(3), 437–447. https://doi.org/10.1002/jts.22377 

Williamson, V., Stevelink, S. A. M., & Greenberg, N. (2018). Occupational moral injury and mental health: Systematic review and meta-analysis. The British Journal of Psychiatry, 212(6), 339–346. https://doi.org/10.1192/bjp.2018.55