Is mindfulness-based therapy ready for rollout to prevent relapse and recurrence in depression?

Doubts that much of clinical or policy significance was learned from a recent study published in Lancet

Dog-MindfulnessPromoters of Acceptance and Commitment Therapy (ACT) notoriously established a record for academics endorsing a psychotherapy as better than alternatives, in the absence of evidence from adequately sized, high quality studies with suitable active control/comparison conditions. The credibility of designating a psychological interventions as “evidence-based” took a serious hit with the promotion of ACT, before its enthusiasts felt they attracted enough adherents to be able to abandon claims of “best” or “better than.”

But the tsunami of mindfulness promotion has surpassed anything ACT ever produced, and still with insufficient quality and quantity of evidence.

Could that be changing?

Some might think so with a recent randomized controlled trial reported in the Lancet of mindfulness-based cognitive therapy (MBCT) to reduce relapse and recurrence in depression. The headline of a Guardian column  by one of the Lancet article’s first author’s colleagues at Oxford misleadingly proclaimed that the study showed

freeman promoAnd that misrepresentation was echoed in the Mental Health Foundation call for mindfulness to be offered through the UK National Health Service –

calls for NHS mindfulnessThe Mental Health Foundation is offering a 10-session online course  for £60 and is undoubtedly prepared for an expanded market.

andrea-on-mindfulness

Patient testimonial accompanying Mental Health Foundation’s call for dissemination.

 

 

 

The Declaration of Conflict of Interest for the Lancet article mentions the first author and one other are “co-directors of the Mindfulness Network Community Interest Company and teach nationally and internationally on MBCT.” The first author notes the marketing potential of his study in comments to the media.

revising NICETo the authors’ credit, they modified the registration of their trial to reduce the likelihood of it being misinterpreted.

Reworded research question. To ensure that readers clearly understand that this trial is not a direct comparison between antidepressant medication (ADM) and Mindfulness-based cognitive therapy (MBCT), but ADM versus MBCT plus tapering support (MBCT-TS), the primary research question has been changed following the recommendation made by the Trial Steering Committee at their meeting on 24 June 2013. The revised primary research question now reads as follows: ‘Is MBCT with support to taper/discontinue antidepressant medication (MBCT-TS) superior to maintenance antidepressant medication (m-ADM) in preventing depression over 24 months?’ In addition, the acronym MBCT-TS will be used to emphasise this aspect of the intervention.

1792c904fbbe91e81ceefdd510d46304I would agree and amplify: This trial adds nothing to  the paucity of evidence from well-controlled trials that MBCT is a first-line treatment for patients experiencing a current episode of major depression. The few studies to date are small and of poor quality and are insufficient to recommend MBCT as a first line treatment of major depression.

I know, you would never guess that from promotions of MBCT for depression, especially not in the current blitz promotion in the UK.

The most salient question is whether MBCT can provide an effective means of preventing relapse in depressed patients who have already achieved remission and seek discontinuation.

Despite a chorus of claims in the social media to the contrary, the Lancet trial does not demonstrate that

  • Formal psychotherapy is needed to prevent relapse and recurrence among patients previously treated with antidepressants in primary care.
  • Any less benefit would have been achieved with a depression care manager who requires less formal training than a MBCT therapist.
  • Any less benefit would have been achieved with primary care physicians simply tapering antidepressant treatment that may not even have been appropriate in the first place.
  • The crucial benefit to patients being assigned to the MBCT condition was their acquisition of skills.
  • That practicing mindfulness is needed or even helpful in tapering from antidepressants.

We are all dodos and everyone gets a prize

dodosSomething also lost in the promotion of the trial is that it was originally designed to test the hypothesis that MBCT was better than maintenance antidepressant therapy in terms of relapse and recurrence of depression. That is stated in the registration of the trial, but not in the actual Lancet report of the trial outcome.

Across the primary and secondary outcome measures, the trial failed to demonstrate that MBCT was superior. Essentially the investigators had a null trial on their hands. But in a triumph of marketing over accurate reporting of a clinical trial, they shifted the question to whether MBCT is inferior to maintenance antidepressant therapy and declared the success demonstrating that it was not.

We saw a similar move in a MBCT trial  that I critiqued just recently. The authors here opted for the noninformative conclusion that MBCT was “not inferior” to an ill-defined routine primary care for a mixed sample of patients with depression and anxiety and adjustment disorders.

An important distinction is being lost here. Null findings in a clinical trial with a sample size set to answer the question whether one treatment is better than another is not the same as demonstrating that the two treatments are equivalent. The latter question requires a non-inferiority design with a much larger sample size in order to demonstrate that by some pre-specified criteria two treatments do not differ from each other in clinically significant terms.

Consider this analogy: we want to test whether yogurt is better than aspirin for a headache. So we do a power analysis tailored to the null hypothesis of no difference between yogurt and aspirin, conduct a trial, and find that yogurt and aspirin do not differ. But if we were actually interested in the question whether yogurt can be substituted for aspirin in treating headaches, we would have to estimate what size of a study would leave us comfortable with that conclusion the treating aspirin with yogurt versus aspirin makes no clinically significant difference. That would require a much larger sample size, typically several times the size of a clinical trial designed to test the efficacy of an intervention.

The often confusing differences between standard efficacy trials and noninferiority and superiority trials are nicely explained here.

Do primary care patients prescribed an antidepressant need to continue?

Patients taking antidepressants should not stop without consulting their physician and agreeing on a plan for discontinuation.

NICE Guidelines, like many international guidelines, recommend that patients with recurrent depression continue their medication for at least two years, out of concerned for a heightened risk of relapse and recurrence. But these recommendations are based on research in specialty mental health settings conducted with patients with an established diagnosis of depression. The generalization to primary care patients may not be appropriate best evidence.

Major depression is typically a recurrent, episodic condition with onset in the teens or early 20s. Many currently adult depressed patients beyond that age would be characterized as having a recurrent depression. In a study conducted at primary care practices associated with the University of Michigan, we found that most patients in waiting rooms identified as depressed on the basis of a two stage screening and formal diagnostic interview had recurrent depression, with the average patient having over six episodes before our point of contact.

However, depression in primary care may have less severe symptoms in a given episode and an overall less severe course then the patients who make it to specialty mental health care. And primary care physicians’ decisions about placing patients on antidepressants in primary care are typically not based upon a formal, semi structured interview in which there are symptom counts to ascertain whether patients have the necessary number of symptoms (5 for the Diagnostic and Statistical Manual-5) to meet diagnostic criteria.

My colleagues in Germany and I conducted another relevant study in which we randomized patients to either antidepressant, behavior therapy, or the patient preference of antidepressant versus behavior therapy. However, what was unusual was that we relied on primary care physician diagnosis, not our formal research criteria. We found that many patients enrolling in the trial would not meet criteria for major depression and, at least by DSM-IV-R criteria, would be given the highly ambiguous diagnosis of Depression, Not Otherwise Specified. The patients identified by the primary care physicians as requiring treatment for depression were quite different than those typically entering clinical trials evaluating treatment options. You can find out more about the trial here .

It is thus important to note that patients in the Lancet study were not originally prescribed antidepressants based on a formal, research diagnosis of major depression. Rather, the decisions of primary care physicians to prescribe the antidepressants, are not usually based on a systematic interview aimed at a formal diagnosis based on a minimal number of symptoms being present. This is a key issue.

The inclusion criteria for the Lancet study were that patients currently be in full or partial remission from a recent episode of depression and have had at least three episodes, counting the recent one. But their diagnosis at the time they were prescribed antidepressants was retrospectively reconstructed and may have biased by them having received antidepressants

Patients enrolled in the study were thus a highly select subsample of all patients receiving antidepressants in the UK primary care. A complex recruitment procedure involving not only review of GP records, but advertisement in the community means that we cannot tell what the overall proportion of patients receiving antidepressants and otherwise meeting criteria would have agreed to be in the study.

The study definitely does not provide a basis for revising guidelines for determining when and if primary care physicians should raise the issue of tapering antidepressant treatment. But that’s a vitally important clinical question.

skeptical-cat-is-fraught-with-skepticismQuestions not answered by the study:

  • We don’t know the appropriateness of the prescription of antidepressants to these patients in the first place.
  • We don’t know what review of the appropriateness of prescription of antidepressants had been conducted by the primary care physicians in agreeing that their patients participate in the study.
  • We don’t know the selectivity with which primary care physicians agreed for their patients to participate. To what extent are the patients to whom they recommended the trial representative of other patients in the maintenance phase of treatment?
  • We don’t know enough about how the primary care physicians treating the patients in the control groups reacted to the advice from the investigator group to continue medication. Importantly, how often were there meetings with these patients and did that change as a result of participation in this trial? Like every other trial of CBT in the UK that I have reviewed, this one suffers from an ill defined control group that was nonequivalent in terms of the contact time with professionals and support.
  • The question persists whether any benefits claimed for cognitive behavior therapy or MBCT from recent UK trials could have been achieved with nonspecific supportive interventions. In this particular Lancet study, we don’t know whether the same results could been achieved by simply tapering antidepressants assisted by a depression care manager less credentialed than what is required to provide MBCT.

The investigators provided a cost analysis. They concluded that there were no savings in health care costs of moving patients in full or partial remission off antidepressants to MBCT. But the cost analysis did not take into account the added patient time invested in practicing MBCT. Indeed, we don’t even know whether the patients assigned to MBCT actually practiced it with any diligence or will continue to do after treatment.

The authors promise a process analysis that will shed light on what element of MBCT contributed to the equivalency of outcomes with the maintenance of antidepressant medication.

But this process analysis will be severely limited by the inability to control for nonspecific factors such as contact time with the patient and support provided to the primary care physician and patient in tapering medication.

The authors seem intent on arguing that MBCT should be disseminated into the UK National Health Services. But a more sober assessment is that this trial only demonstrates that a highly select group of patients currently receiving antidepressants within the UK health system could be tapered without heightened risk of relapse and recurrence. There may be no necessity or benefit of providing MBCT per se during this process.

The study is not comparable to other noteworthy studies of MBCT to prevent remission, like Zindel Segal’s complex study . That study started with an acutely depressed patient population defined by careful criteria and treated patients with a well-defined algorithm for choosing and making changes in medications. Randomization to continued medication, MBCT, or pill placebo occurred on in the patients who remitted. It is unclear how much the clinical characteristics of the patients in the present Lancet study overlapped with those in Segal’s study.

What would be the consequences of disseminating and implementing MBCT into routine care based on current levels of evidence?

There are lots of unanswered questions concerning whether MBCT should be disseminated and widely implemented in routine care for depression.

One issue is where would the resources come from for this initiative? There already are long waiting list for cognitive behavior therapy, generally 18 weeks. Would disseminating MBCT draw therapists away from providing conventional cognitive behavior therapy? Therapists are often drawn to therapies based on their novelty and initial, unsubstantiated promises rather than strength of evidence. And the strength of evidence for MBCT is not such that we could recommend substituting it for CBT for treatment of acute, current major depression.

Another issue is whether most patients would be willing to commit not only the time for sessions of training and MBCT but to actually practicing it in their everyday life. Of course, again, we don’t even know from this trial whether actually practicing MBCT matters.

There hasn’t been a fair comparison of MBCT to equivalent time with a depression manager who would review patients currently receiving antidepressants and advise physicians has to whether and how to taper suitable candidates for discontinuation.

If I were distributing scarce resources to research to reduce unnecessary treatment with antidepressants, I would focus on a descriptive, observational study of the clinical status of patients currently receiving antidepressants, the amount of contact time their receiving with some primary health care professional, and the adequacy of their response in terms of symptom levels, but also adherence. Results could establish the usefulness of targeting long term use of antidepressants and the level of adherence of patients to taking the medication and to physicians monitoring their symptom levels and adherence. I bet there is a lot of poor quality maintenance care for depression in the community

When I was conducting NIMH-funded studies of depression in primary care, I never could get review committees interested in the issue of overtreatment and unnecessarily continued treatment. I recall one reviewer’s snotty comment that that these are not pressing public health issues.

That’s too bad, because I think they are key in considering how to distribute scarce resources to study and improve care for depression in the community. Existing evidence suggest a substantial cost of treatment of depression with antidepressants in general medical care is squandered on patients who do not meet guideline criteria for receiving antidepressants or who do not receive adequate monitoring.

Category: antidepressants, depression, mindfulness, National Institute for Health and Care Excellence, primary care, psychotherapy, treatment as usual | Tagged , , , | 7 Comments

Delusional? Trial in Lancet Psychiatry claims brief CBT reduces paranoid delusions

lancet psychiatryIn this issue of Mind the Brain, I demonstrate a quick assessment of the conduct and reporting of a clinical trial.  The authors claimed in Lancet Psychiatry a “first ever” in targeting “worries” with brief cognitive therapy as a way of reducing persistent persecutory delusions in psychotic persons. A Guardian article written by the first author claims effects were equivalent to what is obtained with antipsychotic medication. Lancet Psychiatry allowed the authors a sidebar to their article presenting glowing testimonials of 3 patients making extraordinary gains. Oxford University lent its branding* to the first author’s workshop promoted with a video announcing a status of “evidence-based” for the treatment.

There is much claiming to be new here. Is it a breakthrough in treatment of psychosis and in standards for reporting a clinical trial? Or is what is new not praiseworthy?

I identify the kinds of things that I sought in first evaluating the Lancet Psychiatry article and what additional information needed to be consulted to assess the contribution to the field and relevance to practice.

The article is available open access.

Its publication was coordinated with the first author’s extraordinarily self-promotional elarticle in The Guardian

The Guardian article makes the claim that

benefits were what scientists call “moderate” – not a magic bullet, but with meaningful effects nonetheless – and are comparable with what’s seen with many anti-psychotic medications.

The advertisement for the workshop is here

 

The Lancet Psychiatry article also cites the author’s self-help book for lay persons. There was no conflict of interest declared.

Probing the article’s Introduction

Reports of clinical trials should be grounded in a systematic review of the existing literature. This allows readers to place the study in the context of existing research and the unsolved clinical and research problems the literature poses. This background prepares the reader to evaluate the contribution the particular trial can make.

Just by examining the references for the introduction, we can find signs of a very skewed presentation.

The introduction cites 13 articles, 10 of which are written by the author and an eleventh is written by a close associate. The remaining 2 citations are more generic, to a book and an article about causality.

Either the author is at the world center of this kind of research or seriously deficient in his attention to the larger body of evidence. At the outset, the author announces a bold reconceptualization of the role of worry in causing psychotic symptoms:

Worry is an expectation of the worst happening. It consists of repeated negative thoughts about potential adverse outcomes, and is a psychological component of anxiety. Worry brings implausible ideas to mind, keeps them there, and increases the level of distress. Therefore we have postulated that worry is a causal factor in the development and maintenance of persecutory delusions, and have tested this theory in several studies.

This is controversial, to say the least. The everyday experience of worrying is being linked to persecutory delusions. A simple continuum seems to be proposed – people can start off with everyday worrying and end out with a psychotic delusion and twenty years of receiving psychiatric services. Isn’t this too simplistic or just plain wrong?

Has no one but the author done relevant work or even reacted to the author’s work? The citations provided in the introduction suggest the author’s work is all we need in order to interpret this study in the larger context of what is known about psychotic persecutory delusions.

Contrast my assessment with the author’s own:

Panel 2: Research in context
Systematic review We searched the ISRCTN trial registry and the PubMed database with the search terms “worry”,“delusions”. “persecutory”,“paranoia”,and “schizophrenia”without date restrictions, for English-language publications of randomised controlled trials investigating the treatment of worry in patients with persecutory delusions. Other than our pilot investigation12 there were no other such clinical trials in the medical literature. We also examined published meta-analyses on standard cognitive behavioural therapy (CBT) for persistent delusions or hallucinations, or both.

The problem is that “worry” is a nonspecific colloquial term, not a widely used scientific one. For the author to require that studies have “worry” as a keyword in order to be retrieved is a silly restriction.

PubMedI welcome readers to redo the PubMed search dropping this term. Next replace “worry” with “anxiety.” Furthermore, the author makes unsubstantiated assumptions about a causal role for worry/anxiety in development of delusions. Drop the “randomized controlled trial” restriction from the PubMed search and you find a large relevant literature. Persons with schizophrenia and persecutory delusions are widely acknowledged to be anxious. But you won’t find much suggestion in this literature that the anxiety is causal or that people progress from worrying about something to developing schizophrenia and persecutory delusions. This seems a radical version gone wild of the idea that normal and psychotic experiences are on a continuum, concocted with a careful avoidance of contrary evidence.

Critical appraisal of clinical trials often skips examination of whether the background literature cited to justify the study is accurate and balanced. I think this brief foray has demonstrated that it can be important in establishing whether an investigator is claiming false authority for a view with cherry picking and selective attention to the literature.

Basic design of the study

The 150 patients randomized in this study are around 40 years old. Half of the sample of has been in psychiatric services for 11 or more years, with 29% of the patients in the intervention group and 19% in the control group receiving services for more than 20 years. The article notes in passing that all patients were prescribed antipsychotic medication at the outset of the study except 1 in the intervention group and 9 in the control group – 1:9? It is puzzling how such differences emerged if randomization was successful in controlling for baseline differences. Maybe it demonstrates the limitations of block randomization.

The intervention is decidedly low intensity for what is presumably a long standing symptom in chronically psychotic population.

We aimed to provide the CBT worry-reduction intervention in six sessions over 8 weeks. Each session lasted roughly an hour and took place in NHS clinics or at patients’ homes.

The six sessions were organized around booklets shared by the patient and therapist.

The main techniques were psychoeducation about worry, identification and reviewing of positive and negative beliefs about worry, increasing awareness of the initiation of worry and individual triggers, use of worry periods, planning activity at times of worry (which could include relaxation), and learning to let go of worry.

Patients were expected to practice exercises from the author’s self-help book for lay persons.

The two main practical techniques to reduce worry were then introduced: the use of worry periods (confining worry to about a 20 minute set period each day) and planning of activities at peak worry times. Worry periods were implemented flexibly. For example, most patients set up one worry period a day, but they could choose to have two worry periods a day or, in severe instances, patients instead aimed for a worry-free period. Ideally, the worry period was then substituted with a problem-solving period.

Compared to what?

The treatment of the control group was ill-defined routine care “delivered according to national and local service protocols and guidelines.” Readers are not told how much treatment the patients received or whether their care was actually congruent with these guidelines. Routine care of mental health patients in the community is notoriously deficient. That over half of these patients had been in services for more than a decade suggests that treatment for many of them had tapered off and was being delivered with no expectation of improvement.

To accept this study as an evaluation of the author’s therapy approach, we need to know how much in the way of other treatment was received by patients in both the intervention and control group. Were patients in the routine care condition, as I suspect, largely being ignored? The intervention group got 6 sessions of therapy over 8 weeks. Is that a substantial increase in psychotherapy or even in time to talk with a professional over what they would otherwise receive? Did being assigned to the intervention also increase patients’ other contact with mental health services? If the intervention therapists heard that patients was having problems with medication or serious unmet medical needs, how did they respond?

The authors report collecting data concerning receipt of services with the Client Service Receipt Inventory, but nowhere is that reported.

Most basically, we don’t know what elements the comparison/control group controlled. We have no reason to presume that the amount of contact time and basic relationship with a treatment provider was controlled.

As I have argued before, it is inappropriate and arguably unethical to use ill defined routine care or treatment-as-usual in the evaluation of a psychological intervention. We cannot tell if any apparent benefits to patients having been assigned to the intervention are due to correcting the inadequacies of routine care, including its missing of basic elements of support, attention, and encouragement. We therefore cannot tell if there are effective elements to the intervention other than  these nonspecific factors.

We cannot tell if any positive results to this trial suggest encourage dissemination and implementation or only improving likely deficiencies in the treatment received by patients in long term psychiatric care.

In terms of quickly evaluating articles reporting clinical trials, we see that imply asking “compared to what” and jumping to the comparison/control condition revealed a lot of deficiencies at the outset in what this trial could reveal.

Measuring outcomes

Two primary outcomes were declared – changes in the Penn State Worry Questionnaire and the Psychotic Symptoms Rating Scale- Delusion (PSYRATS-delusion) subscale. The authors use multivariate statistical techniques to determine whether patients assigned to the intervention group improved more on either of these measures, and whether specifically reduction in worry caused reductions in persecutory delusions.

Understand what is at stake here: the authors are trying to convince us that this is a groundbreaking study that shows that reducing worry with a brief intervention reduces long standing persecutory delusions.

The authors lose substantial credibility if we look closely at their primary measures, including their items, not just the scale names.

what-me-worry-715605The Penn State Worry Questionnaire (PSWQ) is a 16 item questionnaire widely used with college student, community and clinical samples. Items include

When I am under pressure I worry a lot.

I am always worrying about something.

And reverse direction items scored so greater endorsement indicates less worrying –

I do not tend to worry about things.

I never worry about anything.

I know, how many times does basically the same question have to be asked?

The questionnaire is meant to be general. It focuses on a single complaint that could be a symptom of anxiety. While the questionnaire could be used to screening for anxiety disorders, it does not provide a diagnosis of a mental disorder, which requires other symptoms be present. Actually, worry is only one of three components of anxiety. The others are physiological – like racing heart, sweating, or trembling – and behavioral – like avoidance or procrastination.

But “worry” is also a feature of depressed mood. Another literature discusses “worry” as “rumination.” We should not be surprised to find this questionnaire functions reasonably well as a screen for depression.

But past research has shown that even in nonclinical populations, using a cutpoint to designate high versus low worriers results in unstable classification. Without formal intervention, many of those who are “high” become  “low” over time.

In order to be included in this study, patients had to have a minimum score of 44 on the PSWQ. If we skip to the results of the study we find that the patients in the intervention group dropped from 64.8 to 56.1 and those receiving only routine care dropped from 64.5 to 59.8. The average patient in either group would have still qualified for inclusion in the study at the end of follow up.

The second outcome measure, the Psychotic Symptoms Rating Scale- Delusion subscale has six items: duration and frequency of preoccupation; intensity of distress; amount of distressing content; conviction and disruption. Each item is scored 0-4, with 0 = no problem and 4 = maximum severity.

The items are so diverse that interpretation of a change in the context of an intervention trial targeting worry becomes difficult. Technically speaking, the lack of comparability among items is so great that the measure cannot be considered an interval scale for which conventional parametric statistics could be used. We cannot reasonably assume changes in one item is equivalent to changes in other items.

It would seem, for instance, that amount of preoccupation with delusions, amount and intensity of distress, and amount of preoccupation with delusions are very different matters. The intervention group changed from a mean of 18.7 on a scale with a possible score of 24 to 13.6 at 24 weeks; the control group from 18.0 to 16.4. This change could simply represent reduction in the amount and intensity of distress, not in patients’ preoccupation with the delusions, their conviction that the delusions are true, or the disruption in their lives. Overall, the PSYRATS-delusion subscale is not a satisfactory measure on which to make strong claims about reducing worry reducing delusions. The measure is too contaminated with content similar to the worries questionnaire. We might only be finding ‘changes in worries results in changes in worries.”

Checking primary outcomes is important in evaluating a clinical trial, but in this case, it was crucial to examine what the measures assessed at an item content level. Too often reviewers uncritically accept the name of an instrument as indicating what it validly measures when used as an outcome measure.

The fancy multivariate analyses do not advance our understanding of what went on in the study. The complex statistical analyses might simply be demonstrating patients were less worried as seen in questionnaires and interview ratings based on what patients say when asked whether they are distressed.

My summary assessment is that a low intensity intervention is being evaluated against an ill-defined treatment as usual. The outcome measures are too nonspecific and overlapping to be helpful. We may simply be seeing effects of contact and reassurance among patients who are not getting much of either. So what?

testimonialsBring on the patient endorsements

Panel 1: Patient comments on the intervention presents glowing endorsements from 3 of the 73 patients assigned to the intervention group. The first patient describes the treatment as “extremely helpful” and as providing a “breakthrough.” The second patient suggests describing starting treatment being lost and without self-confidence but now being relaxed at times of the day that had previously been stressful. The third patient declared

“The therapy was very rewarding. There wasn’t anything I didn’t like. I needed that kind of therapy at the time because if I  didn’t have that therapy at that time, I wouldn’t be here.

Wow, but these dramatic gains seem inconsistent with the modest gains registered with the quantitative primary outcome measures. We are left guessing how these endorsements were elicited – where they obtained in a context where patients were expected to express gratitude for the extra attention they received? –  and the criteria by which the particular quotes were selected from what is presumably a larger pool.

Think of the outcry if Lancet Psychiatry extended this innovation to reporting of clinical trials to evaluations of medications by their developers. If such side panels are going to be retained in the future in the reporting of a clinical trial, maybe it would be best that they be marked “advertisement” and accompanied by a declaration of conflict of interest.

A missed opportunity to put the authors’ intervention to a fair test

In the Discussion section the authors state

although we think it highly unlikely that befriending or supportive counselling [sic] would have such persistent effects on worry and delusions, this possibility will have to be tested specifically in this group.

Actually, the authors don’t have much evidence of anything but a weak effect that might well have been achieved with befriending or supportive counseling delivered by persons with less training. We should be careful of accepting claims of any clinically significant effects on delusions. At best, the authors have evidence that distress associated with delusions was reduced and that in any coordination in scores between the two measurs may simply reflect confounding of the two outcome measures.

It is a waste of scarce research funds, an unethical waste of patients willingness to contribute to science to compare this low intensity psychotherapy to ill-described, unquantified treatment as usual. Another low intensity treatment like befriending or supportive counseling might provide sufficient elements of attention, support, and raised expectations to achieve comparable results.

Acknowledging the Supporting Cast

In evaluating reports of clinical trials, it is often informative to look to footnotes and acknowledgments, as well as the main text. This article acknowledges Anthony Morrison as a member of the Trial Steering Committee and Douglas Turkington as a member of the Data Monitoring and Ethics Committee. Readers of Mind the Brain might recognize Morrison as first author of a Lancet trial that I critiqued for exaggerated claims and Turkington as the first author of a trial that became an internet sensation when post-publication reviewers pointed out fundamental problems in the reporting of data.  Turkington and an editor of the journal in which the report of the trial was published counterattacked.

All three of these trials involve exaggerated claims based on a comparison between CBT and an ill-defined routine care. Like the present one, Morrison’s trial failed to report data concerning collected receipt of services. And in an interview with Lancet, Morrison admitted to avoiding a comparison between CBT and anything but routine care out of concern that differences might not be found with any treatment providing a supportive relationship, even basic supportive counseling.

MRCA note to funders

This project (09/160/06) was awarded by the Efficacy and Mechanism Evaluation (EME) Programme, and is funded by the UK Medical Research Council (MRC) and managed by the UK NHS National Institute for Health Research (NIHR) on behalf of the MRC-NIHR partnership.

Really, UK MRC, you are squandering scarce funds on methodologically poor, often small trials for which investigators make extravagant claims and that don’t include a comparison group allowing control for nonspecific effects. You really ought to insist on better attention to the existing literature in justifying another trial and adequate controls for amount of contact time, attention and support.

Don’t you see the strong influence of investigator allegiance dictating reporting of results consistent with the advancement of the investigators’ product?

I don’t understand why you allowed the investigator group to justify the study with such idiosyncratic, highly selective review of the literature driven by substituting a colloquial term “worry” for more commonly used search terms.

Do you have independent review of grants by persons who are more accepting of the usual conventions of conducting and reporting trials? Or are you faced with the problems of a small group of reviewers giving out money to like-minded friends and family? Note that the German Federal Ministry of Education and Research (BMBF) has effectively dealt with inbred old boy networks by excluding Germans from the panels of experts reviewing German grants. Might you consider the same strategy in getting more seriously about funding projects with some potential for improving patient care? Get with it, insist on rigor and reproducibility in what you fund.

*We should make too much of Oxford lending its branding to this workshop. Look at the workshops to which Harvard Medical School lends its labels.

Category: cognitive behavioral therapy, Conflict of interest, psychosis, psychotherapy, schizophrenia, Uncategorized | Tagged , , , , , , | Leave a comment

What the pot and pain pill overdose study teaches us about ecological fallacies

I am delighted to offer Mind the Brain readers a guest blog written by Keith Humphreys, Ph.D., John Finney, Ph.D., Alex Sox-Harris, Ph.D., and Daniel Kivlahan, Ph.D. Drs. Humphreys, Sox-Harris, and Finney are at the Palo Alto VA and Stanford University. Dr. Kivlahan is at the Seattle VA and the University of Washington.

Follow Professor Humphreys on Twitter @KeithNHumphreys.

 

Image Credit: Bogdan, Wikimedia Commons

Image Credit: Bogdan, Wikimedia Commons

A team of scientists recently reported that states with laws permitting medical marijuana had lower rates of opioid overdose than states without such laws. In a New York Times essay, two members of the team suggested this state-level association between medical marijuana access and deaths reflects the behavior of individuals in pain:

 

If enough people opt to treat pain with medical marijuana instead of prescription painkillers in states where this is legal, it stands to reason that states with medical marijuana laws might experience an overall decrease in opioid painkiller overdoses and deaths.

 

At first blush, saying it “stands to reason” seems, well, reasonable. But in the current issue of the journal in which the study appeared, we point out that the assumption that associations based on aggregations of people (e.g., counties, cities and states) must reflect parallel relationships for individuals is a seductive logical error known as the “ecological fallacy.”

 

Once you understand the ecological fallacy, you will recognize it in many interpretations of and media reports about science.   Here are some examples that have been reported over the years:

 

 

Such differences are counter-intuitive and therefore a bit baffling. If individuals having heart attacks who receive high quality care are far more likely to survive, doesn’t it follow that hospitals that provide higher quality care to larger percentages of their heart attack patients would have substantially lower mortality rates? (Answer: No, their results are barely better). Why don’t patterns we see in the aggregate always replicate themselves with individuals, and vice versa?

 

The mathematical basis for the ecological fallacy has multiple and complex aspects (our detailed explanation here), but most people find it easiest to understand when presented with a simple example. Imagine two states with 100 people each residing in them, with each state population including a comparable proportion of people in pain. Potsylvania has a loosely regulated medical marijuana system that 25% of residents access. Alabstentia, in contrast, limits access to medical marijuana so only 15% of residents can obtain it.

 

Potsylvania

Medical Marijuana User Medical Marijuana Non-User Totals
Died of Opioid Overdose 2 3 5
Did Not Die of Overdose 23 72 95
Totals 25 75 100

 

Alabstentia

 

Medical Marijuana User Medical Marijuana Non-User Totals
Died of Opioid Overdose 4 6 10
Did NotDie of Overdose 11 79 90
Totals 15 85 100

 

Ganja-loving Potsylvania has a lower opioid overdose death rate (5%) than more temperate Alabstentia (10%).   Does this prove that individuals in those states who use medical marijuana lower their risk of opioid overdose death? Nope. In both states, medical marijuana-users are more likely to die of a pain medication overdose than are non-users: 2 of 25 (8%) of marijuana users dying versus 3 of 75 (4%) marijuana non-users dying in Potsylvania; 4 of 15 (26.6%) of marijuana users dying versus 6 of 85 (7.1%) of non-users dying in Alabstentia!

 

Embracing the ecological fallacy is tempting, even to very bright people, but it must be resisted if we want to better understand the world around us. So, the next time you see a study saying, for example, that politically conservative states have higher rates of searching for sex and pornography on line and want to immediately speculate about why conservative individuals are so hypocritical, pause and remember that what applies at the aggregate level does not necessarily apply to individuals. For all we know, alienated liberals in red states may just be feeling lonely and frustrated.

Category: Commentary, News, Psychiatry, Uncategorized | Tagged , , | Leave a comment

Busting foes of post-publication peer review of a psychotherapy study

title_vigilante_blu-rayAs described in the last issue of Mind the Brain, peaceful post-publication peer reviewers (PPPRs) were ambushed by an author and an editor. They used the usual home team advantages that journals have – they had the last word in an exchange that was not peer-reviewed.

As also promised, I will team up in this issue with Magneto to bust them.

Attacks on PPPRs threaten a desperately needed effort to clean up the integrity of the published literature.

The attacks are getting more common and sometimes vicious. Vague threats of legal action caused an open access journal to remove an article delivering fair and balanced criticism.

In a later issue of Mind the Brain, I will describe an  incident in which authors of a published paper had uploaded their data set, but then  modified it without notice after PPPRs used the data for re-analyses. The authors then used the modified data for new analyses and then claimed the PPPRs were grossly mistaken. Fortunately, the PPPRs retained time stamped copies of both data sets. You may like to think that such precautions are unnecessary, but just imagine what critics of PPPR would be saying if they had not saved this evidence.

Until journals get more supportive of post publication peer review, we need repeated vigilante actions, striking from Twitter, Facebook pages, and blogs. Unless readers acquire basic critical appraisal skills and take the time to apply them, they will have to keep turning to the social media for credible filters of all the crap that is flooding the scientific literature.

MagnetoYardinI’ve enlisted Magneto because he is a mutant. He does not have any extraordinary powers of critical appraisal. To the contrary, he unflinchingly applies what we should all acquire. As a mutant, he can apply his critical appraisal skills without the mental anguish and physiological damage that could beset humans appreciating just how bad the literature really is. He doesn’t need to maintain his faith in the scientific literature or the dubious assumption that what he is seeing is just a matter of repeat offender authors, editors, and journals making innocent mistakes.

Humans with critical appraisal risk demoralization and too often shirk from the task of telling it like it is. Some who used their skills too often were devastated by what they found and fled academia. More than a few are now working in California in espresso bars and escort services.

Thank you, Magneto. And yes, I again apologize for having tipped off Jim Coan about our analyses of his spinning and statistical manipulations of his work to get newsworthy finding. Sure, it was an accomplishment to get a published apology and correction from him and Susan Johnson. I am so proud of Coan’s subsequent condemnation of me on Facebook as the Deepak Chopra of Skepticism  that I will display it as an endorsement on my webpage. But it was unfortunate that PPPRs had to endure his nonsensical Negative Psychology rant, especially without readers knowing what precipitated it.

shakespeareanThe following commentary on the exchange in Journal of Nervous and Mental Disease makes direct use of your critique. I have interspersed gratuitous insults generated by Literary Genius’ Shakespearean insult generator and Reocities’ Random Insult Generator.

How could I maintain the pretense of scholarly discourse when I am dealing with an author who repeatedly violates basic conventions like ensuring tables and figures correspond to what is claimed in the abstract? Or an arrogant editor who responds so nastily when his slipups are gently brought to his attention and won’t fix the mess he is presenting to his readership?

As a mere human, I needed all the help I could get in keeping my bearings amidst such overwhelming evidence of authorial and editorial ineptness. A little Shakespeare and Monty Python helped.

The statistical editor for this journal is a saucy full-gorged apple-john.

 

Cognitive Behavioral Techniques for Psychosis: A Biostatistician’s Perspective

Domenic V. Cicchetti, PhD, quintessential  biostatistician

Domenic V. Cicchetti, PhD, quintessential biostatistician

Domenic V. Cicchetti, You may be, as your website claims

 A psychological methodologist and research collaborator who has made numerous biostatistical contributions to the development of major clinical instruments in behavioral science and medicine, as well as the application of state-of-the-art techniques for assessing their psychometric properties.

But you must have been out of “the quintessential role of the research biostatistician” when you drafted your editorial. Please reread it. Anyone armed with an undergraduate education in psychology and Google Scholar can readily cut through your ridiculous pomposity, you undisciplined sliver of wild belly-button fluff.

You make it sound like the Internet PPPRs misunderstood Jacob Cohen’s designation of effect sizes as small, medium, and large. But if you read a much-accessed article that one of them wrote, you will find a clear exposition of the problems with these arbitrary distinctions. I know, it is in an open access journal, but what you say is sheer bollocks about it paying reviewers. Do you get paid by Journal of Nervous and Mental Disease? Why otherwise would you be a statistical editor for a journal with such low standards? Surely, someone who has made “numerous biostatistical contributions” has better things to do, thou dissembling swag-bellied pignut.

More importantly, you ignore that Jacob Cohen himself said

The terms ‘small’, ‘medium’, and ‘large’ are relative . . . to each other . . . the definitions are arbitrary . . . these proposed conventions were set forth throughout with much diffidence, qualifications, and invitations not to employ them if possible.

Cohen J. Statistical power analysis for the behavioural sciences. Second edition, 1988. Hillsdale, NJ: Lawrence Earlbaum Associates. p. 532.

Could it be any clearer, Dommie?

Click to enlarge

You suggest that the internet PPPRs were disrespectful of Queen Mother Kraemer in not citing her work. Have you recently read it? Ask her yourself, but she seems quite upset about the practice of using effects generated from feasibility studies to estimate what would be obtained in an adequately powered randomized trial.

Pilot studies cannot estimate the effect size with sufficient accuracy to serve as a basis of decision making as to whether a subsequent study should or should not be funded or as a basis of power computation for that study.

Okay you missed that, but how about:

A pilot study can be used to evaluate the feasibility of recruitment, randomization, retention, assessment procedures, new methods, and implementation of the novel intervention. A pilot study is not a hypothesis testing study. Safety, efficacy and effectiveness are not evaluated in a pilot. Contrary to tradition, a pilot study does not provide a meaningful effect size estimate for planning subsequent studies due to the imprecision inherent in data from small samples. Feasibility results do not necessarily generalize beyond the inclusion and exclusion criteria of the pilot design.

A pilot study is a requisite initial step in exploring a novel intervention or an innovative application of an intervention. Pilot results can inform feasibility and identify modifications needed in the design of a larger, ensuing hypothesis testing study. Investigators should be forthright in stating these objectives of a pilot study.

Dommie, although you never mention it, surely you must appreciate the difference between a within-group effect size and a between-group effect size.

  1. Interventions do not have meaningful effect sizes, between-group comparisons do.
  2. As I have previously pointed out

 When you calculate a conventional between-group effect size, it takes advantage of randomization and controls for background factors, like placebo or nonspecific effects. So, you focus on what change went on in a particular therapy, relative to what occurred in patients who didn’t receive it.

Turkington recruited a small, convenience sample of older patients from community care who averaged over 20 years of treatment. It is likely that they were not getting much support and attention anymore, whether or not they ever were. The intervention that Turkington’s study provided that attention. Maybe some or all of any effects were due to simply compensating for what was missing from from inadequate routines care. So, aside from all the other problems, anything going on in Turkington’s study could have been nonspecific.

Recall that in promoting his ideas that antidepressants are no better than acupuncture for depression, Irving Kirsh tried to pass off within-group as equivalent to between-group effect sizes, despite repeated criticisms. Similarly, long term psychodynamic psychotherapists tried to use effect sizes from wretched case series for comparison with those obtained in well conducted studies of other psychotherapies. Perhaps you should send such folks a call for papers so that they can find an outlet in Journal of Nervous and Mental Disease with you as a Special Editor in your quintessential role as biostatistician.

Douglas Turkington’s call for a debate

Professor Douglas Turkington: "The effect size that got away was this big."

Professor Douglas Turkington: “The effect size that got away was this big.”

Doug, as you requested, I sent you a link to my Google Scholar list of publications. But you still did not respond to my offer to come to Newcastle and debate you. Maybe you were not impressed. Nor did you respond to Keith Law’s repeated request to debate. Yet you insulted internet PPPR Tim Smits with the taunt,

Click to Enlarge

 

You congealed accumulation of fresh cooking fat.

I recommend that you review the recording of the Maudsley debate. Note how the moderator Sir Robin Murray boldly announced at the beginning that the vote on the debate was rigged by your cronies.

Do you really think Laws and McKenna got their asses whipped? Then why didn’t you accept Laws’ offer to debate you at a British Psychological Society event, after he offered to pay your travel expenses?

High-Yield Cognitive Behavioral Techniques for Psychosis Delivered by Case Managers…

Dougie, we were alerted that bollacks would follow with the “high yield” of the title. Just what distinguishes this CBT approach from any other intervention to justify “high yield” except your marketing effort? Certainly, not the results you have obtained from an earlier trial, which we will get to.

Where do I begin? Can you dispute what I said to Dommie about the folly of estimating effect sizes for an adequately powered randomized trial from a pathetically small feasibility study?

I know you were looking for a convenience sample, but how did you get from Newcastle, England to rural Ohio and recruit such an unrepresentative sample of 40 year olds with 20 years of experience with mental health services? You don’t tell us much about them, not even a breakdown of their diagnoses. But would you really expect that the routine care they were currently receiving was even adequate? Sure, why wouldn’t you expect to improve upon that with your nurses? But would you be demonstrating?

insult 1

 

The PPPR boys from the internet made noise about Table 2 and passing reference to the totally nude Figure 5 and how claims in the abstract had no apparent relationship to what was presented in the results section. And how nowhere did you provide means or standard deviations. But they did not get to Figure 2 Notice anything strange?

figure 2Despite what you claim in the abstract, none of the outcomes appear significant. Did you really mean standard error of measurement (SEMs), not standard deviations (SDs)? People did not think so to whom I showed the figure.

mike miller

 

And I found this advice on the internet:

If you want to create persuasive propaganda:

If your goal is to emphasize small and unimportant differences in your data, show your error bars as SEM,  and hope that your readers think they are SD.

If our goal is to cover-up large differences, show the error bars as the standard deviations for the groups, and hope that your readers think they are a standard errors.

Why did you expect to be able to talk about effect sizes of the kind you claim you were seeking? The best meta analysis suggests an effect size of only .17 with blind assessment of outcome. Did you expect that unblinding assessors would lead to that much more improvement? Oh yeh, you cited your own previous work in support:

That intervention improved overall symptoms, insight, and depression and had a significant benefit on negative symptoms at follow-up (Turkington et al., 2006).

Let’s look at Table 1 from Turkington et al., 2006.

A consistent spinning of results

Table 1 2006

Don’t you just love those three digit significance levels that allow us to see that p =.099 for overall symptoms meets the apparent criteria of p < .10 in this large sample? Clever, but it doesn’t work for depression with p = .128. But you have a track record of being sloppy with tables. Maybe we should give you the benefit of a doubt and ignore the table.

But Dougie, this is not some social priming experiment with college students getting course credit. This is a study that took up the time of patients with serious mental disorder. You left some of them in the squalor of inadequate routine care after gaining their consent with the prospect that they might get more attention from nurses. And then with great carelessness, you put the data into tables that had no relationship to the claims you were making in the abstract. Or in your attempts to get more funding for future such ineptitude. If you drove your car like you write up clinical trials, you’d lose your license, if not go to jail.

insult babbling

 

 

The 2014 Lancet study of cognitive therapy for patients with psychosis

Forgive me that I missed until Magneto reminded me that you were an author on the, ah, controversial paper

Morrison, A. P., Turkington, D., Pyle, M., Spencer, H., Brabban, A., Dunn, G., … & Hutton, P. (2014). Cognitive therapy for people with schizophrenia spectrum disorders not taking antipsychotic drugs: a single-blind randomised controlled trial. The Lancet, 383(9926), 1395-1403.

But with more authors than patients remaining in the intervention group at follow up, it is easy to lose track.

You and your co-authors made some wildly inaccurate claims about having shown that cognitive therapy was as effective as antipsychotics. Why, by the end of the trial, most of the patients remaining in follow up were on antipsychotic medication. Is that how you obtained your effectiveness?

In our exchange of letters in The Lancet, you finally had to admit

We claimed the trial showed that cognitive therapy was safe and acceptable, not safe and effective.

Maybe you should similarly be retreating from your claims in the Journal of Nervous and Mental Disease article? Or just take refuge in the figures and tables being uninterpretable.

No wonder you don’t want to debate Keith Laws or me.

insult 3

 

 

A retraction for High-Yield Cognitive Behavioral Techniques for Psychosis…?

The Turkington article meets the Committee on Publication Ethics (COPE) guidelines for an immediate retraction (http://publicationethics.org/files/retraction%20guidelines.pdf).

But neither a retraction nor even a formal expression of concern has appeared.

Toilet-outoforderMaybe matters can be left as they now are. In the social media, we can point to the many problems of the article like a clogged toilet warning that Journal of Nervous and Mental Disease is not a fit place to publish – unless you are seeking exceeding inept or nonexistent editing and peer review.

 

 

 

Vigilantes can periodically tweet Tripadvisor style warnings, like

toilets still not working

 

 

Now, Dommie and Dougie, before you again set upon some PPPRs just trying to do their jobs for little respect or incentive, consider what happened this time.

Special thanks are due for Magneto, but Jim Coyne has sole responsibility for the final content. It  does not necessarily represent the views of PLOS blogs or other individuals or entities, human or mutant.

Category: cognitive behavioral therapy, evidence-supported, open access, Peer review, psychosis, psychotherapy, publishing | Tagged , , , , , , , , | 2 Comments

Sordid tale of a study of cognitive behavioral therapy for schizophrenia gone bad

What motivates someone to publish that paper without checking it? Laziness? Naivety? Greed? Now that’s one to ponder. – Neuroskeptic, Science needs vigilantes.

feared_and_hated_by_a_world_they_have_sworn_to_pro_by_itomibhaa-d4kx9bd.pngWe need to

  • Make the world safe for post-publication peer review (PPR) commentary.
  • Ensure appropriate rewards for those who do it.
  • Take action against those who try to make life unpleasant for those who are toil hard for a scientific literature that is more trustworthy.

In this issue of Mind the Brain, I set the stage for my teaming up with Magneto to bring some bullies to justice.

The background tale of a modest study of cognitive behavior therapy (CBT) for patients with schizophrenia has been told in bits and pieces elsewhere.

The story at first looked like it was heading for a positive outcome more worthy of a blog post than the shortcomings of a study in an obscure journal. The tale would go

A group organized on the internet called attention to serious flaws in the reporting of a study. We then witnessed the self-correcting of science in action.

If only this story was complete and accurately described scientific publishing today

Daniel Lakens’ blog post, How a Twitter HIBAR [Had I Been A Reviewer] ends up as a published letter to the editor recounts the story beginning with expressions of puzzlement and skepticism on Twitter.

Gross errors were made in a table and a figure. These were bad enough in themselves, but seemed to point to reported results not seem supporting the claims made in the article.

A Swedish lecturer blogged Through the looking glass into an oddly analyzed clinical paper .

Some of those involved in the Twitter exchange banded together in writing a letter to the editor.

Smits, T., Lakens, D., Ritchie, S. J., & Laws, K. R. (2014). Statistical errors and omissions in a trial of cognitive behavior techniques for psychosis: commentary on Turkington et al. The Journal of Nervous and Mental Disease, 202(7), 566.

Lakens explained in his blog

Now I understand that getting criticism on your work is never fun. In my personal experience, it very often takes a dinner conversation with my wife before I’m convinced that if people took the effort to criticize my work, there must be something that can be improved. What I like about this commentary is that is shows how Twitter is making post-publication reviews possible. It’s easy to get in contact with other researchers to discuss any concerns you might have (as Keith did in his first Tweet). Note that I have never met any of my co-authors in real life, demonstrating how Twitter can greatly extend your network and allows you to meet interesting and smart people who share your interests. Twitter provides a first test bed for your criticisms to see if they hold up (or if the problem lies in your own interpretation), and if a criticism is widely shared, can make it fun to actually take the effort to do something about a paper that contains errors.

Furthermore,

It might be slightly weird that Tim, Stuart, and myself publish a comment in the Journal of Nervous and Mental Disease, a journal I guess none of us has ever read before. It also shows how Twitter extends the boundaries between scientific disciplines. This can bring new insights about reporting standards  from one discipline to the next. Perhaps our comment has made researchers, reviewers, and editors who do research on cognitive behavioral therapy aware of the need to make sure they raise the bar on how they report statistics (if only so pesky researchers on Twitter leave you alone!). I think this would be great, and I can’t wait until researchers from another discipline point out statistical errors in my own articles that I and my closer peers did not recognize, because anything that improves the way we do science (such as Twitter!) is a good thing.

Hindsight: If the internet group had been the original reviewers of the article…

The letter was low key and calmly pointed out obvious errors. You can see it here. Tim Smit’s blog Don’t get all psychotic on this paper: Had I (or we) Been A Reviewer (HIBAR) describes what had to be left out to keep within the word limit.

the actual table originalTable 2 had lots of problems –

  • The confidence intervals were suspiciously wide.
  • The effect sizes seemed too large for what the modest sample size should yield.
  • The table was inconsistent with information in the abstract.
  • Neither they table nor the accompanying text had any test of significance nor reporting of means and standard deviations.
  • Confidence intervals for two different outcomes were identical, yet one had the same value for its effect size as its lower bound.

Figure 5 Click to Enlarge

Figure 5 was missing labels and definitions on both axes, rendering it uninterpretable. Duh?

The authors of the letter were behaving like a blue helmeted international peacekeeping force, not warriors attacking bad science.

peacekeepersBut you don’t send peacekeeping troops into an active war zone.

In making recommendations, the Internet group did politely introduce the R word:

We believe the above concerns mandate either an extensive correction, or perhaps a retraction, of the article by Turkington et al. (2014). At the very least, the authors should reanalyze their data and report the findings in a transparent and accurate manner.

Fair enough, but I doubt the authors of the letter appreciated how upsetting this reasonable advice was or anticipated what reaction would be coming.

A response from an author of the article and a late night challenge to debate

The first author of the article published a reply

Turkington, D. (2014). The reporting of confidence intervals in exploratory clinical trials and professional insecurity: a response to Ritchie et al. The Journal of Nervous and Mental Disease, 202(7), 567.

He seemed to claim to re-examine the study data and

  • The findings were accurately reported.
  • A table of means and standard deviations was unnecessary because of the comprehensive reporting of confidence intervals and p-values in the article.
  • The missing details from the figure were self-evident.

The group who had assembled on the internet was not satisfied. An email exchange with Turkington and the editor of the journal confirmed that Turkington had not actually re-examined the raw file data, but only a summary with statistical tables.

The group requested the raw data. In a subsequent letter to the editor, they would describe Turkington as timely the providing the data, but the exchange between them was anything but cordial. Turkington at first balked, saying that the data were not readily available because the statistician had retired. He nonetheless eventually provided the data, but not before first sending off a snotty email –

Click to Enlarge

Click to Enlarge

Tim Smit declined:

Dear Douglas,

Thanks for providing the available data as quick as possible. Based on this and the tables in the article, we will try to reconstruct the analysis and evaluate our concerns with it.

With regard to your recent invitation to “slaughter” me at Newcastle University, I politely want to decline that invitation. I did not have any personal issue in mind when initiating the comment on your article, so a personal attack is the least of my priorities. It is just from a scientific perspective (but an outsider to the research topic) that I was very confused/astonished about the lack of reporting precision and what appears to be statistical errors. So, if our re-analysis confirms that first perception, then I am of course willing to accept your invitation at Newcastle university to elaborate on proper methodology in intervention studies, since science ranks among the highest of my priorities.

Best regards,

Tim Smits

When I later learned of this email exchange, I wrote to Turkington and offered to go to Newcastle to debate either as Tim Smits’ second or to come alone. Turkington asked me to submit my CV to show that I wasn’t a crank. I complied, but he has yet to accept my offer.

A reanalysis of the data and a new table

Smits, T., Lakens, D., Ritchie, S. J., & Laws, K. R. (2015). Correcting Errors in Turkington et al.(2014): Taking Criticism Seriously. The Journal of nervous and mental disease, 203(4), 302-303.

The group reanalyzed the data and the title of their report leaked some frustration.

We confirmed that all the errors identified by Smits et al. (2014) were indeed errors. In addition, we observed that the reported effect sizes in Turkington et al. (2014) were incorrect by a considerable margin. To correct these errors, Table 2 and all the figures in Turkington et al. (2014) need to be changed.

The sentence in the Abstract where effect sizes are specified needs to be rewritten.

A revised table based on their reanalyses was included:

new tableGiven the recommendation of their first letter was apparently dismissed –

To conclude, our recommendation for the Journal and the authors would now be to acknowledge that there are clear errors in the original Turkington et al. (2014) article and either accept our corrections or publish their own corrigendum. Moreover, we urge authors, editors, and reviewers to be rigorous in their research and reviewing, while at the same time being eager to reflect on and scrutinize their own research when colleagues point out potential errors. It is clear that the authors and editors should have taken more care when checking the validity of our criticisms. The fact that a rejoinder with the title “A Response to Ritchie et al. [sic]” was accepted for publication in reply to a letter by Smits et al. (2014) gives the impression that our commentary did not receive the attention it deserved. If we want science to be self-correcting, it is important that we follow ethical guidelines when substantial errors in the published literature are identified.

Sound and fury signifying nothing

Publication of their letter was accompanied by a blustery commentary from the statistical editor for the journal full of innuendo and pomposity.

quote-a-harmless-hilarity-and-a-buoyant-cheerfulness-are-not-infrequent-concomitants-of-genius-and-we-charles-caleb-colton-294969

Cicchetti, D. V. (2015). Cognitive Behavioral Techniques for Psychosis: A Biostatistician’s Perspective. The Journal of Nervous and Mental Disease, 203(4), 304-305.

He suggested that the team assembled on the internet

reanalyzed the data of Turkington et al. on the basis that it contained some serious errors that needed to be corrected. They also reported that the statistic that Turkington et al. had used to assess effect sizes (ESs) was an inappropriate metric.

Well, did Turkington’s table contain errors and was the metric inappropriate? If so, was a formal correction or even retraction needed? Cicchetti reproduced the internet groups’ table, but did not immediately offer his opinion. So, the uncorrected article stands as published. Interested persons downloading it from behind the journal’s paywall won’t be alerted to the controversy.

hello potInstead of dealing with the issues at hand, Cicchetti launched into an irrelevant lecture about Jacob Cohen’s arbitrary designation of effect sizes as small, medium, or large. Anything he said had already appeared clearer and more accurately in an article by Daniel Laken, one of the internet group authors. Cicchetti cited that article, but only as a basis for libeling the open access journal in which it appeared.

To be perfectly candid, the reader needs to be informed that the journal that published the Lakens (2013) article, Frontiers in Psychology, is one of an increasing number of journals that charge exorbitant publication fees in exchange for free open access to published articles. Some of the author costs are used to pay reviewers, causing one to question whether the process is always unbiased, as is the desideratum. For further information, the reader is referred to the following Web site: http://www.frontiersin.org/Psychology/fees.

love pomposityCicchetti further chastised the internet group for disrespecting the saints of power analysis.

As an additional comment, the stellar contributions of Helena Kraemer and Sue Thiemann (1987) were noticeable by their very absence in the Smits et al. critique. The authors, although genuinely acknowledging the lasting contributions of Jacob Cohen to our understanding of ES and power analysis, sought to simplify the entire enterprise

Jacob Cohen is dead and cannot speak. But good Queen Mother Helena is very much alive and would surely object to being drawn into this nonsense. I encourage Cicchetti to ask what she thinks.

Ah, but what about the table based on the re-analyses of the internet group that Cicchetti had reproduced?

The reader should also be advised that this comment rests upon the assumption that the revised data analyses are indeed accurate because I was not privy to the original data.

Actually, when Turkington sent the internet group the study data, he included Cicchetti in the email.

The internet group experienced one more indignity from the journal that they had politely tried to correct. They had reproduced Turkington’s original table in their letter. The journal sent them an invoice for 106 euros because the table was copyrighted. It took a long email exchange before this billing was rescinded.

Science Needs Vigilantes

Imagine a world where we no longer depend on a few cronies of an editor to decide once and forever the value of a paper. This would replace the present order in which much of the scientific literature is untrustworthy, where novelty and sheer outrageousness of claims are valued over robustness.

Imagine we have constructed a world where post publication commentary is welcomed and valued. Data are freely available for reanalysis and the rewards are there for performing those re-analyses.

We clearly are not there yet and certainly not with this flawed article. The sequence of events that I have described has so far not produced a correction of a paper. As it stands, the paper concludes that nurses can and should be given a brief training that will allow them to effectively treat patients with severe and chronic mental disorder. This paper encourages actions that may put such patients and society at risk because of ineffectual and neglectful treatment.

The authors of the original paper and the editor responded with dismissal of the criticisms, ridicule, and, the editor at least, libeling open access journals. Obviously, we have not reached the point at which those willing to re-examine and if necessary, re-analyze data, are appropriately respected and protected from unfair criticism. The current system of publishing gives authors who have been questions and editors who are defensive of their work, no matter how incompetent and inept it may be, the last word. But there is always the force of social media- tweets and blogs.

The critics were actually much too kind and restrained in a critique narrowly based on re-analyses. They ignored so much about

  • The target paper as an underpowered feasibility study being passed off a source of estimates of what a sufficiently sized randomized trial would yield.
  • The continuity between the mischief done in this article with tricks and spin in the past work of the author Turkington.
  • The laughably inaccurate lecture of the editor.
  • The lowlife journal in which the article was published.

These problems deserve a more unrestrained and thorough trashing. Journals may not yet be self-correcting, but blogs can do a reasonable job of exposing bad science.

Science needs vigilantes, because of the intransigence of those pumping crap into the literature.

Coming up next

In my next issue of Mind the Brain I’m going to team up with Magneto. You may recall I previously collaborated with him and Neurocritic to scrutinize some junk science that Jim Coan and Susan Johnson had published in PLOS One. Their article crassly promoted to clinicians what they claimed was a brain-soothing couples therapy. We obtained an apology and a correction in the journal for undeclared conflict of interest.

Magneto_430But that incident left Magneto upset with me. He felt I did not give sufficient attention to the continuity between how Coan had slipped post hoc statistical manipulations in the PLOS article to get positive results and what he had done in a past paper with Richard Davison. Worse, I had tipped off Jim Coan about our checking his work. Coan launched a pre-emptive tirade against post-publication scrutiny, his now infamous Negative Psychology rant  He focused his rage on Neuroskeptic, not Neurocritic or me, but the timing was not a coincidence. He then followed up by denouncing me on Facebook as the Chopra Deepak of skepticism.

I still have not unpacked that oxymoronic statement and decided if it was a compliment.

OK, Magneto, I will be less naïve and more thorough this round. I will pass on whatever you uncover.

Check back if you just want to augment your critical appraisal skills with some unconventional ones or if you just enjoy a spectacle. If you want to arrive at your own opinions ahead of time, email Douglas Turkington douglas.turkington@ntw.nhs.uk and for a PDF of his paywalled article. Tell him I said hello. The offer of a debate still stands.

 

Category: cognitive behavioral therapy, open access, Peer review, psychosis, schizophrenia | Tagged , , , , , , , | 2 Comments