Psychosomatic Medicine Faster Service from Outside North America
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hartmann, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hartmann, A.
Psychosomatic Medicine 68:175-176 (2006)
© 2006 American Psychosomatic Society


LETTERS TO THE EDITOR

PLEASE DON’T TALK BAD ABOUT GOOD OLD AUNT ANOVA!: A REPLY TO A.J. VICKERS’ CRITIQUE

Armin Hartmann, PhD

Abteilung für Psychosomatische Medizin und Psychotherapie, Universitätsklinikum Freiburg, Freiburg, Germany, armin.hartmann{at}uniklinik-freiburg.de

In a recent article, Vickers criticizes the use and abuse of Analysis of Variance (ANOVA) (1). He claims that there are at least two major problems. First, that the results of ANOVAs of randomized controlled trials (RCTs) were either badly reported (meaning that very often only p-values but no measures of effect size were reported), ANOVA did not provide adequate statistics to contrast groups, and adequate measures of effect size were missing. Second, that the results of ANOVAs for repeated measurement were hard to understand and easily misinterpreted and therefore other kinds of analyses such as "longitudinal mixed models" or "generalized linear modeling" were preferable.

In my opinion this is mainly a "cross-cultural" misunderstanding, as Vickers seems to belong to the culture of medical biometricians, whereas ANOVA is related to the culture of psychological methodologists (as I am). Let me therefore argue for my old "aunt" ANOVA.

RCTs and ANOVA
I do agree that any statistical analysis is incomplete if effect sizes are not determined. I would even claim that no reviewer of any journal should accept a manuscript where measures of effect size are missing. But this problem is not only found with ANOVA, it is the case with many other statistical procedures too. As a solution, Vickers suggests reporting differences between means and their corresponding confidence intervals, but I think that this is not an adequate measure of effect size. In psychotherapy research Cohen’s measures of effect size are widely used and well understood. An excellent and comprehensive overview is given by Rosenthal (2), providing formulas for the computation and showing the relation to other measures of effect size.

It is just not true that clinically relevant differences could not be detected or reported in case of an insignificant overall difference of groups. To my knowledge, any statistical package allows for the computation of contrasts between groups. Another issue is the power of trials, where we very often must conclude that the sample sizes were too small to detect small differences (which would be the case for the constructed trial of Vickers’ example).

Yes, not all ANOVA procedures of the standard statistical packages provide "clickable" options for the computation of effect sizes. On the other hand, an experienced statistician should be able to write some program code which, with modern statistical software, is not much more work than a few clicks. A simple table for the computation of Cohen’s d, realized with SAS-JMP, is available from the author and may be freely distributed.

To complain that the results of ANOVAs for repeated measurement were hard to understand is just not fair. Have you ever tried to explain the meaning of an odds ratio to somebody who is able to understand a risk difference only? Any scientific methodology must be learned and taught (sometimes to a whole community). The proposed solution, to use a regression with baseline scores, is statistically more or less the same. Its rationale and interpretation also need explanation or expertise—so it seems to me more a matter of taste (or belonging to a certain culture) which choice one makes.

I agree that is not sufficient to report the significance (and F, df) of a time x treatment-interaction only. It is like the first argument, that significance is worthless without effect size. A picture is worth a thousand numbers—some visualization is required, as soon as there are more than two points of measurement. The proposed solution, applying mixed models or hierarchical linear models, is fine; but these models are even more complex and harder to understand for inexperienced statisticians, let alone clinicians. Last but not least, these models require some decisions about the nature of the development of scores over time. For example, change can be modeled with "growth curves"/hierarchical linear models (3,4). These are a special case of mixed models and they need a "formula" for the level-1 models of change. Researchers have to decide in advance on the best theoretical model of improvement. Linear, higher-order exponential, and logarithmic functions have been discussed and fitted (5–9), but it is still an open issue what we should expect and which function to use (10).

To summarize, my good old "aunt" ANOVA is not as bad as Vickers suggests. She can do more (contrasting groups). She makes fewer problems than the younger relatives (HLMs, mixed models) do. They make life complex, whereas she gives you a K.I.S.S. (keep it safe and simple). Her limitations and statistical necessities are well known (who can’t spell homogeneity?). If you do not ask more than ANOVA promises to give (to compare means between groups and/or over time) you will get reliable and interpretable answers (if you have learned to speak ANOVA).

Therefore, I conclude what is needed is a "family tree" of methods and (longitudinal) designs, including the related family of event occurrence analysis (Survival, Cox Regression). A decision tree or a mental map of the advantages and the shortcomings of available designs and their corresponding statistical methods could show researchers which option best matches their research questions. Such a tool would show that, among others, ANOVA is still a good choice for the analysis of RCTs when the issue is testing differences of group means, in a design with a fixed schedule of measurement.

DOI:10.1097/01.psy.0000199925.51075.36

REFERENCES

  1. Vickers AJ. Analysis of variance is easily misapplied in the analysis of randomized trials: a critique and discussion of alternative approaches. Psychosom Med 2005;67:652–5.[Abstract/Free Full Text]
  2. Rosenthal R. Parametric measures of effect size. In: Cooper H, Hedges LV, editors. The handbook of research synthesis. New York, NY, US: Russell Sage Foundation; 1994.
  3. Bryk AS, Raudenbush SW. Application of hierarchical linear models to assessing change. Psychol Bull 1987;101:147–58.[CrossRef]
  4. Bryk AS, Raudenbush SW, Congdon R. Hierarchical linear and nonlinear modelling with the HLM/2L and HLM/3L programs. Chicago: Scientific Software International; 1996.
  5. Lutz W, Martinovich Z, Howard KI, Leon SC. Outcomes management, expected treatment response, and severity-adjusted provider profiling in outpatient psychotherapy. J Clin Psychol 2002;58:1291–304.[CrossRef][Medline]
  6. Lutz W, Rafaeli E, Howard KI, Martinovich Z. Adaptive modeling of progress in outpatient psychotherapy. Psychother Res 2002;12:427–43.
  7. Lutz W. Patient-focused psychotherapy research and individual treatment progress as scientific groundwork for an empirically based clinical practice. Psychother Res 2002;12:251–72.
  8. Lueger RJ, Howard KI, Martinovich Z, Lutz W, Anderson EE, Grissom G. Assessing Treatment Progress of Individual Patients Using Expected Treatment Response Models. J Consult Clin Psychol 2001;69:150–8.[Medline]
  9. Howard KI, Moras K, Brill PL, Martinovich Z, Lutz W. Evaluation of psychotherapy. Efficacy, effectiveness, and patient progress. Am Psychol 1996;51:1059–64.[CrossRef][Medline]
  10. Singer JD, Willett JB. Applied longitudinal data analysis. Oxford: Oxford University Press; 2003.




This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hartmann, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hartmann, A.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS