Third Education Group ReviewA peer-reviewed electronic journal.   ISSN 1557-2870

"The source of Lake Wobegan" by Richard Phelps

Review by Bruce Thompson

In my view, this manuscript has great potential but could be considerably tightened and shortened, making it much easier to follow.  As it stands now, the basic organization of the paper seems unclear and demands too much effort on the part of the reader to follow the author's argument.

I suggest that the author organize the paper around an examination of what seemed to be two competing hypotheses.  The first is that test inflation is due to high stakes.  The second, supported by the author, seems to consist of two parts: (1) that test inflation reflects mainly poor security and cheating by administrators and (2) if high-stakes test are aligned with curriculum the test results will rise to reflect increased student learning.

The author scatters a series of quotes throughout the article.  It appears from the context that these are mainly quotes from people associated with the Center for Research on Education Standards and Student Testing, but that is not made explicit.  I suggest that these be collected together in the development of the first hypothesis.

There needs to be more discussion of just what is meant by high-stakes, particularly in the view of those arguing that high-stakes lead to test inflation.  Clearly the early tests are generally considered low stakes by today's standards, but some administrators felt compelled to inflate the results, implying that they regarded the tests as high stakes for themselves.

I found confusing the author's discussion of low stakes tests given proximate in time to high stakes tests.  Is the author saying that critics of high-stakes tests are saying that they lead to inflation of other tests, that in a high-stakes state all test results will be inflated, whether or not the particular test is high-stakes?  If so, this discussion is to be made more explicit.

It would be helpful if the first mention of the Debra P. versus Turlington  case described the outcome of the case.  The outcome does become clear much later in the article.

The discussion of Bishop's work is confusing to me.  The author should explain what is meant by Appositive backwash effects.

The discussion of the ACT and SAT coaching seems out of place in this article.  These tests are quite different from the low and high-stakes test given by school districts.

I find footnote 6 both unconvincing and unnecessary to the argument.  It seems at least plausible that curricula could diverge more as students move up in grade.  One could argue that there are certain skills, both in reading and mathematics, that young students must learn before they can progress, forcing convergence in the early years.

It is not clear how a randomly drawn subsample can be unrepresentative.

The author refers to the National Council of Teachers of Mathematics standards.  However, my understanding is that particularly the early edition of the standards did less to set what mathematics should be learned then to set forth an approach to teaching mathematics.

(This was one of four reviews of this submission in fall of 2005)