Steve Sailer: iSteve: The Golden Age of Standardized Test Creation

September 6, 2010

The Golden Age of Standardized Test Creation

Psychometrics is a relatively mature field of science, and a politically unpopular one. So you might think there isn't much money to be made in making up brand new standardized tests. Yet, there is.

From the NYT:

U.S. Asks Educators to Reinvent Student Tests, and How They Are Given

By SAM DILLON

Standardized exams — the multiple-choice, bubble tests in math and reading that have played a growing role in American public education in recent years — are being overhauled.

Over the next four years, two groups of states, 44 in all, will get $330 million to work with hundreds of university professors and testing experts to design a series of new assessments that officials say will look very different from those in use today.

The new tests, which Secretary of Education Arne Duncan described in a speech in Virginia on Thursday, are to be ready for the 2014-15 school year.

They will be computer-based, Mr. Duncan said, and will measure higher-order skills ignored by the multiple-choice exams used in nearly every state, including students’ ability to read complex texts, synthesize information and do research projects.

“The use of smarter technology in assessments,” Mr. Duncan said, “makes it possible to assess students by asking them to design products of experiments, to manipulate parameters, run tests and record data.”

I don't know what the phrase "design products of experiments" even means, so I suspect that the schoolchildren of 2014-15 won't be doing much of it.

Okay, I looked up Duncan's speech, "Beyond the Bubble Tests," and what he actually said was "design products or experiments," which almost makes sense, until you stop and think about it. Who is going to assess the products the students design? George Foreman? Donald Trump? (The Donald would be good at grading these tests: tough, but fair. Here's a video of Ali G pitching the product he designed -- the "ice cream glove" -- to Trump.

Because the new tests will be computerized and will be administered several times throughout the school year, they are expected to provide faster feedback to teachers than the current tests about what students are learning and what might need to be retaught.

Both groups will produce tests that rely heavily on technology in their classroom administration and in their scoring, she noted.

Both will provide not only end-of-year tests similar to those in use now but also formative tests that teachers will administer several times a year to help guide instruction, she said.

And both groups’ tests will include so-called performance-based tasks, designed to mirror complex, real-world situations.

In performance-based tasks, which are increasingly common in tests administered by the military and in other fields, students are given a problem — they could be told, for example, to pretend they are a mayor who needs to reduce a city’s pollution — and must sift through a portfolio of tools and write analytically about how they would use them to solve the problem.

Oh, boy ...

There is some good stuff here -- adaptive tests are a good idea (both the military's AFQT and the GRE have gone over to them). But there's obvious trouble, too.

Okay, so these new tests are going to be much more complex, much more subjective, and get graded much faster than fill-in-the-bubble tests? They'll be a dessert topping and a floor wax!

These sound a lot like the Advanced Placement tests offered to high school students, which usually include lengthy essays. But AP tests take two months to grade, and are only offered once per year (in May, with scores coming back in July), because they use high school teachers on their summer vacations to grade them.

There's no good reason why fill-in-the-bubble tests can't be scored quickly. A lot of public school bubble tests are graded slothfully, but they don't have to be. My son took the ERB's Independent School Entrance Exam on a Saturday morning and his score arrived at our house in the U.S. Mail the following Friday, six days later.

The only legitimate reason for slow grading is if there are also essays to be read, but in my experience, essay results tend to be dubious at least below the level of Advanced Placement tests, where there is specific subject matter in common. The Writing test that was added to the SAT around 2003 has largely been a bust, with many colleges refusing to use it in the admissions process.

One often overlooked problem with any kind of writing test, for example, is that graders have a hard time reading kids' handwriting. You can't demand that kids type because millions of them can't. Indeed, writing test results tend to correlate with number of words written, which is often more of a test of handwriting speed than of anything else. Multiple choice tests have obvious weaknesses, but at least they minimize the variance introduced by small motor skills.

And the reference to "performance-based tasks" in which people are supposed to "write analytically" is naive. I suspect that Duncan and the NYT man are confused by all the talk during the Ricci case about the wonders of "assessment centers" in which candidates for promotion are supposed to sort through an in-basket and talk out loud about how they would handle problems. In other words, those are hugely expensive oral tests. The city of New Haven brought in 30 senior fire department officials from out of state to be the judges on the oral part of the test.

And the main point of spending all this money on an oral test is that an oral test can't be blindgraded. In New Haven, 19 of the 30 oral test judges were minorities, which isn't something that happens by randomly recruiting senior fire department officials from across the country.

But nobody can afford to rig the testing of 35,000,000 students annually.

Here are some excerpts from Duncan's speech:

President Obama called on the nation's governors and state education chiefs "to develop standards and assessments that don't simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking and entrepreneurship and creativity."

You know your chain is being yanked when you hear that schoolteachers are supposed to teach "21st century skills" like "entrepreneurship." So, schoolteachers are going to teach kids how to be Steve Jobs?

Look, there are a lot of good things to say about teachers, but, generally speaking, people who strive for union jobs with lifetime tenure and summers off are not the world's leading role models on entrepreneurship.

Further, whenever you hear teachers talk about how they teach "critical thinking," you can more or less translate that into "I hate drilling brats on their times tables. It's so boring." On the whole, teachers aren't very good critical thinkers. If they were, Ed School would drive them batty. (Here is an essay about Ed School by one teacher who is a good critical thinker.)

And last but not least, for the first time, the new assessments will better measure the higher-order thinking skills so vital to success in the global economy of the 21st century and the future of American prosperity. To be on track today for college and careers, students need to show that they can analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings. ...

Over the past 19 months, I have visited 42 states to talk to teachers, parents, students, school leaders, and lawmakers about our nation's public schools. Almost everywhere I went, I heard people express concern that the curriculum had narrowed as more educators "taught to the test," especially in schools with large numbers of disadvantaged students.

Two words: Disparate Impact.

The higher the intellectual skills that are tested, the larger the gaps between the races will turn out to be. Consider the AP Physics C exam, the harder of the two AP physics tests: In 2008, 5,705 white males earned 5s (the top score) versus six black females.

In contrast, tests of rote memorization, such as having third graders chant the multiplication tables, will have smaller disparate impact than tests of whether students "can analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings." That's a pretty decent description of what IQ tests measure.

Duncan says that the new tests could replace existing high school exit exams that students must pass to graduate.

Many educators have lamented for years the persistent disconnect between what high schools expect from their students and the skills that colleges expect from incoming freshman. Yet both of the state consortia that won awards in the Race to the Top assessment competition pursued and got a remarkable level of buy-in from colleges and universities.

... In those MOUs, 188 public colleges and universities and 16 private ones agreed that they would work with the consortium to define what it means to be college-ready on the new high school assessments.

The fact that you can currently graduate from high school without being smart enough for college is not a bug, it's a feature. Look, this isn't Lake Wobegon. Half the people in America are below average in intelligence. They aren't really college material. But they shouldn't all have to go through life branded as a high school dropout instead of high school graduate because they weren't lucky enough in the genetic lottery to be college material.

The Gates Foundation and the U. of California ganged up on the LA public schools to get the school board to pass a rule that nobody will be allowed to graduate who hasn't passed three years of math, including Algebra II. That's great for UC, not so great for an 85 IQ kid who just wants a high school diploma so employers won't treat him like (uh oh) a high school dropout. But, nobody gets that.

Another benefit of Duncan's new high stakes tests will be Smaller Sample Sizes of Questions:

With the benefit of technology, assessment questions can incorporate audio and video. Problems can be situated in real-world environments, where students perform tasks or include multi-stage scenarios and extended essays.

By way of example, the NAEP has experimented with asking eighth-graders to use a hot-air balloon simulation to design and conduct an experiment to determine the relationship between payload mass and balloon altitude. As the balloon rises in the flight box, the student notes the changes in altitude, balloon volume, and time to final altitude. Unlike filling in the bubble on a score sheet, this complex simulation task takes 60 minutes to complete.

So, the NAEP has experimented with this kind of question. How did the experiment work out?

You'll notice that the problem with using up 60 minutes of valuable testing time on a single multipart problem instead of, say, 60 separate problems is that it radically reduces the sample size. A lot of kids will get off track right away and get a zero for the whole one hour segment. Other kids will have seen a hot air balloon problem the week before and nail the whole thing and get a perfect score for the hour.

That kind of thing is fine for the low stakes NAEP where results are only reported by groups with huge sample sizes (for example, the NAEP reports scores for whites, blacks, and Hispanics, but not for Asians). But for high stakes testing of individual students and of their teachers, it's too random. AP tests have large problems on them, but they are only given to the top quarter or so of high school students in the country, not the bottom half of grade school students.

It's absurd to think that it's all that crucial that all American schoolchildren must be able to "analyze and solve complex problems, communicate clearly, synthesize information, apply knowledge, and generalize learning to other settings." You can be a success in life without being able to do any of that terribly well.

Look, for example, at the Secretary of Education. Arne Duncan has spent 19 months traveling to 42 states, talking about testing with teachers, parents, school leaders, and lawmakers. Yet, has he been able to synthesize information about testing terribly well at all? Has his failure to apply knowledge and generalize learning about testing gotten him fired from the Cabinet?

My published articles are archived at iSteve.com -- Steve Sailer

43 comments:

Anonymous said...: More pork for the educrats.

Otherwise, 100% useless.; 9/6/10, 11:34 PM
Black Death said...: Of course, subjectively graded written tests allow more room for politically correct responses. Totalitarian regimes love this stuff - answers are evaluated more on their political content than on their factual correctness. In the Third Reich, there was "German" science and "Jewish" science. And Stalin loved the pseudoscientific ideas of Trofim Lysenko - geneticists who actually believed in scientific facts were shot or sent to the Gulag if their ideas contradicted the party line.; 9/7/10, 12:50 AM
l said...: Arne Duncan bought $330 million worth of ice cream gloves.; 9/7/10, 2:52 AM
Anonymous said...: How long will it take for the education geniuses to decide that the kids who do poorly on multiple choice tests don't do any better on 'performance based' tests?; 9/7/10, 5:14 AM
polistra said...: "Entrepreneurship" is the white version of the NBA. A dream that educators and advertisers like to cultivate, but a dream that will only reach fruition for a tiny handful of kids.

More generally, I'm inclined to run Duncan some slack because he's the first Ed Sec who seems to be serious about vo-tech schooling.; 9/7/10, 5:14 AM
Fred said...: Your last two paragraphs were just brutal.

Maybe it's time to bring back the old GOP idea of abolishing the Department of Education -- not on cutting waste grounds this time, but on Hippocratic grounds.; 9/7/10, 5:59 AM
Anonymous said...: I love reading this stuff even though Steve writes lots of articles about the intellectual gap between the races.

I'm trying to figure out how Steve and others who write intellectual arguments filled with research and data everyday, are different from members of the KKK.

Oh yeah! Its the gap in intellect. Its the stereotype of the Klan member as a redneck racist while Steve is the intellectual.

If Steve's articles are not intended to encourage the KKKers and the rest of us to accept the beliefs of the KKKers, than what are they intended to accomplish?

I guess we're all Rebels now.; 9/7/10, 6:11 AM
Anonymous said...: Sounds like Blooms Taxonomy eduspeak. Good article...Can critical thinking be taught? American Educator, Summer, 8-19. by Dan Willingham; 9/7/10, 6:51 AM
Anonymous said...: You know, if you weren't such a notorious HBD-er, you could get something like this (suitably masked for hatefacts) published in an influential magazine.; 9/7/10, 6:52 AM
Bill said...: Further, whenever you hear teachers talk about how they teach "critical thinking," you can more or less translate that into "I hate drilling brats on their times tables. It's so boring." On the whole, teachers aren't very good critical thinkers. If they were, Ed School would drive them batty.

My daughter's teacher was teaching her charges about science and about data collection. My daughter was told to measure how long each of her family members could stand on one foot, hands at side, with a max of 2 minutes. I can do it for 10 seconds. My wife can do it all day.

When my daughter's teacher saw 2 minutes written down as my wife's time, she forced my daughter to change it to 30 seconds since "nobody can stand on one foot for 2 minutes."

Observe both the stupidity (how can a grown-up not know that some people have really good balance?) and the extreme meta-stupidity (the whole %@*!ing point of science is that when data and preconceptions conflict, data wins). She is trying to teach a subject that she, quite literally, does not know the first thing about.

I don't blame the poor woman, but there is nothing isolated about this. Schoolteachers are not smart enough to do what they are being asked to do. They can do drilling on times tables. They cannot teach science or critical thinking. How is it possible that every undergraduate knows that primary ed majors are airy bimbos, but every college graduate knows that schoolteachers are knowledgeable, highly skilled professionals?; 9/7/10, 6:57 AM
Le MNur said...: Duncan: "to develop standards and assessments that don't simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking and entrepreneurship and creativity."

That statement is dishonest and stupid, but also funny because Duncan himself clearly doesn't have any "critical thinking" skills, other than the apparent ability to fool some of the people all of the time.; 9/7/10, 7:02 AM
Jonathan Silber said...: How much brain power and "higher-level thinking skills" does it take to realize that a kid can be ignorant of what is taught in Algebra II and still make a good go of it in life?

Apparently more than Bill Gates possesses.; 9/7/10, 7:10 AM
Curvy said...: "But they shouldn't all have to go through life branded as a high school dropout instead of high school graduate because they weren't lucky enough in the genetic lottery to be college material."

In the old days, say pre-WWII, being a high-school dropout was not a huge stigma. Being an *8th-grade* dropout was the stigma.

Then we got the silly notion in the '50s that EVERY teen ought to drag his bum through 4 years of high school, accompanied by the necessary dumbing-down of curriculum so they could.

NOW you're a failure not only if you drop out before finishing 8th grade, (which in fact isn't even allowed) you're a failure if you drop out of high school.
And we're well on our way to stigmatizing you as (uh-oh) a COLLEGE dropout, because the curriculum's been dumbed down.

What next? You're a failure if you don't have a Ph.D.?; 9/7/10, 7:18 AM
Kylie said...: Curvy said..." we're well on our way to stigmatizing you as (uh-oh) a COLLEGE dropout, because the curriculum's been dumbed down."

That cuts both ways. This college drop-out now presumes that those with college degrees from, say, the mid-80's on are hopelessly blinkered by leftist indoctrination and therefore incapable of critical thought or analysis. And the NAMS are AA babies.

In other words, college degree=useful idiot. Only I don't find them very useful.; 9/7/10, 7:44 AM
Anonymous said...: Two points:

1. If the tests are given on computers, then the kids almost certainly will be forced to type their answers. Most kids at that age can hunt and peck faster than they can write by hand, and if that doesn't seem fair enough, then we can just teach them to type. That's the sort of thing that kids of all intellectual abilities tend to excel at -- and if you doubt they have the coordination for it, why don't you challenge a six year old to play a video game with you and see whose coordination better allows him to blindly move his fingers over the 12 or so keys to the controller.

2. All essays will certainly be graded by machine. ETS introduced something called e-rater for several tests -- TOEFL and GMAT, perhaps -- almost a decade ago and they've been expanding it ever since. It agreed with human graders about 99.5 percent of the time in its first iteration and that has gotten dramatically better since then. Of course, machine grading that matches human grading does nothing to address your point that human grading can be pretty arbitrary, but it does address the cost and speed issue. It was stupid of the Times not to put this in, but it's almost certainly true. No one would move forward with this if they didn't think the machine grading was mastered.; 9/7/10, 7:57 AM
Anonymous said...: Sailer mentioned the words "college material," a term that my guidance counselors used many moons ago when I went to high school. In today's globalized economy, everybody is "college material" because low skill manufacturing jobs MUST be outsourced to cheap labor countries in order to bring more to the corporate bottom line.

The old SAT test average (a measure of college potential) reached a high (verbal + math) of 997 in 1963. By 1994, the score was 902 even with a bit of tweaking. In recognition of the new norms, the SAT was changed: analogies and antonyms were eliminated altogether, reading passages were made more relevant (i.e. easier), test questions were first tried out on certain minority groups to determine if there was "cultural bias" and if found, eliminated, calculator use was allowed (to mask deficient arithmetic skills), a written essay was added and for good measure the scores were curved upward by 100 points.

I wonder when the educational establishment will stop covering up their failures with new "assessments?"; 9/7/10, 7:58 AM
AMac said...: Some over-40 Americans who last took a standardized test back in college might be surprised by how parts of the enterprise are run today (I was). For instance, part of my mandatory certification involved passing tests designed by my field's SRO (self-regulatory organization) and offered at a few hundred testing centers. It's the same deal for periodic recertification.

A few weeks before taking the test, I scheduled the place and time on the Thomson Prometric website (competing test companies have similar setups). On the day, I signed in at their local center, and took the timed test at one of the two dozen or so PC-containing booths there (videotaped the whole time, of course). I got my grade and pass/fail outcome one minute after finishing.

Prometric provides similar services for dozens of fields--architects, engineers, med students (USMLE), etc.

There are only a couple of reasons for not giving immediate feedback for Duncan's proposed tests. One is obviously inclusion of essays a la AP. Another is being unwilling to spend the dough to create or license a Prometric-type system. Finally, there is the obvious bug or feature that if the test-taker is informed right away, then they know right away: little room for post hoc adjustments.; 9/7/10, 8:17 AM
eh said...: Your tax money at work. In addition, you'll be paying interest on the debt the government will have to incur to fund this. And last but not least, your social security money in retirement will have to be paid out of the taxes collected from these kids. So you better hope all of this works.; 9/7/10, 8:38 AM
Anonymous said...: Schools or skills or whatever "..for the twenty-first century" seems to be the big mantra in educrat circles now.

So, saying over and over that they awoke and and realized it's "the twenty-first century" is supposed to work some sort of magic on the effort?; 9/7/10, 8:45 AM
Garland said...: "On the whole, teachers aren't very good critical thinkers. If they were, Ed School would drive them batty."

Word.; 9/7/10, 9:03 AM
Big bill said...: Now let's not start badmouthing the KKK, OK, Miss Anonymous? Like there is something wrong with traditional native punishments like stoning wayward women, chopping off hands, lynching or witch burning.

That sort of judgmental cultural imperialism is really inappropriate in our cultural-relativistic Boasian age. I thought you got the memo.

According to modernghana.com, folks in Africa (or at least in "Ghana, the Jewel of Africa") regularly avail themselves of these more traditional approaches to law enforcement. Apparently it works quite effectively in cultures with less than full confidence in law enforcement.

So trot on over to modernghana.com and search the archives for "lynch" or "witchcraft" or "muti". Judging by African standards, you cultural imperialists are really unfair.; 9/7/10, 9:41 AM
sabril said...: For Leftists, "critical thinking" means being able to trot out the standard Leftist responses to conservative arguments and data.

Any attempt to evaluate students based on their critical thinking skills will quickly turn into a test of their political correctness.; 9/7/10, 9:42 AM
Ed U. Crat said...: Schools or skills or whatever "..for the twenty-first century" seems to be the big mantra in educrat circles now.

I spose you wouldn't go for "Schools or skills for the third millennium"?

To infinty ... and beyond!; 9/7/10, 9:54 AM
alonzo portfolio said...: The education mess is a great opportunity to black parents. Simply by indulging their natural paranoia toward whites, they could shuck off all the post-1970 "child-centered" critical (leftist)teaching styles, and go back to rote learning of objective facts. Then, they could put their kids into competitions against the mainstream kids and kick ass. With the E-school product conclusively shown to be useless, they could demand that all federal monies theretofore going to Ed "research" be redirected to blacks themselves, as a reward for restoring the wisdom of the past. Nobody could complain.; 9/7/10, 9:58 AM
Anonymous said...: Anyway they make the tests, the achievement gap is still going to remain. Unless maybe impromptu dancing is included.; 9/7/10, 10:31 AM
Anonymous said...: How does standardized testing improve education?

A stress on entrepreneurship makes sense if it is a renewal of civics classes where studnets where learn how to use governments and regulatory bodies to achieve specific ends such as meeting safety and licensing regulations for one's own business. All high school students, if such a thing must exist, should be taught how to deal with corporate and government bureaucracy.; 9/7/10, 10:33 AM
Geoff Matthews said...: A few thoughts:

At my place of work, a study on grade inflation showed the students in the education college had the lowest ACT scores but the highest GPAs. I'm working with a coworker to see what the GPA of education students is in the sciences, compared to other students (along with other courses).

In regards to tests, I thought that one reason for introducing multiple choice items was to remove grader bias.; 9/7/10, 11:29 AM
Default User said...: ...whether they possess 21st century skills...
If we want a 21st century workforce, it might be a good idea to stop importing 19th century workers.

As others have said, I cannot see how these new tests will do anything but widen the "achievement gap."

I suppose the new improved test will cost a lot of money. That is OK, as we can spend even more when that "achievement gap" remains a gulf. At least the testing companies will be happy, no (profit) achievement gap for them.; 9/7/10, 12:45 PM
Baloo said...: Is there any way you can tweak this Blogger thing to make it impossible to post as "anonymous"? Not that I want to know who anybody is, just that it's very hard, as has been said here before, to keep track of all the Anons and reply to them efficiently.; 9/7/10, 1:20 PM
Anonymous said...: NAEP does track Asian performance

http://nces.ed.gov/nationsreportcard/pdf/stt2009/2010454TX4.pdf; 9/7/10, 2:59 PM
Anonymous said...: "Schools or skills or whatever "..for the twenty-first century" seems to be the big mantra in educrat circles now."

LOL

Bring back the factory jobs. The 21st century American workforce will be eerily similar to the late 20th century Chinese workforce.; 9/7/10, 3:01 PM
Anonymous said...: Hiding black and Hispanic underachievement by creating a subjective test everyone can ace sounds good to me. The motivation for throwing trillions of white tax payer dollars at the so-called achievement gap will then vanish overnight. After that, the ball is in Harvard's court: they will need to explain to the DOJ why their admissions procedures require objective testing which has obvious disparate impact. A day will come when Harvard will need to create it's own admissions test, implement racial quotas, and expand the student body so that it can continue producing a fixed number of students likely to prosper in the economy. The others can pretend they are getting an elite college degree while achieving no real higher learning.; 9/7/10, 3:04 PM
Severn said...: I'm trying to figure out how Steve and others who write intellectual arguments filled with research and data everyday, are different from members of the KKK.

Wait, you are trying to figure out how Steve and others who write intellectual arguments filled with research and data everyday ........... are different from the KKK?

Here is a clue - when you hear Alex Trebeck say "writes intellectual arguments filled with research and data" do you typically press the buzzer and respond "What is the Ku-Klux-Klan"?; 9/7/10, 3:21 PM
Curvy said...: "In other words, college degree=useful idiot. Only I don't find them very useful."

Same here, Kylie, in too many instances.; 9/7/10, 4:22 PM
Anonymous said...: How much brain power and "higher-level thinking skills" does it take to realize that a kid can be ignorant of what is taught in Algebra II and still make a good go of it in life?

I dunno - Algebra II [IQ 115] is probably about where you need to be if you want to be gainfully employed as a certified electrician who can work some basic voltage/resistance/current calculations, or a licensed pipe-fitter who can run some steam pressure calculations, or a master carpenter who can square a wall using the Pythagorean theorem.

Or, for that matter, a Microsoft MCSE who can set up a network of Microsoft domains & Outlook servers. [If you doubt me, then try teaching the theory of IP subnet masks to someone who can't do any algebra.]

What can you accomplish at less than an Algebra II level?

Maybe become a commercial truck driver, or a carpenter's helper, or a cabling monkey?

"Entrepreneurship" is the white version of the NBA. A dream that educators and advertisers like to cultivate, but a dream that will only reach fruition for a tiny handful of kids.

Right - I doubt that you get much in the way of "entrepreneurship" below about IQ 115 [aka Algebra II/Trigonometry].

Although if anyone knows of large numbers of entrepreneurs in the IQ 105 to 114 range, then I'd love to hear about it.

Nowadays, if you're working in the above-ground economy, with checking accounts and credit cards and balance sheets and whatnot, then I'd say that you need an IQ of at least 115 just to deal with the IRS forms - which brings to mind an old Derbyshire observation about how the modern tax code amounts to little more than an internecine cold war that the highly intelligent are waging against the slightly-less-than-highly intelligent.; 9/7/10, 5:32 PM
Unknown said...: This is all just another episode in the saga to invent tests that minorities do well on; there will never be any. Everyone can know that the gaps that appeared in previous tests will simply be repeated in the future ones, no matter what form they take.; 9/7/10, 5:49 PM
David Davenport said...: ... adaptive tests are a good idea (both the military's AFQT and the GRE have gone over to them)

The GRE is adaptive? Dunno about that.

Steve, I took or re-took the GRE ( Graduate Record Examination ) in Dec, 2009. I'm one of America's oldest doctoral candidates, at least in Aerospace Engineering. I have been pursuing the doctorate off and on and part time since Clinton's first term in office.

Most U.'s consider the shelf life of the GRE to be five years. The distinguished and prestigious U. of Tennessee wanted me to re-take the GRE, so I did. ( UT is one of 28 American U.'s that offer accredited doctoral programs in Aero. Eng. )

A paper and pencil version of the GRE is given at many locations in the US four or maybe five times a year. A computer version of the GRE is given several times a month every month.

I took the computerized version. There were three sections: an essay section, a multiple choice verbal section, and a math section.

The essay section allowed the testee to choose one of four different topics. I choose "Preserving the ecology versus the need for electrical power," or something like that.

Following the advice of the Kaplan's and Barron's GRE cram books, I strove for a long word count with a definite beginning, middle and end, and Pee Cee sentiments throughout. Was it thirty or forty-five minutes one was allowed to type an essay into the machine? I can't recall.

Scores for the essay section are graded by humans, so the results take weeks to arrive. However, scores for the multiple choice verbal and math sections are shown immediately after completing the computerized test.

The math section had forty-two (42) questions in forty (40) minutes. Speed was the primary virtue. The reward for creative mathematical answers was zero. Given enough time, I could have scored 100 percent. But there wasn't enough time.

The cram books said that the math section would ask harder or easier questions, depending on one's answer to the first few math questions. I suppose one could call that adaptive testing. The cram books -- There was also a Princeton brand cram book -- didn't say anything about the multiple choice verbal section being adaptive.

Why is the math section adaptive in this way? To avoid compression of scores.

For some historical reason, the multiple choice verbal and math parts of the GRE are graded on a scale from 200 to 800. Some test takers score poorly, while a fortunate few get 100 per cent correct.

I suspect that American U. science and engineering deptd. want a fine-grained test on the right side of the Gaussian curve that can sort out the 95 per cent talent from the 90 per centers, the merely eighty-five per cent people, and so on. ... Somewhat covertly eee-leetist and un-Pee Cee.

Oh yes, American University sci. and eng. departments continue to take GRE scores very seriously, at least in math.; 9/7/10, 6:46 PM
Silver said...: You know your chain is being yanked when you hear that schoolteachers are supposed to teach "21st century skills"

Well, come on, we're not talking about solving simplistic 20th century problems -- like getting to the moon. That stuff was a cinch.

This is about 21st century problems.

Like: "The old adage has it that you can fool some of the people all of the time and all of the people some of the time but you can't fool all of the people all of the time. Design an experiment that would allow you to test the political trade-off between fooling all of the people some of the time and only some of the people all of the time. Bonus points for synchronizing the maxima."

Or: "The Secretary of Education has elected to pursue a strategy of fooling all of the people some of the time. Research historical data to (a) determine how he might maximize the temporal durability of the platitudes he deploys, and (b) optimize the point at which he changes tack and revitalizes platitudes temporarily out of favor."; 9/7/10, 7:07 PM
Anonymous said...: You want a great example of an inner city entrepreneur? I give you Antoine Dodson.

http://www.antoine-dodson.com/

http://www.zazzle.com/antoinedodson24+gifts; 9/7/10, 7:55 PM
Mitch said...: I agree with your post; there's just a few things I wanted to clarify.

First, the SAT writing test was a huge success if you realize that its primary purpose was to increase College Board profits and cover the costs of changing the SAT to make the UC (its biggest customer) happy. Testers had to pay more money to take a test that most colleges never required. It made them $18 mill the first year alone. (Here's more on this--written five years ago, but the links are still good.

As a test, it's fine. The reason most schools don't use it is because they didn't want it in the first place. UC was the primary system that required the Writing test, about a quarter of the other top 100 schools required it, I think.

The standardized essay isn't bad--it's just not very useful. Too little granularity. It's better considered a test of written reasoning than a writing test. Each test (GRE, SAT, ACT, GMAT) provides its own definition of "support" required, and you go from there. I've taught several full classes of ACT/SAT and tracked my own grading for each student essay and collected their actual grades. I'm satisfied that they are graded well. Handwriting is almost never a factor. I've coached around 500 kids, I think, and had one kid who I told that if he didn't make his writing bigger, I'd kill him so he wouldn't have to worry about his damn score. It hurt my eyes just thinking about it.

But grammar is never going to matter a more than a point, and a brilliant writer who doesn't answer the question is screwed. On the other hand, I've had a number of kids with very weak English skills get a 4 (out of 6) and, based on the criteria, it was entirely legit.

I'm not arguing in favor of them--I'm just disagreeing with the particular criticism. I don't think there's any way to do a standardized essay that doesn't start with "did the tester answer the question" as the primary criteria, and if that's the primary criteria, then it can't really be a writing test.

Count me as an opponent of CBT. I have coached around 100 students, some for hundreds of hours, watching test results. In the middle (the 500 range) the tests are useless. I'm particularly not a fan of the GRE, which is little more than a brutal vocabulary test that only 2% get over 700 on, and a relatively easy math test that 4% get an 800 on. I don't know enough about CBT methods to point out the flaws, but I quit coaching the tests because I couldn't confidently give my students any assurance about their prospects in a given range (480-570).

I'm also not a fan of the NAEP tests, precisely because they conflate writing with ability--in much the same way that these tests will, of course. I don't trust the NAEP results much. There was a report done in the early 90s pointing out the problems with it--I can't find it any more, but none of the criticisms have been addressed.

AP tests have large problems on them, but they are only given to the top quarter or so of high school students in the country, not the bottom half of grade school students.

More accurately, AP tests are not given to the middle half of high school students. Around 17% of the testers are low income and, language tests aside, they are also low ability for the most part.

These "new tests" are the baby of Linda Darling Hammond, who is always going on and on about this. Here's an oped about it--check out the part about the Australian biology tests. As if.

I'd be really worried about all this if I thought it had a chance in hell of happening. But again, this is a Darling-Hammond baby, so it's going nowhere. Please god, let it go nowhere.

PS--Thanks!; 9/7/10, 11:10 PM
Mitch said...: Whoops--I said CBT (computer based training) when I meant CAT (computer adaptive testing). Hey, it's a letter.; 9/8/10, 9:44 AM
Curvy said...: "You want a great example of an inner city entrepreneur? I give you Antoine Dodson."

Yes, well, drug-dealing is very entrepreneurial.; 9/8/10, 3:32 PM
Anonymous said...: Entrepreneurship is not taught. It is culturally transmitted as part of formation in a culture, as are most important "life skills."; 9/9/10, 6:25 AM

Steve Sailer: iSteve

September 6, 2010

The Golden Age of Standardized Test Creation

43 comments:

Panhandling

Links

Search This Blog

Recent Comments

Classic Steve Sailer articles

Rough Cut

"What Is Winter Good For?"

Blog Archive

My Book:

Labels

Popular Posts

Contribute money to iSteve

Flattr this blog

Translate

Stats

Total Pageviews

Site Meter

Steve Sailer: iSteve

September 6, 2010

The Golden Age of Standardized Test Creation

43 comments:

Panhandling

Subscribe To iSteve

Links

Search This Blog

Recent Comments

Classic Steve Sailer articles

Subscribe To

Rough Cut

Subscribe To iSteve

"What Is Winter Good For?"

Blog Archive

My Book:

Labels

Popular Posts

Contribute money to iSteve

Flattr this blog

Translate

Stats

Total Pageviews

Subscribe To iSteve

Site Meter