Sunday, July 4, 2010

Of anchor questions, leaks and Prometric’s plan for a glitch-free CAT 2010

There will be little change in the base process used behind the creation and evaluation of the Common Admissions Test (CAT) 2010 compared to CAT 2009. Prometric, the testing vendor commissioned by the Indian Institutes of Management (IIMs) for conducting the test will continue to use the “psychometric process to ensure that the CAT is valid, reliable and fair.”



Speaking to mediapersons in New Delhi last week, Prometric’s Vice President of Test Development Services Stephen Williams said that he was happy with the content creation and evaluation of CAT 2009. “It went exactly as we would have liked it to be,” he said.



Prometric India’s Managing Director Soumitra Roy was quick to clarify that Mr Williams was speaking only about the academic content and evaluation aspect of CAT 2009 (and not the hardware failures and virus attacks).



Much of what Mr Williams told journalists about the processes and standards used in creation and evaluation of the CAT question papers has been shared in the public domain before. I will summarize the highlights and some points of interest captured during the interaction.



In summary, during the test development phase, “Prometric works closely with IIM professors along with specially trained subject matter experts from other well-regarded Indian universities. Each exam question is written, edited and reviewed in an iterative process. All modifications and approvals are tracked electronically in an audit trail.”



“After the tests are administered, the raw scores are calculated on the basis of a +3 for a correct answer, a -1 for a wrong answer while un-attempted questions are ignored. After equating the scores (read below), they are linearly scaled to a 0-450 range before being presented to the candidates as test results.”



Anchor questions, Cloned questions and question leakages



“Each question paper has four ‘anchor questions’, which are used to adjust differences in difficulty between different question papers. Two of these questions would have appeared in the question paper of a previous slot, and the other two would appear in a consequent slot. All question papers are thus linked together in a sequential chain, the two anchor questions forming the links.” (Mr Williams refused to divulge whether the anchor questions were shared between question papers of consecutive slots.)



The performance of test-takers in these anchor questions is measured to establish a common metric of difficulty between two question papers and then adjust the raw scores accordingly.



So in CAT 2009, which was held over 30 slots (3 slots per day for 10 days), there would have been 60 anchor questions in total.



Mr Williams said that a few more questions similar in nature but different in form – called ‘cloned questions’ – were repeated across slots. A simplistic example is an algebraic problem in multiple versions, each version having a different set of substitution values. In reality, the differences between the versions would be a lot less apparent, said Mr Williams.



Anchor questions and cloned questions have a lot to do with the controversy over the ‘leakage’ of CAT 2009 questions on anonymous blogs, Orkut communities and coaching institute channels. For any candidate to gain an advantage in a future test slot, it is important that the leaked questions fall in the anchor or cloned categories.



Mr Williams played down the leakages. According to him, people tend to have nervous mindsets during a test and their ability to remember questions and reproduce them accurately a few hours later is overrated. “We went through the blogs and Orkut communities that were accused of sharing questions and found that very few of them had gotten it right. There was only a perception that the questions on the blogs were the same as those in the test, but they were not,” he said.



Not all candidates are exactly ‘nervous’, though. Test-preparation institutes, for example, were regularly sending ‘proxy candidates’ (teachers, content developers) to CAT 2009 slots with the intentional purpose of memorizing questions and sharing them with their students. These questions were being shared clandestinely in classrooms and secret email lists with test-takers in later slots.



Mr Williams’ defended this with the explanation that such concerted activity had not worked because “if that were to happen, then statistically we would have noticed strange patterns with the answering of cloned questions which we didn’t.”



Mathematically, the extent of the advantage gained by being privy to an anchor question is an interesting probability problem (anyone?).



CAT 2010 will be held in 32 cities and over fewer centers



The IIMs will release the CAT 2010 advertisement in the end of August, said Mr Roy. Prometric plans to use fewer testing centers this year after its disastrous tryst with India’s hardware infrastructure during CAT 2009.



“We will only choose test centers that adhere to the norms created by us for ensuring infrastructure quality,” he said.



As the testing window will also be longer (more like a month), there will clearly be more number of slots, more number of distinct question papers and therefore a larger pool of questions.

No comments:

paralegal degree programs