Problem with probability of students' scores

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

eMPiko
Posts: 6
Joined: Wed Jul 11, 2012 11:12 am UTC

Problem with probability of students' scores

Postby eMPiko » Mon Nov 07, 2016 12:11 am UTC

Hi, I have a interesting statistics related question. I have two groups of students taking a test. Each test is rated using points 0-20 and naturally the distribution of points for these two small groups are different. My null hypothesis is that both groups have the same distribution of points. Obviously I have only limited number of samples so I can't just compare the resulting distributions. How should I proceed in accepting or rejecting my null hypothesis?

lorb
Posts: 404
Joined: Wed Nov 10, 2010 10:34 am UTC
Location: Austria

Re: Problem with probability of students' scores

Postby lorb » Mon Nov 07, 2016 7:48 am UTC

The gold-standard for discrete variables is a chi-squared test. (Test scores are usually discrete.) As a rule of thumb there should be at least 5 data-points per bin, so you may need to pool values together. If you prefer to treat them as continuous you may use a Kolmogorov-Smirnov test or something similar. However, as you say yourself, sometimes there just isn't enough data to do meaningful statistical analysis.
Please be gracious in judging my english. (I am not a native speaker/writer.)
http://decodedarfur.org/

User avatar
Xanthir
My HERO!!!
Posts: 5228
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

Re: Problem with probability of students' scores

Postby Xanthir » Tue Nov 08, 2016 3:20 am UTC

Agree with lorb, a chi-squared should do you right.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

User avatar
Zamfir
I built a novelty castle, the irony was lost on some.
Posts: 7312
Joined: Wed Aug 27, 2008 2:43 pm UTC
Location: Nederland

Re: Problem with probability of students' scores

Postby Zamfir » Tue Nov 08, 2016 9:40 am UTC

I am bit rusty on this, but wouldn't a chi-squared test be overly conservative? It throws away the ordering of the categories - it doesn't know that 5s and 6s are closer together than 5s and 10s. At small sample sizes, it might give too much inconclusive results. And you throw away information by binning.

Something like Mann-Whitney-Wilcox might be more appropriate. It assumes that both samples have the same distribution, but possibly shifted along an ordinal axis . You'll get a p-value, telling you whether there is a significant shift. It only requires ordinality for the result categories.

You might combine this with a variance test (Brown-Forsythe is best, I think), to test whether the spread is different even if the median is not. I am not 100% certain of to interpret results from that for ordinal categories, but I think that your 1-20 points scheme can be interpreted a tad more strongly than mere odinality.

EDIT: an extra thought. If you have the results for individual questions, the you might be able to use that as well. You don't get a single ordinal value for each student, but a binary vector. There must surely exist some clustering algorithm that tells you whether the vectors in one group are significantly more alike than the vectors in another group.

lorb
Posts: 404
Joined: Wed Nov 10, 2010 10:34 am UTC
Location: Austria

Re: Problem with probability of students' scores

Postby lorb » Tue Nov 08, 2016 11:15 am UTC

While mann-whitney-wilcoxon indeed uses more information but only tells you whether the scores have a different "median". In this case it tells you if one of the groups of students is doing better/worse on the test than the other. It's a good choice if that is what you want to know and if you can't do a T-Test. (which isn't clear here)
The drawback is that OP specifically asked for a test that tells them whether the distributions are different, and for that MWW is not the right tool.
Brown-Forsythe on a 1-20 points score is probably fine. It's not designed for ordinal variables, but for this test the scores are likely good enough to be used as numeric variables.
Please be gracious in judging my english. (I am not a native speaker/writer.)
http://decodedarfur.org/


Return to “Mathematics”

Who is online

Users browsing this forum: No registered users and 10 guests