Statistics - measure of overlap between populations?

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

Actaeus
Posts: 606
Joined: Thu Jan 10, 2008 9:21 pm UTC
Location: ZZ9 Plural Z Alpha

Statistics - measure of overlap between populations?

The original context of this problem was the search for a good statistic for how connected you are to someone on a social network such as Facebook. I wanted it to be independent of the total friends each person has, because otherwise friend-collectors would throw the whole thing off.

Original formula:
A = set of my friends
B = set of other person's friends
$\frac{|A\cap B|}{|B|}$
Modified to be commutative and to take into account the size of my friend set:
$\frac{|A\cap B|}{\sqrt{|A|\times|B|}}$
Note that this is the geometric mean of the original formula taken both ways (A = me and A = other)
What would be a more useful way to do this? I've been messing with chi-square tests with little success.

Clarification edit: I'm trying to measure how closely our sets of friends overlap. The Jaccard index actually seems like a much simpler and better idea than what I've been doing. It's also similar enough to set off my "someone smarter than me had the same idea, better, a long time ago" dismay reaction. This happens to me far too often.
Last edited by Actaeus on Wed Jun 17, 2009 11:17 pm UTC, edited 1 time in total.

GreedyAlgorithm
Posts: 286
Joined: Tue Aug 22, 2006 10:35 pm UTC
Contact:

Re: Statistics - measure of overlap between populations?

...what?

What are you actually trying to find? |A intersect B| is the number of mutual friends A and B have, done. Why are you dividing by anything? If you want to exclude some people ("friend-collectors") from the domain, just exclude them. chi-squared tests? Clearly you haven't told us whatever your actually goal is.
GENERATION 1-i: The first time you see this, copy it into your sig on any forum. Square it, and then add i to the generation.

t0rajir0u
Posts: 1178
Joined: Wed Apr 16, 2008 12:52 am UTC
Location: Cambridge, MA
Contact:

Re: Statistics - measure of overlap between populations?

You should probably tell us what you're actually trying to measure. Right now your thread says "I want to study the number of mutual friends two people have, but I don't want to use the number of mutual friends."