Statistics - measure of overlap between populations?

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

User avatar
Posts: 606
Joined: Thu Jan 10, 2008 9:21 pm UTC
Location: ZZ9 Plural Z Alpha

Statistics - measure of overlap between populations?

Postby Actaeus » Sat May 30, 2009 11:38 am UTC

The original context of this problem was the search for a good statistic for how connected you are to someone on a social network such as Facebook. I wanted it to be independent of the total friends each person has, because otherwise friend-collectors would throw the whole thing off.

Original formula:
A = set of my friends
B = set of other person's friends
[math]\frac{|A\cap B|}{|B|}[/math]
Modified to be commutative and to take into account the size of my friend set:
[math]\frac{|A\cap B|}{\sqrt{|A|\times|B|}}[/math]
Note that this is the geometric mean of the original formula taken both ways (A = me and A = other)
What would be a more useful way to do this? I've been messing with chi-square tests with little success.

Clarification edit: I'm trying to measure how closely our sets of friends overlap. The Jaccard index actually seems like a much simpler and better idea than what I've been doing. It's also similar enough to set off my "someone smarter than me had the same idea, better, a long time ago" dismay reaction. This happens to me far too often.
Last edited by Actaeus on Wed Jun 17, 2009 11:17 pm UTC, edited 1 time in total.

Posts: 286
Joined: Tue Aug 22, 2006 10:35 pm UTC

Re: Statistics - measure of overlap between populations?

Postby GreedyAlgorithm » Sun May 31, 2009 4:09 am UTC


What are you actually trying to find? |A intersect B| is the number of mutual friends A and B have, done. Why are you dividing by anything? If you want to exclude some people ("friend-collectors") from the domain, just exclude them. chi-squared tests? Clearly you haven't told us whatever your actually goal is.
GENERATION 1-i: The first time you see this, copy it into your sig on any forum. Square it, and then add i to the generation.

User avatar
Posts: 1178
Joined: Wed Apr 16, 2008 12:52 am UTC
Location: Cambridge, MA

Re: Statistics - measure of overlap between populations?

Postby t0rajir0u » Sun May 31, 2009 6:34 am UTC

You should probably tell us what you're actually trying to measure. Right now your thread says "I want to study the number of mutual friends two people have, but I don't want to use the number of mutual friends."

Return to “Mathematics”

Who is online

Users browsing this forum: No registered users and 11 guests