Modularity randomize option

Computing metrics, community detection and data handling
Post Reply [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
tamara_cqr
Posts:3
Joined:19 Feb 2014 17:34
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Modularity randomize option

Post by tamara_cqr » 19 Feb 2014 18:11

Hello,

I was curious about randomize option in modularity.

I thought unchecking randomize would result in the same modularity value and the same communities over multiple runs. But this is not what's happening.

I looked at the source code a bit, and it seems to me that randomize option is used to control an order of the nodes being examined in computeModularity method.
So if I uncheck randomize option, I always examine nodes in the same order and I should get the same partition of the graph. Hope I'm not wrong so far...

So there is something else that causes the "randomness" of graph partitions over multiple runs. I did some debugging and it could be the fact that iterating nodeConnectionsWeight keyset (in updateBestCommunity) doesn't give always the same order of communities, thus altering the order of nodes . I tried changing nodeConnectionsWeight to a LinkedHashMap. This made the random option work as I taught it should work - same partition in every run when randomize option is unchecked.

So the question is:
Am I wrong to think that unchecked randomize option should always give the same partition (and why)?
And if I'm not wrong, what is the reason you kept some randomness for an unchecked randomize option (like some modularity algorithm theory that I missed).

Thanks in advance

Tamara

pegerp
Posts:124
Joined:21 Dec 2011 17:10
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Modularity randomize option

Post by pegerp » 19 Feb 2014 23:30

Hi tamara.

I don't know the exact details of modularity implementation in Gephi. However the randomness comes from the fact that each run of the algorithm tries to find good modularity value (Q-value) using approximation. The approximation algorithm works by starting from one node and walking from that node to other nodes with a probability that is relative to the weight of the edge connecting the two nodes. Repeating this kind of "random walk" multiple times some nodes are visited more often than the others and thus the algorithm finds nodes that are highly connected and those that are not - ie. nodes that are in communities together. Because the random walk uses probabilities to traverse network consecutive runs of the algorithm don't produce the same modularity value.

The initial step of the algorithm is to build a random network that has the same node+edge properties (eg. degree) that the network that is being investigated. By comparing connectivity of that random network to the actual network it is possible to evaluate whether the nodes in the actual network are more densely connected than those in the random network. My guess is that the "randomize" option in Gephi builds this initial random network several times during the run of the algorithm to make sure that the randomly generated network actually is random - ie. the random generation didn't accidentally create a network that had dense communities, thus making the actual network seem less connected than the random network.

Note that these are my (hopefully well educated) guesses on the subject. Actual details may vary. There are also many questions regarding modularity at Gephi forums, you might find more info on those threads as well: https://www.google.fi/search?q=site%3Af ... modularity.

Good luck with your study!

tamara_cqr
Posts:3
Joined:19 Feb 2014 17:34
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Modularity randomize option

Post by tamara_cqr » 20 Feb 2014 11:35

Hi pegerp,

thanks a lot for answering.

Your answer helped me understand better the use of a random network in the algorithm (in general, not only Gephi). I admit that it was not really crystal clear.

But my doubt is mostly about the sense and function of Gephi's randomize option.

I think your guess is correct - Gephi builds a random network until no node to community move improves the modularity.

And this makes perfect sense if the randomize option is checked because than rand() is used to randomly select the starting node for every build and the moves node-to-community are evaluated always in a different order of communities (due to an underling HashMap).

But I am still confused about the case of unchecked randomize option because than the starting node is always the same but results still differ because of the moves node-to-community are still evaluated in a different order of communities.

I'm wondering if these moves should be evaluated in the same order of communities instead (of course only for an unchecked randomize option), so to get always the same network partition.

Thanks again

Tamara

vtraag
Posts:3
Joined:16 Feb 2012 14:31
Location:the Netherlands
Contact:

Re: Modularity randomize option

Post by vtraag » 24 Feb 2014 09:40

Hi,

I've just taken a look at the community detection code of Gephi. They actually implement the Louvain method (see http://perso.uclouvain.be/vincent.blond ... uvain.html for some more info). This method basically loops over all nodes and (greedily) puts them in the best community (i.e. best for now, the one that maximizes the improvement of modularity). It then contracts the graphs ('zooms out', makes communities into nodes) and reiterates. Hence, there is no construction of a random graph, nor is there any random walk being performed (this is only the idea that is suggested in the referenced paper, it is not actually implemented here). The only thing that should be affected by the randomize option is the order in which the nodes are being considered for moving to another community. So this is how it should work.

Now, how it actually works, for some reason, apparently, there is some random effect in the code, which shouldn't be there (in my humble opinion). This might be due to the way the code works internally. With some structures, the order of iteration of elements is not fixed (e.g. HashMaps as you suggest), so that there might be diffferences between different runs. I'm not exactly sure where this would originate, one would have to debug the code to be sure.

In any case, I wouldn't worry about it too much. There is absolutely no reason to prefer iterating over nodes from 1,...,n instead of another random node order: it doesn't 'work better' in any way. If the only reason to choose this particular order is to have a stable outcome, why not just pick one and stick to it (i.e. save it somewhere)? On the other hand, if the runs give you very different results each time you run it, you might want to reconsider how 'valid' that particular partition is, if the next time it returns a completely different one. Of course, some minor changes will be acceptable (depending on your point of view), as long as the overall structure remains similar.

Best,

Vincent

tamara_cqr
Posts:3
Joined:19 Feb 2014 17:34
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Modularity randomize option

Post by tamara_cqr » 24 Feb 2014 10:36

Hi,
thanks for answering.

Yeah, there is no random graph construction, just a random (based on the order of extraction of nodes) "output" construction.

And it is possible that the randomness introduces by a HashMap shouldn't be there as you said.

Anyway, I agree that removing randomness is not so important and that actually might help find better solutions. It's just a bit confusing for Gephi users the fact that you have an option to remove randomness and than you get again random results. So I was asked just to find out why is that happening.

Thanks a lot for helping me,

Tamara

Post Reply
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable