[phpBB Debug] PHP Warning: in file [ROOT]/phpbb/session.php on line 583: sizeof(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/phpbb/session.php on line 639: sizeof(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
Gephi forums •[SOLVED] Best Layout Algorithm for Massive Network
Page 1 of 2

[SOLVED] Best Layout Algorithm for Massive Network

Posted: 01 Jul 2011 14:19
by PMR3349
I have a network with ~100,000 nodes and ~1,600,000 edges.

Clearly such a monstrosity is going to take some time, but after letting a force-atlas attempt run overnight I check it out this morning to find the program crashed. Is there a better algorithm to use for a gigantic network like this one?

I still haven't ruled out the possibility that the hardware could be the limiting factor, but in case it's not, I'm wondering if one of the algorithms is best. Thanks

Re: Best Layout Algorithm for Massive Newtork

Posted: 01 Jul 2011 16:55
by jacomyma
Hi, I think ForceAtlas2 might be able to deal with networks of this size but I actually didn't test above 50K nodes. If you tried only ForceAtlas "version 1" it might be a good idea to test the second version (available in Gephi 0.8).
By the way, the layout considered as the most adapted to big networks is OpenOrd, which is available as a plug-in in the catalog.

And also, I think that Gephi in 64bits allows more RAM and can handle bigger graphs (if you have a 64bits CPU and at least 4Go RAM).

Re: Best Layout Algorithm for Massive Newtork

Posted: 01 Jul 2011 17:42
by PMR3349
Thanks, I'll give that a try

I got the memory allocation up by increasing it for Java, which helped. It seems that Gephi isn't made to use multiple cores on the CPU though since I can't get it above 25% CPU usage on my quad-core

Re: Best Layout Algorithm for Massive Newtork

Posted: 01 Jul 2011 18:01
by mbastian
Gephi takes advantage of your multi-quores when you are doing several things at the same time. For layout, the OpenOrd is able to use several threads.

Re: Best Layout Algorithm for Massive Newtork

Posted: 01 Jul 2011 20:41
by PMR3349
Thanks for the suggestion. Just thought I should follow up and say that the OpenOrd algorithm worked amazingly well. After waiting hours and finally crashing with other layouts, OpenOrd finished in about 5 minutes! Thanks again for the tip :D

Re: Best Layout Algorithm for Massive Newtork

Posted: 01 Jul 2011 20:48
by pbittner
I planned to write about that a little later but since you are talking multi-core, I’ll say it here. I have recently implemented a multi-core version of the Force Atlas layout in Fortran using OpenMP for the parallelism.

Since the repulsion in O(n2) is where most of the time is spent, I wrote a small function in Fortran that performs the repulsion on all nodes at once. This Fortran function is wrapped in a C++ function which is compiled as a DLL and the DLL is called from Java using JNI.

I have tested it with graphs up to 75000 nodes and it performs very well (around 50x faster than the Java implementation on an 8 cores workstation). However, it obviously doesn’t outperform the O(nlogn) Barnes-Hut repulsion of the new Force Atlas 2 layout. I have therefore started working on a parallel version of the Force Atlas 2 and I’ll release a plug-in containing the two parallel implementations when it is finished.

I think that the performance of the parallel Force Atlas 2 will increase by the number of cores you have (4 times faster with 4 cores) which would be great. It doesn’t solve the issue with the amount of RAM required to handle very large graphs but running Force Atlas 2 on a 100,000 nodes graph will hopefully be quite smooth. :)

My main concern for the plug-in is portability since the DLL has to be compiled separately for each architecture. The first version of the plug-in may only be available on Windows 64bits.

Then, I may have a look at the OpenOrd layout to make it even faster or I may try to use CUDA to speedup Force Atlas using GPGPU. I have very little experience with CUDA but the O(n2) repulsion of Force Atlas looks like a perfect candidate.

Any suggestion?

Re: Best Layout Algorithm for Massive Newtork

Posted: 02 Jul 2011 09:44
by admin
pbittner, THAT sounds amazing! :geek:

Re: Best Layout Algorithm for Massive Newtork

Posted: 03 Jul 2011 16:21
by seinecle
Just great, I can't wait for the release of the plugin!

Re: Best Layout Algorithm for Massive Newtork

Posted: 04 Jul 2011 23:07
by seinecle
A follow-up: how can I make sure that I installed the 64 bit version of Gephi? (there is no 32 vs 64 versions in the download page).

Thx,

Clement

Re: Best Layout Algorithm for Massive Newtork

Posted: 04 Jul 2011 23:45
by eduramiba
Parallel layout algorithms sound great :D !

Clement, to make sure you run Gephi under a 64 bit virtual machine you have to install a 64 bit jre/jdk.
Then you can edit {gephi installation}/etc/gephi.conf file and set larger maximum memory.

If you have problems because other jre/jdk versions are installed then uncomment the jdkhome and put, for example:

Code: Select all

jdkhome="C:\Program Files\Java\jdk1.6.0_25"
Here http://gephi.org/users/install there is some more info.

Eduardo