[SOLVED] Best Layout Algorithm for Massive Network

Algorithms and parameters to put data in space
User avatar
PMR3349
Posts:9
Joined:28 Jun 2011 17:30
Location:Cambridge, MA
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
[SOLVED] Best Layout Algorithm for Massive Network

Post by PMR3349 » 01 Jul 2011 14:19

I have a network with ~100,000 nodes and ~1,600,000 edges.

Clearly such a monstrosity is going to take some time, but after letting a force-atlas attempt run overnight I check it out this morning to find the program crashed. Is there a better algorithm to use for a gigantic network like this one?

I still haven't ruled out the possibility that the hardware could be the limiting factor, but in case it's not, I'm wondering if one of the algorithms is best. Thanks

User avatar
jacomyma
Gephi Core Developer
Posts:61
Joined:09 Feb 2010 23:23
Contact:

Re: Best Layout Algorithm for Massive Newtork

Post by jacomyma » 01 Jul 2011 16:55

Hi, I think ForceAtlas2 might be able to deal with networks of this size but I actually didn't test above 50K nodes. If you tried only ForceAtlas "version 1" it might be a good idea to test the second version (available in Gephi 0.8).
By the way, the layout considered as the most adapted to big networks is OpenOrd, which is available as a plug-in in the catalog.

And also, I think that Gephi in 64bits allows more RAM and can handle bigger graphs (if you have a 64bits CPU and at least 4Go RAM).

User avatar
PMR3349
Posts:9
Joined:28 Jun 2011 17:30
Location:Cambridge, MA
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Best Layout Algorithm for Massive Newtork

Post by PMR3349 » 01 Jul 2011 17:42

Thanks, I'll give that a try

I got the memory allocation up by increasing it for Java, which helped. It seems that Gephi isn't made to use multiple cores on the CPU though since I can't get it above 25% CPU usage on my quad-core

User avatar
mbastian
Gephi Architect
Posts:728
Joined:10 Dec 2009 10:11
Location:San Francisco, CA
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Best Layout Algorithm for Massive Newtork

Post by mbastian » 01 Jul 2011 18:01

Gephi takes advantage of your multi-quores when you are doing several things at the same time. For layout, the OpenOrd is able to use several threads.

User avatar
PMR3349
Posts:9
Joined:28 Jun 2011 17:30
Location:Cambridge, MA
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Best Layout Algorithm for Massive Newtork

Post by PMR3349 » 01 Jul 2011 20:41

Thanks for the suggestion. Just thought I should follow up and say that the OpenOrd algorithm worked amazingly well. After waiting hours and finally crashing with other layouts, OpenOrd finished in about 5 minutes! Thanks again for the tip :D

User avatar
pbittner
Gephi Plugin Developer
Posts:35
Joined:18 Mar 2010 23:03
Location:London, UK
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Best Layout Algorithm for Massive Newtork

Post by pbittner » 01 Jul 2011 20:48

I planned to write about that a little later but since you are talking multi-core, I’ll say it here. I have recently implemented a multi-core version of the Force Atlas layout in Fortran using OpenMP for the parallelism.

Since the repulsion in O(n2) is where most of the time is spent, I wrote a small function in Fortran that performs the repulsion on all nodes at once. This Fortran function is wrapped in a C++ function which is compiled as a DLL and the DLL is called from Java using JNI.

I have tested it with graphs up to 75000 nodes and it performs very well (around 50x faster than the Java implementation on an 8 cores workstation). However, it obviously doesn’t outperform the O(nlogn) Barnes-Hut repulsion of the new Force Atlas 2 layout. I have therefore started working on a parallel version of the Force Atlas 2 and I’ll release a plug-in containing the two parallel implementations when it is finished.

I think that the performance of the parallel Force Atlas 2 will increase by the number of cores you have (4 times faster with 4 cores) which would be great. It doesn’t solve the issue with the amount of RAM required to handle very large graphs but running Force Atlas 2 on a 100,000 nodes graph will hopefully be quite smooth. :)

My main concern for the plug-in is portability since the DLL has to be compiled separately for each architecture. The first version of the plug-in may only be available on Windows 64bits.

Then, I may have a look at the OpenOrd layout to make it even faster or I may try to use CUDA to speedup Force Atlas using GPGPU. I have very little experience with CUDA but the O(n2) repulsion of Force Atlas looks like a perfect candidate.

Any suggestion?

admin
Gephi Community Manager
Posts:964
Joined:09 Dec 2009 14:41
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Best Layout Algorithm for Massive Newtork

Post by admin » 02 Jul 2011 09:44

pbittner, THAT sounds amazing! :geek:

User avatar
seinecle
Gephi Community Support
Posts:546
Joined:08 Feb 2010 16:55
Location:Lyon, France
Contact:

Re: Best Layout Algorithm for Massive Newtork

Post by seinecle » 03 Jul 2011 16:21

Just great, I can't wait for the release of the plugin!

User avatar
seinecle
Gephi Community Support
Posts:546
Joined:08 Feb 2010 16:55
Location:Lyon, France
Contact:

Re: Best Layout Algorithm for Massive Newtork

Post by seinecle » 04 Jul 2011 23:07

A follow-up: how can I make sure that I installed the 64 bit version of Gephi? (there is no 32 vs 64 versions in the download page).

Thx,

Clement

User avatar
eduramiba
Gephi Code Manager
Posts:1064
Joined:22 Mar 2010 15:30
Location:Madrid, Spain
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Best Layout Algorithm for Massive Newtork

Post by eduramiba » 04 Jul 2011 23:45

Parallel layout algorithms sound great :D !

Clement, to make sure you run Gephi under a 64 bit virtual machine you have to install a 64 bit jre/jdk.
Then you can edit {gephi installation}/etc/gephi.conf file and set larger maximum memory.

If you have problems because other jre/jdk versions are installed then uncomment the jdkhome and put, for example:

Code: Select all

jdkhome="C:\Program Files\Java\jdk1.6.0_25"
Here http://gephi.org/users/install there is some more info.

Eduardo

Post Reply
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable