Dynamic Two-Mode Network Analysis (CSV to GEXF)

Get help with your data
Post Reply [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
nikita.andreew
Posts:7
Joined:26 Nov 2011 10:29
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Dynamic Two-Mode Network Analysis (CSV to GEXF)

Post by nikita.andreew » 05 Apr 2012 18:05

Dear Community!

I would like to share something. I had a dataset containing dates of beginning and end of projects, names of these projects and corresponding lists of participating organizations – very interesting material for dynamic two-mode network analysis. I never learned programming and processing all 1745 rows by hand promised to destroy all the fun. So I decided to learn Ruby by writing a script, which would do it for me. And it does :) http://andreew.userpage.fu-berlin.de/ruby/picon.txt

The data must be structured in a way, that your resulting CSV file looks like this:

yyyy-mm-dd,yyyy-mm-dd,”Project 1”,”AAA”,”BBB”,”CCC”,
yyyy-mm-dd,yyyy-mm-dd,”Project 2”,”DDD”,”EEE”,”FFF”,”GGG”,
yyyy-mm-dd,yyyy-mm-dd,”Project 3”,”HHH”,”EEE”,”AAA”,

No header!

Please make sure the cells don't contain double quotes and ampersands!

The script will produce a GEXF file, describing a dynamic directed graph in which projects are targets, oraganizations are sources and dates are coded as spells. It does not yet write the attributes of the nodes, which are needed to mark the nodes of two different types. I was just so happy after I managed to get this thing running properly, that I decided to work this in later. Alternatively someone else here could do that :) A professional will need much less time for it then I will. At the moment you can mark the projects (or whatever mode-1 nodes are), asking Gephi to color the nodes by their InDegree.

As I said, it is my first programming experience, so don't be harsh. The code might look a bit clumsy and I realize, the algorithm is not the fastest. But it works fine on Ubuntu Maverick with Ruby 1.9.3. I will be glad if this script will save some time to the people who work with data, which are structured this way. And I will be thankful for constructive critique.

Best wishes!

Nikita

User avatar
seinecle
Gephi Community Support
Posts:546
Joined:08 Feb 2010 16:55
Location:Lyon, France
Contact:

Re: Dynamic Two-Mode Network Analysis (CSV to GEXF)

Post by seinecle » 11 Apr 2012 14:47

Hi,

interesting! I have a kind of similar project.
It is called "Eonydis" and you can see snapshots here: http://www.clementlevallois.net/portofolio.html

it takes a csv file like this (let's imagine these are mobile comunications)

caller,receiver,start of the communication, duration of the communication in seconds, age of caller, age of receiver (this fields could be in any 1different order)
Marc,Gerald,2012/01/01 - 13:12:59,542,23,28
Marc,Denis,2012/01/01 - 17:08:11,51,23,17
Ted,Jan,2011/07/07 - 08:01:13,1201,58,72
Jan,Eloise,2012/03/29 - 23:01:50,117,72,71
...
...


From this kind of csv files, it creates a gexf file with:

- nodes and their dynamic attributes (as many attributes as you want)
- edges and their dynamic attributes (as many attributes as you want)
- an attribute can be assigned as edge weight.
- if two values are found for an attribute on the same date, the user can decide to take their average, or the sum
- spells for nodes and edges
- the time format can be specified by the user (see screenshot)
- fields can be ordered in any way (the file just needs to be a csv file)

limitations:
- attributes are just numerical (float), not textual (strings).

It is basically ready and could be released as an exe file today, but I prefer to write a small tutorial to release it properly. But if anybody is interested, I can send them the exe file already, the user interface is pretty self explanatory.

[EDIT: I added a link to the exe file for the early adopters who want to try it, even without much instructions. They can send me emails with questions / bug reports if they want.
visit http://www.clementlevallois.net/portofolio.html and click on the download link for Eonydis]


Best,

Clement

nikita.andreew
Posts:7
Joined:26 Nov 2011 10:29
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Dynamic Two-Mode Network Analysis (CSV to GEXF)

Post by nikita.andreew » 13 Apr 2012 11:18

Hi Clement,

cool, thanks for sharing! It's great that your progam gives a user these degrees of freedom. In my case the columns must be in that exact order, which I described in my example. I'm thinking of allowing headers to be able to specify the order of columns in a configuration file or in a GUI.

One more thought: Is there any initiative collecting such projects? I think it would be cool to have some kind of a toolkit with programs converting different structures of data into different kinds of graphs (one- or two-mode) in GEXF format.

Nikita

User avatar
seinecle
Gephi Community Support
Posts:546
Joined:08 Feb 2010 16:55
Location:Lyon, France
Contact:

Re: Dynamic Two-Mode Network Analysis (CSV to GEXF)

Post by seinecle » 21 Apr 2012 11:13

Yes indeed...

This could be on the wiki of Gephi, but I did not find a suitable page for that yet.

Post Reply
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable