Data Lab improvement: merging of rows

Discussion about future features
admin
Gephi Community Manager
Posts: 964
Joined: 09 Dec 2009 14:41

Data Lab improvement: merging of rows

Post by admin » 15 Jun 2011 08:43

Hello,

I'd like to address a simple use case:

A graph is loaded, laid out, explored, but then the user finds two nodes that should be only one (say there is a "duplicate"). The user wants to merge the two nodes, not creating a meta-node because he wants to fix an error in the data.

The only way is to edit the graph file by hand, but this is tricky because you don't always know how to do that, and it's tricky to edit all edges related to the node we merge into the other one. And it is even harder to merge node attributes (keep one value, sum the values or average the values?).
The problem is the same when one wants to merge two edges.

So I'd see a new feature which, after selecting two rows, opens a panel to help the merge. What do you think about that?

User avatar
eduramiba
Gephi Code Manager
Posts: 976
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data Lab improvement: merging of rows

Post by eduramiba » 15 Jun 2011 21:52

Hi,

So the user could merge 2 nodes only or any number of nodes?
I guess that the resulting node should have all edges that the nodes have (if not repeated).
To merge the attributes, the user could select an strategy for each column or just select 1 node to keep all its attributes, for example.

But how would edges merge work since graphs can't have edge duplicates?

Eduardo

admin
Gephi Community Manager
Posts: 964
Joined: 09 Dec 2009 14:41

Re: Data Lab improvement: merging of rows

Post by admin » 16 Jun 2011 09:54

Yes a merge strategy for any nodes is a good idea!
The node would indeed have any edges.

About the edges, we could let the user choose the new source and target from all the sources and targets of the merged edges. So we just create a new edge and remove the olders, but we can merge the attributes, which is the goal.

User avatar
eduramiba
Gephi Code Manager
Posts: 976
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data Lab improvement: merging of rows

Post by eduramiba » 16 Jun 2011 16:01

Cool,
Then I think I will build a nodes and edges manipulator to do this with some basic strategies.

Eduardo

elijah
Gephi Community Support
Posts: 169
Joined: 11 Sep 2010 18:09
Location: Stanford, CA
Contact:

Re: Data Lab improvement: merging of rows

Post by elijah » 16 Jun 2011 21:40

This is an extremely common occurrence with my data. Fixing it by hand right now is tedious for any non-trivial changes and fixing it in the database requires that you reload the data and re-run your processing.

User avatar
eduramiba
Gephi Code Manager
Posts: 976
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data Lab improvement: merging of rows

Post by eduramiba » 20 Jul 2011 23:25

I finally implemented this node merging and commited it to trunk (revision 2266) :) !

Steps to use it:
Select nodes to merge, right click, choose Merge nodes and choose an strategy for each column.

I hope it helps and you can use it soon. Also please report a bug if any problem happens.

Josef_K
Posts: 2
Joined: 14 Nov 2011 16:04

Re: Data Lab improvement: merging of rows

Post by Josef_K » 14 Nov 2011 16:23

I am sorry but am I only one who has problem with this merging nodes?
It does not work well inside more complex network. I am trying to find where exactly problem occurs but basically happens when I try to weld more nodes with several edges, than not all of the edges (from old sub-nodes) are present.
It looks like it work good only with merging two nodes.

User avatar
eduramiba
Gephi Code Manager
Posts: 976
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data Lab improvement: merging of rows

Post by eduramiba » 29 Dec 2011 14:45

Hi,
There could be some problem because assigning the edges is a bit tricky.

This feature first creates a new node.
Then for each edge of each node to merge creates an edge between the new node and the other node (undirected or directed depending on this edge) if possible. And if the other node is the same node it will create a self loop for the new node instead.
Finally deletes old merged nodes if desired.

Looking at the code I can see that sometimes the self-loop will not be created (would need to check if the other node is any of the nodes to merge). Is that the problem?

Can you show me an example of wrong behaviour with your data?
Eduardo

ignacio.morer
Posts: 3
Joined: 12 Jul 2012 18:24
Location: Zaragoza, Spain

Re: Data Lab improvement: merging of rows

Post by ignacio.morer » 13 Jul 2012 11:13

Hi everyone,

I'm having trouble with this topic as well. I obtain duplicate nodes when I import data from a .csv. It happens when a node exists as source and target. The problem is that Gephi does not identificate duplicate nodes even though the information in all the columns is the same.

Any ideas on what is happening?

Thank you,

Ignacio

User avatar
eduramiba
Gephi Code Manager
Posts: 976
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data Lab improvement: merging of rows

Post by eduramiba » 16 Jul 2012 14:57

Hi Ignacio,
Can you show some part of your file that is problematic?

Eduardo

Post Reply