[phpBB Debug] PHP Warning: in file [ROOT]/phpbb/session.php on line 583: sizeof(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/phpbb/session.php on line 639: sizeof(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
Gephi forums •Detect and merge by label
Page 1 of 1

Detect and merge by label

Posted: 24 Jun 2017 23:13
by joshisanonymous
I have a very large directed graph which has many duplicate nodes. For some reason, one node was created for all incoming edges and another for all outgoing edges. For instance, I have a there might be two copies of the node "Jim", one with an indegree of 5 and outdegree of 0 and the other with an indegree of 0 and outdegree of 11. I have no idea why that happened, but I think at this point the easiest fix is to merge these duplicates and rerun the statistics.

The problem is that I cannot get Gephi to automatically detect the duplicates by the Label column. It works fine if I go by the degree column or something like that, but that's it. Is there something I'm missing here?

Re: Detect and merge by label

Posted: 26 Jun 2017 09:36
by eduramiba
What's the error or problem when using Label?
A workaround might be to duplicate the label column.

Re: Detect and merge by label

Posted: 01 Jul 2017 01:51
by joshisanonymous
I choose detect and merge, choose the Label column, and then zero duplicates are found despite there being thousands.

I just attempted to duplicate the Label column as Label_copy and the result was the same: zero duplicates found in the Label_copy column.

Re: Detect and merge by label

Posted: 01 Jul 2017 02:01
by joshisanonymous
Perhaps there's something I need to do with the source .csv files that I used to import the nodes and edges to avoid the problem of having duplicate nodes in the first place? My edges .csv file has a column labeled "source" and another labeled "target", with the values in each being names. My nodes .csv file has a single column labeled "id" with one copy of each of those names as values.

Re: Detect and merge by label

Posted: 01 Jul 2017 11:29
by eduramiba
Your labels might not be actually equal. Make sure they use the same characters, and don't have trailing spaces.

Are you importing an edges spreadsheet? Can you show a screenshot of your data laboratory tables?

Re: Detect and merge by label

Posted: 01 Jul 2017 15:35
by joshisanonymous
Ah, there is a trailing space only on the target column. That must be the problem. Thank you.

Re: Detect and merge by label

Posted: 01 Jul 2017 15:54
by eduramiba
Great! For your info, this will be more user friendly in Gephi 0.9.2, where all values will be trimmed. You can try the latest snapshot (prebuild) at https://github.com/gephi/gephi#nightly-builds