Data preparation to show correlation beetween classes and keywords

Get help with your data
Post Reply
Chazu1989
Posts: 4
Joined: 01 Oct 2017 15:32

Data preparation to show correlation beetween classes and keywords

Post by Chazu1989 » 01 Oct 2017 15:41

I'm totally new to Gephi and I have already problems ... at first with the data preparation.

I have the following data (sample):
Image

What I want to do is to show a correlation between the keywords and the classes. So actually it should be a simple task but I don't get it.

Classes are from 0 to 10 and an individual keyword can occur multiple times and can be connected to different classes.

For example:
"Keyword 1" can have a connection to "Class 1" and "Class 6"
"Keyword 2" can have a connection to "Class 2"
"Keyword 4" cann have a connection to "Class 8", "Class 6" and "Class 2" and so one.

What I want to do, is to show the classes as nodes and the keywords around them. They are connected with a line which has a different thickness depending on the number of connections of the respective keyword with the class.

Example with paint :)

Image


I am totally lost :? I hope anyone can help me with that.

User avatar
eduramiba
Gephi Code Manager
Posts: 922
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data preparation to show correlation beetween classes and keywords

Post by eduramiba » 01 Oct 2017 16:25

You can use spreadsheet import and name your classes and keywords columns as source and target.

https://gephi.org/users/supported-graph ... readsheet/

Chazu1989
Posts: 4
Joined: 01 Oct 2017 15:32

Re: Data preparation to show correlation beetween classes and keywords

Post by Chazu1989 » 01 Oct 2017 17:12

Thanks for your fast reply.

But I'm a "little" bit confused:

The keywords shouldn't have a connection among each other but only to the classes.
So I have created the following two files.

Nodes.txt as UTF8 with BOM
Nodes.txt
(139 Bytes) Downloaded 3 times
Nodes.JPG
Nodes.JPG (14.96 KiB) Viewed 89 times

Edges.txt as UTF8 with BOM
Edges.txt
(73.28 KiB) Downloaded 4 times
Edges.JPG
Edges.JPG (24.84 KiB) Viewed 89 times
But if I want to import the edges table, I am getting the error message: Edges table needs a 'Source' and 'Target' column with nodes ids

User avatar
eduramiba
Gephi Code Manager
Posts: 922
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data preparation to show correlation beetween classes and keywords

Post by eduramiba » 01 Oct 2017 17:39

Hi,
Try to use .csv (or even .xlsx) extension. Also never include the BOM.

Chazu1989
Posts: 4
Joined: 01 Oct 2017 15:32

Re: Data preparation to show correlation beetween classes and keywords

Post by Chazu1989 » 01 Oct 2017 18:38

Ahhh thanks!!
As .txt file without BOM it is working. :mrgreen:

Now I can test a bit.

Thanks again!

Another question... I have about ~4 million keywords for this 10 classes. Are they too many, or could it work?

User avatar
eduramiba
Gephi Code Manager
Posts: 922
Joined: 22 Mar 2010 15:30
Location: Madrid, Spain

Re: Data preparation to show correlation beetween classes and keywords

Post by eduramiba » 01 Oct 2017 19:26

Are you still using 0.9.1? 0.9.2 is newer and more stable and friendly but you will have to change the extension from txt to csv (no changes required to the file).

4 million should work but you might need to increase gephi memory limit in gephi.conf

Post Reply