Anonymize Data
-
- Posts:13
- Joined:08 Jun 2011 07:31 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Hello,
Does anyone know of a script or application to anonymize network data (in .CSV format)? I've got some new data I've collected and I want to make it available in the public domain. I've got CSV files that look like this:
ACTORA,ACTORB
ACTORA,ACTORC
ACTORB,ACTORD,ACTORF,ACTORG,ACTORH
ACTORC,ACTORD,ACTORG,ACTORI,ACTORJ
ACTORC,ACTORE
I'm trying to convert ACTORA to 1 and ACTORB to 2, etc. I.e., I want it to look like this:
1,2
1,3
2,4,5,6,7
3,4,6,8,9
3,10
Does anyone know how to do this? Have a look at the data sets at http://snap.stanford.edu/data/index.html. That format would be perfect. I'd really appreciate your help on this, as I want to make the data available to the community.
Thank you kindly!
Does anyone know of a script or application to anonymize network data (in .CSV format)? I've got some new data I've collected and I want to make it available in the public domain. I've got CSV files that look like this:
ACTORA,ACTORB
ACTORA,ACTORC
ACTORB,ACTORD,ACTORF,ACTORG,ACTORH
ACTORC,ACTORD,ACTORG,ACTORI,ACTORJ
ACTORC,ACTORE
I'm trying to convert ACTORA to 1 and ACTORB to 2, etc. I.e., I want it to look like this:
1,2
1,3
2,4,5,6,7
3,4,6,8,9
3,10
Does anyone know how to do this? Have a look at the data sets at http://snap.stanford.edu/data/index.html. That format would be perfect. I'd really appreciate your help on this, as I want to make the data available to the community.
Thank you kindly!
- eduramiba
- Gephi Code Manager
- Posts:1064
- Joined:22 Mar 2010 15:30
- Location:Madrid, Spain [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Hi, well I don't know one but if you are using Gephi, you can use the default generated Ids and remove personal data (copy Id column to label column for example).
Eduardo
Eduardo
-
- Posts:13
- Joined:08 Jun 2011 07:31 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Thank you for your help. The problem I have is that both the label and id columns are the same and they contain the personal data (I'm opening a CSV file). Any idea how I can anonymize these?eduramiba wrote:Hi, well I don't know one but if you are using Gephi, you can use the default generated Ids and remove personal data (copy Id column to label column for example).
Eduardo
- eduramiba
- Gephi Code Manager
- Posts:1064
- Joined:22 Mar 2010 15:30
- Location:Madrid, Spain [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Oh, I see, I can't find a way to do this easily without programming.
-
- Posts:13
- Joined:08 Jun 2011 07:31 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Any idea where I can go to get help with writing some code for this? How much code would it be to do something like that?eduramiba wrote:Oh, I see, I can't find a way to do this easily without programming.
- eduramiba
- Gephi Code Manager
- Posts:1064
- Joined:22 Mar 2010 15:30
- Location:Madrid, Spain [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
It should be a short code. We can import the file with Gephi toolkit, set the Nodes Ids to 1,2,3... and export it.
-
- Posts:13
- Joined:08 Jun 2011 07:31 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Hmm, can you walk me through an example?eduramiba wrote:It should be a short code. We can import the file with Gephi toolkit, set the Nodes Ids to 1,2,3... and export it.
- eduramiba
- Gephi Code Manager
- Posts:1064
- Joined:22 Mar 2010 15:30
- Location:Madrid, Spain [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Hi, just had an idea, you could do it with this http://gephi.org/plugins/script-console/ wonderful plugin
It is really simple:
Open Gephi 0.8 alpha
Go to Tools, Plugins, Available Plugins and there install the Script Console plugin
Reboot Gephi
Open your graph file, copy and paste the following code
Click Run
And that should be enough to anonimyze the Id column. For the label column, you can copy Id column values in Data Laboratory for example
Eduardo
It is really simple:
Open Gephi 0.8 alpha
Go to Tools, Plugins, Available Plugins and there install the Script Console plugin
Reboot Gephi
Open your graph file, copy and paste the following code
Code: Select all
import java.lang.String as String
i=0
graph = getGraph()
for n in graph.getNodes():
i=i+1
graph.setId(n,String.valueOf(i))
print i, "nodes"
And that should be enough to anonimyze the Id column. For the label column, you can copy Id column values in Data Laboratory for example
Eduardo
-
- Posts:13
- Joined:08 Jun 2011 07:31 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
In addition to the above proposed solutions, I have obtained the following Python code to do anonymize data programmatically:
Thought I would post it here in case someone needs to anonymize data in the future. Thank you to everyone who assisted with this issue! Much obliged.
Code: Select all
import sys
hashes = {}
count = 1
with open(sys.argv[1]) as f1:
for line in f1:
actors = line.strip("\n").split(',')
hashActors = []
for actor in actors:
try:
hashActors.append(hashes[actor])
except KeyError:
hashes[actor] = str(count)
hashActors.append(str(count))
count += 1
print(",".join(hashActors))
-
- Posts:13
- Joined:08 Jun 2011 07:31 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Anonymize Data
Not quite sure I understand. Would you be able to elaborate? Are you referring to the Python code?seniyajw wrote:In the case of parallel edges, I suggest to alert the user and make the "road Import" to act as a CSV file importer, adding weight, if possible, and leave blank the other attributes. I opened a mistake.