I wrote a statistic that will take a connected graph, will find each node's betweeness centrality and remove the node with the highest BC. Once that's done, it will repeat the process until the node becomes fragmented, i.e. it is no longer possible to reach every node from any other node.
It works great - until you reload the same file in a second workspace that is.
I need to reload the same file again because while I was removing nodes, I am losing information about the graph I loaded - so short of keeping a list of all removed nodes in memory and re-adding them after the statistic is done, I just reload the file again.
When I re-open the file, and run the statistic on that again - I get a different result then the one I would get for the first run. I debugged this for about 4 hours and narrowed it down to something odd.
This is how I get the graph that I use in the statistic's execute(...) method.
Code: Select all
GraphController graphController = Lookup.getDefault().lookup(GraphController.class);
ProjectController pc = Lookup.getDefault().lookup(ProjectController.class);
GraphModel model = graphController.getModel();
UndirectedGraph graph = model.getUndirectedGraph();
You see, the way I check if the network is fragmented relies on checking what nodes are reachable from a node - i.e. its neighbors. To that end, I use the getNeighbors(Node) method of my UndirectedGraph object.
While debugging, it came to my attention that there's some internal ID number for each node that is completely different than the one visible in the data table view (an attribute value read from the gexf file) and from the one returned in Node.getId( ). Running the second time, I could 'hover' my mouse over the node object I got and see the number '20'. Considering that I am running my tool the second time on a network that has 35 nodes, why am I still seeing the number 20?
I don't know what to do - I've spent all day and it's getting really frustrating. I can tell that the nodes I am operating on the second time around belong to a completely new set of IDs (that I get via the Node.getId( ) method) which is fine, but the graph.getNeighbor(graph.getNode(id)) call must return a subset of the actual neighbors - and I think that has to do with the previous run of my statistic tool that has removed some nodes and their edges.
I think I must be doing something wrong or using a wrong assumption about how Gephi keeps its graph models, workspaces, and graphs in check...
What should I do?
Sorry for the long post.