I'm trying to get my mind around a large data warehouse. I started out filtering SQL schemas to provide .gv input summarized by grep and sed and simplified as follows:
digraph dg {
tablename1 -> foreignkey1.in.tablename1;
tablename1 -> foreignkey2.in.tablename1;
tablename2 -> foreignkey1.in.tablename2;
tablename2 -> foreignkey2.in.tablename2;
and so on...
}
The idea was to dump this in graphviz and let dot have at it. Well, dot did its best but the result was a large indigestible lump of nodes and arcs.
Then I stumbled upon Gephi and fed the .gv file to it. I got another indigestible lump, but this time I was able to apply the Yifan Hu Proportional layout to it and zoom in. The most highly-connected nodes stood out from the other ones and allowed me to visually identify the most important top-level tables and relationships in the system. Also, the most interrelated nodes tended to end up together, so closeness became a measure of affinity and helped me envision an eventual partition of the system into families of tables and foreign keys. It looks very much like a star chart, with the client galaxy connected to the service provider galaxy via the intermediate payment relationship galaxy. Tres cool, mes vieux.
But I want more!
I'd like to know how to increase node size depending on the simple number of connections (arcs) it has, and (more power! more! more!), I'd like to know how to set up the Data Table to display only the nodes selected in the Graph window. This would be majorly cool (as my granddaughter might say) when doing interactive graph exploration and selecting little groups of nodes here and there to see what conceptual affinity they have as reflected in the table names, in addition to sheer arc connectivity. I have visions of brown bag sessions using this powerful tool to mine the subject matter experts for database knowledge.
Summarizing,
1. How do I size node representations according to arc connections, and
2. how do I set up the Data Table to display only the node and arcs currently selected in the graph?Statistics:Posted by jsbenson — 05 Sep 2011 22:08
]]>