Doubts about Hubs & Authorities implementation
Posted: 02 May 2012 21:02
Hi everybody,
First of all let me give you my congratulations for this excelent piece of software you are developing. I've started using Gephi a few weeks ago and I'am really impressed with its practical user interface, high visualtization capabilities and varaiety of network analysis algorithms included.
At this moment I am working with eigenvector centralities for information retrival, particularly I am writing you to ask about the Hits & Authorities implementation.
My doubts are about some diferences I've found between the source code Hits.java in github and the algorithm's description done in the reference article http://www.cs.cornell.edu/home/kleinber/auth.pdf.
More precisely my questions have to do with nodes' authority and hub value update in each iteration:
(1) At each iteration you init new nodes' authority and hub values "temp_authorities" and "temp_hubs" with values obtained at the previous iteration and then you add to this original values neighbors' hubs or authorities values respectively. Shouldn't zero initial value for "temp_authorities" and "temp_hubs" be used ?
(2) I understand at a node's hub value update you should sum authority values of nodes connected by outcoming edges. Why are you iterating over incoming edges as you do for node's authority value update ?
I've done some tests comparing Hubs and Authorities values obtained from Gephi with those calculated with Networkx, a python library which also implements HITS, and I've found they are definitely different.
Probably my doubt aren't a problem and I am missing some important points here!
Can help me to understand these diferences?
Thanks in advance!
David
First of all let me give you my congratulations for this excelent piece of software you are developing. I've started using Gephi a few weeks ago and I'am really impressed with its practical user interface, high visualtization capabilities and varaiety of network analysis algorithms included.
At this moment I am working with eigenvector centralities for information retrival, particularly I am writing you to ask about the Hits & Authorities implementation.
My doubts are about some diferences I've found between the source code Hits.java in github and the algorithm's description done in the reference article http://www.cs.cornell.edu/home/kleinber/auth.pdf.
More precisely my questions have to do with nodes' authority and hub value update in each iteration:
(1) At each iteration you init new nodes' authority and hub values "temp_authorities" and "temp_hubs" with values obtained at the previous iteration and then you add to this original values neighbors' hubs or authorities values respectively. Shouldn't zero initial value for "temp_authorities" and "temp_hubs" be used ?
(2) I understand at a node's hub value update you should sum authority values of nodes connected by outcoming edges. Why are you iterating over incoming edges as you do for node's authority value update ?
I've done some tests comparing Hubs and Authorities values obtained from Gephi with those calculated with Networkx, a python library which also implements HITS, and I've found they are definitely different.
Probably my doubt aren't a problem and I am missing some important points here!
Can help me to understand these diferences?
Thanks in advance!
David