Hi all!
I have a little experience with Gephi and I'm new to this forum.
For my work (application of bayesian reasoning to business & management problems) I have to analyze the structure and the navigation data of a web portal.
Does anyone have any suggestion as to the best layout, best data format, existing litterature on the subject?! Maybe there's already a thread on the forum (but I couldn't find any...).
Thanks!
Marco
Analysis of a web portal with Gephi
Re: Analysis of a web portal with Gephi
Hi,
You should start by using a web crawling engine to collect links between pages on this web portal. If you are interested in the content of these pages, you'll need web scrappin capabilities too. A tool (didn't try it myself) that integrates both is http://www.80legs.com/services.html
Alternatively, if you can have access to programming capabilities then some languages have straightforward libraries for crawling and scrapping.
Once you have these data, you can visualize the network of pages and the hyperlinks connecting them. Another interesting angle would be to look at most frequent terms in these pages, connected if they frequently co-occur on the same page.
Best,
Clement
You should start by using a web crawling engine to collect links between pages on this web portal. If you are interested in the content of these pages, you'll need web scrappin capabilities too. A tool (didn't try it myself) that integrates both is http://www.80legs.com/services.html
Alternatively, if you can have access to programming capabilities then some languages have straightforward libraries for crawling and scrapping.
Once you have these data, you can visualize the network of pages and the hyperlinks connecting them. Another interesting angle would be to look at most frequent terms in these pages, connected if they frequently co-occur on the same page.
Best,
Clement
-
- Posts:2
- Joined:08 May 2013 09:24 [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Re: Analysis of a web portal with Gephi
Thanks, Clement!
We already crawled the portal (we did it ourselves with an R code). The key feature we have been asked to analyze is the behavior of customer on the webpage, via an analysis of their navigation sessions.
Should we go for a weighted network to rank the "quality" of the paths?!
thanks again,
regards,
Marco
We already crawled the portal (we did it ourselves with an R code). The key feature we have been asked to analyze is the behavior of customer on the webpage, via an analysis of their navigation sessions.
Should we go for a weighted network to rank the "quality" of the paths?!
thanks again,
regards,
Marco