Graph Streaming

GSoC developers forum
Post Reply [phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
admin
Gephi Community Manager
Posts:964
Joined:09 Dec 2009 14:41
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
Graph Streaming

Post by admin » 22 Mar 2010 09:45

This is the thread for asking more details about the Graph Streaming proposal.

Gayana
Posts:5
Joined:04 Apr 2010 14:11
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by Gayana » 04 Apr 2010 20:30

Hi,
I went through this idea. According to the way I understood what Gephi needs is, if the graph data is taken from a external data source which is so dynamic, there should be a proper monitoring mechanism to gracefully append the correct nodes and edges to the relevant graph. This is the main task of the Graph Streaming API. isn't it?
So in this case I came across these questions.
1) What are the properties of the node and edges which is provided from the external data source?
2)Are those data pre-processed before processed by the Streaming API?
(The Reason to this question is there is another project idea called Direct Social Networks Import, it may process data and earlier)

I really appreciate your help on this topic. If you can give me a broader explanation on this idea it would be more helpful to get the idea correctly.

Thanks in advance,
Gayana.

admin
Gephi Community Manager
Posts:964
Joined:09 Dec 2009 14:41
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by admin » 05 Apr 2010 09:35

Hi,

The goal is to develop a multi-threaded socket server. The monitoring mechanism already exists as containers: nodes and edges and loaded into a container, an intermediate class which will check the consistency of the data before adding them automatically to the graph.
1) What are the properties of the node and edges which is provided from the external data source?
Every possible property a node or an edge could have. You may find some details by reading the GEXF file format primer here.
2)Are those data pre-processed before processed by the Streaming API?
No, as said above the container is responsible for verifying the data, and we assume that the received graph does not need any additional processing.

Gayana
Posts:5
Joined:04 Apr 2010 14:11
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by Gayana » 06 Apr 2010 21:02

Hi,

Thank you very much for your reply. I went through your reply many times and read the GEXF file format primer as well. Now I have an idea of the use of the Socket server and what the Graph Streaming API is supposed to do. I have several concerns about followings,

1) What type of a input stream would come to the Streaming API, is it an xml like GEXF with dynamic mode enabled or just a stream of data so that the API is supposed to create the relevant GEXF file to import?

2) Is it like using a JAVA API for access an XML and add new tags according to the input stream like "who added the element and touched later", "time stamp" etc?

Currently I am creating my proposal and hope to publish it as soon as possible. If you can give me some tips in making the proposal highlighted and most important areas that I must explore and describe more relevant to this topic, would really helpful. So your comments regarding the above facts are highly appreciated.

Thanks in advance,
Gayana.

User avatar
mbastian
Gephi Architect
Posts:728
Joined:10 Dec 2009 10:11
Location:San Francisco, CA
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by mbastian » 06 Apr 2010 21:48

Hi Gayana,

You should first point what are the problems that leads us to need a Graph Streaming API. Why not just use Graph API and push data ? For instance imagine three different sources use a web-crawler and send graph to Gephi. Nodes are websites identified by the "url" value. The Graph Streaming API would provide a safe container where data can be appened from several threads and apply different merge strategies:
* Use the ID and the LABEL values to merge doubles
* Or keep all elements and set different time slices
* Or set an attribute column that trace data source name

It's a real challenge to be able to make graph push as simple as it seems to be, with the API taking care of everything. On the user side, the module would be nicely integrated, with following features:
* Container view, that says in real time the graph size in the buffer
* Sort by sources, see what is coming from different sources
* Merge strategies chooser, pick a strategy and configure it. For instance choose on which attribute column matching has to be done.
* Append button, choose on which workspace the new graph is appened

Gayana
Posts:5
Joined:04 Apr 2010 14:11
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by Gayana » 08 Apr 2010 13:45

Thank you very much for your reply.
Now I have created my proposal and added in proposal list. If you can see it and give me some feedback that would be really helpful.

Thanks in advance,
Gayana.

Gayana
Posts:5
Joined:04 Apr 2010 14:11
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by Gayana » 19 Apr 2010 14:59

Hi,

I have been asked for more details of the data format transmitted from the source, in the public reviews of my GSoc proposal. Therefore I wrote the following explanation. Could you please give me some feedback on this, So I can get a good understanding of the project?

I have few points to deliver in this aspect,

1) In the document named "Gephi : An Open Source Software for Exploring and Manipulating Networks" it is mentioned that the dynamic module can get network data either a compatible graph file or from an external source. It further says that data stream send network data and immediately see the results in the visualization module. In that case there should be a very well designed way of sending data from source to Gephi. It has not addressed an exact way to do it or the format of the data to be streamed in advance.

2) What I suppose is, this is still an open area to be discussed and come up with a better generic design. I do not like the source to say "Send the gexf file directly" instead what I propose is to create a XML based structure which would be a common format to send data from source to Gephi. This can be designed in a short time because what source should do is pushing data with a tag to represent the data as explained below.
The source should only push data elements to Gephi, not Nodes and Edges. Data element contains the attributes, time slices,source names etc. When it is pushed to the Safe Container of Gephi which is to be developed using Graph Streaming API, the user can set or define what are the nodes and edges and create the graph. Then the data source sends data elements and safe container knows how to map them to the graph as user defined earlier. In this way we can have the same visualization of pushing the graph real time.

Best Regards,
Gayana.

User avatar
mbastian
Gephi Architect
Posts:728
Joined:10 Dec 2009 10:11
Location:San Francisco, CA
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by mbastian » 22 Apr 2010 07:37

The source should only push data elements to Gephi, not Nodes and Edges. Data element contains the attributes, time slices,source names etc. When it is pushed to the Safe Container of Gephi which is to be developed using Graph Streaming API, the user can set or define what are the nodes and edges and create the graph. Then the data source sends data elements and safe container knows how to map them to the graph as user defined earlier. In this way we can have the same visualization of pushing the graph real time.
That could be a very interesting extension, bringing a generic way to define dimensions of data and let Gephi create the graph structure. Same reflexions are made for Excel import assistant: from a set of columns and values, let the user create different types of network. That would be nice to include this in the API but the regular graph is however the priority I think. But don't hesitate to share your ideas about this general level, I think we gain to list possible uses cases.

elishowk
Posts:3
Joined:04 Jun 2010 13:14
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by elishowk » 04 Jun 2010 13:21

About the serialization problem we think we could propose a GEXF format working with JSON. That would lower the size of messages a lot and fit more to the "network world" than XML. Do you agree and how do you think that is possible? You have more experience about JSON than me.
JSON may be a good choice, but you should consider bsd licensed Protocol Buffers (http://code.google.com/intl/fr/apis/pro ... orial.html) which provides a nice object oriented api
For synchronization issues, it is not directly related to the GEXF format, yet it is an interesting topic. Feel free to comment this point as well. Read the wiki page and imagine possible use cases. For instance if several instances of Gephi synchronize, how to make versionning and keep the data consistent and up to date everywhere? I personally think I miss some real-world example about this. Do you have in mind other projects or articles that could help to see problems?
About syncing data, I think a decentralized versionning system like git could be a nice and efficient tool.

wumalbert
Posts:2
Joined:19 Mar 2012 17:17
Location:Tsinghua University, Beijing, China
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable

Re: Graph Streaming

Post by wumalbert » 25 Mar 2012 16:15

Hello,

Since there is a Graph Streaming idea in Gephi for GSOC 2012, I want to now what's the difference between this post and the newly graph streaming idea. What's going on about this post? Has it been implemented? I want to make clear about this year's graph streaming idea so to make a right proposal. Thank you!

Best regards

Min WU

Post Reply
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/vendor/twig/twig/lib/Twig/Extension/Core.php on line 1275: count(): Parameter must be an array or an object that implements Countable