[phpBB Debug] PHP Warning: in file [ROOT]/phpbb/session.php on line 583: sizeof(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/phpbb/session.php on line 639: sizeof(): Parameter must be an array or an object that implements Countable
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4516: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3262)
Gephi forums •Special characters in GEXF import
Page 1 of 1

Special characters in GEXF import

Posted: 22 Apr 2012 18:54
by GapaxGermany
Hello,

I'm importing some data into Gephi via self-written GEXF files. In the GEFX files, I set the size of the nodes:

Code: Select all

<node id="3060473429321147825814991366103" label="Mühe" >
<viz:color r="127" g="201" b="127" a="1" />
<viz:shape value="triangle" />
<viz:size value="30"></viz:size>
</node>
You see that the node has the label "Mühe" with the German special character "ü". The file itself is UTF-8 encoded and seems to be fine.

When I import these files into Gephi, all the special charactes are destroyed, for me it seems so that there is somewhere a string conversion done with the wrong encoding. When I don't add the size information, the file is imported fine and everything looks perfect.

Re: Special characters in GEFX import

Posted: 24 Apr 2012 13:48
by GapaxGermany
Dear all,

I want to bring up this issue again, please forgive me ;-)

I tried it now not only on Windows, but also on Mac, and there the problem is the same. I made a screenshot which shows the problem: on the left, you can see Gephis Data Laboratory with the wrongly displayed label. On the right, it shows a simple text editor which has opened the same GEXF file and which shows the label correctly (it should be "König"):

Image

Re: Special characters in GEFX import

Posted: 24 Apr 2012 14:51
by eduramiba
Hi,
Can you share at least some part of the file to see what can be wrong?

Eduardo

Re: Special characters in GEFX import

Posted: 24 Apr 2012 14:53
by GapaxGermany
I've uploaded the full file ...

... meanwhile, I'm already in the code, seems to be a problem with the XMLStreamReader in ImporterGEXF.java ... ;-)

Re: Special characters in GEFX import

Posted: 24 Apr 2012 15:27
by eduramiba
Adding the Byte Order Mark to the file seems to make it load fine.
But I guess it should not be necessary. I'll check the code.

Eduardo

Re: Special characters in GEFX import

Posted: 24 Apr 2012 15:34
by GapaxGermany
Eduardo, thank you very much!

I took a look at the code in "ImporterGEXF.java", in "execute", I switched some lines:

Code: Select all

 InputStream in = new ReaderInputStream(reader);
 xmlReader = inputFactory.createXMLStreamReader(in, "UTF-8");
Where "ReaderInputStream" is a class which converts Reader to InputStream (quite old stuff, but works pretty well here). Then, xmlReader can created with any encoding (here just "UTF-8" hardcoded).

Probably a better way of dealing with UTF-8 files?

Re: Special characters in GEFX import

Posted: 26 Apr 2012 15:26
by eduramiba
Well, we should not force the charset to UTF-8 since the reader is prepared to auto detect charset, and it does on most files.
I'm not an expert about this, but I guess sometimes it is just not possible to detect the charset correctly without the BOM?

Eduardo

Re: Special characters in GEFX import

Posted: 29 Apr 2012 21:54
by mbastian
Should we open an issue for that?

Re: Special characters in GEFX import

Posted: 30 Apr 2012 08:05
by admin
We should re-open this one, as it is the same bug described differently.