Unsupported Screen Size: The viewport size is too small for the theme to render properly.

Utf-8 Sql Errors And Export [help Needed]

forums forums SQLyog SQLyog: Bugs / Feature Requests Utf-8 Sql Errors And Export [help Needed]

  • This topic is empty.
Viewing 20 reply threads
  • Author
    Posts
    • #10371
      ekod
      Member

      I have messed up a database due to incompatibilities in sqlyog 602 I guess, since until now I have not encountered problems with UTF-8 charsets. I will explain below what is the problem, and until it is solved I will not use my recently bought sqlyog.

      First some background info:

      character_set_client utf8

      character_set_connection utf8

      character_set_database latin1

      character_set_results utf8

      character_set_server latin1

      character_set_system utf8

      character_sets_dir /usr/share/mysql/charsets/

      Mysql Server version: 4.1.11

      What is the problem:

      I have a russian text : начать руководство сэкономленного

      This translates to UTF-8 string : начать руководство сэкономленного

      It is stored in the following table:

      Code:
      CREATE TABLE `test` (
      `ID` bigint(20) unsigned NOT NULL auto_increment,
      `post_title` text character set latin1 NOT NULL,
      `post_status` enum('publish','attachment') character set latin1 NOT NULL default 'publish',
      PRIMARY KEY (`ID`),
      KEY `type_status_date` (`post_status`,`ID`)
      ) ENGINE=MyISAM DEFAULT CHARSET=utf8;

      So far so good: lets add some data:

      Code:
      INSERT INTO `test` VALUES (1, 'this is first row', 'publish');
      INSERT INTO `test` VALUES (2, 'начать руководÑтво ÑÑкономленного', 'attachment');
      INSERT INTO `test` VALUES (9, 'начать руководство сэкономленного', 'publish');

      Now I add a space in the title of record ID 2:

      In phpmyadmin:

      UPDATE `test` SET `post_title` = 'начать

      руководство сэкономленного ' WHERE `ID` = '2';

      In sqlyog:

      update `test` set `ID`='2',`post_title`='начать руководÑ?тво Ñ?Ñ?кономленного ',`post_status`='attachment' where `ID`='2'

      So the sqlyog messes up:

      начать руководство сэкономленного (original)

      начать руковод?тво ??кономленного (after sqlyog)

      I just made another test, and it also appears in the SQL generated by the export function:

      Code:
      insert into `test`(`ID`,`post_title`,`post_status`) values ('1','this is first row','publish');
      insert into `test`(`ID`,`post_title`,`post_status`) values ('2','начать руководÑ?тво Ñ?Ñ?кономленного','attachment');
      insert into `test`(`ID`,`post_title`,`post_status`) values ('9','начать руководÑ?тво Ñ?Ñ?кономленного','publish');

      If you have similar problems, or have a resolution for it, please let me know. I need it urgently….

    • #24193
      peterlaursen
      Participant

      The charset DEFAULT is utf8, but every string column is defined as 'latin1'. That looks weird!

      If you are not using SQLyog, what are you using now to generate this example!  How was the table and the data created?

      What happens if you simply ALTER TABLE and define utf8 charset for the columns? (back up first!)

    • #24194
      ekod
      Member
      peterlaursen wrote on Jun 7 2007, 02:20 PM:
      The charset DEFAULT is utf8, but every string column is defined as 'latin1'. That looks weird!

      If you are not using SQLyog, what are you using now to generate this example! How was the table and the data created?

      What happens if you simply ALTER TABLE and define utf8 charset for the columns? (back up first!)

      Thank you for the response:

      I changed collation and that works, thats nice for the future, but my data is still corrupted :-(.

      Default server collation was set to latin, DB and table definition was set to UTF.

      This was done by a wordpress install.

      Of course this should be better SQL, but if you dont specify column collation when creating the table (who thinks of that anyway when creating a table) , it will revert to server default. (lin this case latin)

      This is common practice (in a lot of web based installers of cms/blog packages), but in case you have this conflict, SQLyog should see the inconsistency and report that it can lead to corrupted data and DENY the update, because corrupting data is the worst that can happen.

      If I use phpmyadmin as client this does not happen, neither this behaviour appears with any other client (web based, mysql front, didnt try heidisql), and no data gets corrupted, so the combination of database (utf)table(utf)column(latin) is not be the problem.

      I consider this critical behaviour as extremely dangerous.

      I know sqlyog pushes everything to UTF-8, which is good, but it should handle these situations better imho

    • #24195
      peterlaursen
      Participant
      Quote:
      This is common practice (in a lot of web based installers of cms/blog packages), but in case you have this conflict, SQLyog should see the inconsistency and report that it can lead to corrupted data and DENY the update, because corrupting data is the worst that can happen. … I consider this critical behaviour as extremely dangerous.

      I am not sure I understand (or rather I am sure I do not understand!).  Columns will be created with the table default (if exists), next the database default (if exists) and only if neither exist the server default will be used.  Unless of course a charset or collation is explicitly specified for the column.

      Could you please explain more in detail what you think SQLyog has done wrong here and what some other application has done wrong!

      I do not understand either how SQLyog should deny creating columns with user-specified charsets.  There can be lots of situations where it makes good sense to have columns with different charsets.  Although using unicode would be preferable in most situations then (if all applications accesing the data support it).

    • #24196
      ekod
      Member

      Ok, ill have a look at this later, for now Ill try to read http://trac.wordpress.org/ticket/3517 which has a nice resume of the problem.

      If I can pinpoint the problem then Ill post it here.

    • #24197
      peterlaursen
      Participant

      From your link I quote:

      WP is running in semi-unicode and ascii/latin mode. As a result, people with weird languages that require UTF-8 character sets are having major problems. The issue isn't easily detectable, since storing and retreiving UTF-8 data to an SQL database with latin character set seems to work. Unfortuantely, it doesn't really work. WP can store UTF-8 data on a database/table/field with latin character set, but all SQL-based text functions return wrong values.

      Quote:
      semi-unicode and ascii/latin mode

      I think that one day I will invent a moon made from green cheese and call it 'semi moon stone' 😉

      This is very non-standard and probably origines from back in time where MySQL did not support unicode. This trick makes it possible to display data from different languages at the same time without using unicode.

      Put another way, this system stores data as latin1 strings and does a 'client side encoding/decoding' where certain sequences of latin ASCII/ANSI characters has the meaning of non-latin characters.  But this is only the meaning of those data for this specific system. In my opinion you will only be able to read/write such data with the client software/scripts shipped with this system. 

      SQLyog displays data as it is stored in the database!

      but actully, I think the 'workaround' as described in the link too

      Quote:
      The solution is to ALTER all TEXT and related fields to BLOB, then alter the character set and finaly change back the BLOB fields to TEXT.

      … would work actually! And that would be very easy with SQLyog actually!

    • #24198
      ekod
      Member

      Computers make your grey hair come off ….

      Peter, I couldnt agree more with your sig at this time, I think the workaround is ok and I can live with it, but one question (in non nagging mode), the scenario I depicted that you make a db/table in UTF and then do a dds sql without explicitly stating UTF as collation would result in deformed sql is still a risk..

      Can you give me an example where db (utf), table (utf) and column (latin) would make sense? Of course this is not a wordpress support forum, I m just curious how to circumvent possible problems here.

      Ok, I will withdraw my harsh words to sqlyog, but I am still wondering why other clients still can make sense of this without errors (see phpmyadmin).

      FYI, I am using the client in https tunneling mode, I did not try yet with direct 3306 port, could be that there is some issue with SQLyogTunnel.php?

      Thanks for your response!

    • #24199
      peterlaursen
      Participant

      “but I am still wondering why other clients still can make sense of this without errors”  I do not know either.  Maybe becasue it is webbased/PHP.  Do you know any BINARY supporting this?

      An example whre different charsets would make sense is a product catalogue of an international firm.  If 'rail' is 'lang-tung-ting' in Chinese (you need to be Dansih to understand the joke!) than you may have 2 columns – 1 displaying 'rail' in English and aanother 'lang-tung-ting' in Chinese.  But using Unicode all over would be easier of course.  But Unicode and PHP based web applications for instance also has its problems!

    • #24200
      peterlaursen
      Participant

      Please also explain

      Quote:
      Now I add a space in the title of record ID 2:

      In phpmyadmin:

      UPDATE `test` SET `post_title` = 'начать

      руковод�тво ��кономленного ' WHERE `ID` = '2';

      In sqlyog:

      update `test` set `ID`='2',`post_title`='начать руководÑ?тво Ñ?Ñ?кономленного ',`post_status`='attachment' where `ID`='2'

      So the sqlyog messes up:

      начать руководство сэкономленного (original)

      начать руковод?тво ??кономленного (after sqlyog)

      This is not reproducable here!  If i add spaces after opening in the blob viewer spaces DO NOT become ?'s and the SQL from HISTORY goes like

      update `test` set `ID`='2',`post_title`='начать Ñ€ÑƒÐºÐ¾Ð²Ð¾Ð´Ñ Ñ‚Ð²Ð¾ Ñ Ñ ÐºÐ¾Ð½Ð¾Ð¼Ð»ÐµÐ½Ð½Ð¾Ð³Ð¾',`post_status`='attachment' where `ID`='2'

      But also I am not sure what you think is the issue now?

    • #24201
      ekod
      Member

      Ok, i will try again: the problem is that data corruption occurs:

      It happens when latin encoded TEXT field exists, with UTF-8 content inside.

      I use https incl tunnel. Serverparams are already posted

      You said

      Quote:
      SQLyog displays data as it is stored in the database

      , but here it doesnt exactly.

      See the picture for what happens. [attachment=680:thumb_022.gif]

      To test you can use the sql attached, which is generated by phpmyadmin [attachment=681:test.sql.txt]

      I have not been able to try this with direct connection (port 3306)

    • #24202
      peterlaursen
      Participant

      At least we are getting closer in understanding one another.

      You store a utf8 string in a table column created as 'latin1'.

      This will NOT be stored as unicode (in the MySQL sense of the word) but it will be stored as 'latin1' and every byte stored will display as the latin1 glyph for that byte value!  (what the docs for this system calls 'semi-unicode)

      If you omit the 'latin1' desriptor for the column like

      CREATE TABLE `test2` (

      `ID` bigint(20) unsigned NOT NULL auto_increment,

      `post_title` text NOT NULL,

      PRIMARY KEY (`ID`),

      KEY `type_status_date` (`ID`)

      ) ENGINE=MyISAM DEFAULT CHARSET=utf8

      then it is real unicode.  

      But with your data you should still be able to edit the 'semi-unicode/latin1' string.  At least inserting spaces (and any ascii-character actually) should work seamlessly (with most other characters there will be a keyboard problem!).  Also copy/pasting should work!

      SQLyog displays data as it is stored in the database.  If it does not there is a

      bug or at least some issue!

      So far I think we agree now!

      But it works for me here.  With both direct connection and HTTP-tunnelling if I insert a blank between everyone of your characters the SQL goes like

      insert into `test`(`ID`,`post_title`) values ( NULL,'Ð ´ Ñ ï¿½Ñ â€š Ð ² Ð ¾ Ñ ï¿½Ñ ï¿½Ð º Ð ¾ Ð ½')

      Also see attached picture.   Row 2 is inserted with direct connection, row 3 with HTTP connection.  Exactly the same.  And exactly the same SQL.  No '?'marks in the SQL or in the data display!

      It would be very nice if you could find some way to establish direct or SSH connection!  

      Is the SQL with '?'marks in it copied from SQLyog HISTORY tab?

      And when you insert blanks you open the text-string in the BLOB viewer an eidt form here (silly question probably – what else?!! But just to be perfectly sure!)

    • #24203
      peterlaursen
      Participant

      with real unicode the same exercise (inserting blanks) generates the SQL

      Code:
      update `test2` set `ID`='2',`post_title`='д Ñ� Ñ‚ в о Ñ� Ñ� к о н ' where `ID`='2'

      (and my browser does not render this correctly as cyrillic! But in SQLyog HISTORY it displays correctly!)

      This is real unicode as it was meant to be and as we support it!

    • #24204
      ekod
      Member
      Quote:
      At least we are getting closer in understanding one another.

      Always nice 😉 I understand your remarks about real utf and utf stored in latin.

      Quote:
      SQLyog displays data as it is stored in the database. If it does not there is a bug or at least some issue!

      See the screenshots of the sqlyog display. As already said, it does not show the content right.

      Quote:
      It would be very nice if you could find some way to establish direct or SSH connection!

      I ll work on that.

      Quote:
      Is the SQL with '?'marks in it copied from SQLyog HISTORY tab?

      Yes.

      Quote:
      And when you insert blanks you open the text-string in the BLOB viewer an eidt form here (silly question probably – what else?!! But just to be perfectly sure!)

      Affirmative again..

      As soon as I have direct ssh connection available, I will post my findings. Stay tuned..

    • #24205
      peterlaursen
      Participant
      Quote:
      See the screenshots of the sqlyog display. As already said, it does not show the content right.

      But it does for me.  Look at my screenshots!

      I cannot be sure what is the reson that you get '?'marks, but it could be a webserver/PHP related (configuration)issue with cyrillic.  So that is why I would like to know if this happens with HTTP(s) tunnelling only!

    • #24206
      peterlaursen
      Participant

      We have now tried on 4 different client machines (XP and Vista, English and Danish locales) connecting to different server with all sorts of available connection types – including HTTP.

      The '?'mark issue is not reproducable!

    • #24207
      peterlaursen
      Participant

      We have now reproduced this '?'mark thing.  But this is reproducable in MySQL 4.1.10 not in 4.1.21.

      It is wellknown that early 4.1.x MySQL builds had bugs in the charset implementatio

       (that were mostly fixed in 4.1.12 only).

      Now the problem is that quite a lot of Linux distros ship with 4.1.10 (RHEL4, Fedora3 for instance).  However there is an official Red Hat version 4.1.20 build if you prefer to use those (and not downloadables from MySQL mirrors).

      Also very early 5.0.x may have similar bugs!

      Conclusion: MySQL bug, not SQLyog bug. You should upgrade MySQL!   😛

      did I ask you about the MySQL version?

    • #24208
      peterlaursen
      Participant

      BTW: 'Query Browser' from MySQL AB and phpMyAdmin does the same. It is a server issue, not a client issue!

      and also BTW: all connection types are affected!

    • #24209
      ekod
      Member
      peterlaursen wrote on Jun 11 2007, 11:47 AM:
      BTW: 'Query Browser' from MySQL AB and phpMyAdmin does the same. It is a server issue, not a client issue!

      and also BTW: all connection types are affected!

      Damn, thats smart debugging then… I have MySQL 4.1.11-Debian_4sarge7 here, but it is still surprising that phpmyadmin (phpMyAdmin 2.5.7) has no problems with it ( bug + bug = …no bug…) , but phpmyadmin 2.10 it does reproduce the ? error.

      I think your mysql upgrading suggestion is the best way to go! Thanks a lot overthere 🙂

    • #24210

      I have the same kind of problem.

      When I export a dbase (with phpmyadmin) it shows “coupé” in de SQL file.

      When Import this file with SQLYog, (with creation of dbase/tables) it gives “CoupÔ

      When I import this file with mysql command line, everything is okay.

      So it looks like SQLYog can't handle utf-8 imports.

      I really like SQLYog, so a sollution would be nice

    • #24211
      peterlaursen
      Participant

      1) What is the program version you are using?

      Always tell that! and please try 6.03 if you don't allready!

      2) Please attach a very small SQL-file

      You may need to explicitly SET NAMES in the file if it is encoded differently than the server default charset!

      Refer to this FAQ http://webyog.com/faq/34_148_en.html

    • #24212
      peterlaursen
      Participant

      also note that the FAQ was updated wil a small clarification!

Viewing 20 reply threads
  • You must be logged in to reply to this topic.