Unsupported Screen Size: The viewport size is too small for the theme to render properly.

Forum Replies Created

Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
    Posts
  • in reply to: Help With Where During Sync #32299
    Rich
    Member
    'peterlaursen' wrote:

    OK .. I think we can conclude here.

    1)

    The query that Vishal asked you to execute is a simplified query that Data Sync executes. It may be executed multiple times. It will calculate such checksums on CHUNKS of rows. Let us say CHUNK = 2000 rows. If The first CHUNK reveals no difference of data (as per the checksum) we will proceed to next chunk. So it you have 200,000 rows in table it will be executed 100 times even if data are completely identical. And if each execution take 4 seconds it wil be 400 seconds in total to get the information from MySQL. Moreover: if the calculation in a CHUNK reveals differences of data the query will be executed again in a recursive manner for smaller CHUNKs in order to identify exactly what rows are different on source and target. So in conclusion: how many times such query is executed depends on 'how much different' source and target are and how the differences are distributed in the table.

    I guess I am naive in how I thought this would be implemented. I would have expected the sync tool to get an initial result set by executing the query with the where condition, then do the sync'ing operations based strictly on that result set, ignoring all rows not meeting the condition.

    'peterlaursen' wrote:

    2)

    The execution time for the sample query show that it is not optimized well by the MySQL server. Because it uses a calculation in the WHERE clause, the result of the calculation is not stored and thus not indexed. Even if EXPLAIN tells that an index is used I believe it will only be for one/first operand of the calculation. I think taht for each (indexed) 1st operand it will still do a full table scan for comparing with the second operand. If you have SQLyog ULTIMATE profiling the query will reveal more information (in particular check the increase of 'full table scans' for the sample query).

    3)

    Option for improvement (as I can see) include:

    * If WHERE is less efficient than NOWHERE, don't use WHERE.

    * explore the COLUMNS option in Data Sync. The COLUMNS option defines on what columns checksums shoule be calculated (and NOT what columns are sync'ed. All columns are always synced). Specifying only a few and small columns necessary to ensure uniqueness will improve performance of checksums calculation.

    * Tuning of global server parameters could improve a lot. Some buffer settings etc. in MySQL are ridculously low with default parameters and it may result in swapping.

    * If you have access (and can manage) to rewrite the application you may store (and index) the TIMEDIFF in the table. If you want to avoid redundant information you may drop one of the existing columns.

    * Consider replacing MySQL 5.1 with MariaDB 5.2 (it is a drop-in replacement), define and index a 'virtual column' on the TIMEDIFF. However I cannot promise how mcuh it would help in this scenario. You will have to try for yourself in your environment.

    * Consider Replication rather than Data Sync for this.

    I use only the key columns (2) for compare.

    I'm not really disposed to tune the server params although I guess that's how people eventually get educated.

    I would like to use Replication but this operation has to be done over an ssh port with the slave outside of the secure network such that it cannot be pointed to the master. If I could find a source of info that would let me replicate such that the master points to the slaves rather than vice versa, I would do that (if can do over ssh)

    Thanks for putting significant effort into helping investigate this.

    Rich

    in reply to: Help With Where During Sync #32297
    Rich
    Member

    Hi Vishal,

    It takes ~ 3.78s consistently

    Rich

    in reply to: Help With Where During Sync #32295
    Rich
    Member
    'peterlaursen' wrote:

    Even if it may not be relevant please provide the *full* version (not 5.1 but 5.1.57 or whatever).

    But now I understand (I should before) that the WHERE uses a calculation. I think this is the problem. Because the result of the calculation is not stored and thus not indexed either. Some databases have 'virtual columns' (that can be indexed) for handling this. The standard MySQL server does not (but latest MariaDB has).

    Anyway I'll let Vishal and our testers comment too.

    5.1.43

    in reply to: Help With Where During Sync #32293
    Rich
    Member
    'vishal.pr' wrote:

    Hi Rich,

    Please tell what all columns from the table `tag_data5mins` is used in the Columns to Compare option.

    Also the exact where clause you supplied in the SQL Where (nothing more, nothing less).

    TO_DAYS(NOW()) – TO_DAYS(datatime) <= 1

    Once you provide these details we can frame a query that gets the rows to sync and you can execute the same in your environment to see how much time it takes. We can then help you to optimize the WHERE clause used.

    Regards,

    Vishal PR

    Hi,

    Only two columns: Datatime and Tagname

    Where Clause: TO_DAYS(NOW()) – TO_DAYS(datatime) <= 1

    Also, the mySQL version is 5.1 on both servers

    in reply to: Help With Where During Sync #32290
    Rich
    Member

    I have placed an index on the datatime column in both source and target database, yet still the sync time for a table with 2251881 rows is 595s using the WHERE clause when only 68 rows are different.

    A sync without using the WHERE clause is 153s.

    Is there anything else you can help me do to diagnose and figure out a way to speed this up by limiting the rows?

    Thanks

    Rich

    in reply to: Help With Where During Sync #32289
    Rich
    Member
    'peterlaursen' wrote:

    I am not sure if this index will work with such where clause. I think it will not. The DATETIME column is not first/leftmost in the index.

    Please refer http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html “MySQL cannot use an index if the columns do not form a leftmost prefix of the index.”

    Could you please try to add an index on that column alone? I know it could take some time to build the index.

    Also try to EXPLAIN a query using a WHERE clause on the column. Is the index used?

    OK, the following shows the index present (i.e., dateTimeNdx):

    CREATE TABLE `tag_data5mins` (

    `TAGNAME` varchar(255) NOT NULL,

    `DATATIME` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,

    `VALUE` float DEFAULT NULL,

    `EXPIRY` timestamp NULL DEFAULT NULL,

    `LASTFETCHED` timestamp NULL DEFAULT NULL,

    `ERROR_DESC` varchar(512) DEFAULT NULL,

    `QUALITY` float DEFAULT NULL,

    `WWRESOLUTION` float DEFAULT NULL,

    `WWRETRIEVALMODE` varchar(255) DEFAULT NULL,

    PRIMARY KEY (`TAGNAME`,`DATATIME`),

    KEY `dateTimeNdx` (`DATATIME`),

    CONSTRAINT `TAG_DATA5MINS_FK` FOREIGN KEY (`TAGNAME`) REFERENCES `tagnames` (`TAGNAME`) ON DELETE CASCADE

    ) ENGINE=InnoDB DEFAULT CHARSET=latin1

    and in the 'possible_keys' column of the EXPLAIN result, the following

    PRIMARY,dateTimeNdx

    indicates that yes, it is “possibly” being used

    in reply to: Help With Where During Sync #32287
    Rich
    Member
    'peterlaursen' wrote:

    I wonder if you have an index on this DATETIME column? If not a lot of full table scans could happen.

    Anyway it is a little difficult for us to understand. Maybe you could post the SHOW CREATE TABLE output for this table as well as the exact WHERE.clauses you use. Next we can frame queries that you can profile/EXPLAIN.

    I meant to make clear that a primary key is made up of the 'Datatime' (which is timestamp) and 'Tagname' (which is varchar) columns. I am under the impression that they are then indexed.

    The SHOW CREATE TABLE has the following result:

    CREATE TABLE `tag_data15mins` (

    `TAGNAME` varchar(255) NOT NULL,

    `DATATIME` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,

    `VALUE` float DEFAULT NULL,

    `EXPIRY` timestamp NULL DEFAULT NULL,

    `LASTFETCHED` timestamp NULL DEFAULT NULL,

    `ERROR_DESC` varchar(512) DEFAULT NULL,

    `QUALITY` float DEFAULT NULL,

    `WWRESOLUTION` float DEFAULT NULL,

    `WWRETRIEVALMODE` varchar(255) DEFAULT NULL,

    PRIMARY KEY (`TAGNAME`,`DATATIME`),

    CONSTRAINT `TAG_DATA15MINS_FK` FOREIGN KEY (`TAGNAME`) REFERENCES `tagnames` (`TAGNAME`) ON DELETE CASCADE

    ) ENGINE=InnoDB DEFAULT CHARSET=latin1

    I tried the the following query in sqlYog, which would be ok for my sync job (but extended the time as mentioned in original post), however I would really like to always be looking back a full 24 hours:

    SELECT * , TO_DAYS(NOW()) – TO_DAYS(datatime) AS diff FROM tag_data15mins

    WHERE tagname LIKE 'WBSF_INCIN1_BREACHOUTLET'

    AND TO_DAYS(NOW()) – TO_DAYS(datatime) < 1

    ORDER BY datatime DESC

    in reply to: Error No. 2013 – Inconsistent Sync Experience #30739
    Rich
    Member

    Hi Peter,

    I'm sorry for delaying here, I had responded last week and somehow that response never made it into this thread. (maybe I never clicked the Reply button)

    A version update to 8.3 seemed to largely resolve the inconsistency per the log output originally posted.

    Can you verify that the WHERE clause limits the sync activity to the rows that meet the condition?

    If I can get the last 24 rows out of 1.5M using SqlYog in under 2 seconds, why does the sync operation take several minutes?

    I would like to have a binary with more detailed log output to see where all the time is being spent. Maybe it would be helpful for you as well. I would be happy to report back or send you log files to help you.

    Thanks,

    Rich

    'peterlaursen' wrote on '14:

    Now .. please read what I have been writing and next tell what help you expect from us here.  We may for instance provide a build that logs more details. Say something like 10000 rows done .. 20000 rows done …etc. But again with more simultaneously jobs it may be hard to understand.

    in reply to: Error No. 2013 – Inconsistent Sync Experience #30737
    Rich
    Member

    Thank you for responding Peter:

    Bullet 1 – server is on LAN behind firewall, sync is over SSH (I am not using HTTP-tunnel, I am using SSH tunnel)

    Bullet 7 – Source is using v8.2 on Windows 2000 MySql 5.1 (sorry if this wasn't more clear; SQLYog 8.2/Win 2000 Server/MySql 5.1)

    I will appreciate any light you can shed on this.

    Is there any issue with 1.5M rows in the table since I am only sync'ing on last 24 hrs?

    Rich

    'peterlaursen' wrote on '12:

    You did not tell the SQLyog version.  This is important.  I believe the 'sql_mode cannot be set to NULL' issue is solved in recent versions. It is also important whether you use HTTP-tunnel or not.

    Note that you first get “Lost connection to MySQL server during query”.  After that we reconnect automatically but the user variable used to stored the old sql_mode gets reset to NULL (user variables have a session-wide scope only).  We did not handle user variables properly when reconnecting.  Next we try to SET the sql_mode + the value of the user vraible, but as it has become NULL with a reconnection it returns an error.

    This is what I remember right now.  So you please tell 

    1) SQLyog version

    2) HTTP tunnel or not

    .. and I will check if we have more details about this

    in reply to: Error No. 2013 – Inconsistent Sync Experience #30735
    Rich
    Member

    Just a follow-up note.

    The following query executes against the target server in just over two seconds:

    SELECT * FROM tag_data60mins

    WHERE DATATIME > CURRENT_TIMESTAMP – INTERVAL 24 HOUR

    I am running the sync job manually from the UI for the case were there are no new rows, and it has been calculating checksum for over eight minutes.

    The screen shot shows a weird result: 37 rows inserted even the the source and target had the same number of rows. Note the time of 585 seconds.

    Does this spark any ideas?

    Thanks,

    Rich

Viewing 10 posts - 1 through 10 (of 10 total)