forums › forums › SQLyog › Sync tools, Migration, Scheduled Backup and Notifications › Boosted Two-way Sync (more Than Mysql Cluster?)
- This topic is empty.
-
AuthorPosts
-
-
October 4, 2006 at 1:55 am #9935sdozonoMember
This is just a suggestion.
I tested two-way sync feature of SQLyog. Because, so far, MYSQL Cluster needs more than 16GB RAM!
http://www.mysql.com/products/database/cluster/faq.html and this is not for my small project.
At first look, two-way sync looks suitable for synchronization, but there was a problem. If two servers have a same Primary Key, SQLyog will ALWAYS update the data in the “SOURCE” server even the “TARGET” server has the latest infomation. So I couldn't use this feature for “making a second server for emergency situation.” For my project, each existed data is being updated frequently.
The problem is, SQLyog doesn't know “WHICH DATA IS THE LATEST”!
I'm a php programmer (so far…), and it is easy for me to add a “MODIFIED”(timestamp) field in my DB-Table and trace the time of transaction. And recently, I'm using CakePHP Framework, and it automatically save the time if there is “modified” field in a DB-Table. Other frameworks might have that kind of feature. (May be MYSQL itself should have this…. 🙄 )
If SQLyog can make use of the “modified” field in DB-Table, the two way sync feature may became very useful. Maybe, check option and “field name” option will be needed.
Then, we can achive real “synchronization” with SQLyog even with shared servers, and you don't have to set up complicated environment.
What's your opinion?
-
October 4, 2006 at 3:58 am #22682peterlaursenParticipantQuote:The problem is, SQLyog doesn't know “WHICH DATA IS THE LATEST”!
This is nothing new! We have been discussing for years! Once every second month someone gets the 'smart' idea' to use a timestamp. That is just too simple! There are 'replication managers' for Access and Oracle that we are studying. And yes: Once we get finisshed with the 5.1 stuff (events, partitions) we are actually planning what we can do of the stuff!
Now who says that 'latest is right' is the rule in this situation? It could also be a 'user hierachy'. Or one column could be more important than the other. There are lots of other possibilities. And what if a record has been updated at one server and deleted at the other?
I think the solution is another: To save the checksum on both servers! Because we can then compare the checksum just generated with the checksum from last sync and 'row-based synchronisation' will be possible – that is for each row it will be possible to tell on what server data has changed.
-
October 4, 2006 at 5:37 am #22683TomBezMemberpeterlaursen wrote on Oct 4 2006, 05:58 AM:This is nothing new! We have been discussing for years! Once every second month someone gets the 'smart' idea' to use a timestamp. That is just too simple! There are 'replication managers' for Access and Oracle that we are studying. And yes: Once we get finisshed with the 5.1 stuff (events, partitions) we are actually planning what we can do of the stuff!
WOW getting inside the replication manager from oracle and understand not only the basic concepts is a big thing and not very easy.
the problem is, replication if you want to make it really clean, what means that every change in every dataset from the master has to be done also on the slave, you will need to have some kind of archive logs to get the changes and not only the changed data. especially if you have mechanismen to historize your data inside the database, you will either have to replicate the historized data aswell, or to get all update/insert/delete requests to database since last replication. I'm not sure if MySQL does have the feature of archive logs where those information is stored, but I don't really think so.
Maybe the solution shouldn't be in the complicated way db2/oracle/etc are handle that. How about a simplified replication mechanism (simple & stupid)? some kind reduced to a timestamp-based replication instead of a full replication. I see the main problem in that solution that you must trust foreign application that they will update the timestamp correctly. but I think with a FAQ that explains that you will cover at least 80% of the replication needs.
regards
-
October 4, 2006 at 6:28 am #22684peterlaursenParticipantQuote:I'm not sure if MySQL does have the feature of archive logs
It does not! With MySQL 5.1 you can configure the 'binary log' for row-based repleication, and that log could in principle be used. But in principle only – in practice it is not practicable for preformance reasons. It is alos a log for a complete server and not a database or a table.
Some on the new emerging Storage engines have better logging option. In particular the PrimeXT engine where the table itself is organised as a log.
Quote:How about a simplified replication mechanism (simple & stupid)? some kind reduced to a timestamp-based replicationI still think that saving the 'checksum for last sync' and comparing each row of data with that for each row is a more perfect solution, because:
1: A MySQL timestamp has an accuracy of 1 second only. Certain data may be updated several times per second on fast servers running heavy business applications
2: System Clocks may get 'out of tune'. That is not only PC's actually. Even on servers that can occur. There is also timezone issues. 'light saving hours'.
With a saved checksum we could do like:
1) When data on the two hosts differ we generate checksum on both hosts and compare with 'last checksum'. The host where cureent checksum and last checksum differ is where data have changed! So for that individual row (only) that will define the host where change has taken place asBut true: it is still not possible to handle situations where THE SAME ROW has been updated on both servers. Also there will need to be a 'rule' whether DELETES has priority to UPDATES, or opposite. Or if such rows should be left untouched and logged instead.
And anything like this will of course need addition of data.
-
October 4, 2006 at 7:43 am #22685TomBezMemberpeterlaursen wrote on Oct 4 2006, 08:28 AM:With a saved checksum we could do like:
1) When data on the two hosts differ we generate checksum on both hosts and compare with 'last checksum'. The host where cureent checksum and last checksum differ is where data have changed! So for that individual row (only) that will define the host where change has taken place as
Hm, you got than a big problem, what are you doing when on both (more than 1) servers you have a difference between last checksum and current checksum? define one as master and override it on the other? let user interact? mark them as replication conflicts like in lotus notes?
you will also have to insert additional fields or tables on the databases. what won't work in all cases. i have seen client-applications that couldn't handle additional fields, so for them it wouldnt be an option to have an additional field (either timestamp or checksum doesnt matter). so maybe something more generic with sqlyog specific tables to store checksums and primary-keys with automatically generated triggers on the tables to be synchronized, so that checksum is build automatically (and you will also have information that something is deleted).
-
October 4, 2006 at 7:54 am #22686peterlaursenParticipantQuote:you got than a big problem, what are you doing when on both (more than 1) servers you have a difference between last checksum and current checksum?
true! that was why I wrote this “it is still not possible to handle situations where THE SAME ROW has been updated on both servers”. 'rules', 'no touch and log' are both possible. I do not like user-interaction as I do not think it makes sense with SJA (and scheduling).
Quote:i have seen client-applications that couldn't handle additional fieldsTrue. I use one of those myslef: ThumbsPlus http://www.cerious.com
Quote:so maybe something more generic with sqlyog specific tablesIt will then have to be a SQLyog-specific DATABASE. ThumbsPlus would not accept an 'alien' TABLE in its database either
Quote:automatically generated triggers on the tablesActually Acces Replication Manager uses triggers. I would like to avoid them. Because it should preferably work with all MySQL versions and not only 5.x
-
October 4, 2006 at 9:53 am #22687TomBezMemberpeterlaursen wrote on Oct 4 2006, 09:54 AM:true! that was why I wrote this “it is still not possible to handle situations where THE SAME ROW has been updated on both servers”. 'rules', 'no touch and log' are both possible. I do not like user-interaction as I do not think it makes sense with SJA (and scheduling).
Nor do I, 90 % of all computer problems are sitting in front of it 😉
peterlaursen wrote on Oct 4 2006, 09:54 AM:It will then have to be a SQLyog-specific DATABASE. ThumbsPlus would not accept an 'alien' TABLE in its database eitherActually Acces Replication Manager uses triggers. I would like to avoid them. Because it should preferably work with all MySQL versions and not only 5.x
In prinicple I agree with you. I just see a lot of problems and only very complicated solution possibilities if not using triggers. for example imagine a rather large database with rows about 10M, I think generating the checksums and maybe running the replication via a different machine would cause enormous traffic and the performance of the machines would be also something to concern about. what happens if one replication job isn't finished? having a kind of joblist would than be necessary. how would you deal if running more than one replication job and if 2 or more of them share the need for a table? what to do with locked tables/rows when using INNODB, wait until it's unlocked or continue, if waiting how about a timeout in case some other application hangs and constantly locks a table? what's with application that lock tables all the time to implement their own transaction mechanism?
maybe a possibility would be a version with triggers which only works with mysql 5.x, which is probably a more easy solution than any other system, and than focusing on the previous mysql versions to get a similar replication mechanism.
-
October 4, 2006 at 10:14 am #22688peterlaursenParticipantQuote:I think generating the checksums …
we do generate checksums allrady. We just do not save them. I think this save can be implemented on server side alone.
Quote:what happens if one replication job isn't finished?it puts a little more load on the server and the connection. that is the only thing that happens. That can happen with current version too. There could be an issue too with conflict when accessing the jobfile of sessionfile. Could cause one of the instance to crash. And so solved 🙂
I do not see any NEW issues with locked tables or CONSTRAINTS.
We have to be realistic and realize that most Webyog Customers use 3.23 and 4.0 hosting on shared server at an ISP. And it will be no less than 5 years before cheap ISP's change over to 5.0, I think!
-
October 5, 2006 at 12:06 am #22689rotorboyMember
I appologize in advance for jumping in here only a couple hours after I purchased this software.
We are currently running MySQL 5.0.20 on a Master-Master setup of 2 machines. The system is used by a pair of webservers running a busy Ecommerce store where the load is split between the 2 machines. Most of the load is visitors viewing pages of products with only a few actual orders per day. So far this has been running smoothly for months at a time.
The databases are setup with a trigger to update a timestamp on each row whenever an update is done. What I need is the ability to use our timestamp field to bring the 2 systems back into sync so I can restore MySQL replication.
Example:
IF timestamp on Alpha is newer than timestamp on Beta, Select ROW from Alpha and update ROW to Beta.
ELSEIF timestamp on Beta is newer than timestamp on Alpha, Select ROW from Beta and update ROW to Alpha.
ELSEIF ROW doesn't exist on Beta INSERT to BETA
ELSEIF ROW doesn't exist on Alpha INSERT to ALPHA
Anyway, you all get what I mean I'm sure.
I've read this thread. It seems that WebYog doesn't yet provide this function for an assortment of reasons, yet there seems to be a number of people looking for this option. Even if it's NOT the ideal setup, it's pretty damn close, in my experience, close enough to handle 99.9% of anything I'm faced with.
So…. what would it take to convince the developers of what seems to be a beautiful software package to put in some ability to handle this? Use BIG disclaimers or say it only works on 5.x or whatever…
-
October 5, 2006 at 12:22 am #22690peterlaursenParticipant
one option does not exclude the other!
But I would be very carefull with timestamps when a ordinary PC is involved.
If system clock gets 'out of tune' it can be disastrous.
That typically happens after 3-5 years when the motherboard battery get 'flat' – and we are talking about errors in the range of YEARS.
Also some software may reset the system clock to BIOS initial time by error (I have an example!)
BTW: you do not need a trigger to update a timestamp on update.
Just define as 'TIMESTAMP … ON UPDATE CURRENT TIMESTAMP'. that is possible from MySQL 4.1 and with versions before that the first timestamp in a table behaves this way. But triggers and a special table will be necessary to keep track of DELETES as a TIMESTAMP or a CHECKSUM saved in a row gets deleted with the row itself!
Keep discussing … we want to hear the opinion of everyone having one 😀
SQLyog sync as a tool to 'save' a broken MySQL replication is a nice idea indeed!
-
October 5, 2006 at 3:15 am #22691rotorboyMember
It's our own responsibility to ensure our servers have properly synchronised time. It's not a huge deal to have a check in place to ensure any replication machines are using something close to the same time. It's just a tool that would be incredibly useful to those of us whom are in shops that can't spend tens (hundreds) of thousands on big expensive commercial grade solutions. At the very least, we can run a MySQL (5+) Master-Master setup and have the tools to deal with the odd breakages if/when they occur.
Timestamps — yeah that's right I forgot about that… I don't actually do the implementation of the databases and tables, I'm just the guy that administers the servers and fixes the problems.
-
October 5, 2006 at 5:28 am #22692peterlaursenParticipantCode:It's our own responsibility to ensure our servers have properly synchronised time.
The reality is if people suffer dataloss, webyog will suffer financially.
We have to be realistic about the world we are living in that 75% of users do accept 'their own responsibility' here!
In LEGAL terms you are right of course. We need to think in MARKETING terms too!
But you don't have to, beause you do not risk anything!
WE CANNOT AFFORD TO BE THAT NAIVE ABOUT OUR MARKET POSITION!
-
October 5, 2006 at 8:57 am #22693TomBezMemberpeterlaursen wrote on Oct 5 2006, 07:28 AM:WE CANNOT AFFORD TO BE THAT NAIVE ABOUT OUR MARKET POSITION!
this is starting to get more like a strategic discussion than a technical 🙂
doesn't it depend on where SQLyog should go in the future, more than it depends on
where SQLyog is now? I think it should be made clear if SQLyog is be more an administration
tool, developement tool or data user tool.
Replication and stuff like that is more in the focus of a databaseadministrator, but if you would
like to see SQLyog more as a tool for data users than the focus must be stronger to develop
simple input possibilities.
-
October 5, 2006 at 10:35 am #22694peterlaursenParticipant
Sorry for SHOUTING! 😎
I do not like something that works as expected 99.9% of the times. For instance when a cell has been updated twice within a second, or some ISP admin forgot to configure 'light saving hours'.
I think my proposal will work 100% as expected! But both options could be there!
Keep on discussing!
-
October 5, 2006 at 3:47 pm #22695rotorboyMemberpeterlaursen wrote on Oct 5 2006, 10:35 AM:WE CANNOT AFFORD TO BE THAT NAIVE ABOUT OUR MARKET POSITION!
That's incredibly true. The reason I purchased this software yesterday was the “replication” feature alone. Now it's ok for some basic tasks, but what I really need is what I've outlined.
I understand the concern about “sue-age” when some numbskull screws things up and wants to pin the blame on someone else, sure we see it every day. This threat can be minimized with a strong EULA and a big fat DISCLAIMER for any feature you feel has an inherent risk.
For example. If I have 2 databases and I click the 2-way replication feature, not realizing that the source will overwrite the target rows if they should be different – that could be rather nasty. Where I'm sitting, that's just as dangerous, perhaps even more dangerous, than the odd chance of a timestamp being far enough out to screw up my data.
So. Do you offer a full solution and a EULA and DISCLAIMER that covers your butt, or do you offer a product that only handles a portion of what the users need? From what I can see so far, this is a nice package. More replication options would only make it even more useful. Hey, you can even go as far as to have 3 levels of protection. You must accept the terms on download. You accept them on install, and you accept them before you use “dangerous” features.
-
October 30, 2006 at 12:49 pm #22696sdozonoMember
I just found this: FYI.
http://opensource.replicator.daffodilsw.com/
Open-source project. I haven't tested yet. Some techniques are interesting.
And there is a explanation.
http://opensource.replicator.daffodilsw.co…_Resolution.htm
http://opensource.replicator.daffodilsw.co…date-Update.htm
http://opensource.replicator.daffodilsw.co…cator-help.html
-
-
AuthorPosts
- You must be logged in to reply to this topic.