Migrating 1 Billion Rows Takes A Month

This topic is empty.

Viewing 5 reply threads

Author

Posts
- March 31, 2011 at 9:36 pm #12294
  
  Ptolemy
  Member
  
  We need to migrate a table with 1 BILLION rows from MS SQL Server to MySQL.
  
  We started the migration process, so far it ran for 2 hours and according to the process logs transferred about 3 mln records. That means that with such speed 1 BILLION rows would take us about 666 hours – 27 days, which seems “a bit excessive”.
  
  How can the migration speed be optimized? How can we diagnose what slows the process? (none of the involved hosts shows too much CPU load)
  
  Would it help to convert target table from InnoDB to MyISAM? would it help to remove all indexes?
  
  Should we use dump-to-file and load-from-file utilities?
  
  Any help is highly appreciated.
  
  Thanks,
  
  Philopator.
- April 1, 2011 at 8:05 am #32101
  
  peterlaursen
  Participant
  
  I fully understand that this is not acceptable, but I am afraid that the main slowdown is in the ODBC driver interface to the SQL server. You can check this by seeing if an import to Access using same driver is faster. If you are connecting over the Internet, the connection speed could also be an issue.
  
  You could start checking CPU + memory of the SQL Server machine/process, the MySQL machine/process and the SQLyog machine/process. Anywhere it peaks or is there any indication that memory swaps? Also monitor the network utilization.
  
  I do not know if SQL Sever has some configuration option or if it can produce a SQL-dump that can be read by MySQL. I will ask. But exporting as CSV and import to MySQL using LOAD DATA could also be an option. I do not think we have any benchmarks on this, but it could be specific data types causing the problem. For instance very big column types. If is likely not only #of rows, but also the average size of rows that matters. So to comment further we will at least need the structure – or better a subset of the table.
- April 1, 2011 at 5:27 pm #32102
  
  Ptolemy
  Member
  
  Thanks a lot for the reply.
  
  We will check resource utilization on all hosts.
  
  Also, is there any way to control batch and tx commit size (i.e. commit every 5000 rows)?
  
  May be that could also help with the speed.
  
  Thanks again,
  
  Philopator.
- April 1, 2011 at 8:19 pm #32103
  
  peterlaursen
  Participant
  
  I am not able to reply to this question (TRX or CHUNK size) myself.
  
  It could very well be the issue that this is not optimal. After all the Import Tool is *a generic ODBC-import* (both for data sources supporting transactions and not) and not optimized for any specific source (SQL Server or whatever)
  
  As soon as possible after the weekend I will ask the SQLyog developer team to comment on that.
- May 26, 2011 at 12:13 pm #32104
  
  Chirantan Thakkar
  Member
  
  Hi,
  
  We use bulk insert to insert data to the target MySQL table. The size of the bulk insert statement will be minimum of max_allowed_packet and 16MB. So you can have a maximum of 16MB data to be inserted/updated at once. The 16MB limit is not configurable.
  
  Also we turn off AUTOCOMMIT when import starts and in the end we are committing the changes. This will affect tables with engines that support transaction (eg: InnoD:cool:. But with non transactional engines like MyISAM, the inserts/updates will be reflected immediately in the target table.
  
  And of course, it's an ODBC import and thus the elements like physical separation, connection type, network speed etc will have significant effect on the time required to complete the import.
- June 10, 2011 at 6:37 pm #32105
  
  peterlaursen
  Participant
  
  I am sorry, but we did not find a way to improve the performance against theserver using ODBC. There may be other ways (using JET directly for instance), but SQLyog is a MySQL client and we do not plan to add code specific for non-MySQL databases.
Author

Posts

Viewing 5 reply threads

You must be logged in to reply to this topic.

Search Forums

Recent Topics

Recent Replies

Migrating 1 Billion Rows Takes A Month

Search Forums

Recent Topics

Recent Replies

Tags