Tag Archives: MySQL

SQL Injection: How To Prevent Security Flaws In PHP / MySQL

\r\n
\r\n
\r\n
\r\n
\r\n
What is SQL Injection\r\nMost new web developers have heard of SQL injection attacks, but not very many know that it is fairly easy to prevent an attacker from gaining access to your data by filtering out the vulnerabilities using MySQL extensions found in PHP. An SQL injection attack occurs when a hacker or cracker (a malicious hacker) attempts to dump the data in a database table in a database-driven web site. In an unprotected and vulnerable site, this is pretty easy to do.\r\n\r\nSQL injection is a common vulnerability that is the result of lax input validation. Unlike cross-site scripting vulnerabilities that are ultimately directed at your site’s visitors, SQL injection is an attack on the site itself, in particular its database.\r\nThe goal of SQL injection is to insert arbitrary data, most often a database query, into a string that’s eventually executed by the database. The insidious query may attempt any number of actions, from retrieving alternate data, to modifying or removing information from the database.\r\n\r\nHow does SQL injection attack works\r\nIn order for an SQL injection attack to work, the site must use an unprotected SQL query that utilizes data submitted by a user to lookup something in a database table. The data could be from a search box, a login form or any type of query used to look up data using data input by user. It also means that querystring data used to query a database can create vulnerabilities.\r\nFor example:\r\n\r\nAn very simple unprotected query might look like this:\r\n\r\n

\r\n

\r\n

\r\n
SELECT * FROM items WHERE itemID = '$itemID'

\r\n

\r\n

Normally, you would expect a user to submit a username and password, which would be used to query the database table to see if the username and password exists. But what if someone used the following instead of a password?

\r\n

‘ OR ‘1′ = ‘1

\r\n

\r\n

\r\n

That would make the query used to look for the password look like this:

\r\n

\r\n
\r\n
SELECT * FROM items WHERE itemID = '' OR '1' = '1'

\r\n

\r\n

\r\n

\r\n

This would always return a True response and could literally display the entire table as the result for the query. This is a pretty scary thought if you are trying to keep your data secure. The problem with SQL injection is that a hacker does not have to know anything about your database or table structure.\r\n\r\nWhat if an error or some other issue caused your table structure to be exposed? Hackers are very good at forcing errors to occur that expose information that allows them to penetrate a site deeper. What if the following was entered in the password field?\r\n\r\n

\r\n

‘; drop table users;

\r\n

\r\n

How to prevent your database from SQL Injection attacks\r\nThere is a method for filtering the data that is used on the right side of the WHERE clause to look up a row in a database. The trick is to escape any characters that may be in the user input portion of the query that could lead to a successful attack.\r\n\r\nUse the following function to add backslashes to suspect characters and filter any data that is input by a user.\r\n\r\n

\r\n

function cleanQuery($string)\r\n{\r\n if(get_magic_quotes_gpc()) // prevents duplicate backslashes\r\n {\r\n  $string = stripslashes($string);\r\n }\r\n  if (phpversion() >= '4.3.0')\r\n  {\r\n   $string = mysql_real_escape_string($string);\r\n  }\r\nelse\r\n{\r\n $string = mysql_escape_string($string);\r\n}\r\nreturn $string;\r\n}\r\n\r\n// if you are using form data, use the function like this:\r\nif (isset($_POST['itemID'])) $itemID = cleanQuery($_POST['itemID']);\r\n\r\n// you can also filter the data as part of your query:\r\nSELECT * FROM items WHERE itemID = '". cleanQuery($itemID)."' "

\r\n

The first part looks to see if magic quotes is turned on. if so, it may have already added backslash escapes though a POST or GET method used to pass the data. If backslashes were added, they need to be removed prior to running it through the rest of the function.\r\n\r\nThe next part checks the PHP version. The built-in function that we want to use is called mysql_real_escape_string. This MySQL function only exists in PHP version 4.3.0 or newer. If you are using an older version of PHP, another MySQL function is used called mysql_escape_string.\r\n\r\nmysql_escape_string is not as effective as the newer mysql_real_escape_string. The newer version escapes the string according to the current character set. The character set is ignored by mysql_escape_string, which can leave some vulnerabilities ope for sophisticated hackers. If you find that you are using an older version of PHP and you are trying to protect sensitive data, you really should upgrade to a current version of either PHP 4 or PHP 5.\r\n\r\nSo what does mysql_real_escape_string do?\r\n\r\nThis PHP library function prepends backslashes to the following characters: \n, \r, \, \x00, \x1a, ‘ and “. The important part is that the single and double quotes are escaped, because these are the characters most likely to open up vulnerabilities.\r\n\r\nFor those who do not know what an escape is, it is a character that is pre-pended to another character. When a character is escaped, it is ignored by the database. In other words, it makes that character ineffective in a query. In the case of PHP, an escaped character is treated differently by the PHP parser. The standard escape character used by PHP and MySQL is the backslash.\r\n\r\nIn the case of the SQL query example used above, after running it through the routine, it now looks like this, which breaks the query :\r\n\r\n

\r\n

\r\n
SELECT * FROM items WHERE itemID = '\' OR \'1\' = \'1'

\r\n

\r\nThis method should stop the bulk of the SQL injection attacks, but crackers and hackers are very creative and are always finding new methods to break into systems. There are additional steps that can be taken to filter out certain words, such as drop, grant, union, etc., but using this method will strip these words from searches performed by you users. However, if you want to add another level of security and do not have an issue with certain words being deleted from queries, you can add the following just before if (phpversion() >= ‘4.3.0′).\r\n

$badWords = array("/delete/i", "/update/i","/union/i","/insert/i","/drop/i","/http/i","/--/i");\r\n$string = preg_replace($badWords, "", $string);

\r\nThis additional step should prevent a malicious attacker from damaging a database if they found a way to slip through. Just remember that is you take this additional step and you have a site where someone might search for a “plumbing union” or a “drop cloth”, those queries would not work as intended. If you are wondering what the trailing ‘i’ is following each word in the array, it is required to make the preg_replace replacements case insensitive. This wasn’t needed with eregi_replace, but that function has been deprecated in PHP 5.3.\r\n\r\nAnother important step that needs to be taken with any database is controlling user privileges. When setting up a MySQL user, you should never assign any more privileges than they actually need to accomplish the tasks that you allow on your site. Privileges are easily assigned and managed thought phpMyAdmin, which is found in the the control panel (cPanel, Plesk, etc.) for most hosting companies.\r\n\r\nUseful Links\r\n

http://en.wikipedia.org/wiki/SQL_injection\r\nhttp://www.learnphponline.com/securi…tion-mysql-php\r\nhttp://dev.mysql.com/tech-resources/…curity-ch3.pdf\r\nhttp://www.tizag.com/mysqlTutorial/m…-injection.php

Online Schema Change for mySQL

\r\n

It is great to be able to build small utilities on top of an excellent RDBMS.

\r\n

\r\n\r\n

MySQL - Online Schema Change for mySQL
Online Schema Change for mySQL
\r\n

Thank you MySQL. This is a small but complex utility to perform online schema change for MySQL. We call it OSC and the source is here (Code is also available at bottom).

\r\nSome ALTER TABLE statements take too long form the perspective of some MySQL users. The fast index create feature for the InnoDB plugin in MySQL 5.1 makes this less of an issue but this can still take minutes to hours for a large table and for some MySQL deployments that is too long.\r\n\r\nA workaround is to perform the change on a slave first and then promote the slave to be the new master. But this requires a slave located near the master. MySQL 5.0 added support for triggers and some replication systems have been built using triggers to capture row changes. Why not use triggers for this? The openarkkit toolkit did just that with oak-online-alter-table. We have published our version of an online schema change utility (OnlineSchemaChange.php aka OSC).\r\n\r\nThe remainder of this document is copied from the internal documents that were written for this project. Note that this project was done by Vamsi and he did an amazing job with it. In addition to writing the tool, writing the docs and doing a lot of testing he also found and fixed or avoided a few bugs in MySQL to make sure OSC would be reliable.\r\n\r\nOverview\r\n\r\nIf the row format of database allows addition of a new column (possibly positioned at the end of existing columns with some default value) without modifying every row of the table, addition of a column could simply be just a metadata change which can be done very fast. In such databases, an exclusive lock is needed only for a very short time for the metadata change. Our understanding is that InnoDB row format does not allow this, and changing row format would be a significant project. Hence we do not consider this approach. Also, note that this approach would not work for operations like defragmentation.\r\n\r\nOSC algorithms typically have several phases:\r\n

    \r\n
  • copy – where they make a copy of the table
  • \r\n

  • build – where they work on the copy until the copy is ready with the new schema
  • \r\n

  • replay – where they propagate the changes that happened on the original table to the copy table. This assumes that there is a mechanism for capturing changes.
  • \r\n

  • cut-over – where they switch the tables ie rename the copy table as original. There is typically small amount of downtime while switching the tables. A small amount of replay is also typically needed during the cut-over.
  • \r\n

\r\nNote that the above operations can be done within the storage engine itself, or using an external (PHP) script. We followed the latter approach as it can be implemented much faster. An advantage of doing within storage engine is some optimizations can be done that are not available while doing these operations externally.\r\n\r\nCopy Phase\r\n\r\nWhen the binlog is enabled Innodb gets read locks on table S during a statement such as “insert into T select * from S”. To avoid this and to reduce the load on MySQL we select the data into an outfile and load from the outfile.\r\n\r\nRather than generating one big outfile, we do multiple scans of the table, where each scan covers about 500000 rows (batchsize is a parameter to the OnlineSchemaChange and default value is set to 500000). The first scan scans from start of the table using LIMIT 500000. Subsequent scans start at the posistion where the previous scan left off. For example, for a 2-column PK, if the 1st select reads till [x, y] the where clause of the 2nd select will look like ((col1 = x and col2 > y) OR (col1 > x)). We patched InnoDB to not get the read locks and expect to get rid of these outfiles in the future. However, we will continue to do multiple scans of table with each scan going after different rows than previous scans.\r\n\r\nFor efficiency, in innodb _plugin 5.1 we drop all non-clustered indexes before loading data into copytable, and recreate them after load. As a future optimization there may be some cases where it is useful to drop and recreate C index as well. We do not drop NC indexes in which 1st column is AUTO_INCREMENT column. Also, in innodb 5.0 we do not drop non-clustered indexes as recreating them is very slow.\r\n\r\nCapturing changes\r\n\r\nSome of the approaches for capturing changes for replay are as follows:\r\n

    \r\n
  • Use statement level binary log. Unfortunately, this approach is problematic when statements that refer to other tables are replayed. If the other tables are read during replay phase, they may return different data than what they returned during the original execution of the statement. If those statements update the other tables, those updates need to be skipped during replay.
  • \r\n

  • Use row level binary log. This approach would work assuming we filter out updates to other tables during replay. However many MySQL deployments don’t use row based replication (RBR) yes. Also, we need the original SQL in the binlog even when RBR is used and that feature has yet to appear in an official release.
  • \r\n

  • Use triggers to capture changes. This approach has extra overhead as changes to the table are recorded in a change capture table. Also, we need to get a short duration exclusive lock for creating triggers, as mysql does not allow creating triggers while holding READ lock. If we don’t get any lock while creating triggers, we risk losing changes done by transactions that are active at the time of selecting data into outfile, if those changes were done prior to creating triggers. The trigger approach has the advantage of less effort, and less risk of breaking stuff  so we adopt decided to use it for OSC.
  • \r\n

\r\nThe change capture table is referred to as the deltas table. It has has all columns as original table plus two additional columns: an integer autoincrement column to track order of changes AND an integer column to track dml type (insert, update or delete).\r\n

    \r\n
  • An insert trigger is created on the original table to capture all column values of row being inserted in deltas.
  • \r\n

  • A delete trigger is created on original table to capture only the PK columns of row being deleted in deltas.
  • \r\n

  • An update trigger is created on the original table so that if the update does not change the PK columns then it captures new values of all columns in deltas. If the update changes the PK columns, then the update trigger captures it as a delete followed by an insert.  A possible optimization in the future is to log only changed columns.
  • \r\n

\r\nReplay phase\r\n\r\nIt is useful to do most of the replay work without blocking changes to the original table. Mutliple replay passes are used and only the final replay is done under WRITE lock on the table. Because there are multiple passes we need to avoid replaying the same change multiple times. The following approaches are available to do this:\r\n

    \r\n
  • Delete the records from deltas as they are replayed. When a record is deleted, the entire record is put in transaction log (possibly containing large columns) and this might be too much load.
  • \r\n

  • Have a column ‘done’ in deltas and set it for the records as they are replayed. Updates generate less transaction log than delete, but if the ‘done’ column is not indexed, we will be scanning deltas on each pass anyway.
  • \r\n

  • save IDs of the replayed records in a temporary table so that OSC does not write to deltas.
  • \r\n

\r\nWe choose to save IDs in a temporary table.\r\n\r\nAnother consideration is how to propagate changes from deltas table to the copytable. There are at least two approaches:\r\n

    \r\n
  • select the columns from deltas table into PHP code and pass them back to mysql through update or insert or delete commands. This could move large column values back and forth between PHP and mysql.
  • \r\n

  • Only fetch the ID column in deltas to PHP code, and then construct the insert, update or delete statements such that column values are directly copied from deltas to copytable.
  • \r\n

\r\nWe use the to only feetch the ID columns.\r\n\r\nThere are three phases for replaying changes: after all outfiles are loaded, after indexes are recreated and during the swap phase.\r\n\r\nCut-over phase\r\n\r\nMysql offers two ways of renaming a table foo to bar.\r\n

    \r\n
  • Rename table ‘foo’ to ‘bar’. Multiple tables can be renamed atomically using rename command, which makes it attractive for swapping two tables. Unfortunately, this command cannot be executed while holding table locks, or inside a larger transaction (i.e rename has to be transaction all by itself). So we are unable to use this.
  • \r\n

\r\n

    \r\n
  • Alter table ‘foo’ rename ‘bar’. Alter table causes an implicit commit, but it can be last statement in a multi-statement transaction, and it can be executed while holding table locks. So we use this, but two tables cannot be swapped atomically using alter table command. We need to use two separate statements.
  • \r\n

\r\nOur cut-over phase looks like\r\n

    \r\n
  • lock tables (original table, new table, change capture table) in exclusive mode
  • \r\n

  • replay any additional changes that happened after last replay
  • \r\n

  • alter original table by renaming it as some old table
  • \r\n

  • alter copytable by renaming it as original table.
  • \r\n

\r\nSince alter table causes an implicit commit in innodb, innodb locks get released after the first alter table. So any transaction that sneaks in after the first alter table and before the second alter table gets a ‘table not found’ error. The second alter table is expected to be very fast though because copytable is not visible to other transactions and so there is no need to wait.\r\n\r\nError handling\r\n\r\nThere are two basic cases of errors:\r\n

    \r\n
  • Sql command issued by OSC fails due to some error, but mysql server is still up
  • \r\n

  • Mysql server crashes during OSC
  • \r\n

\r\nHere are the various entities created by OSC:\r\n

    \r\n
  • triggers
  • \r\n

  • new non-temporary tables (copy table, deltas table, backup table to which the original table is renamed)
  • \r\n

  • temp tables
  • \r\n

  • outfiles
  • \r\n

\r\nAs we create an entity, we use a variable to track its cleanup. For example, when we create deltas, we set a variable $this->cleanup_deltastable to indicate that deltas needs to be cleaned up. This is not necessary for temp tables as they are automatically nuked when the script ends. A cleanup() method does the cleanup based on these cleanup variables. The cleanup() method is used during both successful termination of the script as well as failure termination.\r\nHowever if mysql server crashes, cleanup steps would also fail. The plan to handle mysql failures is to have a mode ‘force_cleanup’ for the OSC script, which would cleanup all the triggers, non-temporary tables, and outfiles that would have been created by OSC.  One caution while using the force_cleanup mode is if the names of triggers/outfiles/tables that OSC would have created coincide with an existing trigger/outfile/table that has nothing to do with OSC, that entity may get dropped. The chances of coincidence are very slim though as we use prefixes like __osc_ for entities that OSC creates. This issue does not arise during regular cleanup (i.e non-forced) because cleanup is done based on cleanup variables in that case.\r\nNote that normally the failures during OSC don’t have to be acted on urgently, as the existence of stray tables/outfiles/triggers is not a critical problem. However, an exception is if failure happens after the original table is renamed to a backup table but before copy table is renamed as original table. In that case there should be two tables – backup table and copytable with identical contents except for the schema change. Applications would get ‘table not found’ errors until the issue is fixed. During force_cleanup, if it detects that both backup table and copytable exist, it renames backup table to original table.\r\n\r\nReplication\r\n\r\nOSC is is not really making any changes on its own, but only propagating the changes done by other transactions (which are replicated). So we set sql_log_bin = 0 for OSC. For schema changes like adding a column, this puts a requirement that the schema change must be done on slaves first. \r\n\r\nAssumptions that are validated in the code\r\n

    \r\n
  1. The original table must have PK. Otherwise an error is returned.
  2. \r\n

  3. No foreign keys should exist. Otherwise an error is returned.
  4. \r\n

  5. No AFTER_{INSERT/UPDATE/DELETE} triggers must exist. Otherwise create trigger would fail and error is returned.
  6. \r\n

  7. If PK is being altered, post alter table should support efficient lookups on old PK columns. Otherwise an error is returned. The reason for this assumption is that PHP code may have queries/inserts/updates/deletes based on old PK columns and they need to be effiicient. Another reason is during replay, the ‘where’ clauses generated have old PK columns and so replay phase would be very slow.
  8. \r\n

  9. If two OSCs are executed on same table concurrently, only the first one to create copytable would succeed and the other one would return an error.
  10. \r\n

  11. OSC creates triggers/tables/outfiles with prefix __osc_. If coincidentally objects with those names already exist, an error is returned as object creation would fail.
  12. \r\n

  13. Since we only tested OSC on 5.1.47 and 5.0.84, if it is not one of those two versions, it returns error.
  14. \r\n

\r\n   Assumptions that are NOT validated in the code\r\n

    \r\n
  1. Schema changes are done on slave before master. (If master has more columns than slave, replication may break. )
  2. \r\n

  3. If OSC is done concurrently with alter table on the same table, race condition could cause “lost schema changes”. For example if column foo is being added using OSC and column bar is being added using alter table directly, it is possible that one of the column additions is lost.
  4. \r\n

  5. Schema changes are backward compatible, such as addition of a column. Column name changes or dropping a column would cause error on the 1st load.
  6. \r\n

  7. When OSC is run with OSC_FLAGS_FORCE_CLEANUP, it drops triggers/tables/outfiles with prefix __osc_. So if coincidentally objects with those names exist that have nothing to do with OSC, they would get dropped.
  8. \r\n

\r\n  Steps in detail (listed in the order of execution)\r\n

    \r\n
  • Initialization
  • \r\n

  • create_copy_table
  • \r\n

  • alter_copy_table
  • \r\n

  • create_deltas_table
  • \r\n

  • create_triggers
  • \r\n

  • start snapshot xact
  • \r\n

  • select_table_into_outfile
  • \r\n

  • drop NC indexes
  • \r\n

  • load copy table
  • \r\n

  • Replay Changes
  • \r\n

  • recreate NC indexes
  • \r\n

  • Replay Changes
  • \r\n

  • Swap tables
  • \r\n

  • Cleanup
  • \r\n

\r\nThey are described in more detail below.\r\n\r\nSlight difference in the sequence of steps in 5.0 and 5.1\r\n\r\nNote that (unfortunately) we need to use slightly different sequences for 5.0 and 5.1 – and that is not good. This must be done to compensate for different behavior in those versions.\r\n\r\nThis order works in 5.1 but not 5.0 (I am only showing the relevant part of the sequence):\r\n

    \r\n
  1. Lock table in WRITE mode
  2. \r\n

  3. Create insert, update, delete triggers
  4. \r\n

  5. Unlock tables.
  6. \r\n

  7. Start snapshot transaction
  8. \r\n

  9. Scan deltas table and track these deltas in ‘changes to exclude from replay’
  10. \r\n

  11. Scan original table into multiple outfiles
  12. \r\n

  13. End snapshot xact
  14. \r\n

  15. Load data from all outfiles to copytable
  16. \r\n

  17. Replay changes that have not been excluded in step 5.
  18. \r\n

\r\nSince the scan done in step 6 should already see the changes captured in step 5, we exclude them from replay.\r\n\r\nThe above order does not work for 5.0 because creating trigger after locking table hangs in 5.0. See bug 46780.\r\n\r\nThis order works in 5.0 but not in 5.1\r\n\r\nSame as above except that 1 and 2 are reversed i.e create triggers before locking.\r\n\r\nNote that the table lock is for ensuring that transactions that changed the table before triggers were created are all committed. Any changes done after snapshot transaction began in step 4 should be captured by triggers. So even if we get table lock after creating triggers, the purpose of waiting for all prior transactions would still be achieved. So it should work in theory.\r\n\r\nHowever, this sequence does not work in 5.1 in my automated unit tests as it causes the scan in step 5 to exclude some changes from replay that are not captured in scan in step 6. (For example, if a concurrent xact updates row R, the snapshot xact step 5 is seeing the row in deltas table inserted by the update, but step 6 is seeing old image of row instead of new image).\r\n\r\nMySQL docs state\r\n\r\nFor transactional tables, failure of a statement should cause rollback of all changes performed by the statement. Failure of a trigger causes the statement to fail, so trigger failure also causes rollback.\r\n\r\nSo that means trigger is executed as part of same transaction as the DML that activated the trigger, right?  We don’t know why the snapshot xact in OSC is seeing the affect of trigger but not the affect of original DML and are not sure if this is a bug.\r\n\r\nCode Vocabulary/Glossary\r\n

    \r\n
  • $this->tablename is name of original table (i.e table being altered)
  • \r\n

  • $this->dbname is name of database
  • \r\n

  • $this->newtablename is name of copy table or new table
  • \r\n

  • $this->deltastable is name of [deltas] table
  • \r\n

  • $this->renametable is name to which the original table is renamed to before discarding
  • \r\n

  • $this->columns, $this->pkcolumns, $this->nonpkcolumns are comma separated lists of all columns, just pk columns and just non PK columns respectively of the original table
  • \r\n

  • $this->newcolumns and $this->oldcolumns are comma separated lists of columns of the original table prefixed by ‘NEW.’ and ‘OLD.’ respectively. Similarly we also have $this->oldpkcolumns and $this->newpkcolumns.
  • \r\n

  • IDCOLNAME and DMLCOLNAME are names of ID column and DML TYPE column in [deltas] table
  • \r\n

  • TEMP_TABLE_IDS_TO_EXCLUDE refers to temp table used for IDs to exclude. Its actual name is ‘__osc_temp_ids_to_exclude’.
  • \r\n

  • TEMP_TABLE_IDS_TO_INCLUDE refers to temp table used for IDs to include. Its actual name is ‘__osc_temp_ids_to_include’.
  • \r\n

  • $this->insert_trigger, $this->delete_trigger, and $this->update_trigger refer to trigger names.
  • \r\n

  • $this->get_pkmatch_condition($tableA, $tableB) generates condition of the form tableA.pkcolumn1=tableB.pkcolumn1 AND tableA.pkcolumn2=tableB.pkcolumn2 … where pkcolumn1, pkcolumn2 etc are PK columns of original table. tableA and tableB would be table references in the FROM clause.
  • \r\n

\r\nInitialization\r\n

    \r\n
  • Here we turn off bin log using ‘SET sql_log_bin = 0′.
  • \r\n

  • We do validations like checking for no foreign keys, checking that PK exists, and innodb version.
  • \r\n

  • We also retrieve all column information of the table being altered, so that we don’t have to read from information schema multiple times. (QUESTION: what happens if columns get changed by another alter table running in parallel? For now we assume that OPs is aware of alter table commands being run and won’t run two in parallel.)
  • \r\n

\r\nThis query retrieves column names.\r\n

$query = “select column_name, column_key from “.\r\n\r\n”information_schema.columns “.\r\n\r\n”where table_name =’%s’ and table_schema=’%s'”;\r\n\r\n$query = sprintf($query, $this->tablename,\r\n\r\n$this->dbname);

\r\nif column_key is not ‘PRI’, we infer that it is NOT part of primary key.\r\n

// for PK columns we need them to be in correct order as well.\r\n\r\n$query = “select * from information_schema.statistics “.\r\n\r\n”where table_name = ‘%s’ and TABLE_SCHEMA = ‘%s’ “.\r\n\r\n” and INDEX_NAME = ‘PRIMARY’ “.\r\n\r\n”order by INDEX_NAME, SEQ_IN_INDEX”;\r\n\r\n$query = sprintf($query, $this->tablename, $this->dbname);

\r\ncreate_copy_table\r\n\r\ncopy table is named as concatenate( ‘__osc_new_’, originaltablename) truncated to 64 characters (maxlen). This is done by ‘create table <copytable> LIKE <originaltable>’.\r\n\r\nalter_copy_table\r\n\r\n \r\n\r\nDDL command to alter original table is given as input. We replace original table name by copy table name by doing:\r\n

$altercopy = preg_replace(‘/alter\s+table\s+/i’,\r\n\r\n’ALTER TABLE ‘, $this->altercmd,\r\n\r\n-1, $count);\r\n\r\n$altercopy = preg_replace(‘/ALTER\s+TABLE\s+\r\n\r\n’.$this->tablename.’/’,\r\n\r\n’ALTER TABLE ‘.\r\n\r\n$this->newtablename,\r\n\r\n$altercopy, -1, $count);

\r\nThe command is then run to alter copytable in the same way as we want original table to look like after doing alter. If we have < 1 or > 1 matches in either of preg_replace mentioned above, exception is raised.\r\n\r\nNow we also retrieve index info using the following query so that we can drop and recreate NC indexes. (QUESTION : what happens if a concurrent alter table adds or drops index while this is running? For now we assume that operations is aware of alter table commands being run and won’t run two in parallel.\r\n

$query = “select * from information_schema.statistics “.\r\n\r\n”where table_name = ‘%s’ and “.\r\n\r\n”TABLE_SCHEMA = ‘%s’ “.\r\n\r\n”order by INDEX_NAME, SEQ_IN_INDEX”;\r\n\r\n$query = sprintf($query, $this->newtablename,\r\n\r\n$this->dbname);

\r\nThe following columns in select list are used:\r\n

    \r\n
  • NON_UNIQUE column: gives info on whether the index is non-unique.
  • \r\n

  • COLUMN_NAME gives the name of the column that is in the index.
  • \r\n

  • SUB_PART column indicates if index is on on part of column. (For example if an index is created on a varchar(1000) column, Innodb only creates index on first 767 chars. SUB_PART column gives this value.)
  • \r\n

  • INDEX_NAME gives the name of index. if name is ‘PRIMARY’ it is inferred to be primary index.
  • \r\n

\r\nWe also check if old PK (available in $this->pkcolumnarry) is a prefix of atleast one index after the alter table. Note that if old PK is (a, b) and after alter table there is an index on (b, a), that is OK as it supports efficient lookups if values of both a and b are provided. This check is done because replay would be very inefficient if lookup based on old PK columns is inefficient after the alter table.\r\n\r\ncreate_deltas_table\r\n\r\ncreates change capture table. It is named as concatenate(‘__osc_deltas_’, originaltablename) truncated to 64 characters (maxlen). created using:\r\n

$createtable = ‘create table %s’. ‘(%s INT AUTO_INCREMENT, ‘.\r\n\r\n’%s INT, primary key(%s)) ‘.\r\n\r\n’as (select %s from %s LIMIT 0)';\r\n\r\n$createtable = sprintf($createtable, $this->deltastable,\r\n\r\nIDCOLNAME, DMLCOLNAME,\r\n\r\nIDCOLNAME, $this->columns,\r\n\r\n$this->tablename);

\r\ncreate_triggers\r\n\r\nAs mentioned before, in 5.1 we lock table and create triggers and then unlock table, but in 5.0, we create the triggers and then lock the table and unlock it.\r\n\r\nInsert trigger is created as:\r\n

$trigger = ‘create trigger %s AFTER INSERT ON %s’.\r\n\r\n’FOR EACH ROW ‘.\r\n\r\n’insert into %s(%s, %s) ‘. ‘values (%d, %s)';\r\n\r\n$trigger = sprintf($trigger, $this->insert_trigger,\r\n\r\n$this->tablename,\r\n\r\n$this->deltastable, DMLCOLNAME,\r\n\r\n$this->columns, DMLTYPE_INSERT,\r\n\r\n$this->newcolumns);

\r\nDelete trigger is created as\r\n

$trigger = ‘create trigger %s AFTER DELETE ON’.\r\n\r\n’%s FOR EACH ROW ‘.\r\n\r\n’insert into %s(%s, %s) ‘. ‘values (%d, %s)';\r\n\r\n$trigger = sprintf($trigger, $this->delete_trigger,\r\n\r\n$this->tablename,\r\n\r\n$this->deltastable, DMLCOLNAME,\r\n\r\n$this->pkcolumns, DMLTYPE_DELETE,\r\n\r\n$this->oldpkcolumns);

\r\nUpdate trigger is created as\r\n

// if primary key is updated, map the update\r\n\r\n// to delete followed by insert\r\n\r\n$trigger = ‘create trigger %s AFTER UPDATE ON’.\r\n\r\n’%s FOR EACH ROW ‘.\r\n\r\n’IF (%s) THEN ‘. ‘ insert into %s(%s, %s) ‘.\r\n\r\n’ values(%d, %s); ‘.\r\n\r\n’ELSE ‘. ‘ insert into %s(%s, %s) ‘.\r\n\r\n’ values(%d, %s), ‘. ‘ (%d, %s); ‘. ‘END IF';\r\n\r\n$trigger = sprintf($trigger, $this->update_trigger,\r\n\r\n$this->tablename,\r\n\r\n$this->get_pkmatch_condition(‘NEW’, ‘OLD’),\r\n\r\n$this->deltastable, DMLCOLNAME,\r\n\r\n$this->columns,\r\n\r\nDMLTYPE_UPDATE, $this->newcolumns,\r\n\r\n$this->deltastable, DMLCOLNAME,\r\n\r\n$this->columns, DMLTYPE_DELETE,\r\n\r\n$this->oldcolumns,\r\n\r\nDMLTYPE_INSERT, $this->newcolumns);

\r\nstart snapshot xact\r\n\r\nHere we ‘start transaction with consistent snapshot’. At this point the deltas table may already have some changes done by transactions that have committed before out snapshot began. Since such changes are already reflected in our snapshot, we don’t want to replay those changes again during replay phase. So we also create a temp table named __osc_temp_ids_to_exclude to save the IDs of records that already exist in deltas table.\r\n

$createtemp = ‘create temporary table %s(%s INT, %s’.\r\n\r\n’INT, primary key(%s))';\r\n\r\n$createtemp = sprintf($createtemp, $temptable,\r\n\r\nIDCOLNAME,\r\n\r\nDMLCOLNAME, IDCOLNAME);

\r\nSince innodb gets read locks during “insert into T1 select * from T2″ state   ments, we select out into outfile and load from that. Outfile is created in ‘secure-file-priv’ folder with name concatenate(‘__osc_ex_’, $this->tablename).\r\n

$selectinto = “select %s, %s from %s “.\r\n\r\n”order by %s into outfile ‘%s’ “;\r\n\r\n$selectinto = sprintf($selectinto, IDCOLNAME,\r\n\r\nDMLCOLNAME, $this->deltastable,\r\n\r\nIDCOLNAME, $outfile);\r\n\r\n// read from outfile above into the temp table\r\n\r\n$loadsql = sprintf(“LOAD DATA INFILE ‘%s’ INTO’.\r\n\r\n’TABLE %s(%s, %s)”,\r\n\r\n$outfile, $temptable,\r\n\r\nIDCOLNAME, DMLCOLNAME);

\r\nselect_table_into_outfile\r\n\r\nIf an outfile folder is passed in, we use that. Otherwise, if @@secure_file_priv is non-NULL, we use it as outfile folder. Otherwise we use @@datadir/dbname as outfile folder. We assume @@datadir is non-NULL.\r\n\r\nOutfile is named as concatenate(‘__osc_tbl_’, originaltablename’); Since we use multiple outfiles, they are suffixed .1,.2,.3 etc.\r\n\r\nWe also commit snapshot xact here.\r\n\r\ndrop NC indexes\r\n\r\nIn 5.1 we iterate over the index info gathered in previous step and drop all indexes whose name is NOT ‘PRIMARY’. We also don’t drop indexes in which first column is AUTO_INCREMENT column. We use this command to drop index:\r\n\r\n$drop = sprintf(‘drop index %s on %s’, $this->indexname, $this->newtablename);\r\n\r\nIndexes are not dropped in 5.0 as mentioned before.\r\n\r\nload copy table\r\n\r\nWe use this command to load each outfile:\r\n

$loadsql = sprintf(“LOAD DATA INFILE ‘%s’ INTO”.\r\n\r\n”TABLE %s(%s)”,\r\n\r\n$this->outfile_table,\r\n\r\n$this->newtablename,\r\n\r\n$this->columns);

\r\nrecreate NC indexes\r\n\r\nWe iterate over the index info gathered in ‘alter_copy_table’ step and recreate all indexes whose name is NOT ‘PRIMARY’.\r\n\r\nWe use one alter table command to create all NC indexes.\r\n\r\nIf the ‘SUB_PART’ column value in information_schema.statistics is not-null we use it while building columnlist. For example, if SUB_PART value for column ‘comment’ is 767, we use ‘comment(767)’ in the columnlist passed to create index command.\r\n\r\nReplay changes\r\n\r\nAs mentioned before replay changes could be done multiple times. We maintain a temp table called TEMP_TABLE_IDS_TO_EXCLUDE to track those IDs that have been processed already. The set of IDs to process is obtained by taking the IDs from deltas table and excluding those that are in TEMP_TABLE_IDS_TO_EXCLUDE and is saved in TEMP_TABLE_IDS_TO_INCLUDE.\r\n

// Select from deltastable that are not in\r\n\r\n// TEMP_TABLE_IDS_TO_EXCLUDE.\r\n\r\n// Use left outer join rather than ‘in’ subquery for better perf.\r\n\r\n$idcol = $this->deltastable.’.’.self::IDCOLNAME;\r\n\r\n$dmlcol = $this->deltastable.’.’.self::DMLCOLNAME;\r\n\r\n$idcol2 = self::TEMP_TABLE_IDS_TO_EXCLUDE.’.’.self::IDCOLNAME;\r\n\r\n$selectinto = “select %s, %s “. “from %s LEFT JOIN %s ON %s = %s “.\r\n\r\n”where %s is null order by %s into outfile ‘%s’ “;\r\n\r\n$selectinto = sprintf($selectinto, $idcol, $dmlcol,\r\n\r\n$this->deltastable,\r\n\r\nself::TEMP_TABLE_IDS_TO_EXCLUDE,\r\n\r\n$idcol,\r\n\r\n$idcol2, $idcol2, $idcol, $outfile);

\r\nWe process about 500 rows in a transaction (except for the final replay which happens while holding WRITE lock on table, which is done without starting any new transaction).\r\n\r\nHere is the query to retrieve IDs and dml type from TEMP_TABLE_IDS_TO_INCLUDE.\r\n

$query = sprintf(‘select %s, %s from %s order by %s’,\r\n\r\nIDCOLNAME, DMLCOLNAME,\r\n\r\nTEMP_TABLE_IDS_TO_INCLUDE,\r\n\r\nIDCOLNAME);

\r\nDMLCOLNAME column tells if it is insert, delete or update.\r\n\r\nHere is how insert is replayed:\r\n

$insert = sprintf(‘insert into %s(%s) select %s’.\r\n\r\n’from %s where %s.%s = %d’,\r\n\r\n$this->newtablename,\r\n\r\n$this->columns,\r\n\r\n$this->columns,\r\n\r\n$this->deltastable,\r\n\r\n$this->deltastable,\r\n\r\nIDCOLNAME,\r\n\r\n$row[IDCOLNAME]);

\r\nHere is how delete is replayed:\r\n

$delete = sprintf(‘delete %s from %s, %s ‘.\r\n\r\n’where %s.%s = %d AND %s’,\r\n\r\n$newtable, $newtable,\r\n\r\n$deltas, $deltas,\r\n\r\nIDCOLNAME,\r\n\r\n$row[IDCOLNAME],\r\n\r\n$this->get_pkmatch_condition($newtable,\r\n\r\n$deltas));

\r\nHere is how update is replayed:\r\n

$update = sprintf(‘update %s, %s SET %s where ‘.\r\n\r\n’%s.%s = %d AND %s ‘,\r\n\r\n$newtable, $deltas,\r\n\r\n$assignment, $deltas,\r\n\r\nIDCOLNAME,\r\n\r\n$row[IDCOLNAME],\r\n\r\n$this->get_pkmatch_condition($newtable,\r\n\r\n$deltas));

\r\nSwap tables\r\n\r\n \r\n\r\nHere are the steps as mentioned in cut-over phase:\r\n

    \r\n
  • TURN AUTOCOMMIT OFF: ‘set session autocommit=0′ // without this lock tables is not getting innodb lock
  • \r\n

  • lock all tables in WRITE mode:
  • \r\n

\r\n

$lock = sprintf(‘lock table %s WRITE, %s WRITE, %s WRITE’, $this->tablename,\r\n\r\n$this->newtablename, $this->deltastable);

\r\n

    \r\n
  • final replay
  • \r\n

\r\n

$rename_original = sprintf(‘alter table %s rename %s’,\r\n\r\n$this->tablename, $this->renametable);\r\n\r\n$rename_new = sprintf(‘alter table %s rename %s’,\r\n\r\n$this->newtablename, $this->tablename);

\r\n

    \r\n
  • COMMIT // alter tables would have already caused implicit commits in innodb
  • \r\n

  • unlock tables
  • \r\n

  • TURN AUTOCOMMIT ON: ‘set session autocommit=1′
  • \r\n

\r\nCleanup\r\n

    \r\n
  • ROLLBACK in case we are in the middle of a xact
  • \r\n

  • Turn on autocommit in case we turned it off
  • \r\n

  • if trigger cleanup variables are set, drop triggers and unset trigger cleanup variables
  • \r\n

  • if outfile cleanup variables are set, delete the outfiles and unset outfile cleanup variables
  • \r\n

  • if cleanup variable is set for both newtable and renamedtable, then it means failure happened between the two alter tables. In this case just rename renamedtable as original table, and unset cleanup variable for renamedtable.
  • \r\n

  • if cleanup variable is set for newtable, renamedtable or deltas table, drop the corresponding tables, and unset corresponding cleanup variable
  • \r\n

\r\nIn the force cleanup mode we will pretend as though all cleanup variables are set, and use ‘drop if exists’.\r\n\r\nFor details Click Here, Thank you Mike.\r\n\r\nIf you faced any issue with original schema with single page, please Click Here

MySQL – How to create a database diagram basing on the image available

Often we make some sketches of the database we plan to create on a blackboard or a sheet of paper before we actually design its structure on computer. After that we discuss the entities we’ve got, normalize them and repeat these actions several times. As a result we get a completely approved database structure in the form of an image file in the project documentation. In this article we’ll try using Database Designer of dbForge Studio for MySQL.\r\n\r\nSuppose that you have a sketch of the future database:\r\n

\r\n\r\n
Database Structure Image
Database Structure
\r\n\r\nDatabase Structure\r\n\r\n

\r\nTo place this picture onto an empty diagram you should create an empty document, for example, Diagram1.dbd by pressing New Database Diagram on the Standard toolbar. After that you should press the New Image button on the Database Diagram toolbar. The mouse pointer will change to an icon with a picture. Click on any part of the diagram. In the Open dialog window that appeared select the image with the diagram structure sketch.\r\n\r\nDatabase Designer: Open New Image\r\n

\r\n\r\n
Database Designer - Open New Image
Open New Image
\r\n\r\n

\r\nNow as you see the database sketch you can recreate the database from it. Let’s create the necessary tables with Primary Key and indexes one by one. For example, to create the Sessions table press the New Table button on the Database Diagram toolbar. The mouse pointer should change to an icon with a table. Click on any part of the diagram. A window for editing the Table1 table should appear.\r\n

\r\n\r\nDatabase Designer: Create New table\r\n\r\nUsing the database editor window you should do the following:\r\n\r\n
Database Designer Create New table
Create New table
\r\n\r\n

\r\n

    \r\n
  • On the General tab edit the table name; add a key column (in this column you should edit its name, datatype, and set the Primary option); add all other columns (uncheck the additional Allow nulls(*) option)
  • \r\n

  • On the Indexes tab let’s create indexes for all key columns and uncheck the Unique option
  • \r\n

\r\nAs a result we’ve got a new entity on the diagram – the Sessions table.\r\n\r\nDatabase Designer: Design New Table\r\n

\r\n\r\n
Design New Table
Design New Table
\r\n\r\n

\r\nNow we can add a relation between the Hits and Sessions tables. To do this, you should:\r\n

    \r\n
  • press the New Relation button on the Database Diagram toolbar. The mouse pointer should change to an icon with an arrow. Then click the Hits table, and, without releasing the mouse button, drag the cursor to any part of the Sessions table and release the mouse button(**).
  • \r\n

  • in the Foreign Key Properties window that appeared select the SessionID column from the “Table Columns” columns list and press the [→] button. The SessionID column was moved to the “Constraints Columns” column list. Save these changes by pressing OK.
  • \r\n

\r\n

Database Designer: Create New Relation\r\n\r\n

\r\nAs a result, we’ve bound two tables – “Hits” and “Sessions” using the foreign key “hits_FK”.\r\n\r\n

Database Desinger: Create New Relation
Create New Relation
\r\n\r\n

\r\nDatabase Designer: Display Relation\r\n\r\nNow we should repeat the same operations as creating and designing tables, creating indexes and relations between tables.\r\n\r\n

Database Designer: Display Relation
Display Relation
\r\n\r\nAn important part of the database design process is logical division of database objects into groups. Database Designer available in dbForge Studio for MySQL has a special Container component for this purpose.\r\nTo create a new container and move the necessary objects into it you should:\r\n
    \r\n
  • Press the New Container button on the Database Diagram toolbar. The mouse pointer should change to an icon with three squares. Click on an empty place on the diagram. A container with the Group1 name appeared. Let’s change the container name;
  • \r\n

  • Select the tables you want to move to the container. For example, let’s select Users, Registrars, Products, and OrderLinks tables;
  • \r\n

  • Move the selected tables onto the container;
  • \r\n

\r\n

Database Designer: New Container

\r\n

\r\n\r\nAnd the final step in the process of database creation using a sketch is the optimization of\r\n\r\n
Database Designer: New Container
New Container
\r\n\r\ndatabase objects location on the diagram. The algorithm used by Layout Diagram is designed so that the program redraws the relations between tables so that they would not intersect each other. This allows to save space on the diagram and makes it readable.\r\n\r\n

\r\n

\r\nDatabase Designer: Layout Diagram\r\n\r\nAs a result of the actions described above we’ve created a database using a sketch without switching over to other applications displaying the image of the diagram using Alt+Tab or\r\n\r\n

Database Designer: Layout Diagram
Layout Diagram
\r\n\r\nprinting the sketch owing to the unique functionality of dbForge Studio for MySQL\r\n
    \r\n
  • On the diagram, columns with the Not Null property enabled are displayed in bold (for example, the HitDate column of the SpiderHits table) unlike other columns (for example, the HitUrl column of the SpiderHits table).
  • \r\n

  • To create Foreign Key between tables both these tables should have been created with Engine=InnoDB.
  • \r\n

\r\nYou can download evaluation copy of dbForge Studio for MySQL.

Hello world!

Welcome to Community\r\n\r\nWelcome to SysAdmin community site. You’ll get help, news, discussions and collection of tools for System administration & IT Professionals. The target is to make this community to be one of biggest community of System Administrators and IT Professionals.