PHP Design – Biggest Database Oversights

Over the last three years I've had the opportunity to work on several PHP projects, some of them having grown rapidly and required to scale quickly. Three in particular have been a fantastic learning experience for me. Now I don't consider myself a total expert, but I thought I would share my experiences. Many PHP developers are self-taught, like I was, and are learning PHP to be able to create dynamic websites. Many haven't had studied or learned best practices for Object-Oriented Programming (OOP). They are web designers first, and programming in PHP is just a way to do things simple HTML can't do.

Because most PHP applications rely heavily on the ability to read and store information in a database, many times designing how you interact with that database is critical. The reason being is if you design something, but don't foresee a requirement or necessity, it can be extremely time consuming to go back and change. These "oversights" can be costly in time, money, and resources.

I've thought of some of the biggest oversights I've had when working with PHP and MySQL and put them in a list. This is my personal list, and I'm sure some people can think of some other oversights that belong on the list as well. This list is just for PHP & MySQL, not PHP and any database. I know many people like using software like Doctrine to allow switching between different database types. That is beyond the scope of this article.

Many of the examples in this article are very simple and are designed for showing techniques of better programming, not for actual use. Straight copy and paste most likely will not create sound, robust solutions. Search the web for a solid MySQL PHP Class to handle these issues for you, or you can make your own.

Now that I've thrown out my disclaimers and thoughts, lets venture on to by list:

  • Oversight #1 - No Data Access Layer
  • Oversight #2 - Design for Only One Database Connection
  • Oversight #3 - No Developer Logging
  • Oversight #4 - Queries Written In Procedural Processes
  • Oversight #5 - No Separation of Reads & Writes

Oversight #1 - No Data Access Layer

I first heard the term Data Access Layer was when I was reading about ASP .NET. Basically a Data Access Layer (DAL) is a layer of code that you access data through. Sounds simple right? Many new php developers don't have any type of DAL. Here is an example of not having a Data Access Layer:

PHP:
  1. <?php
  2.  
  3. // Create MySQL Connection
  4. $connection = mysql_connect('localhost', 'mysql_user', 'mysql_password');
  5.  
  6. // Query User List
  7. $result = mysql_query("SELECT * FROM users", $connection);
  8.  
  9. // Get User List in an Array
  10. $user_list = array();
  11. while($user = mysql_fetch_assoc($result))
  12. {
  13.      $user_list[] = $user;
  14. }
  15.  
  16. // Take the populated user list and display a table...
  17. // ..
  18.  
  19. ?>

Now, that wasn't so bad, was it? Lots of developers just use the mysql commands. I have several projects I worked on my early days with PHP that fall victim to this oversight. One in particular now has grown out of hand so bad that we've decided to start from scratch for a whole new version. Why? Lets say you have 3000+ php files, and your boss says "Hrm, we're seeing some problems with performance. Can you display at the bottom of each page the # of queries you use on that page?" If you coded your entire project like the example above, you would be totally screwed. You would have to find each and every mysql_query() and add some counter at the end. It would be a managing Nightmare. So how cold you solve this problem?

PHP:
  1. <?php
  2.  
  3. // Declare MySqlDAL class
  4. class MySqlDAL
  5. {
  6.      public $connection = null; // hold the connection link resource
  7.      public $query_count = 0; // total number of queries executed
  8.  
  9.      // Connect when creating an instance of this class
  10.      public function __construct($host, $user, $pass)
  11.      {
  12.           $this->connection = mysql_connect($host, $user, $pass);
  13.           return true;
  14.      }
  15.  
  16.      // Run a query against the database. The key here is to make sure
  17.      // every query command goes through this function
  18.      public function Query($sql)
  19.      {
  20.           // Execute Query & Get Result Resource
  21.           $result = mysql_query($sql);
  22.          
  23.           // Increment Query Counter
  24.           $this->query_count++;
  25.          
  26.           // Return the result
  27.           return $result;
  28.      }
  29.  
  30.      // Function to take an SQL query, execute it, and return all
  31.      // the rows in an assoc. array.
  32.      public function FetchArray($sql)
  33.      {
  34.           // Execute the Query and get the Result
  35.           $result = $this->query($sql);
  36.  
  37.           // Create Empty Array to Store all the rows
  38.           $array = array();
  39.  
  40.           // Loop through each row
  41.           while($row = mysql_fetch_assoc($result))
  42.           {
  43.                // Add the row to the array
  44.                $array[] = $row;
  45.           }
  46.  
  47.           // Return the array containing all the rows
  48.           return $array;
  49.      }
  50. }
  51.  
  52. $dal = new MySqlDAL('localhost','mysql_user','mysql_pass');
  53. $user_list = $dal->FetchArray("SELECT * FROM users");
  54.  
  55. // Display Table of Users...
  56. // ..
  57.  
  58. ?>

First of all, you've satisfied the boss's wish of keeping a counter for the total # of queries. The trick is to have all queries go through the function "Query($sql)." This will ensure the counter stays current. You can also add code to catch errors, display error messages, log queries, etc. Having a class to wrap around your database connection, queries, etc. can make your life easier, and more simple.


Oversight #2 - Design for Only One Database Connection

I have a project that I've taken over right now that is suffering from being programmed into handling only one database connection. "But why would you ever need to connect to more than one database?" Well, I like to have a database for important information: invoices, users, payments, etc. Stuff that is updated every day & information that the end user manipulates. Then I like to have another database for logging statistical information: user log-ins, page hits, visitor counts. The problem I've ran into lately is I want to have hourly backups for my important information, but only daily backups for simple statistical information. It would be much easier if they were separated in different databases, but many times they are in the same. I don't want to take the extra time and space to backup the statistical information.

Many people who run into this problem use a Singleton class to handle their Data Access Layer. A singleton class is a class that can only have one instance at a time in execution. This can be suer easy and quick to use in your code, but may lock you into a situation where your code only handles a single database connection:

PHP:
  1. <?php
  2.  
  3. // I've declared a singleton class called MyDB.
  4.  
  5. $users = MyDb::FetchArray("SELECT * FROM users");
  6.  
  7. // ..
  8.  
  9. ?>

Now your boss comes in and asks you to connect to another database to get a list of customers. You think "uh oh, I can't really do that with my design, now can I?" The solution I like to use is having Database Connection classes, and then a Database Connection Manager. Imagine I have the MySqlDAL class from the previous example in oversight #1. I would create singleton class called MySqlManager. It's job would be to handle the creation & management of all it's classes. Here is an example:

PHP:
  1. <?php
  2.  
  3. // ... declare my MySqlDAL class before
  4.  
  5. // Handles the creation and management of MySQL Connections
  6. class MySqlManager
  7. {
  8.  
  9.     // Holds the singleton instance
  10.     private static $instance = null;
  11.  
  12.     // Contians each MySqlDAL object created;
  13.     private $connections;
  14.  
  15.     private function __construct()
  16.     {
  17.         // Its private to prevent creation outside of the GetInstance function
  18.         $this->connections = array();
  19.     }
  20.  
  21.     public static function GetInstance()
  22.     {
  23.         // If the instance is null, make one
  24.         if(!self::$instance)
  25.         {
  26.             self::$instance = new MySqlManager();
  27.         }
  28.  
  29.         return self::$instance;
  30.     }
  31.    
  32.     // Connect to a new database;
  33.     public static function CreateConnection($host, $user, $pass, $name = 'default')
  34.     {
  35.         $manager = self::GetInstance();
  36.         $manager->connections[$name] = new MySqlDAL($host, $user, $pass);
  37.  
  38.         return $manager->connections[$name];
  39.     }
  40.  
  41.     public static function GetConnection($name = 'default')
  42.     {
  43.         $manager = self::GetInstance();
  44.         if(isset($manager->connections[$name]))
  45.         {
  46.             return $manager->connections[$name];
  47.         }
  48.         else
  49.         {
  50.             // handle connection not found error...
  51.         }
  52.     }
  53. }
  54.  
  55. // Now you can create a connection, and if you don't give it a name, it will be defaulted for you
  56.  
  57. $dal = MySqlManager::CreateConnection('localhost','mysql_user','mysql_pass', 'customer');
  58.  
  59. $customer_dal = MySqlManager::CreateConnection('mysql.another-location.com','another_user','another_pass');
  60.  
  61. // ..
  62.  
  63. // later in your code, maybe even in a class or function you could get the connection by name:
  64.  
  65. // get default connection
  66. $dal = MySqlManager::GetConnection();
  67.  
  68. // Get customer database connection
  69. $customer_dal = MySqlManager::GetConnection('customer');
  70.  
  71. $users = $dal->FetchArray('SELECT * FROM users');
  72.  
  73. $customers = $customer_dal->FetchArray("SELECT * FROM customers");
  74.  
  75. // ..
  76.  
  77. ?>

Now you have a MySQL Connection Manager class that can handle the creation, management, and disconnection of your SQL connections. Your boss will be happy to hear you can connect to another database with easy!

Oversight #3 - No Developer Logging

There are many times where I wish I could what was going on behind the scenes. It is very easy to add Query Logging to your Data Access Layer if you choose. I won't give an example how to do so, but there are some several very important parts to it. First, you need to be able to enable & disable query logging. Second, you should log whether it was successful or not. Also, if you play around with PHP's back tracing abilities you can trace where each query was made. I'll make a blog post about this another day.

Oversight #4 - Queries Written In Procedural Processes

I absolutely *loathe* when I get into a situation where queries are being written all over the place. If there is one constant in any software development, it is that things will always change. When queries are being placed in procedural code, it makes management of database changes a total hassle. "But what about find and replace?" That can help you find places where you need to change, but you still can miss other places in your code, which can lead to bugs, headaches, and self-combustion. Here is an example:

PHP:
  1. <?php
  2.  
  3. // .. code displaying a user profile:
  4.  
  5. // Get user links
  6. $links = $dal->FetchArray("SELECT * FROM user_links WHERE user_id = 1 ORDER BY number");
  7.  
  8. // Display them on links page...
  9.  
  10. // .. another page
  11.  
  12. // Get user links
  13. $links = $dal->FetchArray("SELECT * FROM user_links ORDER BY number");
  14.  
  15. // Display them on profile page...
  16.  
  17. // .. another page
  18.  
  19. // Get user links
  20. $links = $dal->FetchArray("SELECT * FROM user_links ORDER BY number");
  21.  
  22. // Display them on about me page...
  23.  
  24. ?>

Not seeing the big issue here? Lets say you copy and paste that line of code '$links = $dal->FetchArray("SELECT * FROM user_links ORDER BY id");' on twenty different pages. Then your boss comes and says "Hey, you know how the user_links table has a title? Well every where we display the title, I'd like to have it show which # of link it is, so example: Link #1 - My Blog, Link #2 - My Facebook., Oh, and can we order them by date_added? Thanks!"

Have fun trying to find every place that query is executed. Don't be surprised when only 80% of the queries get changed, and the other 20% become bug tickets with your boss wondering why you didn't find them before. This is an example of writing queries all over the place when you need them, instead of organizing them and abstracting them by Classes and Functions. Here is an example of abstracting the query:

PHP:
  1. <?php
  2.  
  3. class User
  4. {
  5.    public $id; // contains user_id
  6.  
  7.    // ..
  8.  
  9.     public function GetVideos()
  10.     {
  11.         $dal = MySqlManager::GetInstance();
  12.         $videos = $dal->FetchArray("SELECT * FROM user_videos WHERE user_id = '$this->id' ORDER BY date_added");
  13.        
  14.         $new_video_list = array();
  15.         foreach($videos as $video)
  16.         {
  17.             $video['caption'] = "Link #".$video['number'].' - '.$video['caption'];
  18.             $new_video_list[] = $video;
  19.         }
  20.  
  21.         return $new_video_list;
  22.     }
  23.  
  24.     // ..
  25. }
  26.  
  27. $user = new User(1);
  28.  
  29. $user_videos = $user->GetVideos();
  30.  
  31. // ..
  32.  
  33. ?>

Now your boss will give you a raise when you can make changes this quickly. While this example may seem simple, there are many times where I've had projects that queries have required major changes and re-writes. Bugs quickly service if queries that are the same, or similar, are copied and re-copied all over the place. This leads to a giant mess very quickly. So remove queries from the procedural flow of things by abstracting them in classes and functions.

Oversight #5 - No Separation of Reads & Writes

This oversight has stung me for the first time just a few weeks ago. A project of mine had reached critical mass with its size, and out grew its beefy MySQL dedicated server. I started investigating doing a Master-Slave replication system, but every article said I needed to route SELECT statements to the slave servers, and just about every else to the Master (like Insert, Update, Delete, etc). This project didn't have any Data Access Layer at all, and 2+ years of development on it. There is no way I could go back and re-write all the queries on budget and timeline, especially without causing bugs. especially if Insert/Update/Delete commands on accidently got routed to a slave server.

My suggestion is funneling all queries into two functions, Select() and Execute(). Then, when you create a MySqlDAL instance, it would actually have a $this->master_connection; and $this->slave_connection; It wouldn't connection on creation, but on the first instance a query was executed. This is done to release stress on the Master server, the whole purpose of Master-Slave replication setups. Here is a mock class that would handle this:

PHP:
  1. <?php
  2.  
  3. class MySqlDAL
  4. {
  5.     public $master_connection;
  6.     public $master_host;
  7.     public $slave_connection;
  8.     public $slave_host;
  9.     public $user;
  10.     public $pass;
  11.  
  12.     // ..
  13.  
  14.     public function Select($sql)
  15.     {
  16.         if(!$this->slave_connection)
  17.         {
  18.             $this->ConnectSlave();
  19.         }
  20.         // .. Increment Slave Counter ..
  21.  
  22.         return $this->Query($sql, $this->slave_connection);
  23.     }
  24.  
  25.     public function Execute($sql)
  26.     {
  27.         if(!$this->master_connection)
  28.         {
  29.             $this->ConnectMaster();
  30.         }
  31.        
  32.         // .. Increment Master Counter ..
  33.  
  34.         return $this->Query($sql, $this->master_connection);
  35.     }
  36.  
  37.     // ..   
  38.  
  39. }
  40.  
  41. ?>

Conclusion

Hopefully these insights into oversights with PHP and MySQL can help someone out there tackle there next project, big or small. If you have any questions, feel free to leave a comment and I'll respond as soon as possible.

Related Posts

  1. PHP Singletons, Sub-Classing, and HAS-A Relationships I've been very busy these last fews weeks and have neglected making any posts. While there are a variety of subjects I'd love to post about, they'll most likely have to wait until the next year. However, I thought I might be able to throw up a quick example of...
  2. MySQL – Does Table Exist w/o Throwing Errors There are times where you would like to know if a table exists before executing an query. Most solutions require having MySQL throw an error saying "table does not exist," but I prefer a cleaner way. I found on this forum post a clean way to do it: PLAIN TEXT...
  3. Web Design & The Cookie Jar: When Dealing with Clients I saw this YouTube video and I couldn't help but post about it. Here is the video, and I'll add my thoughts at the end: I've seen this happen so many times it isn't even funny. You have a company with dozens of individuals giving their input on how something...
  4. New Blog Design I've finally gotten around to putting up my new blog design. It based off the Carrington theme for WordPress. It isn't 100% complete yet as I have a handful of more tweaks to do. The archive pages aren't formated correctly, and I'm sure I'll find some more changes to make....
  5. MySQL, 40 Million Rows, MyISAM to InnoDB, 45 Minutes Warning: This blog post is NOT a walk-through or tutorial. If you don't know what you're doing, you could seriously screw up your database. This is just talking theory and ideas on how I solved my problem. The other night (really, the other morning) I had the wonderful pleasure of...

Posted in Programming. Tagged with , , , , , .

19 Responses

Comments RSS Feed.

  1. padma krishnan said

    nicely written and it explains well the limitations of PHP MySql.

    Have you in anytime thought of a solution for mapping PHP objects to Mysql tables with a layer?
    Any limitations?

  2. I have done something along those lines, I just need to clean up my code and make good examples to show it off. One day I’ll get around to it.

  3. Stephen Beattie said

    Looks like I’m doing something right then. The DB Connection class I wrote recently looks almost line for line identical to yours except I’m using PDO :-)

    Thanks for the excellent article Justin.

  4. EllisGL said

    Shouldn’t the new link argument be set for mysql_connect?

  5. In your solution to oversight #4 you should mention the use of iterators. You’ve fallen into a pattern where in your method that encapsulates some very specific functionality you loop over a set of database results. It’s also the case where in your FetchArray method you loop over the result set. In all likelihood, the array you return from your GetVideos method would be looped to display video information to the user.

    In sum, you’ve looped over the mysql result set three times simply to display results to the user. Given a big result set this is a huge waste of resources.

    The solution, which I would say is oversight #6 is to use iterators to make it so that fetching from the database, formatting results, and displaying them, is all only done when it needs to be done.

    I suggest you go research PHP’s Iterator interface and the IteratorIterator class.

  6. @Justin, Great Article!

    I’m working on a framework (Recess! Framework, http://www.recessframework.com/) which abstracts away each of your oversights.

    I wrote a response to how Recess! addresses these oversights internally here: http://www.krisjordan.com/2008/11/28/how-recess-solves-common-phpmysql-issues/

    Cheers!

  7. juust said

    Very useful article, Justin, thanks. I am rebuilding a site’s data warehouse, this explains what I wanted to know in one page.

  8. coolerthanthou said

    $dal = new MySqlDAL(’localhost’,'mysql_user’,'mysql_pass’);

    What happens if they change the username & password? You’ll have to change that line in 3000+ files?

  9. @coolerthanthou

    You bring up a good point I didn’t mention in the article. It is *critical* that you use configuration files. That way your website finds the username & password from a configuration file, so when you change your login information, you change it in one place.

    Another option is to have an included file that is always loaded at the beginning of your website that establishes a connection.

  10. Cool post, thank you!

  11. Great post, I follow all of those measures and I must say that they do solve a lot of things, although it took me years of trial and error to get to the point where I am now.

    It is great that you showed this to everyone, and I recommend that you post someday your own MySQL handling class. The one I use was developed by Sebastien Laout and it has one of the best debugging systems I have used.

    Best regards,
    Alex

  12. a dada said

    i think most of the points here are for those who obsessed with oop structure. most of these examples will increase the code-base without adding an useful functionality. php is a 4gl and does not need excessive structure to get things done for small apps. is there really a need for a data abstraction layer if you never plan on migrating from mysql?

    consider oversight 4 – ‘Queries Written In Procedural Processes’ – if you’ve got to update the same query in 20 different pages for a minor functionality change, your system architecture must be quite convoluted anyway. if you have to disturb the flow of an application by continually accessing class upon class, you are not making it easier to maintain. it means each time you want to modify a simple logic change in a procedure, you have to have multiple files open figure out where the change is.

    no piece of code is permanent. the purpose of software is to make effective use of hardware in a constantly changing environment. a language provides sufficient structure – your app doesnt need to.

  13. AWLearningTheHardWay said

    Hmm this is really good,

    Will there be a Follow-up Artical with more… The whole why fix something if you can make it so it doesnt break in the first place is a key ideaology which i am trying to follow .

    And secondly from your own experiance if you was starting a project from scratch, which would be avalible for others database specific say ‘mySQL’ or so that its possible to add there own handleing class such as how phpBB does it?

  14. Very useful files search engine. indexoffiles.com is a search engine designed to search files in various file sharing and uploading sites.

  15. wow…nice tutorial, thanks.

Continuing the Discussion

  1. Justin Carmony’s Blog: PHP Design - Biggest Database Oversights : WebNetiques linked to this post on November 27, 2008

    [...] Carmony recently put together a blog post looking at the biggest database design oversights that PHP developers can make in their [...]

  2. Justin Carmony’s Blog: PHP Design - Biggest Database Oversights : Dragonfly Networks linked to this post on November 27, 2008

    [...] Carmony recently put together a blog post looking at the biggest database design oversights that PHP developers can make in their [...]

  3. How Recess Solves Common PHP/MySQL Issues | Kris Jordan linked to this post on November 28, 2008

    [...] Carmony wrote a great article titled ‘PHP Design – Biggest Database Oversights‘. The article points out 5 naive design decisions that will come back to haunt projects if [...]

Some HTML is OK

(required)

(required, but never shared)

or, reply to this post via trackback.

Powered by WP Hashcash