PHP Design – Biggest Database Oversights

Over the last three years I’ve had the opportunity to work on several PHP projects, some of them having grown rapidly and required to scale quickly. Three in particular have been a fantastic learning experience for me. Now I don’t consider myself a total expert, but I thought I would share my experiences. Many PHP developers are self-taught, like I was, and are learning PHP to be able to create dynamic websites. Many haven’t had studied or learned best practices for Object-Oriented Programming (OOP). They are web designers first, and programming in PHP is just a way to do things simple HTML can’t do.

Because most PHP applications rely heavily on the ability to read and store information in a database, many times designing how you interact with that database is critical. The reason being is if you design something, but don’t foresee a requirement or necessity, it can be extremely time consuming to go back and change. These “oversights” can be costly in time, money, and resources.

I’ve thought of some of the biggest oversights I’ve had when working with PHP and MySQL and put them in a list. This is my personal list, and I’m sure some people can think of some other oversights that belong on the list as well. This list is just for PHP & MySQL, not PHP and any database. I know many people like using software like Doctrine to allow switching between different database types. That is beyond the scope of this article.

Many of the examples in this article are very simple and are designed for showing techniques of better programming, not for actual use. Straight copy and paste most likely will not create sound, robust solutions. Search the web for a solid MySQL PHP Class to handle these issues for you, or you can make your own.

Now that I’ve thrown out my disclaimers and thoughts, lets venture on to by list:

  • Oversight #1 – No Data Access Layer
  • Oversight #2 – Design for Only One Database Connection
  • Oversight #3 – No Developer Logging
  • Oversight #4 – Queries Written In Procedural Processes
  • Oversight #5 – No Separation of Reads & Writes

Oversight #1 – No Data Access Layer

I first heard the term Data Access Layer was when I was reading about ASP .NET. Basically a Data Access Layer (DAL) is a layer of code that you access data through. Sounds simple right? Many new php developers don’t have any type of DAL. Here is an example of not having a Data Access Layer:

<?php

// Create MySQL Connection
$connection = mysql_connect('localhost', 'mysql_user', 'mysql_password');

// Query User List
$result = mysql_query("SELECT * FROM users", $connection);

// Get User List in an Array
$user_list = array();
while($user = mysql_fetch_assoc($result))
{
     $user_list[] = $user;
}

// Take the populated user list and display a table...
// ..

?>

Now, that wasn’t so bad, was it? Lots of developers just use the mysql commands. I have several projects I worked on my early days with PHP that fall victim to this oversight. One in particular now has grown out of hand so bad that we’ve decided to start from scratch for a whole new version. Why? Lets say you have 3000+ php files, and your boss says “Hrm, we’re seeing some problems with performance. Can you display at the bottom of each page the # of queries you use on that page?” If you coded your entire project like the example above, you would be totally screwed. You would have to find each and every mysql_query() and add some counter at the end. It would be a managing Nightmare. So how cold you solve this problem?


<?php

// Declare MySqlDAL class
class MySqlDAL 
{
     public $connection = null; // hold the connection link resource
     public $query_count = 0; // total number of queries executed

     // Connect when creating an instance of this class
     public function __construct($host, $user, $pass)
     {
          $this->connection = mysql_connect($host, $user, $pass);
          return true;
     }

     // Run a query against the database. The key here is to make sure 
     // every query command goes through this function
     public function Query($sql)
     {
          // Execute Query & Get Result Resource
          $result = mysql_query($sql);
          
          // Increment Query Counter
          $this->query_count++;
          
          // Return the result
          return $result;
     }

     // Function to take an SQL query, execute it, and return all 
     // the rows in an assoc. array.
     public function FetchArray($sql)
     {
          // Execute the Query and get the Result
          $result = $this->query($sql);

          // Create Empty Array to Store all the rows
          $array = array();

          // Loop through each row
          while($row = mysql_fetch_assoc($result))
          {
               // Add the row to the array
               $array[] = $row;
          }

          // Return the array containing all the rows
          return $array;
     }
}

$dal = new MySqlDAL('localhost','mysql_user','mysql_pass');
$user_list = $dal->FetchArray("SELECT * FROM users");

// Display Table of Users...
// ..

?>

First of all, you’ve satisfied the boss’s wish of keeping a counter for the total # of queries. The trick is to have all queries go through the function “Query($sql).” This will ensure the counter stays current. You can also add code to catch errors, display error messages, log queries, etc. Having a class to wrap around your database connection, queries, etc. can make your life easier, and more simple.


Oversight #2 – Design for Only One Database Connection

I have a project that I’ve taken over right now that is suffering from being programmed into handling only one database connection. “But why would you ever need to connect to more than one database?” Well, I like to have a database for important information: invoices, users, payments, etc. Stuff that is updated every day & information that the end user manipulates. Then I like to have another database for logging statistical information: user log-ins, page hits, visitor counts. The problem I’ve ran into lately is I want to have hourly backups for my important information, but only daily backups for simple statistical information. It would be much easier if they were separated in different databases, but many times they are in the same. I don’t want to take the extra time and space to backup the statistical information.

Many people who run into this problem use a Singleton class to handle their Data Access Layer. A singleton class is a class that can only have one instance at a time in execution. This can be suer easy and quick to use in your code, but may lock you into a situation where your code only handles a single database connection:

<?php

// I've declared a singleton class called MyDB.

$users = MyDb::FetchArray("SELECT * FROM users");

// ..

?>

Now your boss comes in and asks you to connect to another database to get a list of customers. You think “uh oh, I can’t really do that with my design, now can I?” The solution I like to use is having Database Connection classes, and then a Database Connection Manager. Imagine I have the MySqlDAL class from the previous example in oversight #1. I would create singleton class called MySqlManager. It’s job would be to handle the creation & management of all it’s classes. Here is an example:

<?php

// ... declare my MySqlDAL class before

// Handles the creation and management of MySQL Connections
class MySqlManager
{

    // Holds the singleton instance
    private static $instance = null;

    // Contians each MySqlDAL object created;
    private $connections;

    private function __construct()
    {
        // Its private to prevent creation outside of the GetInstance function
        $this->connections = array();
    }

    public static function GetInstance()
    {
        // If the instance is null, make one
        if(!self::$instance)
        {
            self::$instance = new MySqlManager();
        }

        return self::$instance;
    }
    
    // Connect to a new database;
    public static function CreateConnection($host, $user, $pass, $name = 'default')
    {
        $manager = self::GetInstance();
        $manager->connections[$name] = new MySqlDAL($host, $user, $pass);

        return $manager->connections[$name];
    }

    public static function GetConnection($name = 'default')
    {
        $manager = self::GetInstance();
        if(isset($manager->connections[$name]))
        {
            return $manager->connections[$name];
        }
        else
        {
            // handle connection not found error...
        }
    }
}

// Now you can create a connection, and if you don't give it a name, it will be defaulted for you

$dal = MySqlManager::CreateConnection('localhost','mysql_user','mysql_pass', 'customer');

$customer_dal = MySqlManager::CreateConnection('mysql.another-location.com','another_user','another_pass');

// ..

// later in your code, maybe even in a class or function you could get the connection by name:

// get default connection
$dal = MySqlManager::GetConnection();

// Get customer database connection
$customer_dal = MySqlManager::GetConnection('customer');

$users = $dal->FetchArray('SELECT * FROM users');

$customers = $customer_dal->FetchArray("SELECT * FROM customers");

// ..

?>

Now you have a MySQL Connection Manager class that can handle the creation, management, and disconnection of your SQL connections. Your boss will be happy to hear you can connect to another database with easy!

Oversight #3 – No Developer Logging

There are many times where I wish I could what was going on behind the scenes. It is very easy to add Query Logging to your Data Access Layer if you choose. I won’t give an example how to do so, but there are some several very important parts to it. First, you need to be able to enable & disable query logging. Second, you should log whether it was successful or not. Also, if you play around with PHP’s back tracing abilities you can trace where each query was made. I’ll make a blog post about this another day.

Oversight #4 – Queries Written In Procedural Processes

I absolutely *loathe* when I get into a situation where queries are being written all over the place. If there is one constant in any software development, it is that things will always change. When queries are being placed in procedural code, it makes management of database changes a total hassle. “But what about find and replace?” That can help you find places where you need to change, but you still can miss other places in your code, which can lead to bugs, headaches, and self-combustion. Here is an example:

<?php

// .. code displaying a user profile:

// Get user links
$links = $dal->FetchArray("SELECT * FROM user_links WHERE user_id = 1 ORDER BY number");

// Display them on links page...

// .. another page

// Get user links
$links = $dal->FetchArray("SELECT * FROM user_links ORDER BY number");

// Display them on profile page...

// .. another page

// Get user links
$links = $dal->FetchArray("SELECT * FROM user_links ORDER BY number");

// Display them on about me page...

?>

Not seeing the big issue here? Lets say you copy and paste that line of code ‘$links = $dal->FetchArray(“SELECT * FROM user_links ORDER BY id”);’ on twenty different pages. Then your boss comes and says “Hey, you know how the user_links table has a title? Well every where we display the title, I’d like to have it show which # of link it is, so example: Link #1 – My Blog, Link #2 – My Facebook., Oh, and can we order them by date_added? Thanks!”

Have fun trying to find every place that query is executed. Don’t be surprised when only 80% of the queries get changed, and the other 20% become bug tickets with your boss wondering why you didn’t find them before. This is an example of writing queries all over the place when you need them, instead of organizing them and abstracting them by Classes and Functions. Here is an example of abstracting the query:

<?php

class User
{
   public $id; // contains user_id 

   // ..

    public function GetVideos()
    {
        $dal = MySqlManager::GetInstance();
        $videos = $dal->FetchArray("SELECT * FROM user_videos WHERE user_id = '$this->id' ORDER BY date_added");
        
        $new_video_list = array();
        foreach($videos as $video)
        {
            $video['caption'] = "Link #".$video['number'].' - '.$video['caption'];
            $new_video_list[] = $video;
        }

        return $new_video_list;
    }

    // ..
}

$user = new User(1);

$user_videos = $user->GetVideos();

// ..

?>

Now your boss will give you a raise when you can make changes this quickly. While this example may seem simple, there are many times where I’ve had projects that queries have required major changes and re-writes. Bugs quickly service if queries that are the same, or similar, are copied and re-copied all over the place. This leads to a giant mess very quickly. So remove queries from the procedural flow of things by abstracting them in classes and functions.

Oversight #5 – No Separation of Reads & Writes

This oversight has stung me for the first time just a few weeks ago. A project of mine had reached critical mass with its size, and out grew its beefy MySQL dedicated server. I started investigating doing a Master-Slave replication system, but every article said I needed to route SELECT statements to the slave servers, and just about every else to the Master (like Insert, Update, Delete, etc). This project didn’t have any Data Access Layer at all, and 2+ years of development on it. There is no way I could go back and re-write all the queries on budget and timeline, especially without causing bugs. especially if Insert/Update/Delete commands on accidently got routed to a slave server.

My suggestion is funneling all queries into two functions, Select() and Execute(). Then, when you create a MySqlDAL instance, it would actually have a $this->master_connection; and $this->slave_connection; It wouldn’t connection on creation, but on the first instance a query was executed. This is done to release stress on the Master server, the whole purpose of Master-Slave replication setups. Here is a mock class that would handle this:

<?php

class MySqlDAL
{
    public $master_connection;
    public $master_host;
    public $slave_connection;
    public $slave_host;
    public $user;
    public $pass;

    // ..

    public function Select($sql)
    {
        if(!$this->slave_connection)
        {
            $this->ConnectSlave();
        }
        // .. Increment Slave Counter ..

        return $this->Query($sql, $this->slave_connection);
    }

    public function Execute($sql)
    {
        if(!$this->master_connection)
        {
            $this->ConnectMaster();
        }
        
        // .. Increment Master Counter ..

        return $this->Query($sql, $this->master_connection);
    }

    // ..   

}

?>

Conclusion

Hopefully these insights into oversights with PHP and MySQL can help someone out there tackle there next project, big or small. If you have any questions, feel free to leave a comment and I’ll respond as soon as possible.

Justin is currently the Director of Development for the Deseret News. He is active in the Utah Open Source community. He is an advisory member of the Utah Open Source Foundation, and helps with the anual Utah Open Source Conference. He primarily focuses on PHP, MySQL, Redis, HTML, CSS, jQuery, and JavaScript. When he gets the time, he enjoys to play jazz piano. Read More

Tagged with: , , , , ,
Posted in Programming