<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Justin Carmony &#187; PHP</title>
	<atom:link href="http://www.justincarmony.com/blog/tag/php/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.justincarmony.com/blog</link>
	<description>Web Designer &#38; Software Engineer</description>
	<lastBuildDate>Wed, 01 Feb 2012 04:30:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SMS Nagios Notifications with PHP &amp; Twilio</title>
		<link>http://www.justincarmony.com/blog/2012/01/30/sms-nagios-notifications-with-php-twilio/</link>
		<comments>http://www.justincarmony.com/blog/2012/01/30/sms-nagios-notifications-with-php-twilio/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 01:12:14 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Nagios]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[sms]]></category>
		<category><![CDATA[system administration]]></category>
		<category><![CDATA[twilio]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=1089</guid>
		<description><![CDATA[I&#8217;ve had a few requests to share my Nagios SMS notifications using Twilio. I&#8217;m almost embarrassed to share them, since they are so dead simple. There was another plugin out there to do the same, but it was a lot more advanced and more work to setup, so I wrote my own in PHP. In ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2010/05/04/setting-up-nagios-for-servers/' rel='bookmark' title='Setting up Nagios for Servers'>Setting up Nagios for Servers</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/10/11/presentation-real-life-scaling/' rel='bookmark' title='Presentation: Real Life Scaling'>Presentation: Real Life Scaling</a></li>
<li><a href='http://www.justincarmony.com/blog/2010/02/24/bestbuy-gives-me-8770-wait-just-87-00/' rel='bookmark' title='BestBuy Gives Me $8,770 &#8212; Wait, Just $87.00'>BestBuy Gives Me $8,770 &#8212; Wait, Just $87.00</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had a few requests to share my <a href="http://www.nagios.org/">Nagios</a> SMS notifications using <a href="http://www.twilio.com/">Twilio</a>. I&#8217;m almost embarrassed to share them, since they are so dead simple. There was another plugin out there to do the same, but it was a lot more advanced and more work to setup, so I wrote my own in PHP.</p>
<p>In the past I would just use my iPhone&#8217;s email-to-txt email address. However, when I received the txt message, it wasn&#8217;t formated very pretty, and it would have a different &#8220;From Number.&#8221; So if we had a crazy day, I would have 20-30 message threads in my iPhone all about Nagios. I&#8217;ve been lazy in the past, and didn&#8217;t clear them out until I had about 200 of them, and it was a lot of &#8220;swipe, delete, swipe, delete, swipe, delete&#8221; to get rid of them.</p>
<p>What I like out this setup is with Twilio, I can buy a phone number for $1 a month. So all my notifications come through the same number. I&#8217;m also planning a way to call/text the number and it will read/send the current status and list the hosts that are down.</p>
<p>First thing is to get your Twilio account setup. Its pretty easy, and when I signed up (a long time ago) I got $30 in credit, not sure if they are doing that still for new customers. Either way you can add a credit card and say how much credit you want to add when your account dips below a certain amount. </p>
<p>Now, you can buy a number and set it up. Even if you&#8217;re only sending text messages, Twilio wants you to have a &#8220;SMS Request URL&#8221; before you can send any text messages. So I just put in a URL to my site that does nothing (yet). You&#8217;ll need to grab your Account SID and Token.</p>
<p><strong>How it Works</strong></p>
<p>I threw <a href="https://github.com/JustinCarmonyDotCom/Nagios-SMS-Requests-with-PHP-Twilio">the code up on GitHub</a>. There are a few things you&#8217;ll need to do:</p>
<p>First in sendTextMsg.php change your configurations to match your phone number, account SID and token. Here is the entire sendTextMsg.php file:</p>
<pre class="brush: php; title: ; notranslate">
&lt;?php

require('twilio-php/Services/Twilio.php');

/* Start Configs */

$sid = &quot;A123.....&quot;;
$token = &quot;29d6b9f.......&quot;;
$twilio_number = '4045550101';

/* End Configs */

# Get the Argvs
$phone  = $argv[1];
$msg    = $argv[2];
$msg    = str_replace('\n', &quot;\n&quot;, $msg);

$client = new Services_Twilio($sid, $token);
try
{
    $message = $client-&gt;account-&gt;sms_messages-&gt;create(
        $twilio_number, // From a valid Twilio number
        $phone, // Text this number
        $msg
    );
} catch (Exception $ex)
{
    var_dump($ex);
}
</pre>
<p>As you can see, it is super simple. All you need to do is put this script somewhere with the Twilio PHP libraries (or just use my github code with all of it in there). Then, in your Nagios configurations add the commands:</p>
<code class="code">define command{
    command_name    notify-host-by-txt
    command_line    /usr/bin/php /path/to/sendTxtMsg/sendTxtMsg.php &quot;$CONTACTPAGER$&quot; &quot;Nagios Alert\nType: $NOTIFICATIONTYPE$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\nWhen: $LONGDATETIME$&quot;
}

define command{
    command_name    notify-service-by-txt
    command_line    /usr/bin/php /path/to/sendTxtMsg/sendTxtMsg.php &quot;$CONTACTPAGER$&quot; &quot;Nagios Alert\nType: $NOTIFICATIONTYPE$\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\nWhen: $LONGDATETIME$&quot;
}</code>
<p>Then you need to add it to your contacts:</p>
<code class="code">define contact {
        contact_name                    justin
        alias                           Justin Carmony
        service_notification_period     24x7
        host_notification_period        24x7
        service_notification_options    w,u,c,r
        host_notification_options       d,r
        
        # Add the new commands to your list
        service_notification_commands   notify-service-by-email,notify-service-by-txt
        host_notification_commands      notify-host-by-email,notify-host-by-txt
        email                           justin@example.com
        
        # Add a pager number, cause we rock it old school 
        # like that: http://static.howstuffworks.com/gif/restaurant-pager-motorola.jpg
        pager                           4045551234
}</code>
<p>Simple, easy, and works very well. You can test the script by simply executing the command:</p>
<code class="code">php /path/to/sendTxtMsg.php 4045551234 &quot;This is my test.\nI am awesome.\n&quot;</code>
<p>Enjoy!</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2010/05/04/setting-up-nagios-for-servers/' rel='bookmark' title='Setting up Nagios for Servers'>Setting up Nagios for Servers</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/10/11/presentation-real-life-scaling/' rel='bookmark' title='Presentation: Real Life Scaling'>Presentation: Real Life Scaling</a></li>
<li><a href='http://www.justincarmony.com/blog/2010/02/24/bestbuy-gives-me-8770-wait-just-87-00/' rel='bookmark' title='BestBuy Gives Me $8,770 &#8212; Wait, Just $87.00'>BestBuy Gives Me $8,770 &#8212; Wait, Just $87.00</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2012/01/30/sms-nagios-notifications-with-php-twilio/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PHP Workers with Redis &amp; Solo</title>
		<link>http://www.justincarmony.com/blog/2012/01/10/php-workers-with-redis-solo/</link>
		<comments>http://www.justincarmony.com/blog/2012/01/10/php-workers-with-redis-solo/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 18:20:10 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Videos]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[redis]]></category>
		<category><![CDATA[solo]]></category>
		<category><![CDATA[Tips and Tricks]]></category>
		<category><![CDATA[workers]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=1072</guid>
		<description><![CDATA[I&#8217;ve come across an awesome combination of tools for managing PHP Workers, and thought I&#8217;d share. Why Workers? Sometimes there are situations when you want to parallel process things. Other times you might have a list of tasks to accomplish, and you don&#8217;t want to make the user wait after pressing a button. This is ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2011/05/23/mysql-redis-and-a-billion-rows-a-love-story/' rel='bookmark' title='MySQL, Redis, and a Billion Rows &#8211; A Love Story'>MySQL, Redis, and a Billion Rows &#8211; A Love Story</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/01/07/creating-chatroom-walls-with-redis-and-php/' rel='bookmark' title='Creating Chatroom / Walls with Redis &amp; PHP'>Creating Chatroom / Walls with Redis &#038; PHP</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/01/10/debugging-with-php-stack-traces-and-redis/' rel='bookmark' title='Debuging with PHP, Stack Traces, and Redis'>Debuging with PHP, Stack Traces, and Redis</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve come across an awesome combination of tools for managing PHP Workers, and thought I&#8217;d share.</p>
<h3>Why Workers?</h3>
<p>Sometimes there are situations when you want to parallel process things. Other times you might have a list of tasks to accomplish, and you don&#8217;t want to make the user wait after pressing a button. This is where &#8220;Workers&#8221; can come in. They are independent scripts that run along side of your application, performing tasks, or &#8220;jobs.&#8221; </p>
<p>An example is with Dating DNA and our score system. We generate scores between users to show how compatible they are with each other. When a user signs up, or makes a significant change to their profile questionnaire, we need to run a job to query our database, build a list of potential users, and generate scores. This takes 10-20 seconds, and while it is pretty fast, we don&#8217;t want to make the user wait for that. So we queue up a job for the user, divide up the work among several workers, and process the work.</p>
<h3>General Concept</h3>
<p>For this post, we&#8217;ll use the example of generating reports. Lets say on your internal website there is a button that you can click and it will email the user a report, and the report takes 2-3 minutes to generate. When the button is clicked, your code will insert the job into the queue. Meanwhile, workers are monitoring the queue. A worker script will pull the job off the queue, process the report, and send the email when its done.</p>
<p>For the queue management, we&#8217;ll use Redis. To let PHP read and write data to Redis, we&#8217;ll use the PHP Library <a href="https://github.com/nrk/predis">predis</a>. In our examples we&#8217;ll use PHP 5.3, however predis has a PHP 5.2 backport if you are not running 5.3.</p>
<h3>Adding Jobs</h3>
<p>To add jobs, we&#8217;ll need to connect to our Redis server:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Connecting to Redis
 */

const REDIS_HOST = '127.0.0.1';
const REDIS_PORT = 6379;

$predis = new Predis\Client(array(
    'scheme' =&gt; 'tcp',
    'host'   =&gt; REDIS_HOST,
    'port'   =&gt; REDIS_PORT,
));
</pre>
<p>We&#8217;ll assume in all of our examples that we&#8217;ve done the following above &#038; connected to Redis. <span id="more-1072"></span></p>
<p>Now, to manage our queues we&#8217;ll use the Redis Datatype LIST. Whats awesome about lists is that regardless of size, adding or removing at the start or end of a list is extremely fast. So if your queue has 10 items, or 10,000,000 items, Redis wil be able to push and pop entries quickly.</p>
<p>We&#8217;ll have three queues, one for each priority: high, normal, and low. For the Redis key names, we&#8217;ll use queue.priority.high, queue.priority.normal, etc. When interacting with lists, you work with the ends, one called right, the other called left. So we&#8217;ll add items on the right with the RPUSH (Right Push) command, and we&#8217;ll pull items off the left with the BLPOP (Blocking Left Pop) command. We won&#8217;t worry about the pulling items just yet.</p>
<p>You store strings as the values for the list. My personal preference is to store JSON objects so you can easily pass variables needed to perform the job.</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Adding items to the queue
 */

$job = new stdClass();
$job-&gt;id = 1;
$job-&gt;report = 'general';
$job-&gt;email = 'test@example.com';

// Add the job to the high priority queue
$predis-&gt;rpush('queue.priority.high', json_encode($job));

// Or, you could add it to the normal or low priority queue.
$predis-&gt;rpush('queue.priority.normal', json_encode($job));
$predis-&gt;rpush('queue.priority.low', json_encode($job));
</pre>
<p>Simple enough! Having different queue priorities is very beneficial in managing which jobs should get done first. For example, you might have an Executive&#8217;s request go into the high priority queue so they get the report quickly. You might also have a weekly cron that queues up reports to be sent automatically, so those can go in the low priority as to not disrupt people trying to get a manual report.</p>
<p>Now, on to the worker&#8217;s code.</p>
<h3>Processing Jobs</h3>
<p>For now, lets say we have a script running in the PHP CLI (Command Line Interface) that you started by running this command on the server:</p>
<code class="code">php /path/to/worker.php</code>
<p>First thing is we want this worker to work continuously, so we can do a while loop:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Simple Continuous While Loop
 */

// Always True
while(1)
{
	/* ... perform tasks here ...  */
}
</pre>
<p>We&#8217;ll worry about making them more intelligent later. Now, let&#8217;s have our worker check the queue. You can do so with the BLPOP command:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Checking the Queue
 */
$job = $predis-&gt;blpop('queue.priority.high'
						, 'queue.priority.normal'
						, 'queue.priority.low'
						, 10);
</pre>
<p>What we&#8217;re telling PHP to do is to check each queue in order of priority: high, normal, and then low. If it finds an item, it will immediately return an array with the name of the queue it came from, and the string of data that was pulled.</p>
<p>The B in BLPOP is &#8220;blocking.&#8221; What that means is that Redis will wait until either an item enters one of the queues, or the timeout is reached. In this case, the timeout is 10 seconds. So instead of polling (checking every few seconds in a loop), we check and wait, and after 10 seconds it will return null and we can check again.</p>
<p>What this gives us is near instantaneous queues. As soon as something is available, it is passed to the workers that are listening. You can also have multiple workers, and it will pass jobs to the first listening worker, and the next job to the next worker, so you don&#8217;t have to worry about multiple workers getting the same queued item.</p>
<p>After $predis->blpop() returns, if it has an array, it returned an item. If not, the timeout had been reached. We can check to see if a Job was returned, and if so to process the job:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Checking to see if a Job was returned
 */

if($job)
{
	// Index 0 of the array holds which queue was returned
	$queue_name = $job[0];
	// Index 1 of the array holds the string value of the job.
	// Since we are passing it JSON, we'll decode it:
	$details = json_decode($job[1]);

	/* ... do job work ... */
}
</pre>
<p>Now we can have multiple workers listening to the same queues and scale our workload. Redis is very fast &#038; efficient, and you could have hundreds or even thousands of workers listening to a single redis server.</p>
<h3>Continuously Running Workers</h3>
<p>There are a lot of options when it comes to deploying these workers. You can use a framework like Gearman, but for simple things, I like very simple solutions. I came across a <a href="http://josephscott.org/archives/2011/09/solo/">blog post by Joseph Scott</a> about a little 10 line perl script called <a href="http://timkay.com/solo/solo">solo</a>. What it does is it will run a command, and to ensure that no one else is running that same exact command, it will lock a configurable port. This is awesome because the you don&#8217;t have to work about lock files or filesystem tricks, the kernel handles it all. </p>
<p>So what you can do is create a cronjob using solo to execute your script. First copy solo somewhere, I put it in my /usr/local/bin on my linux server. Then add this to your cron job using the command &#8220;crontab -e -u (which user to use)&#8221;:</p>
<code class="code">* * * * * /usr/local/bin/solo -port=5001 php /path/to/worker.php</code>
<p>What this will do is try to run this command every minute. Solo will check to see if the port is already in use, and if it is, it will exit. Otherwise, it will lock the port and then execute the command. The port will stay locked as long as the command is executing. Once the command terminates, the port will unlock.</p>
<p>Now, PHP is a great language, but it has been known to have some memory leaks while running a long time in a single instance. So we can have our scripts exit periodically to be restarted by our cron job. So lets make our &#8220;while(1)&#8221; statement a little smarter:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * A Smarter While Statement
 */

// Set the time limit for php to 0 seconds
set_time_limit(0);

/*
 * We'll set our base time, which is one hour (in seconds).
 * Once we have our base time, we'll add anywhere between 0
 * to 10 minutes randomly, so all workers won't quick at the
 * same time.
 */
$time_limit = 60 * 60 * 1; // Minimum of 1 hour
$time_limit += rand(0, 60 * 10); // Adding additional time

// Set the start time
$start_time = time();

// Continue looping as long as we don't go past the time limit
while(time() &lt; $start_time + $time_limit)
{
	/* ... perorm BLPOP command ... */
	/* ... process jobs when received ... */
}

/* ... will quit once the time limit has been reached ... */
</pre>
<p>One key thing to note is randomly shifting the time limit for the script. I like to do this because you don&#8217;t want your workers all stopping and starting at the same time. So if I have 8 workers, one might, but the 7 will continue until the 8th starts back up again via the cron job.</p>
<h3>Bells &#038; Whistles</h3>
<p>After using workers for awhile, here are a couple of ideas to enhance your workers &#038; system managing them. First off, you can add some monitoring for your queues. Using Redis a HASH, you can use them to store the state of your workers. </p>
<pre class="brush: php; title: ; notranslate">
/*
 * Assigning Worker IDs &amp; Monitoring
 *
 * Usage: php worker.php 1
 */

// Gets the worker ID from the command line argument
$worker_id = $argv[1];

// Setting the Worker's Status
$predis-&gt;hset('worker.status', $worker_id, 'Started');

// Set the last time this worker checked in, use this to
// help determine when scripts die
$predis-&gt;hset('worker.status.last_time', $worker_id, time());
</pre>
<p>Another problem with workers that run for a long time (several hours) is when you make a change to their code, they won&#8217;t reload that change until they exit. What I&#8217;ve found to successfully restart them is having a &#8220;version&#8221; number set in Redis that is checked at the end of every loop:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Using Versions to Check for Reloads
 */

$version = $predis-&gt;get('worker.version'); // i.e. number: 6

while(time() &lt; $start_time + $time_limit)
{
	/* ... check for jobs and process them ... */

	/* ... then, at the very end of the while ... */
	if($predis-&gt;get('worker.version') != $version)
	{
		echo &quot;New Version Detected... \n&quot;;
		echo &quot;Reloading... \n&quot;;
		exit();
	}
}
</pre>
<p>You would simply INCR (increment) worker.version and after finishing their last job, the worker would exit, and solo would start it up again.</p>
<p>You can also kill specific threads by having them check for their value in a hash:</p>
<pre class="brush: php; title: ; notranslate">
/*
 * Using Kill Switches to Check for Reloads
 */

while(time() &lt; $start_time + $time_limit)
{
	/* ... check for jobs and process them ... */

	/* ... then, at the very end of the while ... */
	// Check to see if a kill has been set.
	if($predis-&gt;hget('worker.kill', $worker_id))
	{
		// Make sure to unset the kill request before exiting, or
		// your worker will just keep restarting.
		$predis-&gt;hdel('worker.kill', $worker_id);

		echo &quot;Kill Request Detected... \n&quot;;
		echo &quot;Reloading... \n&quot;;
		exit();
	}
}
</pre>
<h3>Tweak to Solo &#038; Logging </h3>
<p>I made one small tweak in my version of solo, and that was to help it enable logging. Lets say I had three workers in my crontab:</p>
<code class="code"># crontab for user to run workers
* * * * * /usr/local/bin/solo -port=5001 php /path/to/worker.php 1 &gt;&gt; /tmp/worker.log.1
* * * * * /usr/local/bin/solo -port=5002 php /path/to/worker.php 2 &gt;&gt; /tmp/worker.log.2
* * * * * /usr/local/bin/solo -port=5003 php /path/to/worker.php 3 &gt;&gt; /tmp/worker.log.3</code>
<p>The &#8220;>> /tmp/worker.log.1&#8243; tells solo I want to log it&#8217;s output to a tmp file that I can tail and monitor their progress. This is great for debugging problems. However, when I did this, solo would write to the tmp file, and not the output from my script. To overcome this I changed the last line of solo:</p>
<pre class="brush: perl; title: ; notranslate">
# old
exec @ARGV;
# new
exec &quot;@ARGV&quot;;
</pre>
<p>This would ensure my script wrote out to the tmp file, and not just solo.</p>
<h3>Examples</h3>
<p>I&#8217;ve created an <a href="https://github.com/JustinCarmony/PHP-Workers-with-Redis-Solo-Examples">example on GitHub</a> that you can clone on your own machine. All you will need is PHP 5.3 and Redis installed.</p>
<p>To install redis, simple run these commands on your unix based system:</p>
<code class="code">wget http://redis.googlecode.com/files/redis-2.4.5.tar.gz
tar -xzvf redis-2.4.5.tar.gz
cd redis-2.4.5
make
make install</code>
<p>It will copy the redis binaries to /usr/local/bin.</p>
<p>To get a copy of the code, you can <a href="https://github.com/JustinCarmony/PHP-Workers-with-Redis-Solo-Examples/zipball/master">download them here</a>. <strong>HOWEVER, it doesn&#8217;t include predis! You&#8217;ll have to download and copy predis inside there via this link.</strong> It is much easier to clone it as so:</p>
<code class="code">git clone git://github.com/JustinCarmony/PHP-Workers-with-Redis-Solo-Examples.git php_example/
cd php_example
git submodule init
git submodule update</code>
<p>Then, using different terminal windows (or using screen), you can run different worker.php instances, use creator.php to insert jobs, and monitor.php to watch the progress. This is all done from the command line.</p>
<p>If you&#8217;re using windows, I suggest installed a VM of Ubuntu and using that. If you really want to use Redis on windows, there are some Windows Binaries you can google and download. Good luck!</p>
<p>Here is a video where I demo the example:</p>
<p><iframe width="640" height="480" src="http://www.youtube.com/embed/jhgGhBgY14U?hd=1" frameborder="0" allowfullscreen></iframe></p>
<p>(sorry for the poor mic quality)</p>
<h3>Final Thoughts</h3>
<p>I&#8217;ll post here shortly about how to run Redis in production with the init.d scripts and configuration files. One caveat to using solo is if your server has an application that randomly selects ports to use (i.e. VoIP, FTP), it might select one of your worker&#8217;s ports. But on a production server, you should have a good feel for which ports are available for locking.</p>
<p>If you want to learn more about Redis, <a href="http://redis.io/">check out their website</a>.  </p>
<p>Hopefully this will be helpful for anyone looking to use PHP Workers in an easy, simple way.</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2011/05/23/mysql-redis-and-a-billion-rows-a-love-story/' rel='bookmark' title='MySQL, Redis, and a Billion Rows &#8211; A Love Story'>MySQL, Redis, and a Billion Rows &#8211; A Love Story</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/01/07/creating-chatroom-walls-with-redis-and-php/' rel='bookmark' title='Creating Chatroom / Walls with Redis &amp; PHP'>Creating Chatroom / Walls with Redis &#038; PHP</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/01/10/debugging-with-php-stack-traces-and-redis/' rel='bookmark' title='Debuging with PHP, Stack Traces, and Redis'>Debuging with PHP, Stack Traces, and Redis</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2012/01/10/php-workers-with-redis-solo/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>My 2011 Recap</title>
		<link>http://www.justincarmony.com/blog/2011/12/30/my-2011-recap/</link>
		<comments>http://www.justincarmony.com/blog/2011/12/30/my-2011-recap/#comments</comments>
		<pubDate>Sat, 31 Dec 2011 02:00:20 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Personal]]></category>
		<category><![CDATA[Goals]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=1066</guid>
		<description><![CDATA[As I look back on 2011, its been a pretty fun year technology &#038; work wise. While I have a few more draft posts to finish up, I thought I&#8217;d hit some highlights from the past year. PHP Its been a fun year with the Utah PHP Usergroup. I look forward to it each month, ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2009/12/23/my-2009-technology-recap/' rel='bookmark' title='My 2009 Technology Recap'>My 2009 Technology Recap</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/05/14/attending-php-tek-2011/' rel='bookmark' title='Attending PHP Tek 2011'>Attending PHP Tek 2011</a></li>
<li><a href='http://www.justincarmony.com/blog/2010/12/29/goal-for-2011-learn-c/' rel='bookmark' title='Goal for 2011: Learn C'>Goal for 2011: Learn C</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>As I look back on 2011, its been a pretty fun year technology &#038; work wise. While I have a few more draft posts to finish up, I thought I&#8217;d hit some highlights from the past year.</p>
<h3>PHP</h3>
<p>Its been a fun year with the <a href="http://uphpu.org/">Utah PHP Usergroup</a>. I look forward to it each month, and if you&#8217;re in Utah, you definitely should come. We&#8217;ve had a lot of interesting talks, such as running your app in different clouds, several MySQL Forks and NoSQL stuff. </p>
<p>I also had a great time going to my first PHP specific conference. While I&#8217;ve attended and spoken at other conferences, Tek 11 was a lot of fun. Met a lot of smart and fun people.</p>
<p>As for work, I do almost exclusively PHP work. We power the Dating DNA website and APIs using PHP. All of the Clipish Apps&#8217; APIs are powered by PHP. Alienware Arena, The GeForce StarCraft II Pro/Am Tournament, Thermaltake eSports, and a few other gaming tournaments sites, all powered by PHP. </p>
<p>I&#8217;m hoping to blog more about PHP this year. While I use it all over the place, it seems I write more about all the other pieces that interface with and support PHP.</p>
<h3>New Technologies</h3>
<p>There have been a handful of new technologies gaining traction. We&#8217;ve been using Redis in production for quite awhile now, and finding more and more uses for it. I&#8217;ve played around quite a bite with node.js, and I see some potential for augmenting PHP with it. MongoDB, CouchDB, and a few other data stores have shown a lot of potential in situations that play to their strength. I&#8217;m still a major fan of MySQL, but it&#8217;s nice to good options for things that MySQL isn&#8217;t as strong at.</p>
<h3>New Goals</h3>
<p>I&#8217;ve been working out, eating well, and such for almost 8 weeks now as well as my wife. The net effect is I&#8217;ve lost 16 pounds, and my wife has lost quite a bit too. This was done over the holidays even. We still have some ways to go to get to our goals, but we&#8217;re hoping to be successful in 2012 and hit our target goals. We&#8217;re even starting to enjoy lifting, cardio, and eating right.</p>
<p>I&#8217;m also setting the goal to blog once a week this year. I&#8217;ve tried daily, and that is just too much, but once a week is manageable. Hopefully it&#8217;ll be relating the technologies I use, to help share the cool stuff we get to use in-house.</p>
<h3>Upcoming in 2012</h3>
<p>We&#8217;re launching a new product in 2012 for Dating DNA, hopefully within a month or so. I&#8217;m also hoping to go to a few conferences this year, and perhaps speak at a few of them. Also, on the fun side of things, we&#8217;ll be going on a cruise in April, and perhaps a trip to China in September. </p>
<p>I hope everyone had an excellent 2011, and and amazing 2012!</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2009/12/23/my-2009-technology-recap/' rel='bookmark' title='My 2009 Technology Recap'>My 2009 Technology Recap</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/05/14/attending-php-tek-2011/' rel='bookmark' title='Attending PHP Tek 2011'>Attending PHP Tek 2011</a></li>
<li><a href='http://www.justincarmony.com/blog/2010/12/29/goal-for-2011-learn-c/' rel='bookmark' title='Goal for 2011: Learn C'>Goal for 2011: Learn C</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/12/30/my-2011-recap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Setting Up Nginx &amp; PHP-FPM on Ubuntu 10.04</title>
		<link>http://www.justincarmony.com/blog/2011/10/24/setting-up-nginx-php-fpm-on-ubuntu-10-04/</link>
		<comments>http://www.justincarmony.com/blog/2011/10/24/setting-up-nginx-php-fpm-on-ubuntu-10-04/#comments</comments>
		<pubDate>Tue, 25 Oct 2011 03:50:59 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[nginx]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[system administration]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[web servers]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=1020</guid>
		<description><![CDATA[This is another wonderful setup that I&#8217;ve found myself using rather than the traditional Apache &#038; mod_php setup. What is Nginx? Nginx (pronounced engine-x) is a fast, powerful, lightweight web server. I won&#8217;t go into the theory under-the-hood, but it&#8217;s focus is high concurrency with low memory usage. So while Apache is more robust in ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2011/10/24/setting-up-percona-server-5-5-on-ubuntu-10-04/' rel='bookmark' title='Setting Up Percona Server 5.5 on Ubuntu 10.04'>Setting Up Percona Server 5.5 on Ubuntu 10.04</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/09/13/preparing-a-vmware-ubuntu-guest-os/' rel='bookmark' title='Preparing a VMWare Ubuntu Guest OS'>Preparing a VMWare Ubuntu Guest OS</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/01/24/php-nginx-and-output-flushing/' rel='bookmark' title='PHP, Nginx, and Output Flushing'>PHP, Nginx, and Output Flushing</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>This is another wonderful setup that I&#8217;ve found myself using rather than the traditional Apache &#038; mod_php setup.</p>
<h2>What is Nginx?</h2>
<p>Nginx (pronounced engine-x) is a fast, powerful, lightweight web server. I won&#8217;t go into the theory under-the-hood, but it&#8217;s focus is high concurrency with low memory usage. So while Apache is more robust in supporting many different features, nginx focuses on handling the important features very quickly. I still use Apache internally for our SVN &#038; Trac web server. Heck, even at this time I&#8217;m using Apache to host this blog. However, Dating DNA, Clipish, CEVO, Alienware Arena, and some other high traffic sites/apis use nginx.</p>
<p>Ngnix, unlike Apache, doesn&#8217;t actually load PHP. Instead, it hands it off as a proxy to a &#8220;php handler&#8221; which acts like an Application Server. So nginx by itself won&#8217;t serve PHP files, but just static files.</p>
<h2>What is PHP-FPM?</h2>
<p>In the past, when working with something like Nginx or lighttpd, you would use spawn-fcgi to host your PHP application. However, spawn-fcgi had some major drawbacks and problems. So a guy named Andrei Nigmatulin created PHP-FPM, which stands for &#8220;PHP FastCGI Process Manager.&#8221; Since then, several others have contributed and ultimately it was include into the PHP core in version 5.3.3.</p>
<p>So from a high level look, on every PHP request Apache will load the entire installed PHP environment each time. This is for every request, and while it has been optimized as much as it can, that is a <strong>lot</strong> of overhead! With PHP-FPM, it will spin up a configurable amount of children. Each load the PHP environment and then will serve as many requests as it can without having to reload the environment. This saves on a lot of overhead!</p>
<h2>Why use Nginx &#038; PHP-FPM?</h2>
<p>I should note, it is possible to configure/compile Apache in such a way that it can have similar performance capabilities. However, it takes a <strong>ton of work</strong>. Meanwhile, Nginx &#038; PHP-FPM are very fast from the start, so I prefer just using them. You do lose some features, like .htaccess files won&#8217;t work so you&#8217;ll have to do that configuration in your virtual hosts.<br />
<span id="more-1020"></span></p>
<h2>How to Setup Nginx &#038; PHP-FPM</h2>
<p><strong>Nginx</strong></p>
<p>First off, lets setup Nginx.</p>
<code class="code">sudo aptitude update
sudo apt-get install nginx
/etc/init.d/nginx start</code>
<p>Thats it! If you go to your server&#8217;s IP Address or Domain Name you should see a &#8220;Welcome to Nginx!&#8221;</p>
<p><strong>PHP-FPM</strong></p>
<p>Because PHP-FPM is only included by default in PHP 5.3.3 and later, and Ubuntu 10.04 LTS only has PHP 5.2.3, we have two options. Either we can install by source, or we can add another repository to install PHP-FPM. The latter is much, much easier, and there is a good PHP-FPM Repo for Ubuntu 10.04. To add it, you just run the following commands:</p>
<code class="code">sudo aptitude install python-software-properties
sudo add-apt-repository ppa:brianmercer/php
sudo aptitude update</code>
<p>Now that we have the new repository, we can install PHP5:</p>
<code class="code">sudo aptitude install php5-cli php5-common php5-mysql php5-suhosin php5-gd php5-dev
sudo aptitude install php5-fpm php5-cgi php-pear php5-memcache php-apc
/etc/init.d/php5-fpm restart</code>
<p>Excellent! Now, if you need to change some of PHP-FPM&#8217;s configurations, they are found in /etc/php5/fpm/. The file php5-fpm.conf configures how FPM will opporate, and the php.ini is the settings file that PHP will use while running in FPM.</p>
<p>A few settings I like to change in /etc/php5/fpm/php5-fpm.conf:</p>
<code class="code">pm.max_children = 20</code>
<p>The php5-fpm.conf that comes is pretty well documented on the different settings. Once you make a change, make sure to restart php5-fpm.conf: /etc/init.d/php5-fpm restart</p>
<p><strong>Configuring Nginx</strong></p>
<p>Now, we have a few settings for Nginx. The configuration files are found in /etc/nginx/. First we&#8217;ll edit nginx.conf. Here are a few settings we&#8217;ll want to change:</p>
<code class="code">user www-data;
worker_processes  4; # 1 to 4, I normally put this to the number of cores

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
    multi_accept on; # uncomment this line
    use epoll; # Add This - We'll want Nginx to use epoll for event timing
}

http {
    include       /etc/nginx/mime.types;

    access_log  /var/log/nginx/access.log;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;
    tcp_nodelay        on;

    gzip  on;
    gzip_disable &quot;MSIE [1-6]\.(?!.*SV1)&quot;;

    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}</code>
<p>Now, we need to add a VirtualHost! Nginx uses the same layout in Ubuntu as Apache, so we&#8217;ll add configurations for each site we want under /etc/nginx/sites-available/. So using vi, nano, or whichever editor you prefer, create a /etc/nginx/sites-available/www.example.com file:</p>
<code class="code"># rewrite from example.com to www.example.com
server { 
	listen 80;
	server_name example.com;
	rewrite ^(.*) http://www.example.com$1 permanent;
}

server {
    listen   80;
    server_name www.example.com;
    access_log /var/log/nginx/www.example.com.access.log;
    error_log /var/log/nginx/www.example.com.error.log;

	client_max_body_size 4M;
	client_body_buffer_size 128k;
	expires 24h;
 
    location / {
        root   /var/www/example.com/;
        index index.html index.php;
		
        # if file exists return it right away
        if (-f $request_filename) {
                break;
        }

        if (-e $request_filename)
        {
                break;
        }

        # Useful rewrite for most frameworks, wordpress
        if (!-e $request_filename) {
                rewrite ^(.+)$ /index.php last;
                break;
        }

    }

    location /nginx_status {
      # copied from http://blog.kovyrin.net/2006/04/29/monitoring-nginx-with-rrdtool/
      stub_status on;
      access_log   off;
      allow 127.0.0.1;
      deny all;
    }

    location ~ \.php$ {
        expires off;
        include /etc/nginx/fastcgi_params;
        fastcgi_pass  127.0.0.1:9000;
        fastcgi_index index.php;
        fastcgi_param  SCRIPT_FILENAME  /var/www/example.com/$fastcgi_script_name;
    }
}</code>
<p>Now, we need to create the symlink from sites-enabled to sites-available:</p>
<code class="code">ln -s /etc/nginx/sites-available/www.example.com /etc/nginx/sites-enabled/www.example.com</code>
<p>Restart nginx with &#8220;/etc/init.d/nginx restart&#8221;. Go ahead and put a test.php file in your directory with a Hello World example, and see if it works. It should work, and you should be good to go.</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2011/10/24/setting-up-percona-server-5-5-on-ubuntu-10-04/' rel='bookmark' title='Setting Up Percona Server 5.5 on Ubuntu 10.04'>Setting Up Percona Server 5.5 on Ubuntu 10.04</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/09/13/preparing-a-vmware-ubuntu-guest-os/' rel='bookmark' title='Preparing a VMWare Ubuntu Guest OS'>Preparing a VMWare Ubuntu Guest OS</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/01/24/php-nginx-and-output-flushing/' rel='bookmark' title='PHP, Nginx, and Output Flushing'>PHP, Nginx, and Output Flushing</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/10/24/setting-up-nginx-php-fpm-on-ubuntu-10-04/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Preparing a VMWare Ubuntu Guest OS</title>
		<link>http://www.justincarmony.com/blog/2011/09/13/preparing-a-vmware-ubuntu-guest-os/</link>
		<comments>http://www.justincarmony.com/blog/2011/09/13/preparing-a-vmware-ubuntu-guest-os/#comments</comments>
		<pubDate>Tue, 13 Sep 2011 07:11:09 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[development environment]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[vmware]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=1006</guid>
		<description><![CDATA[I forget these steps all the time, so I figured I should record them here. A lot of times VMWare will auto-install the VMWare tools for you when you first setup your VM. However, many times after setting it up I&#8217;ll do updates, and updates to the kernel will be applied, knocking out the VMWare ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2008/10/10/ubuntu-desktop-terminal-su/' rel='bookmark' title='Ubuntu Desktop Terminal &#8211; Su'>Ubuntu Desktop Terminal &#8211; Su</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/03/30/sending-email-from-non-email-ubuntu-server/' rel='bookmark' title='Sending Email from non-email Ubuntu Server'>Sending Email from non-email Ubuntu Server</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/01/19/my-honest-attempt-with-linux-desktop/' rel='bookmark' title='My Honest Attempt With Linux Desktop'>My Honest Attempt With Linux Desktop</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I forget these steps all the time, so I figured I should record them here.</p>
<p>A lot of times VMWare will auto-install the VMWare tools for you when you first setup your VM. However, many times after setting it up I&#8217;ll do updates, and updates to the kernel will be applied, knocking out the VMWare Tools changes. These tools are used for things like file sharing between the guest and host. </p>
<p>So here are the steps to take to fully update and re-install the VMWare Tools. I got most of these from an <a href="https://help.ubuntu.com/community/VMware/Tools">article on Ubuntu&#8217;s website</a>. <span id="more-1006"></span></p>
<p>First, <code>sudo aptitude update</code> to make sure all my repository information is up-to-date.</p>
<p>Second, <code>sudo aptitude safe-upgrade</code> to get any and all updates for my newly installed OS.</p>
<p>Third, <code>sudo apt-get install build-essential linux-headers-`uname -r` psmisc</code> to install the linux headers for my kernel.</p>
<p>Fourth, copy and install the VMWare Tools:</p>
<p><code># make a mount point if needed :<br />
sudo mkdir /media/cdrom</p>
<p># Mount the CD<br />
sudo mount /dev/cdrom /media/cdrom</p>
<p># Make a dir for the VMWare Tools files<br />
mkdir ~/vmtools</p>
<p># Copy and extract VMWareTools<br />
sudo cp /media/cdrom/VMwareTools*.tar.gz ~/vmtools</p>
<p># You can extract with archive manager, right click on the archive and extract ... or<br />
cd ~/vmtools<br />
tar xvf VMwareTools*.tar.gz</p>
<p># Install the tools<br />
cd vmware-tools-distrib<br />
sudo ./vmware-install.pl</code></p>
<p>Just follow the prompts and hit <enter> for all the default values. </p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2008/10/10/ubuntu-desktop-terminal-su/' rel='bookmark' title='Ubuntu Desktop Terminal &#8211; Su'>Ubuntu Desktop Terminal &#8211; Su</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/03/30/sending-email-from-non-email-ubuntu-server/' rel='bookmark' title='Sending Email from non-email Ubuntu Server'>Sending Email from non-email Ubuntu Server</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/01/19/my-honest-attempt-with-linux-desktop/' rel='bookmark' title='My Honest Attempt With Linux Desktop'>My Honest Attempt With Linux Desktop</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/09/13/preparing-a-vmware-ubuntu-guest-os/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Working with Middle-Scale Websites</title>
		<link>http://www.justincarmony.com/blog/2011/07/18/working-with-middle-scale-websites/</link>
		<comments>http://www.justincarmony.com/blog/2011/07/18/working-with-middle-scale-websites/#comments</comments>
		<pubDate>Tue, 19 Jul 2011 00:02:40 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[middle-scale]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[scaling]]></category>
		<category><![CDATA[system administration]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=940</guid>
		<description><![CDATA[I&#8217;ve been thinking about this idea for awhile, and I thought I would put a name to the thought. I brought up this idea while I was giving my &#8220;Real Life Scaling&#8221; presentation at the Utah Open Source Conference in 2009. Here is the problem I think most individuals in the web development face: Hopefully ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2009/04/18/data-backups-there-are-no-excuses/' rel='bookmark' title='Data Backups &#8211; There Are No Excuses'>Data Backups &#8211; There Are No Excuses</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/09/16/speaking-utah-open-source-conference-2009/' rel='bookmark' title='Speaking: Utah Open Source Conference 2009'>Speaking: Utah Open Source Conference 2009</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/10/11/presentation-real-life-scaling/' rel='bookmark' title='Presentation: Real Life Scaling'>Presentation: Real Life Scaling</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been thinking about this idea for awhile, and I thought I would put a name to the thought. I brought up this idea while I was giving my &#8220;Real Life Scaling&#8221; presentation at the Utah Open Source Conference in 2009. Here is the problem I think most individuals in the web development face:</p>
<p>Hopefully at some point, your website gets a lot of traffic. Yay, you&#8217;ve reached your goal of getting good traffic, but it is soon followed by issues with performance and load. I like to call these the growing pains of a website. So as a web developer, I suddenly have the epiphany of &#8220;Hey, I need to scale my website!&#8221; What follows next is the biggest mistake a web developer can make:</p>
<p>They start looking at articles on how Google scales, or maybe how Facebook manages all of their traffic.</p>
<p><strong>This is a mistake!</strong> To be brutally honest, you are <strong>not</strong> Google. You are not Facebook. You are not Twitter. You are a website that receives less than 0.000001% of the traffic that some major websites receive.</p>
<p>Why is this dangerous for web developers to do? Google, Twitter, Facebook, and others like them are solving complicated at a very large scale. I remember a presentation by a Twitter engineer who developed a program for a unique ID generator that can generate millions of IDs per second. The probability of you needing this type of solution is about the same as being struck by lightening. Applying these same practices at a much smaller scale are not realistic. If a locally owned grocery store wanted to open a second store, they would not adopt the same practices that Wal-mart use to manage their 8970 stores.</p>
<h2>A Little Reality Check</h2>
<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/StackExchange.png"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/StackExchange.png" alt="" title="StackExchange" width="300" height="80" class="alignnone size-full wp-image-941" /></a></p>
<p>I&#8217;m sure that most of my readers know of <a href="http://stackexchange.com/">StackExchange.com</a>. They power the popular website <a href="http://stackoverflow.com/">StackOverflow</a> and <a href="http://stackexchange.com/sites">several others</a>. They have about two million visitors per day. That is a <em>lot</em> of traffic. StackOverflow is ranked #123 on Alexa. So you would imagine that they have a very large infrastructure serving all of this traffic?</p>
<p>Earlier this year, Stack Exchange wrote <a href="http://blog.serverfault.com/post/stack-exchanges-architecture-in-bullet-points/">an article about their production environment</a>. I was surprised on what exactly they were using. In paticular, the number of Production Servers*:</p>
<blockquote><ul>
<li>12 Web Servers (Windows Server 2008 R2)</li>
<li>2 Database Servers (Windows Server 2008 R2 and SQL Server 2008 R2)</li>
<li>2 Load Balancers (Ubuntu Server and HAProxy)</li>
<li>2 Caching Servers (Redis on CentOS)</li>
<li>1 Router / Firewall (Ubuntu Server)</li>
<li>3 DNS Servers (Bind on CentOS)</li>
</ul>
</blockquote>
<p>That is 22 servers for 2 Million Visits per day, serving 800 HTTP requests per second. Now, StackExchange did clarify that they did have other servers for management and fail over, but 22 servers handle their production load. This is a website that is ranked the 123rd most visited website in the world.</p>
<p>Honestly, most websites could be run on half a dozen servers if designed and configured correctly, including redundancy. Some really busy websites could run off a dozen servers. Unless you&#8217;re in the top 5,000 websites on the web, you really shouldn&#8217;t be worried about large-scale techniques. </p>
<p>So when you&#8217;re website is starting to grow, and you leave small scale, you&#8217;ll enter the phase of &#8220;Middle-Scale.&#8221;</p>
<h2>What is Middle-Scale?</h2>
<p>Middle-Scale is like being an awkward teenager:</p>
<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/cera-awkward.jpeg"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/cera-awkward-300x200.jpg" alt="" title="cera-awkward" width="300" height="200" class="alignnone size-medium wp-image-942" /></a></p>
<p>You know that you can&#8217;t be the only one suffering through this, but you&#8217;re unsure how to proceed. It feels like you&#8217;re missing missing out on things everyone else must already know, but aren&#8217;t talking about. Like everyone else are awesome vampires or something:</p>
<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/twilight-cast.jpeg"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/twilight-cast-300x225.jpg" alt="" title="twilight-cast" width="300" height="225" class="alignnone size-medium wp-image-943" /></a></p>
<p>But the reality is this: they don&#8217;t have some awesome secret! They are just normal teenagers.</p>
<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/teenage-friends.jpeg"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/07/teenage-friends-300x200.jpg" alt="" title="teenage-friends" width="300" height="200" class="alignnone size-medium wp-image-944" /></a></p>
<p>This same idea applies to Middle-Scale websites.</p>
<p>Middle-Scale is when the <strong>most important things are <em>still</em> the best practices</strong>. Only now when you deviate from them you can feel those consequences. When you only had 100 users, a couple of nested queries and missing indexes didn&#8217;t cause that much of a problem. Your database is powerful enough to hide the inefficiencies. However, when you get to 10,000 users, your database can no longer hide the inefficiencies.</p>
<p>Middle-Scale is when simply separating your web server and database server isn&#8217;t enough. You&#8217;ll probably need to add some sort of cache like <a href="http://www.justincarmony.com/blog/2009/06/24/writing-effictive-php-caches-with-memcached/">memcached</a>. You&#8217;ll need to start tweaking your MySQL, Apache, and PHP configurations. </p>
<p>Then, after you&#8217;ve ironed out your inefficiencies, you&#8217;ll start to use multiple servers. You&#8217;ll probably add a Load Balancer with multiple web servers. After that, you&#8217;ll probably have some sort of Master-Slave replication for your Database for backups and fail-overs.</p>
<p>You start to leave this &#8220;Middle-Scale&#8221; classification when you move to multiple data centers, and start to do some load balancing at the the DNS layer. This is when you&#8217;ll start to have a dedicated sys-admin team.</p>
<h2>Okay, I&#8217;m Middle-Scale! So what should I do? Where do I look?</h2>
<p>First off, you <strong>must adhere to best practices.</strong> If you are working with PHP, research PHP performance and best practices. Do they same for each of your technologies, like Apache and MySQL. You will need to stop treating your application as one big app, and start to understand all of it&#8217;s moving parts.</p>
<p>Second, you <strong>must understand your specific problems.</strong> Scaling ins&#8217;t a problem, nor is it a solution. It is a generic term for many different types of solutions. Without understanding why your website is running slow, or why it cannot handle the load, you will not be able to create an effective solution.</p>
<p>So you don&#8217;t have a scaling problem. You have a MySQL performance issue, or a Apache problem, or a PHP problem. Most likely, it is something extremely specific. You have a high volume of MySQL write operations (i.e. UPDATE, INSERT, DELETE, REPLACE), or perhaps you are missing some indexes and have too many full table scans. </p>
<p>Third, Googling for help will only get you so far. You are starting to enter a phase when it is harder and harder to find answers to your broad issues. Talking with other experienced people who have gone through the Middle-Scale pains before will help immensely. <a href="http://www.justincarmony.com/blog/2009/11/27/my-php-user-group-experience/">I cannot recommend highly enough going to User groups</a>. Being able to communicate with someone, either face to face, on the Phone, over IRC, etc. is invaluable. While I&#8217;ve learned a lot at conference and usergroup presentations, I&#8217;ve learned even more by just talking with the people attending and at the social gatherings.</p>
<h2>Profile &#038; Performance will Naturally Lead to Scaling</h2>
<p>When you want to scale, it can feel like a very daunting task. It seems like this big unknown complicated solution. What in the world am I going to do? I remember feeling these worries when I first started to investigate load balancing and sharding for some websites I was working on.</p>
<p>The thing is, if you start to profile your application, you will discover it&#8217;s inefficiencies. I remember when I spent a sold week, working 12-16 hours a day profiling and optimizing Dating DNA&#8217;s database. I found a lot of bad queries, and I was able to cut our load times from 2-5 seconds to under 0.1 seconds. The CPU on the database server went from 80-90% CPU utilization to under 10%. It was incredible, and then I promptly took the entire next week off. When we migrated to new servers, I was able to move to less powerful database server and still have the same great performance. So by profiling and optimizing our database, I didn&#8217;t need to worry about spinning up multiple master databases and sharding our data.</p>
<p>With Clipish, we faced almost opposite scaling problems. The database was rarely an issue, but our web server CPU&#8217;s were. We do a lot of ImageMagick manipulations of images, and at high volumes on virtual servers this can be a big issue. So over the last year we&#8217;ve introduced some load balancing and CDN tools to help serve all 10 TB of bandwidth for Clipish.</p>
<p>The thing is, when you start to profile your application, you start to understand it&#8217;s low areas better, so you have a much better idea on what do to. Even if you don&#8217;t know your solution, it is much easier to find a solution with a sound understanding of the problem. For example &#8220;scaling mysql&#8221; yields much less helpful results than &#8220;mysql full table scans&#8221; in Google.</p>
<h2>So should I ignore what Facebook and Google do for scaling?</h2>
<p>Of course not! First off, they do cool stuff. Just because I&#8217;ll watch NASA launch a space shuttle doesn&#8217;t mean I&#8217;ll try to make a rocket system for my broken lawn mower. But you have to put what they are doing into context. People from large websites have published several good &#8220;best practices&#8221; articles on techniques that help any website. Especially things on the client/browser side of things. Just use caution. I cringe when I hear someone say &#8220;we&#8217;re trying to use Cassandra to solve XYZ problem at work&#8221; when it is severe overkill. </p>
<h2>Final Thoughts</h2>
<p>Most of the time when I talk about performance and scaling with other people, it is when they are in &#8220;critical mode.&#8221; Their website is down, slow, unusable, etc, and they are looking to fix the problem. I will say, it is much more difficult to profile in &#8220;critical mode&#8221; than profiling before hand. The reason is you are much more desperately focused on getting it working again instead of understanding the problem. </p>
<p>I&#8217;ll be giving a presentation this Thursday at UPHPU on Profiling PHP Applications. I&#8217;ll post the slides, and most likely write some articles on the subject afterwards. As always, feel free to email me or leave a comment.</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2009/04/18/data-backups-there-are-no-excuses/' rel='bookmark' title='Data Backups &#8211; There Are No Excuses'>Data Backups &#8211; There Are No Excuses</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/09/16/speaking-utah-open-source-conference-2009/' rel='bookmark' title='Speaking: Utah Open Source Conference 2009'>Speaking: Utah Open Source Conference 2009</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/10/11/presentation-real-life-scaling/' rel='bookmark' title='Presentation: Real Life Scaling'>Presentation: Real Life Scaling</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/07/18/working-with-middle-scale-websites/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Node.js, Evented I/O, and a Cup of Tea</title>
		<link>http://www.justincarmony.com/blog/2011/06/02/node-js-event-driven-development-and-tea/</link>
		<comments>http://www.justincarmony.com/blog/2011/06/02/node-js-event-driven-development-and-tea/#comments</comments>
		<pubDate>Thu, 02 Jun 2011 16:30:42 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[event driven proramming]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[node.js]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=937</guid>
		<description><![CDATA[All Arthur Dent really wanted was a cup of tea. I was reading The Restaurant at the End of the Universe by Douglas Adams, and I found a humorous example of event driven programming. So I thought I would share. The Story Arthur Dent was one of the last surviving humans after Earth had been ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2011/04/29/node-js-lamp-and-the-future/' rel='bookmark' title='Node.JS, LAMP, and The Future'>Node.JS, LAMP, and The Future</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>All Arthur Dent really wanted was a cup of tea.</p>
<p>I was reading <a href="http://en.wikipedia.org/wiki/The_Restaurant_at_the_End_of_the_Universe">The Restaurant at the End of the Universe</a> by  Douglas Adams, and I found a humorous example of event driven programming. So I thought I would share.</p>
<h3>The Story</h3>
<p>Arthur Dent was one of the last surviving humans after Earth had been destroyed to make way for a new hyperspace bypass. He was living on a spaceship known as the &#8220;Heart of Gold,&#8221; which onboard had an amazing piece of equipment called the &#8220;Nutri-Matic Drinks Synthesizer&#8221; which &#8220;claimed to produce the widest possible range of drinks personally matched to the tastes and metabolism of whoever cared to use it.&#8221;</p>
<p>The problem was that &#8220;when put to the test, however, it invariably produced a plastic cup filled with a liquid that was almost, but nit quite, entirely unlike tea.&#8221; So Arthur tried to reason with the machine with no success. He would ask for tea, and the machine would respond with a cheerful &#8220;Share and enjoy&#8221; and produce the same sickly liquid. </p>
<p>Finally, after all reason and attempts failed, he tried one last time:</p>
<blockquote><p>&#8220;No,&#8221; he said, &#8220;look, it&#8217;s very, very simple &#8230; all I want &#8230; is a cup of tea. You are going to make one for me. Keep quiet and listen.&#8221;</p>
<p>And he sat. He told the Nutri-Matic about India, he told it about China, he told it about Ceylon. He told it about broad leaves drying in the sun. He told it about silver teapots. He told it about summer afternoons on the lawn. He told it about putting in the milk before the tea so it wouldn&#8217;t get scalded. He even told it (briefly) about the history of the East India Company.</p>
<p>&#8220;So that&#8217;s it, is it?&#8221; said the Nutri-Matic when he had finished.</p>
<p>&#8220;Yes,&#8221; said Arthur, &#8220;that is what I want.&#8221;</p>
<p>&#8220;You want the taste of dried leaves boiled in water?&#8221;</p>
<p>&#8220;Er, yes. With milk.&#8221;</p>
<p>&#8220;Squirted out of a cow?&#8221;</p>
<p>&#8220;Well, in a manner of speaking I suppose &#8230;&#8221;</p>
<p>&#8220;I&#8217;m going to need some help with this one,&#8221; said the machine tersely. All the cheerful burbling had dropped out of its voice and it now meant business.</p>
<p>&#8220;Well, anything I can do,&#8221; said Arthur.</p>
<p>&#8220;You&#8217;ve done quite enough,&#8221; the Nutri-Matic informed him.</p></blockquote>
<p>So everything was grand? Well, not quite. There were two problems.</p>
<p>First, the Nutri-Matic needed help processing all this information. So, it asked the spaceship&#8217;s computer to help:</p>
<blockquote><p>[The Nutri-Matic] summoned up the ship&#8217;s computer.</p>
<p>&#8220;Hi there!&#8221; said the ship&#8217;s computer.</p>
<p>The Nutri-Matic explained about tea to the ship&#8217;s computer. The computer boggled, linked logic circuits with the Nutri-Matic and together they lapsed into a grim silence.</p>
<p>Arthur watched and waited for a while, but nothing further happened.</p>
<p>He thumped it, but still nothing happened.</p>
<p>Eventually he gave up and wandered up to the bridge.</p></blockquote>
<p>Now, having the Heart of Gold&#8217;s entire computer system help calculate how to generate a cow and grow a tree to produce leaves so Arthur could have his tea isn&#8217;t a terrible issue, except the second problem:</p>
<p>A Vogon ship approached and began to attack.</p>
<p>So when the rest of the crew rushed to the bridge, they were perplexed that their entire computer was jammed, and all of their controls, navigation, and weapons were unresponsive. However, the reason was soon discovered:</p>
<blockquote><p>&#8220;What have you done to it, Monkeyman?&#8221; he breathed.</p>
<p>&#8220;Well,&#8221; said Arthur, &#8220;nothing in fact. It&#8217;s just that I think a short while ago it was trying to work out how to &#8230;&#8221;</p>
<p>&#8220;Yes?&#8221;</p>
<p>&#8220;Make me some tea.&#8221;</p>
<p>&#8220;That&#8217;s right guys,&#8221; the computer sang out suddenly, &#8220;just coping with that problem right now, and wow, it&#8217;s a biggy. Be with you in a while.&#8221; It lapsed back into a silence that was only matched for sheer intensity by the silence of the three people staring at Arthur Dent.</p>
<p>As if to relieve the tension, the Vogons chose that moment to start firing.</p></blockquote>
<p>Don&#8217;t worry, fortunately the occupants of the ship were able to hold a seance, summon the great grandfather of the captain, and have the great grandfather&#8217;s spirit save them. (I know, such an awesome book)</p>
<p>Anyway, that is a long example, but it show a point:</p>
<h3>Evented I/O</h3>
<p>Basically Evented I/O is when your program does some sort of I/O (like querying a database, reading a file, etc) and continues on executing, not waiting (or blocking) for the response. So in our example above, the ship can be seen like a PHP script:</p>
<pre class="brush: php; title: ; notranslate">
&lt;?php

$ship = new HeartOfGold();
$nutrimatic = new NutriMatic();

$recipe = $ship-&gt;processQuery('How to make tea.');
$nutrimatic-&gt;makeDrink($recipe);

$enemies = $ship-&gt;detectEnemies();
if($enemies)
{
    $ship-&gt;evasiveManuvers();
}
</pre>
<p>The problem is the ship can&#8217;t detect enemies until it finishes processing Arther&#8217;s request on how to make tea. Now, in Node.js, you have functions that are callbacks. It is a way to say &#8220;Hey, when this request is done, do this afterwards.&#8221;</p>
<pre class="brush: jscript; title: ; notranslate">
var ship = new HeartOfGold();
var nutrimatic = new NutriMatic();

ship.processQuery('How to make tea.', function(recipe){
   nutrimatic.makeDrink(recipe);
});

var enemies = ship.detectEnemies();
if(enemies.length &gt; 0)
{
    ship.evasiveManuvers();
}
</pre>
<p>Now the function processQuery is non-blocking, meaning it will start the process of learning how to make tea and create a recipe, but continue execution. That way if it detects enemies, it can perform evasiveManuvers().</p>
<p>In a real life example, processQuery could be querying a database with a very slow query. In PHP, you can&#8217;t continue executing your code until that result is returned. However, in Node.js, if your SQL library takes advantage of Node.js&#8217;s evented abilities, you can send off a query and continue executing, and when the query returns you can do additional logic through your callback.</p>
<p>Hopefully that makes sense. I like finding programming analogies in books, especially with Douglas Adam&#8217;s series of Hitchhiker&#8217;s Guide to the Galaxy, since they are so fun to read.</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2011/04/29/node-js-lamp-and-the-future/' rel='bookmark' title='Node.JS, LAMP, and The Future'>Node.JS, LAMP, and The Future</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/06/02/node-js-event-driven-development-and-tea/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>MySQL, Redis, and a Billion Rows &#8211; A Love Story</title>
		<link>http://www.justincarmony.com/blog/2011/05/23/mysql-redis-and-a-billion-rows-a-love-story/</link>
		<comments>http://www.justincarmony.com/blog/2011/05/23/mysql-redis-and-a-billion-rows-a-love-story/#comments</comments>
		<pubDate>Mon, 23 May 2011 15:20:35 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Dating DNA]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[redis]]></category>
		<category><![CDATA[scaling]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=923</guid>
		<description><![CDATA[This last week we pushed live a very large architecture change for Dating DNA. For those who know me, and have heard me talk about the Dating DNA Scoring System, they know how big of a problem we faced. For those who don&#8217;t know, let me give some background: The Problem With Dating DNA, our ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2008/07/01/mysql-php-sql_calc_found_rows-an-easy-way-to-get-the-total-number-of-rows-regardless-of-limit/' rel='bookmark' title='MySQL &amp; PHP  – SQL_CALC_FOUND_ROWS – An easy way to get the total number of rows regardless of LIMIT'>MySQL &#038; PHP  – SQL_CALC_FOUND_ROWS – An easy way to get the total number of rows regardless of LIMIT</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/01/12/mysql-40-million-rows-myisam-innodb/' rel='bookmark' title='MySQL, 40 Million Rows, MyISAM to InnoDB, 45 Minutes'>MySQL, 40 Million Rows, MyISAM to InnoDB, 45 Minutes</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/04/06/restoring-large-mysql-dump-900-million-rows/' rel='bookmark' title='Restoring Large MySQL Dump &#8211; 900 Million Rows'>Restoring Large MySQL Dump &#8211; 900 Million Rows</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>This last week we pushed live a very large architecture change for <a href="http://www.datingdna.com/">Dating DNA</a>. For those who know me, and have heard me talk about the Dating DNA Scoring System, they know how big of a problem we faced. For those who don&#8217;t know, let me give some background:</p>
<h3>The Problem</h3>
<p>With Dating DNA, our goal was to display a compatibility score with <strong>every other user</strong>. This score is generated by taking two sets of answers to our 20 page survey, and running it through our algorithm. While it is super convenient for our users, this poses a problem. We wanted not only for people to be able to visit a profile and see a score, which is easy to generate a score on demand. We wanted our users to be able to <strong>browse</strong> other profiles sorted <strong>by</strong> their score with them. This requires us to <strong>pre-generate</strong> and <strong>store</strong> these scores, and then later query them.</p>
<p>So Ultimately, in theory, our &#8220;scores&#8221; data scaled at the following rate, with X as the number of registered users: </p>
<h3>X^2 &#8211; X</h3>
<p>That is at an exponential rate, which is practically impossible to scale at. The very first version of Dating DNA (before I took over the project) had about 1,500 users. The scores were stored in a single table. Every night, a &#8220;cron job&#8221; would run and get a list of every user, and loop through every possible iteration and re-generate each score. At 1,500 users that was 2,248,500 records. That is a <strong>lot</strong> for just 1,500 users. With our current user count, we would roughly have 359,999,400,000 score records. Thats <strong>359 Billion</strong> records if you don&#8217;t want to count the commas. </p>
<p>This old system of daily cron jobs broke at about 2,000 users. We would have problems with the cron job taking over 48 hours to complete, and would end up with 3 scripts running at the say time. One for today, one for yesterday, and one for the day before that.</p>
<h3>Smart Logic &#038; Threading</h3>
<p>We solved our first problem by using some common sense and smart logic. I won&#8217;t detail the lengthy measures we go through, but we can basically boil down our entire user base to an estimated top 5000 matches for any given user. If we have a heterosexual man named Joe, he doesn&#8217;t care about the hundreds of thousands of other heterosexual men who he scores a 2 or 3 with, but the heterosexual women he scores above a 6 with. So we don&#8217;t store the score for Frank, Jimmy, and Alan with Joe, but Sally, Rachael, and Tiffany. </p>
<p>The second part we solved was pre-generating scores for a user. After a User has reached a point in the survey where we have all the information we need to generate scores, and they are just filling in some miscellaneous, we put them in a queue. We then have a server process than is continuously running checking this queue, and spinning off multiple generation &#8220;threads&#8221; that crunch the data and store the score. We&#8217;ve spent a lot of time perfecting this system. Currently we typically can generate any given user&#8217;s matches in roughly 5 to 20 seconds, depending on how busy our website is. </p>
<h3>Storing The Score in MySQL</h3>
<p>The problem we now faced was the write through put of MySQL. Even through sharding and partitioning, we wanted to have a goal of sustaining 1,000 registrations per minute in a scalable and high performance manner. Which comes down to about 83,000 records per second that are either being inserted or updated. We then needed to be able to retrieve large volumes of scores just as fast.</p>
<p>I believe we could have bent MySQL to our will and got it to work, but it would be at a high cost of server power, and that cost wouldn&#8217;t scale well with our revenue stream. After we moved from the MySQL storage of the scores, I ran a query to see how many scores we were indeed storing. The final total was 950,363,992. Just 50 Million shy of one billion. It took 1 hour 49 min 38.27 sec to calculate that count. It is evidence that even though MySQL wasn&#8217;t the best choice for storing this data, it did it pretty well considering this single table was holding 90 times more data than any other table.</p>
<h3>Picking Another Solution</h3>
<p>In 2009 we started to throw around ideas for a new scoring system. I cannot stress enough when talking to others about &#8220;NoSQL&#8221; solutions the best solution for any given job is based on your data&#8217;s <strong>characteristics</strong>. User registration data needs to be treated differently than activity logging and basic stats. It might be okay to lose a few minutes of activity logging (depending on the app), but you definitely don&#8217;t want to be losing user accounts.</p>
<p>With Dating DNA&#8217;s scores, we had one great advantage. The data could be somewhat volatile, because we can always re-generate a set of matches for any given users. Of course, we didn&#8217;t want to lose <strong>all</strong> of it, because having to regenerate everyone&#8217;s scores is a major pain and extremely resource intensive. But if we lost a few minutes, anything lost could easily be regenerated. So when we started research for a solution, we were willing to sacrifice some persistance for performance. We wouldn&#8217;t be doing the same for our user registration data.</p>
<p>At first, I was contemplating building a completely in-house project to handle the data storage and retrieval of the scores. It would be a lot of work, and decided against. So I then thought about hacking together a custom solution with memcached. The idea would be a user&#8217;s set of matches would be stores in a variable in memcached. So the website and generation scripts would interface with memcached, and a server process would write inactive sets of scores (people who weren&#8217;t logged in) to a file on the disk. When they logged it and scores were being pulled and stored for that user, it would load the data from disk into memcached.</p>
<p>While the general concept was sound, the actual execution would be difficult. Memcached only supported strings for values, and we would still need some sort of database to manage which users had the data in memory vs disk, and the server process (probably just a php, python, or node.js script running continuously) would have to be running constantly, and if that broke things could get messy.</p>
<p>It boiled down too many points of failure and complexity. But it was a step in the right direction, so we kept looking for a better solution.</p>
<h3>Redis, the Advanced Key-Store</h3>
<p>I was talking with <a href="http://josephscott.org/">Joseph Scott</a>, an employee of <a href="http://automattic.com/">Automattic</a> as a Bug Exorcist (not joking, <a href="http://automattic.com/about/">his real title</a>), and he mentioned I should look into <a href="http://redis.io/">Redis</a>. He gave me a brief overview of what it was, and I shuffled that info back in my brain. I can&#8217;t remember how much longer it was before I checked out and compiled a copy of Redis, but I quickly discovered it could be a viable storage system for our scores.</p>
<p>So I spun up a virtual machine, installed Redis, and started to pound away at it. One of the things I wanted to test was the new feature (at the time) of Virtual Memory for redis. What this allowed was for Redis to make it&#8217;s own Virtual Memory on the server and store the lest recently used data to disk. When a Redis object was retrieved and it was in the VM, it would swap it back into memory, and swap older data to disk. This was just like the idea I had before with using an archaic system with memcached, but much more elegant.</p>
<p>The second thing was Redis&#8217;s support for multiple data types. So instead of having a json encoded string that held the scores and user ids for another user, we could have a hashtable or even a sorted set. It was a much more elegant solution than what we were thinking of before.</p>
<h3>Some Limitations to Redis</h3>
<p>However, there were a few limitations that we faced when implementing Redis. Redis works flawlessly with smaller sets of data. But the larger your data set, the more careful and aware you need to be about a few things that will kill your redis instance.</p>
<p>First off, with memcached, if you set a memory limit, it is a hard limit. I&#8217;ve never seen memcached use more memory than what you allow it to use. Redis, on the other hand, has soft memory limits. This is because of the way the Background Saves work so you can have persistant data. When a Background Save is issued, Redis will fork itself, and have one thread save a snapshot to the disk, and the other thread will continue to operate. In order to do this, Redis will exceed the standard memory limits, and your memory usage will go up much quicker. Once the background save is complete, it will close the forked backup process and sync the memory back to one data set. (I&#8217;m not a computer science guy, nor do I know a lot of lower level programming, so I might be describing this not 100% accurate, but this is how I envision it in my head). </p>
<p>Now, if you are not using the Virtual Memory, this isn&#8217;t that bad. However, when using Virtual Memory, the Background Saves take a great deal longer (from seconds to almost a minute or so), which isn&#8217;t too bad, but there is a catch. You will not be able to swap to the VM Disk until after the BG Save. This means that all Redis objects swaped to memory will stay in memory until after the BG Save is complete. This is because, like the memory from the fork, the Virtual Memory file is being used for the BG Save instead of the process handling requests.</p>
<p>So the one limitation we&#8217;ve encountered is we cannot run scripts that &#8220;query&#8221; large amounts of data from Redis. For example, it would be very simple to get a listing of users using the KEYS command, and then loop through the values using a HLEN to read the length. This will cause you to swap from and to the virtual memory a great deal. If a Background Save is occuring, Redis will not swap to disk until the BG Save is complete. This means if you have 10 GB of data in Virtual Memory, and you have a 1 GB instance of Redis, you will suddenly be reading gigs of data into Redis&#8217;s memory. If you are on a 2GB machine, you can easily use up all the memory on the server and then start using the System&#8217;s Swap.</p>
<p><strong>Once you start using the Operating System&#8217;s Virtual Memory, it is game over.</strong> Your Redis instance&#8217;s performance will tank, and your Background Save might not finish, and you will need to restart redis. </p>
<p>There are some ways to give Redis a &#8220;Hard&#8221; limit on memory, but we opted to configure our servers in a way that doesn&#8217;t require this. If Redis hits the memory limit, it can start throwing write operation errors, which we didn&#8217;t want.</p>
<h3>How We Configure Redis</h3>
<p>After much trial and error and testing internally, we believe we found a sweet spot. We deploy a redis server, and spin up three redis instances on a different port each. Each is configured with 4 GB of Virtual Memory using 4096 size pages, and only 256 MB of &#8220;memory&#8221; using the vm-max-memory setting. While you would think this would mean a hard limit, it is a soft limit, and more of a goal &#8220;we&#8217;ll try to only use 256 MB of memory to store the data, if we&#8217;ll exceed it if needed.&#8221; Given our patterns of usage, Redis&#8217;s actually usage fluctuates (based on ps aux&#8217;s reporting) between 640 MB to 1100 MB of RAM, depending on if a background save is being executed or not. Redis is configured to perform background saves every 10 minutes, which take about a minute to perform.</p>
<p>So between the other admin services running on the box, and the three redis instances, we use just over 3GB RAM:</p>
<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/05/redis-instance-memory.jpg"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/05/redis-instance-memory.jpg" alt="" title="redis-instance-memory" width="602" height="230" class="alignnone size-full wp-image-926" /></a></p>
<p>The amount of CPU required is extremely low, and almost 100% from writing the background saves to the disk:</p>
<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/05/redis-cpu.jpg"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/05/redis-cpu.jpg" alt="" title="redis-cpu" width="601" height="229" class="alignnone size-full wp-image-927" /></a></p>
<p>So what do we get in return? We estimate each instance with 256 MB of data can hold roughly 2,000 active users. So with a single server we can support 6,000 users online at any given moment. We can store the scores for roughly 360,000 users on a single 4 GB box, which is about 1.8 Billion scores. Then, if we need more, we just provision another box, and our system will start assigning users to the instances on that machine.</p>
<p>Because of our ability to re-generate the scores, we decided to only convert the users who had logged in the past three months to the new system. If a user who hadn&#8217;t logged in since then logged in again, it would in the background assign them to a redis instance and rebuild their matches for them.</p>
<h3>Using Redis with PHP</h3>
<p>I recommend currently to use the PHP Redis client <a href="https://github.com/nrk/predis">predis</a>. I&#8217;ve used others like <a href="http://rediska.geometria-lab.net/">Rediska</a>, but I prefer the straight forward approach of predis.  </p>
<p>A high level view of how we use Redis with our PHP based website is we have a class called RedisManager than manages pretty much all the connections to Redis. It supports lazy connections (which is important to us, since we don&#8217;t want to have to connect to every instance of Redis we have), and I hope to open source it some day soon.</p>
<p>One key performance trick we&#8217;ve noticed is to use Pipelining to the redis instance. We don&#8217;t use this so much on the website, but our score generation &#8220;threads.&#8221; Writing thousands of scores one by one eats up a lot of network overhead versus sending them in batches (we send in batched of 500 or 1000, depending on the situation). Using pipelining is extremely fast for us, and I highly recommend it for any large batch of commands.</p>
<h3>Using Redis Elsewhere in Dating DNA</h3>
<p>Now, it might seem that we&#8217;ve put a lot of thought and effort into using Redis, and I want to make sure it was understood that Redis itself wasn&#8217;t difficult to use, but the volume of data were were dealing with. On Dating DNA, we also use Redis to power out in-app chat system (which I&#8217;ve <a href="http://www.justincarmony.com/blog/2011/01/07/creating-chatroom-walls-with-redis-and-php/">written about previously</a>), and it works great and is currently only using 146.70 MB of RAM, and serves thousands of requests per second.</p>
<h3>The Future</h3>
<p>I still have a lot of great ideas for Redis and Dating DNA, both with the score system, and outside of it. I plan on writing several reporting tools for Redis and hope to share them on github. I am currently working on the code and scripts for automatic deployment for Redis servers for Dating DNA, so we can scale easily with the push of a button. I&#8217;m excited for the work that is being done on Redis, and highly recommend it to anyone.</p>
<p>If there are details you would like to know more about, leave a comment and I&#8217;ll try to answer them. If you see me at tek11, feel free to ask me about this, and I can show you in detail how it works (internet permitting).</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2008/07/01/mysql-php-sql_calc_found_rows-an-easy-way-to-get-the-total-number-of-rows-regardless-of-limit/' rel='bookmark' title='MySQL &amp; PHP  – SQL_CALC_FOUND_ROWS – An easy way to get the total number of rows regardless of LIMIT'>MySQL &#038; PHP  – SQL_CALC_FOUND_ROWS – An easy way to get the total number of rows regardless of LIMIT</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/01/12/mysql-40-million-rows-myisam-innodb/' rel='bookmark' title='MySQL, 40 Million Rows, MyISAM to InnoDB, 45 Minutes'>MySQL, 40 Million Rows, MyISAM to InnoDB, 45 Minutes</a></li>
<li><a href='http://www.justincarmony.com/blog/2011/04/06/restoring-large-mysql-dump-900-million-rows/' rel='bookmark' title='Restoring Large MySQL Dump &#8211; 900 Million Rows'>Restoring Large MySQL Dump &#8211; 900 Million Rows</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/05/23/mysql-redis-and-a-billion-rows-a-love-story/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Attending PHP Tek 2011</title>
		<link>http://www.justincarmony.com/blog/2011/05/14/attending-php-tek-2011/</link>
		<comments>http://www.justincarmony.com/blog/2011/05/14/attending-php-tek-2011/#comments</comments>
		<pubDate>Sat, 14 May 2011 15:24:20 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[speaking]]></category>
		<category><![CDATA[tek]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=917</guid>
		<description><![CDATA[I&#8217;m excited to say I will be attending PHP Tek 2011 in one week. This will be the first &#8220;PHP&#8221; conference I&#8217;ve attended, and hope I will learn a lot while also having some fun. It&#8217;ll be nice to meet some people who I&#8217;ve only talked to online. I&#8217;ve always wanted to attend a PHP ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2010/12/29/goal-for-2011-learn-c/' rel='bookmark' title='Goal for 2011: Learn C'>Goal for 2011: Learn C</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/09/16/speaking-utah-open-source-conference-2009/' rel='bookmark' title='Speaking: Utah Open Source Conference 2009'>Speaking: Utah Open Source Conference 2009</a></li>
<li><a href='http://www.justincarmony.com/blog/2010/10/07/utosc-day-one/' rel='bookmark' title='UTOSC &#8211; Day One'>UTOSC &#8211; Day One</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><a href="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/05/tek11.png"><img src="http://c747925.r25.cf2.rackcdn.com/blog/wp-content/uploads/2011/05/tek11-300x89.png" alt="" title="tek11" width="300" height="89" class="alignright size-medium wp-image-918" /></a>I&#8217;m excited to say I will be attending PHP Tek 2011 in one week. This will be the first &#8220;PHP&#8221; conference I&#8217;ve attended, and hope I will learn a lot while also having some fun. It&#8217;ll be nice to meet some people who I&#8217;ve only talked to online. </p>
<p>I&#8217;ve always wanted to attend a PHP conference for quite some time now, but it seemed for several years either it wasn&#8217;t in the budget, or I was just too busy. But I kept hearing great things about Tek, and it&#8217;s organizers, so when the Call for Papers opened up for Tek 11 this year, I submitted. Unfortunately I wasn&#8217;t able to make it on the presenter list, so I immediately registered and booked my flights so I wouldn&#8217;t find an excuse later on to not go.</p>
<p>For those in the Utah PHP Users Group who are not able to come out with me, and other friends of mine, let me know if there is anything interesting in the <a href="http://tek11.phparch.com/schedule/">schedule</a>, and I&#8217;ll see if I can attend and take some notes for you. </p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2010/12/29/goal-for-2011-learn-c/' rel='bookmark' title='Goal for 2011: Learn C'>Goal for 2011: Learn C</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/09/16/speaking-utah-open-source-conference-2009/' rel='bookmark' title='Speaking: Utah Open Source Conference 2009'>Speaking: Utah Open Source Conference 2009</a></li>
<li><a href='http://www.justincarmony.com/blog/2010/10/07/utosc-day-one/' rel='bookmark' title='UTOSC &#8211; Day One'>UTOSC &#8211; Day One</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/05/14/attending-php-tek-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Node.JS, LAMP, and The Future</title>
		<link>http://www.justincarmony.com/blog/2011/04/29/node-js-lamp-and-the-future/</link>
		<comments>http://www.justincarmony.com/blog/2011/04/29/node-js-lamp-and-the-future/#comments</comments>
		<pubDate>Fri, 29 Apr 2011 23:35:37 +0000</pubDate>
		<dc:creator>Justin Carmony</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[LAMP]]></category>
		<category><![CDATA[node.js]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.justincarmony.com/blog/?p=901</guid>
		<description><![CDATA[I just read an article about &#8220;Node.JS and the JavaScript Age.&#8221; It was from a very &#8220;enthusiastic&#8221; point of view. My guess is they had tinkered around with Node.JS and using with client-side JavaScript rebuilding their Dashboard. You can do some pretty cool stuff with it, and it has a lot of potential. It is ...


Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2008/10/10/local-lamp-developement-user-content/' rel='bookmark' title='Local LAMP Developement &amp; User Content'>Local LAMP Developement &#038; User Content</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/01/21/list-of-50-php-tools/' rel='bookmark' title='List of 50 PHP Tools'>List of 50 PHP Tools</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/09/15/jquery-tip-better-toggle/' rel='bookmark' title='jQuery Tip: Better Toggle'>jQuery Tip: Better Toggle</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I just read an article about &#8220;<a href="http://metamarketsgroup.com/blog/node-js-and-the-javascript-age/">Node.JS and the JavaScript Age</a>.&#8221; It was from a very &#8220;enthusiastic&#8221; point of view. My guess is they had tinkered around with Node.JS and using with client-side JavaScript rebuilding their Dashboard. You can do some pretty cool stuff with it, and it has a lot of potential. It is very easy to install and get going, much like Redis. Here ia an excerpt from the article:</p>
<blockquote><p>This decision was driven by a realization: the LAMP stack is dead. In the two decades since its birth, there have been fundamental shifts in the web’s make-up of content, protocols, servers, and clients. Together, these mark three ages of the web:</p>
<p>I. 1991-1999: The HTML Age.</p>
<p>The HTML Age was about documents, true to Tim Berners-Lee’s original vision of a “big, virtual documentation system in the sky.” The web was dominated by static, hand-coded files, which web clients crudely formatted (with defaults that offend even the mildest of typographiles). Static documents were served to static clients.</p>
<p>II. 2000-2009: The LAMP Age.</p>
<p>The LAMP Age was about databases. Rather than documents, the dominant web stacks were LAMP or LAMP-like. Whether CGI, PHP, Ruby on Rails, or Django, the dominant pattern was populating an HTML template with database values. Content was dynamic server-side, but still static client-side.</p>
<p>III. 2010-??: The Javascript Age.</p>
<p>The Javascript age is about event streams. Modern web pages are not pages, they are event-driven applications through which information moves. The core content vessel of the web — the document object model — still exists, but not as HTML markup. The DOM is an in-memory, efficiently-encoded data structure generated by Javascript.</p>
<p>LAMP architectures are dead because few web applications want to ship full payloads of markup to the client in response to a small event; they want to update just a fragment of the DOM, using Javascript. AJAX achieved this, but when your server-side LAMP templates are 10% HTML and 90% Javascript, it’s clear that you’re doing it wrong.</p></blockquote>
<h3>Claiming LAMP is Dead is Reaching</h3>
<p>It goes on to explain how LAMP like architectures dead. I would think that is a very overzealous point of view, and to perhaps put it in some perspective:</p>
<p>During each of these &#8220;Ages&#8221;  new technology was developed to solve problems. HTML was created to create inter-linking information. As the demand grew and became more difficult to manage, LAMP like tools like PHP, Django, Ruby on Rails, and others were developed to help make delivering this information easier. It wasn&#8217;t so much replacing the previous technology, but adding on to it. Sometimes you will replace older technology with newer ones, but we&#8217;re still using HTML &#038; CSS, and augment it when needed.</p>
<p>But LAMP-like systems are deployed in production all over the place, powering some of the most visited websites on the net. While <a href="https://github.com/joyent/node/wiki/Projects,-Applications,-and-Companies-Using-Node">there are some people who are using Node.JS</a>, like Yammer, but Node.JS isn&#8217;t serving nearly the volume of content like PHP, Ruby, Python, or event .NET. </p>
<h3>What is Node.JS?</h3>
<p>Basically, it is a lightweight framework wrapped around <a href="http://code.google.com/p/v8/">Google&#8217;s V8 JavaScript Engine</a>. V8 is the JavaScript engine that powers Google Chrome, their open source Web Browser. Another part of Node.JS is that it is event based, which can make is very quick and efficient.</p>
<p>I&#8217;ve been tinkering with Node.JS for a couple of weeks, and it is very interesting and has a lot of potential. It&#8217;s unique scoping with JavaScript and event-based methodology makes certain things easy to accomplish that are more difficult in other traditional tools. There are some pretty cool libraries for Node.JS like <a href="http://socket.io/">Socket.IO</a>, a library to support multiple transports using feature detection. So if your browser supports WebSocket or Adobe Flash Socket, it will use that, or AJAX long polling and multipart streaming, or even the Forever Iframe technique. So your application will use which ever type of communication is most efficient. </p>
<h3>A Little Bit of a Reality Check</h3>
<p>But lets be honest for a moment, Node.JS was created in 2010. It is still very, very young. In the video on the front page of the Node.JS is a video of it&#8217;s creator presenting to a PHP Users Group about Node, and even cautions using it in production. It still has a long way to go to maturing as a tool. Granted, being based off the V8 Engine brings a lot of maturity to the project.</p>
<p>As for replacing technologies like PHP, Ruby, and Django, I&#8217;m extremely skeptical that they are &#8220;dead.&#8221; Instead, I see Node.JS augmenting our existing technologies.  Like how I&#8217;ve implemented Redis as a data store where MySQL was poorly suited, I see Node.JS functioning as a tool for real-time communication for the future of websites. </p>
<p>I hope to write more about Node.JS, and look forward to using it to solve new and unique problems. But I won&#8217;t be rewriting all my websites in Node.JS, it would be a nightmare. So like with NoSQL, and Cloud Computing, yes it is a new tool, but it won&#8217;t radically remove everything else up to this point. Like with Redis, I didn&#8217;t get rid of MySQL, I just use Redis when it is a better choice.</p>


<p>Related posts:<ol><li><a href='http://www.justincarmony.com/blog/2008/10/10/local-lamp-developement-user-content/' rel='bookmark' title='Local LAMP Developement &amp; User Content'>Local LAMP Developement &#038; User Content</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/01/21/list-of-50-php-tools/' rel='bookmark' title='List of 50 PHP Tools'>List of 50 PHP Tools</a></li>
<li><a href='http://www.justincarmony.com/blog/2009/09/15/jquery-tip-better-toggle/' rel='bookmark' title='jQuery Tip: Better Toggle'>jQuery Tip: Better Toggle</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.justincarmony.com/blog/2011/04/29/node-js-lamp-and-the-future/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using memcached
Database Caching 36/131 queries in 0.059 seconds using memcached
Content Delivery Network via Rackspace Cloud Files: c747925.r25.cf2.rackcdn.com

Served from: www.justincarmony.com @ 2012-02-07 19:49:09 -->
