So us code poets over at ShootProof have been hard at work on doing an upgrade to our server architecture and one of the issues that we ran across was how to always know about which app servers are active and should be considered as part of the memcached pool. The goal was to have it totally automated such that we can bring machines up and down and never have to think about updating a list of IPs. The solution we came up with is to have a cron running on all of the app servers that updates a record in SimpleDB. This record cron runs each minute and gathers some information. Mainly it is just updating a timestamp in its SimpleDB record saying that it is alive. Near the end of the script it runs a query against SimpleDB to get a list of all of the active servers. It takes that list, builds an array of IP addresses and puts that in the local memcache instance. The website application then will be able to do a lookup to get the list of IPs to build the cluster at run time. Below is a copy of the script, feel free to pick it apart in the comments.

require_once('lib/sdb.php');
 
define('AWS_ACCESS_KEY_ID', 'XXXXXXXXXXXXXXX');
define('AWS_SECRET_ACCESS_KEY', 'YYYYYYYYYYYYYYY');
 
// machine specific information
$amiId = file_get_contents('http://169.254.169.254/latest/meta-data/ami-id');
$machineData = array(
    'ami_id' => array('value' => $amiId),
    'ip' => array('value' => file_get_contents('http://169.254.169.254/latest/meta-data/local-ipv4')),
    'hostname' => array('value' => file_get_contents('http://169.254.169.254/latest/meta-data/local-hostname')),
    'availability_zone' => array('value' => file_get_contents('http://169.254.169.254/latest/meta-data/placement/availability-zone')),
    'last_updated' => array('value' => time())
);
 
 
$sdb = new SimpleDB(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY);
$domain = 'simple_db_domain_name';
 
$expectedMachineData = null;
 
$objectInfo = $sdb->select($domain, "select * from `" . $domain . "` where `ami_id` = '" . $amiId . "'");
 
if (count($objectInfo)) {
    $attributes = $objectInfo[0]['Attributes'];
 
    $tmpMachineData = array();
 
    foreach ($machineData as $key => $data) {
        $data['replace'] = 'true';
        $tmpMachineData[$key] = $data;
    }   
 
    $machineData = $tmpMachineData;
 
    $expectedMachineData = array(
        'last_updated' => array(
            'value' => $attributes['last_updated']
        )   
    );  
}
 
// put the object into simple db
$sdb->putAttributes($domain, $amiId, $machineData, $expectedMachineData);
 
// find any old records to clean up
$recordsToDelete = $sdb->select($domain, "select * from `" . $domain . "` where `last_updated` <= '" . (time() - 300) . "'");
 
foreach ($recordsToDelete as $toDelete) {
    // if this is the same machine as the one this script is running on let's skip the
    // delete and just run the proceeding update
    if ($toDelete['Name'] == $amiId) {
        continue;
    }   
 
    // let's delete the object
    $sdb->deleteAttributes($domain, $toDelete['Name']);
}
 
// find any active servers
$activeMachines = $sdb->select($domain, "select * from `" . $domain . "` where `last_updated` > '" . (time() - 300) . "'");
 
$ips = array();
 
// make the list of all ips
foreach ($activeMachines as $activeMachine) {
    if (trim($activeMachine['Attributes']['ip']) == '') {
        continue;
    }   
 
    $ips[] = trim($activeMachine['Attributes']['ip']);
}
 
$memcache = new Memcache();
$memcache->addServer('127.0.0.1', '11212');
$memcache->set('cluster-ip-list', $ips, 0);

It is also a good idea to make a separate user in AWS for these scripts so that you can get a new access key and secret that can be locked down to just that domain in SimpleDB with only the needed permissions.

Currently the general rule when using SSL is that you will need one IP for each hostname you want to secure. This will all change once TLS2.0 is widely adopted. For the time being, if you are lucky enough to only want to be securing multiple subdomains off of the same domain with a wildcard SSL cert the keep reading below.

1. Ensure that your apache config includes:

NameVirtualHost *:443

2. Your vhosts:

<VirtualHost *:443>
ServerName subdomain1.example.com
……
SSLEngine on
SSLProtocol all -SSLv2
SSLCipherSuite ALL:!ADH:!EXPORT:!SSLv2:RC4+RSA:+HIGH:+MEDIUM:+LOW
SSLCertificateFile /path/to/your/ssl.crt
SSLCertificateKeyFile /path/to/your/ssl.key
……
</VirtualHost>
<VirtualHost *:443>
ServerName subdomain2.example.com
……
SSLEngine on
SSLProtocol all -SSLv2
SSLCipherSuite ALL:!ADH:!EXPORT:!SSLv2:RC4+RSA:+HIGH:+MEDIUM:+LOW
SSLCertificateFile /path/to/your/ssl.crt
SSLCertificateKeyFile /path/to/your/ssl.key
……
</VirtualHost>
If my understanding is correct of apache, it will enter the first virtualhost it finds that is SSL in this case and use the certificate details in there to decrypt the request. If the hostname does not match at that point it will move along to the next virtualhost that it can match and try there.

I am currently working on a new startup called ShootProof. ShootProof utilizes many of Amazon web services. We recently have been hearing sporadic feedback from our beta testers that sometimes their uploads are slower than they think they should be. Currently the way we are accepting uploads is that we send each file up via XMLHttpRequest to our EC2 instances, doing some quick inspection of the file and then store it in an upload bucket. A few moments later a re-sizer batch job comes along and does resizing/watermarking/other stuff to the photo and moves it into place.

After we started to investigate why some beta testers were sometimes getting slower than ideal upload speed we decided to test out the ability to do out uploads directly to S3. Amazon S3 support HTTP POST uploads which is great as it takes us out of the middle of all of that traffic. Essentially what this means is that users of ShootProof should never be limited upload-wise by our EC2 instance. Also we will not need to constantly spool up and down EC2 instances to handle load spikes. After each upload is completely sent to S3 we will fire off a small notification call that will let us know we have a new photo we need to take care of. Upload traffic to our EC2 instances will drop by at least 99%. Also to be sure that we never miss a new photo that is placed into our S3 upload bucket we will monitor the contents up the upload bucket to ensure that they match what we are expecting. All photos that are uploaded into S3 by the user are marked to have an ACL of private so that they are essentially being put into a dropbox.

Below is a table that shows the tests that we did to come to our conclusion to post directly to S3. The file that was used for this test is a 13.1MB JPEG. All uploads were done using a internet connection that is a full 10Mbit up.

XMLHttpRequest Post (EC2 -> S3) HTTPS S3 Post HTTP S3 Post
13.2 20.6 9.5
14 19.1 9.6
13.7 22.7 10.3
14.3 24.9 9.3
13.7 15.3 9.4
13.9 18.4 9.2
24.1 24.2 9.7
13.7 17.5 9.4
13.8 17.3 9.9
15.2 17.4 9.4
14.96 sec 20.04 sec 9.57 sec Average

With AT&T’s new A-List feature for accounts of at least a certain threshold comes a new problem, which numbers to include in the list. While this might be easy for some, it wasn’t that simple for me so I wrote a little script to compute what my optimal A-List would be. If you are an AT&T wireless customer, give this a shot, it might save you some more or add to your rollover balance! If you have any questions about this script or find a bug you can find my contact information on the about page.

Update: I just received an email from support, the reports will show up in my account tomorrow at 6:00 AM. [9/23/2008 3:21 PM ET]

I typically will not write about anything non-technical but I will make an exception this time. Why-oh-why is it like pulling teeth when dealing with the iPhone App Developer Program support. I get the feeling that when I call and wait my turn in the phone queue, all I am getting to talk to is a receptionist who is going to “escalate the matter”. When asked how long it will take to get a response back, I am told that it will take a week to a week and a half. This is a VERY simple matter. All I want to know is how many downloads my app has had. Shouldn’t this automatically, programmatically for that matter, be turned on in the iTunesConnect portal by default when an app goes live? If that was the case, I could be writing a blog post about how awesome the developer program is, but no, I am stuck here writing in frustration.

Having dealt with Apple support for a Macbook Pro a couple of times, the iPhone Developer support seems like a bunch of amateurs who have no idea what’s going on. As an AAPL share holder, I am glad that the general public does not have to deal with the iPhone Developer support, as it would turn people away in droves.

If anyone with power to help in this situation is reading this post please feel free to contact me at rswarthout [at] gmail [dot] com.