One of my Magento clients ran into a bit of trouble with their current hosting provider. Stability was not up to what it was supposed to be, there were security issues that raised concern, and support was slow and indifferent at times. The client is a retailer that gets significant traffic spikes during shopping season and when doing promotions. The infrastructure of the current hoster was not flexible enough to deal with huge traffic spikes in an automated fashion.
We decided to switch to a different hosting company to improve the situation. We looked into a bunch of companies and ended up giving Rackspace Magento Hosting a go.
This post is a post-mortem of a failed attempt to switch our hosting provider.
On paper Rackspace seemed to meet all our core requirements:
- Fully managed
- Located on the West Coast
- Flexible scaling (they run their infrastructure on AWS)
- Dedicated Magento team with significant experience hosting Magento installs
- Established company with significant amount of employees
Rackspace Magento Hosting seemed to check all the boxes and their proposed setup was within the budget. After some initial discussion we pulled the trigger and signed the agreement. The proposed timeline was to go live on the new infrastructure within 2 weeks.
Right off the bat some things seemed to be a little off. When we got access to the ticket system we ran into a bug that didn’t allow us to create additional user accounts. So our first support ticket was about an issue with the ticket system. In times of GitHub Issues with Markdown support the RS ticket system seems completely archaic. It is impossible to have more than one person being the recipient of ticket update notifications and the suggested solution to this problem is to set up an external mail distribution list.
Before signing the agreement with RS we had worked out detailed specs for the new system and the RS Onboarding team was going to implement them. Because RS is just reselling AWS they are not able to order the instances on our behalf. The reselling process is not fully transparent so we had to go through a bit cumbersome process to order the required AWS instances. No big deal, but it is something that stood out.
Once the cluster was set up, we were provided with the credentials and started setting up our application. Upon first login I got an update notification telling me that packages needed to be updated. After discussing this with RS it turned out that they don’t have a general security policy in place. They resolved the problem by turning on automatic updates on our instances. I was used to a much more fine grained approach for handling updates based on previous work with other hosting companies. For example, other hosting companies would test new packages on a dedicated testing infrastructure before deploying them on a large scale.
In retrospect we should have pulled the plug at this point. From here, things took a turn for the worse. The environment we were presented with was not a fully functional cluster, but a bunch of instances with hardly any configuration. I was under the impression that when we order a cluster from Magento Hosting that the system that would be handed over would be ready to run Magento pretty much out of the box.
Some of the issues we encountered were:
- System volumes were not sized to spec
- Web nodes couldn’t access NFS and Redis due to SE Linux policy
- Firewall was not configured correctly for RDS
- Apache wasn’t configured to work with NFS
- Virtual hosts were not configured
- SSL was not set up
We started working with the Onboarding team to resolve these issues. The Onboarding team only works 8-5 CST and I’m on HST. Because of the time difference small issues often took more than a day to resolve. The first major issue we ran into was to import the database dump. Again, for a Magento managed hosting package I would expect that they are aware of what it takes to import a Magento database dump. Between firewall issues, RDB permission issues and RDB configuration issues it took us a full week to import the dump. At this point we were several weeks past the initial deadline.
While discussing the issues with support we learned that RS actually only had switched to AWS in November 2015 and they were still working things out. One of the side effects of this was that in our cluster the web nodes would run CentOS while the rest of the instances ran Amazon Linux. The entire process had a very beta feel to it.
When working with RS one of the biggest problems is that you don’t work with a small team of people that maintains your infrastructure but with a large number of administrators who have no specific clients. The result is that lots of different people touch your tickets, often not reading the ticket history or being familiar with your setup. This resulted in additional delays, miscommunication, and frustration. Nothing is more annoying than after waiting a day to get a ticket resolved, getting a reply with a question that had been answered two comments below.
More than a month after we started the setup process, the Onboarding team handed the site over to general support. The site still wasn’t 100% working at this point but at least we could load the front page. The remaining issues were to get SSL set up and resolve some issues around session handling. With regular support things turned from bad to insanely bad. It took days for anyone to even look at our tickets. Two weeks later, we were still not able to push the new cluster live. We had a final call with RS and they were saying that the are not able to resolve the remaining issues and that we were on our own. After two months, we pulled the plug.
What RS offers with their Managed Magento Hosting package has nothing to do with managed hosting. Unless you have system administrators on staff and you are willing to pay for setting up and running the cluster yourself, I don’t see RS Magento Hosting as a viable option. I struggle to see any added value having RS run the AWS instances for you instead of going straight for AWS.
This was also a prime example of sunk cost fallacy. The process of switching a hosting provider with a fairly large shop is quite involved and requires a lot of work and coordination. Once you are deep into the project, it is hard to cancel the project even when red flags arise. The biggest lesson I learned from this is to make sure to not to ignore these red flags in the future.