New CycleCloud HPC Cluster Is a Triple Threat: 30000 cores, $1279/Hour, & Grill monitoring GUI for Chef
In BlogNicknamed after the magical “Nekomata” cat of Japanese nightmares, Cycle Computing’s monstrous new supercomputer can now be yours to rent for the low price of $1,279 an hour. By fusing together the face-melting power of 3,809 eight-core Amazon AWS Elastic Computer 2s, the company was able to create the world’s 30th fastest computer with 30,472 processor cores and 27TB of memory — primarily used for complex modeling rather than Facebooking. Components of the beast hide out in three of Amazon’s EC2 data center lairs located in California, Virginia and Ireland, and communicate using HTTPS and SSH encrypted with AES-256 to keep its secrets safe and secure. Compared to the company’s previous 10,000-core offering ($1,060 / hour), the new version is far more powerful and minimally more expensive, mostly because it uses spot instances (where customers bid on unused EC2 capacity) rather than pricier reserved instances. Good on you Cycle Computing, not everyone has access to a Jeopardy champ.
In fact, we had to implement a triad of features within CycleCloud to make it a reality:
1) MultiRegion support: To achieve the mind boggling core count of this cluster, we launched in three distinct AWS regions simultaneously, including Europe.
2) Massive Spot instance support: This was a requirement given the potential savings at this scale by going through the spot market. Besides, our scheduling environment and the workload had no issues with the possibility of early termination and rescheduling.
3) Massive CycleServer monitoring & Grill GUI app for Chef monitoring: There is no way that any mere human could keep track of all of the moving parts on a cluster of this scale. At Cycle, we’ve always been fans of extreme IT automation, but we needed to take this to the next level in order to monitor and manage every instance, volume, daemon, job, and so on in order for Nekomata to be an efficient 30,000 core tool instead of a big shiny on-demand paperweight. Our new Grill monitoring GUI app for CycleServer helped show what’s cooking with our Opscode Chef environment that helped with cloud infrastructure automation for all the instances for this cluster.
The Stats
Before we step through these enhancements one by one, let’s take a moment to sit back and contemplate the sheer scale of this compute environment.
Resource | Count |
c1.xlarge instances | 3,809 |
cores | 30,472 |
RAM | 26.7-TB |
AWS Regions | 3 ( us-east, us-west, eu-west ) |