Last week, I was super excited to attend James’s talk on “Datacenter networks are in my way” in our internal talk series called Principals of Amazon. As always, James’s talk always is illuminating. I highly encourage everyone to read James’s post and the slides.
A few takeaways from James’s talk worth calling out:
– Contrary to popular belief, power is not the leading driver for datacenter operational cost. It is actually the server cost (which is about 57%).
– The above leads to the conclusion that techniques like shutting down servers when the server is not being used, while interesting, is not a big return for the investment. Instead, you are better off doing the exact opposite: utilize your existing server investment to the fullest.
– Traditional DC networks are usually oversubscribed and live in a very vertical world where all network components are done by a single vendor and are also built to be more like mainframe with “scale up” (get bigger boxes) model instead of “scale out” model. This is bad for sustainability and reliability.
– To enable higher server utilization, you need your datacenter networks to support full connectivity between hosts and not be oversubscribed.
The above takeaways tell us that we need to build DC networks such that they can be easily scaled (moving from oversubscribed to undersubscribed). To scale the DC networks, we need to build out a scale out DC network architecture and systems like OpenFlow enable that. It is interesting to see that just like what we learnt in distributed systems and datastores is applicable to datacenter networks also: Scale out (horizontal scaling) is in the long run better than scale up (vertical scaling).