I was talking to my kids last night and one thing you often tell your kids is how different things were when you were growing up. My daughter will soon be of the age where she can start going to a program called Code Ninjas where she can start to learn the art of coding. I reflected on the fact that she is really fortunate to be growing up in an age where learning things like coding are relatively easy. When I started coding (and I started at a later age than she probably will), it was on a Commodore 64 with BASIC, and although the instruction set was limited and fairly, uh, basic, it was still a process that I mostly drove myself and I didn't really know what I was doing. Looking back now, I realize that the PEEK command was used for looking in memory addresses and the POKE command was used for putting data in those memory addresses.
Thinking about that led me to realize that I could easily spin up a machine in the cloud that had hundreds of thousands or even millions of times as much memory as that first computer, and it would barely cost me anything to do so. If I wanted, I could spin up hundreds or thousands of Commodore 64 vms (or even just containers) just by specifying an image name and a configuration. If I could take that technology back to the time when I was still coding on the Commodore, well, I wouldn't have to be doing this for a living any more - it would literally magic to the folks back then.
Spinning things up in the cloud has become so easy that almost anyone can do it now, and just about everyone can afford it (in some cases, it's free, and that's a price that everyone loves). Now the challenge is to automate and bootstrap the process so that you not only spin up some servers, you can actually spin up a baseline application environment where you just fill in the blanks on what you are trying to do. So many applications fall into a typical pattern of a web frontend, a server backend, and a database store that you can usually find templates already that will fit your basic requirements.
However, some applications have more specific requirements. I'm not talking about what applications are installed or the operating system - I'm talking about hardware requirements. A lot of cloud offerings will allow you to pick a template where there is an effective number of CPUs and memory and then bill you on that. They may allow you to scale up by adding more servers, but you are still shoehorned into picking resources up front. The problem is that not every problem you are going to solve is going to have the same requirements.
Take Machine Learning, for example. There are a number of problems in Machine Learning that would easily be handled by a basic setup, especially if you are just doing some simple linear regression models. These models are nice because the hard part is mostly in the training, and once the model is trained, it executes very quickly. As we start to incorporate Deep Learning into the picture, however, the required computing power goes up significantly. The more layers you have and the more units you have in each layer, the more accuracy you are going to get. The problem is that adding layers and units significantly increases the number of computations that are required. Training becomes signficantly longer because you do lots of calculations through the layers and then have to do more calculations backwards through the layers in order to make the adjustments to each unit. Although still relatively fast, deep learning generally involves a lot more computations after training than a linear regression model would. I'm also glossing over the fact that there can be GPU requirements to make these calculations time effective.
There's another aspect to this - utilization of the resources you allocate. If you request a large number of cores and memory, there is a good chance that you won't be utilizing them all the time. Fortunately most cloud providers only charge you for what you use, so this might not be a problem. If you are paying for dedicated resources that aren't shared, however, you may actually be paying a penalty for the time that you aren't using these resources. This isn't necessarily an easy problem to solve, because after all, you don't really want to generate work for your resources just because they are there. At the same time, however, you really only want to have resources allocated that are actually needed.
There is a new approach to this problem that is slowly becoming the norm. Orchestration frameworks like Kubernetes and Mesos are beginning to abstract computing resources and the actual tasks that run on them such that you can easily specify the resource requirements for the task and let these frameworks figure out how to get it for you. One way of describing this is that the cloud suddenly starts acting like one enormous machine with lots and lots of different things running on it. No longer are you concerned with requesting adequate resources ahead of time, now you just request resources when you submit the task you want done.
This is the next generation of cloud. It is likely how most software will be run and delivered in the near future. We won't be building individual applications that are in separate silos - we'll just define application requirements and then submit our tasks to the great cloud in the sky. If we need more resources we just adjust our requirements accordingly. Welcome to the future.