I recently started a lecture series at the Vrije University of Amsterdam (VU). As part of this I did a lecture on how to operate unreliable Information Systems in a reliable way – or: Everything breaks, All the time.
Behind the clouds of cloud computing! How can we reliably operate systems that are inherently unreliable?
What if for some hours, we do not have access to the services such as navigators, routers, and other communication technologies? It seems our life will be at stake if major digital services fail! Many promises of digital technologies, from big-data to the Internet of the things and many others are based on reliable infrastructures such as cloud computing. What if these critical infrastructures fail? Do they by the way fail? How the responsible companies and organizations manage these infrastructures in a reliable way? And what are the implications of all this for companies who want to base their business on such services?
As part of the lecture we explored modern complex systems and how we got there, using examples from Google and Amazon’s journey and how it relates to modern enterprise IT. We used the material of Mark Burgess to explore how to prevent systems from spiralling out of control. We closed off looking at knowledge management based on the ‘blameless retrospective’ principles and how feedback cycles from other domains are helping to create more reliable IT.
Relevant links supporting the lecture :
The used presentation can be found here: VU lecture
Recording of the session is available within the VU.
VU Assistant Professor Mohammad Mehrizi posted a nice lecture review on LinkedIn, including a picture with some of the attending students.
In 2009 AWS launched their EC2 Reserved Instance (RI) pricing model , providing a significant discount compared to the on-demand pricing model if you are willing to commit to 1 or 3 years of usage.
Cost management on AWS is a hot topic, with hundreds of blogs on making the right choice of RI’s. A new market of AWS cost management tools emerged, with tool vendors promising massive ROI. Based on AWS billing & usage data analysis, these tools will provide RI recommendations. AWS’s own Trusted Advisor will also provide this kind of analysis, as included in their higher level support plans.
I highly recommend my SDL FredHopper colleague David Costa’s presentation on the topic, as he basically wrote the (internal) book on how to do this at scale: Cost Optimization at Scale.
While you can shift RI reservations, after you made them;
- Switch Availability Zones within the same region
- Change between EC2-VPC and EC2-Classic
- Change the instance size within the same instance type
There are several use-cases where you would want end your reservation before the reservation end date;
- Switch Instance Types.
- Buy Reserved Instances on the Marketplace for your medium-term needs.
- Relocate region.
- Bad capacity management.
- Unforeseen business or technology changes.
Continue with reading
Less then 24 hours after I published my ‘thank you team’ post, Datacenter Dynamics announced their nominees for this year’s DCD EMEA 2016 Awards.
I’m very proud one of the mentioned projects in my original post got nominated in the Category Cloud Journey of the Year
The selected project is our move of SDL Machine Translation from our co-lo datacenter to a IAAS cloud solution;
Availability of content in multiple languages is key to driving useful international business. SDL’s Statistical machine translation delivers high quality translation services to more thousands of customers. While SDL’s research organisation already explored a new approach to machine translation, the future development and deployment needed more flexibility in technology choice and dynamic scalability to be commercially successful. Over 10 months, SDL migrated their current workload deployment consisting of hundreds of servers, without customer downtime, to a private Cloud deployment. The migration included a project team of more than 35 staff in 5 time zones. Besides flexibility and scalability gains, the migration saves SDL more than 450k GBP over 4 years.
The teams worked long hours, overcoming many obstacles a long the way. Congrats to all involved!
After 3 very dynamic years, I’m leaving SDL today. It has been a great journey and I enjoyed every minute of it. Anyone who has followed SDL in the last 9 months has seen a lot of changes announced; divestment of 3 business units, new CEO, new CTO,…
While I personally think these changes are good for the company and it will bring focus and stability going forward, I also decided I wasn’t going to be part of that future anymore.
With this in mind, I shifted my focus in the last few months on helping to find a good home for the divesting business units. It provided me with the option to slowly step away from my day-to-day responsibilities without disrupting it too much.
During the hand-over period, you automatically get confronted with what you are going to leave behind. <cue music> Don’t Know What You Got (Till It’s Gone) </cue music> and the saddest thing to leave behind are actually my teams & peers.
Continue with reading
This week I will be celebrating my 15th year of active volunteer firefighter duty. As you naturally tend to do when celebrating milestones like these, is to reflect on the past years and learnings.
One thing that specifically stood out are moments in my IT leadership career, where I applied firefighter techniques and skills, I picked up over the years.
Most of them revolve around problem solving and how to get the most out of teams. While there is an obvious link between firefighters and solving issues in a high pressure or crisis situation, I did learn the same tactics also apply to any challenge I was confronted with.
When firefighters arrive at the scene of a fire, they always follow the same protocol;
-Assess the situation
-Identify & control flow path
-Extinguish the fire
-Reset & evaluate
In business and especially at higher leadership levels some problems may seem very daunting, creating anxiety and leave you with the feeling of being overwhelmed. Firefighters are used to stepping in to highly unknown situations with confidence and as such a protocol like above helps to, step by step, gain control of the situation.
Continue with reading
With AWS celebrating 10 years after the launch of Amazon S3 in March 2006 and Twitter also celebrating 10 years , I wanted to revisit my ‘cloud rules’ published on Twitter and on my Dutch blog in 2011. The original was written after 2 years working on an enterprise IT implementation of whatever was perceived ‘cloud’ in 2009, building a Gov version of Nebula (the Openstack predecessor) and starting to utilize AWS & Google Apps in enterprise IT environments.
As the original rules where published on Twitter with its 140 character limit, it lacked some nuance and context so I converted it in to a blog post. The original 7 rules from 2011, with context;
Even though the debate on a definition of ‘what is cloud’ died down a bit, it does still surface now and then. Given the maturity of the solutions , the current market state and the speed of change in the market of ‘cloud’ ,I still stick to my opinion from 5+ years ago: a generic definition of cloud is not currently relevant.
The most common used definition seems to be the one NIST (pdf) published in 2011, and provides a very broad scope. Looking at IT market development the last few years and the potential of what is still to come, we are continually refining these definitions.
As ‘cloud’ products and services pushed to the current market will be common IT practice in a few years, we will slowly see the ‘cloud’ name being dropped as a result.
There is still a valid argument to have a common definition of ‘cloud’ and the delivery models within companies, to avoid miscommunication between IT and the business. The actual content of that internal definition can be whatever you want it to be, as long as there is a common understanding.
The definition debate in the general IT market will continue until the current hype phase has passed. As soon as we enter Gartner’s “Trough of Disillusionment” all marketing departments will want to move away from the ‘cloud’ term, and replace it with whatever the new hype is. We can already see this happening with the emergence of ‘DevOps’, ‘BigData’, ‘Internet of Things (IoT)’.
Just remember there is just one truth when it comes to ‘cloud’ ; “There is no cloud. It’s just someone else’s computer”
Continue with reading
Several of my quotes ended up in Dec 2015’s Database Marketing Magazine.
See the full version of the magazine here.
Or PDF of my quotes here.
According to AWS CTO Werner Vogels “Cloud is now the new normal.”
Where the first day keynote at AWS’s ReInvent 2015 conference was all about enabling companies to migrate their current services to the cloud, the second day keynote by Vogels was all about the ‘new normal’ – developer enablement.
With new services like AWS Snowball , AWS Database Migration Service and AWS Schema Conversion Tool , AWS tries to smoothen the migration path from old on-premise infrastructure & application deployments, to using AWS’s Infrastructure As A Service offering (EC2, RDS, VPC, S3, ..).
While these new services help companies to move to a consumption model for compute, storage and networking, it is still very infrastructure focused. Design decisions around (virtual)network layout, load balancers and the build & management of the operating systems (Windows/Linux) are still the customer’s responsibility.
Needing to still deal with all these elements, holds developers back from moving fast as they go from idea to the launch of a new service. It slows the creation of real value to the company down.
In the real ‘new normal’ world, the developer is enabled to deploy a new service by building & releasing something fast, without needing to worry about the infrastructure behind it. By stitching external managed capabilities/services together in a smart way the developer can move even faster.
Where in the past a developer would try to speed releases up by code-reuse with, for example, software libraries, the availability of developer ready services like a fully managed message queuing service (AWS SQS) or a push messaging service (AWS SNS) have enabled developers to move even faster without worrying about the manageability of the solution.
Continue with reading
Last year my friend Tim Crawford wrote an excellent article on why CIOs should get out of the datacenter business. Tim focused on how current big cooperates are moving away from building, owning or renting datacenter facilities in favour of consuming IT at higher levels of the stack.
As he focused on the migration of leading big companies, it leaves the question; what about the future Fortune 500 companies?
Continue with reading
Remember when the Cloud hype kicked off and we all looked mesmerised at the Cloud Unicorn companies (like Netflix) that got great benefit from Cloud usage? We all wanted that so badly. We wanted to get out of the pain of high maintenance cost and the lack of agility. Amazon, Google, Microsoft Azure all seemed to provide that. Just by the click of a button.
In 2012 I did a short whitepaper on what it takes to move an application the cloud, based on my painful experience with some Enterprise IT moves to the cloud. I stated, “The idea that one can just move applications without change is flawed.” There was not enough benefit in moving the monolithic, 10 year old, application on the cloud. That type of move may deliver small cost savings, but that is actually a hosting exercise. It could even be dangerous to move in that way because the application may not be suitable for the cloud providers’ reference architecture. That could for example lead to availability and performance issues. The unicorn benefits could only be gained if you changed your way of working and thinking.
The same goes for the Docker hype now;
I agree with all the potential that Docker unlocks; portability & abstraction. It is a game changer and some even say ‘Docker changes everything’
Hearing people talk at large conferences (like AWS ReInvent) about Docker seems like the first phase of the Cloud hype all over again. They state ‘Just docker-ize your app’ and all will be great. Sureal conversations with people that try to put anything and everything in a container. ‘Yes, just put that big monolithic app in a container’
People seem to forget that Docker is an enabler for architecture elements like portability and micro-services (that leads to scalability).
I highly recommend reading James Lewis & Martin Fowler ‘s article on microservices first: http://martinfowler.com/articles/microservices.html
Then See this:
Because The problem with the Docker hype currently? It makes it about the tool. And only the tool will not fix your problem.
Other things to consider around Docker: Docker Misconceptions