Part of the supposedly unbreakable Amazon cloud was down, and the world didn’t end.
What did happen was that a swarm of the best operations people in the world rapidly descended on the problem, diagnosed and fixed it. You can be sure the issue had top management attention, because Amazon’s brand, reputation, and business rides on their infrastructure.
With all due respect to your infrastructure and operations team, they are unlikely to have the manpower, specialization, and training that Amazon cloud engineering has. If the same issue had hit your own in-house data center, it would have taken you much longer to find and fix it.
It is drummed into every aspiring developer that duplicating code is bad, and re-use is good. Seen from the organization hiring the developer, that is true. But seen from a developer under pressure to meet a deadline, it makes perfect sense to write his own code, even if the same functionality has been implemented before.
If you want to promote re-use across teams in your organization, you need to do three things:
Document all services with examples. For REST web services, you can use a tool like Swagger.
Implement the policy that old versions of services are not retired until nobody is calling them
Enforce a policy of calling services instead of writing them over.
This is an excerpt from the monthly Technology That Fits newsletter. Sign up here.
Moving your software to a cloud vendor has always been an act of faith. You believe the vendor will honor their promises, fulfil the SLA and stay in business.
That’s why many are choosing the big names like Amazon, Microsoft and Google.
Oracle wants to extend its brand into Cloud computing as well, but they are not even on Gartner’s radar, and with their recent decision to double the cost of running Oracle on Amazon, they are not endearing themselves to customers.
No matter which cloud vendor you choose, make sure that you establish an exit strategy in advance. You need to be able to keep your systems running even if your cloud vendor suddenly folds. That means that you need to establish a procedure to continually transfer data from your cloud to a third part (or back to yourself). Don’t get stuck in the cloud.
In my popular “Everything that’s wrong with IT” presentation, I use various technical gadgets as examples of the traps we tend to fall into when developing IT.
My favorite example of too much technology for technology’s sake has been my internet-connected socks. Unfortunately, these RFID-equipped wonder socks were discontinued after I started making fun of them. But I think I’ve just found a new favorite: A bluetooth-equipped hair brush.
This brush is so advanced that it can’t even be called a brush – it is a “hair coach.”
On a recent site visit, I went to the printer room to dispose securely of a draft of my confidential report. As expected, there was a container for confidential papers. As expected, it was locked. Unfortunately, the lock was only put through the bracket on the lid, not the container itself.
If I wanted to, I could have rummaged through all the departments’ confidential papers.
Much security is like this: Locked, but not secure. The organization suffers from all the impediments of spotwise strict security while overall security is still lacking.
The only way to build a secure IT infrastructure is to have someone regularly verify the security, including everything from the padlocks to the installation of vendor patches. This can be an internal compliance team or an external service – as long as the verification is not done by the people responsible for implementation.
For as long as we’ve had computers, we have instigated competitions between the humans and the machines. In chess, world champion Garry Kasparov won over specialized chess computer Deep Blue in 1996, only to loose against an improved algorithm in 1997.
You want to be part of the solution, not part of the problem. If you have the responsibility for computers, websites or IoT systems, make sure you have hardened them appropriately.
Side note: When I checked this site, I realized that my anti-spam protection worked, but I had neglected to restrict new user registration. I had 15,777 registered users (!) and had to install a bulk delete plug-in to get rid of them. So if you’ve commented on my posts in the past, I regret to inform you that you’ll have to re-register to comment again (now with Google reCAPTCHA)
After some persuasion, one of my customers was ready to experiment with the Oracle cloud. So I signed him up for a trial Database Schema Cloud service and built him a little APEX application to show how fast and easy it was to get rid of some spreadsheet-based business processes.
This morning, my customer called me to say that the service didn’t work. Indeed it didn’t. I had neglected to put the expiry date into my calendar, and when your 30 days are up, Oracle will wipe out your instance. There is no warning email and your instance is gone without any possibility of restoring it.
So the demo was gone, and with it that potential Cloud customer.
Having used Amazon Web Services, Microsoft Azure and Oracle Public Cloud for quite some time I have to say that Oracle Public Cloud lags far behind the other two in user experience.
I fully concur with that opinion. Additionally, when your process for trials is to wipe them out without warning, you are making it really hard for even your most enthusiastic supporters to recommend you.
Oracle still has a lot of work to do on their cloud services.
I’ve just started my Private Pilot’s License project, and the first order of business was to get a Class 2 medical. Being a triathlete and considering myself fairly healthy, I expected that to be a formality. To my surprise, the examiner detected that my blood pressure was too high, and I’ll have to work on getting it down before I can fly solo.
Similarly, I’m sure that Delta Airlines considered their data center fairly healthy. Unfortunately, they did not test. So when the power supply disappeared, they discovered that 300 out of 7,000 devices were not properly connected to backup power. And 2,000 planes were grounded.
IT suffers from Ostrich Syndrome: The belief that if you put your head in the sand and refuse to face facts, nothing bad will happen. Real ostriches don’t do this, of course – that would soon make them extinct. But IT does.
Finding the right amount to spend on all elements of IT (security, testing, fault tolerance etc) requires proper risk analysis. This is taught in Project Management 101, but recent events show that not everybody in IT understands this.
For example, the Democratic National Committee apparently thought that nobody would bother to attack their systems. After all, it just contained boring political emails, right? Wrong.
Last month, it was Southwest Airlines who cancelled 2,000 flights, supposedly because a router went down. Talk about single point of failure…
Network segmentation, security patching, high availability, and disaster recovery all costs money. But being hacked or down also costs money. Did DNC, Delta and Southwest make the right call? I don’t think so. Maybe it’s time you looked at your risk analysis. Because you do have one, don’t you?
My kitchen has a very nice range hood over the cooktop. It has a powerful fan and beautiful brushed steel finish. And it has a user experience like most IT systems: Lousy.
Let’s think about what a range hood does. It has two main functions:
Start the fan to extract grease and fumes
Turn on the light over the cooktop
Because of the shape of a range hood, the buttons to operate it are typically placed in a row. A row of buttons has two good, easily found positions:
To the far left
To the far right
Two primary functions, two good button locations. It would not take five minutes of thought to allocate functions to buttons. Unfortunately, the engineers at ATAG did not spend those five minutes. Instead, they placed the button for the light 5th from left, 3rd from right. And what did the use the good right-hand position for? The rarely-used feature of resetting the filter cleaning warning. A button I press every three months at most.
Most IT project do not spend these five minutes of thought either. Large, professional organizations have a team of UX professionals, like the people I work with at Oracle. But even if you don’t have professional UX designers, every developer can spend five minutes thinking about the task the user wants to achieve.
Most IT systems are like my range hood: Just inconvenient enough to make users slightly annoyed every time they have to concentrate on an operation that should have been easy and obvious.
Next time you build a system, spend a little while thinking about your users before you code. They’ll love you for it.