JanWiersma.com

The problem with the Docker hype

hyundai-elantra-flattened-3Remember when the Cloud hype kicked off and we all looked mesmerised at the Cloud Unicorn companies (like Netflix) that got great benefit from Cloud usage? We all wanted that so badly. We wanted to get out of the pain of high maintenance cost and the lack of agility. Amazon, Google, Microsoft Azure all seemed to provide that. Just by the click of a button.

In 2012 I did a short whitepaper on what it takes to move an application the cloud, based on my painful experience with some Enterprise IT moves to the cloud.  I stated, “The idea that one can just move applications without change is flawed.” There was not enough benefit in moving the monolithic, 10 year old, application on the cloud. That type of move may deliver small cost savings, but that is actually a hosting exercise. It could even be dangerous to move in that way because the application may not be suitable for the cloud providers’ reference architecture. That could for example lead to availability and performance issues. The unicorn benefits could only be gained if you changed your way of working and thinking.

The same goes for the Docker hype now;

I agree with all the potential that Docker unlocks; portability & abstraction. It is a game changer and some even say ‘Docker changes everything’

Hearing people talk at large conferences (like AWS ReInvent) about Docker seems like the first phase of the Cloud hype all over again. They state ‘Just docker-ize your app’ and all will be great. Sureal conversations with people that try to put anything and everything in a container. ‘Yes, just put that big monolithic app in a container’

People seem to forget that Docker is an enabler for architecture elements like portability and micro-services (that leads to scalability).

I highly recommend reading James Lewis & Martin Fowler ‘s article on microservices first: http://martinfowler.com/articles/microservices.html

Then See this:

Because The problem with the Docker hype currently? It makes it about the tool. And only the tool will not fix your problem.

Other things to consider around Docker:  Docker Misconceptions

Share

Our Trust issue with Cloud Computing

Recently SDL (my employer) did a survey on customer ‘trust’ for the marketer.B0AgpWXIAAA5mZm

Being in the IT space I tend to deal a lot with ‘trust’ the last few years. Being responsible for the Cloud services delivery for my companies SAAS & hosted products we deal with clients evaluating and buying our services. My teams also evaluate & consume IAAS/PAAS/SAAS services in the market, on which we build our services.

The ‘trust’ issue in consuming Cloud services is an interesting one. IAAS platforms like Amazon abstract complexity away from its user. It is easy to consume. The same goes for SAAS services like Box.com of Gmail; the user has no clue what happens behind the scenes. Most business users don’t care about the abstraction of that complexity. It just works….

It’s the IT people that seems to have the biggest issue with gracefully losing control and surrendering data, applications, etc., to someone else. Control is often an emotional issue we are often unprepared to deal with. It leaves us with a feeling ‘they can’t take care of it as good as I can…’ Specifically IT people know how complex IT can be, and how hard it can be to deliver the guarantees that the business is looking for. For many years we have tried to manage the rising complexity of IT within the business with tools and processes, never completely able to satisfy the business as we where either to expensive or not hitting our SLA’s. Continue with reading

Share

German companies ask for Internet border patrol.

border_0_0In the last year multiple companies started serving German customers out of Germany based datacenter locations.

There seems to be a specifically strong sentiment around security & privacy with German companies after the Edward Sonwden leaks. The kneejerk reaction is to mandate that servers should sit within German borders, as that would take any security & privacy concern away. Cloud providers are now starting to follow this customer demand.

Interestingly this reaction is more sentiment driven as there is no legal ground to request this. Especially as more and more German companies are putting this in place as a default policy, regardless of what type of data (privacy sensitive or not…)

Looking at the Federal Data Protection Act (Bundesdatenschutzgesetz in German) (“BDSG”) it states that certain transfer of data (like personal data) outside of the EU needs to be reported and approved and Data controllers must take appropriate technical and organizational measures against unauthorized or unlawful processing and against accidental loss or destruction of, or damage to, personal data. Nothing says servers need to be in Germany.

Looking at other EU countries, Germany seems to be the only country where organizations express such behavior. The only next inline could be Switzerland.

Continue with reading

Share

My move to SDL…

It’s the weekend before the holiday season and just like last year I find my self at an US airport making my way home… just in time for Christmas.

Sitting at the airport lobby, listing Christmas songs, I can’t help to reflect on the past year.

A lot has happened and things changed a lot. I have left OCOM (LeaseWeb/EvoSwitch/Dataxenter) after 2 years in September this year. Something that some of my peers in the market didn’t expect, but was long overdue. For too long I couldn’t identify myself with the way the company was run and its strategy. No good or bad here… just a big difference of opinion on vision and execution.

The last 2 months I have been able to talk to lots of different organizations in an effort to see what my next career step should be. I needed some time to recuperate from my little USA adventure with LeaseWeb & EvoSwitch. It was a great project to participate in but all the travel took a big toll on my personal life and me.

I also learned how passion for your work can be killed and what it takes to be sparked again. And how people are motivated by the Why in their jobs.

I had some good conversations with DCP board members Mark and Tim about my frustration in the lack of progress in the datacenter industry. It felt like I have been doing the same datacenter and cloud talks for the last 3 years, and things still didn’t move. I wanted to have the opportunity to really make a difference.

Continue with reading

Share

EE-IAAS – call to join IAAS energy research

seflab

The cloud market is hot. The IAAS market sees a lot of growth and new IAAS providers seem to entering the market on a daily basis. Enterprise IT shops are exploring the usage of public cloud solutions and building their own private cloud environments.

IAAS distributions like OpenStack and CloudStack seem to have thriving communities.

Organizations that are successful in deploying IAAS solutions, either for customers or internal use, see rapid growth in demand for these services.

IAAS services still need IT equipment to run on and datacenters to be housed in.  While there has been a strong focus on making the datacenter facility and IT equipment more energy efficient, not much is know about the energy efficiency of software running these IAAS services.

The Amsterdam University of Applied Science (HvA) lanced an energy consumption lab focused on energy efficient software called SefLab 1,5 years ago.

Several EU Datacenter Pulse (DCP) members donated time and equipment to the research during the launch.

During the start of the SefLab students and researchers got familiar with the research facility and its equipment. They did a webbrowser compare on energy usage. (results on the SefLab website)

Now that the lab is operational, its looking for new area’s of research. One of the focus areas is the development of IAAS distributions like CloudStack, OpenStack, etc.. and their energy efficiency; what are the effects of architecture choices?, is their a difference in energy efficiency between the distributions ?, what power-manager & reporting elements are missing from the distributions ?

The HvA and SefLab are looking for:

  • IAAS distribution users willing to participate in the research. This is possible in several ways ranging from active participation in formulating research questions, managing a subproject, and supervising student projects, to participating in workshops and events of the project.
  • IAAS distribution communities willing to participate in the research. Future development of IAAS energy management components can be tested and validated in the SefLab facilities.
  • Vendors willing to support the effort with knowledge transfer and hardware/software donation.
  • Industry groups willing to support the effort in knowledge transfer network.

The research done by SefLab is really end-user driven, so this is a natural fit with DCP’s strategy.
Details can be obtained by contacting jan.wiersma@datacenterpulse.org

DCP members will be updated on the projects progress using the LinkedIn DCP member group.

See full call for participation here.

Share

The need for datacenter education in emerging countries.

<Cross-post with my DCP blog>

I recently had the privilege to travel to some of the emerging datacenter countries like Eastern Europe, Russia, Middle East, Africa and some of the Asian countries.

While I was impressed with the deployment of mobile bandwidth & devices the conversations with datacenter owners and operators scared me. With more and more people getting internet access in these countries, the need for local datacenters also rises. Taking the time to really talk and listen to the stories of some of the local datacenter owners, it reminded me of the way datacenters were build and operated 10 years ago in most western countries.

Zooming in to the way these local operators and owners gather their knowledge, two things stuck me:

  • Most of them aren’t connected to industry groups or communities like Datacenter Pulse, 7×24 Exchange, Uptime Institute, etc… Sometimes this is a cultural or language barrier but most of the time I found this to be an awareness issue.
  • There aren’t as many industry conferences in those locations. It’s hard for conference organizers to get something going from a sponsor & revenue perspective. So where do local operators go to learn?

I think the larger datacenter vendors need to step up and take responsibility.
I think western datacenter owners and operators have a moral responsibility to go out and educate.

We need to send our Dr-Bob’s out and educate them on the benefits of retrofits of old facilities.
We need to send the industry leaders (either vendor or DC-owners) out and educate the emerging countries on the technical opportunities they have for the many greenfield projects begin run in those countries.

This way they know what technology is available before some of the old-fashioned datacenter consultants step in and copy-past our 10-year-old datacenter designs.
This way we protect them from mistakes we already made and they can leapfrog in to a brighter datacenter future.

I know this sounds idealistic be we owe that to these emerging economies, especially from a resource (like energy) efficiency perspective.

So if you have the chance to travel to emerging datacenter countries, take it. Share, lecture, educate, and try to protect the local operators and owners from making the same mistakes we made 10 years ago. They deserve that chance.

Believe me… it will make the world a better place.

Share

Where is the rack density trend going ?…

<English cross post with my DCP blog>

When debating capacity management in the datacenter the amount of watts consumed per rack is always a hot topic.

Years ago we could get away with building datacenters that supported 1 or 2 kW per rack in cooling and energy supply. The last few years demand for racks around 5-7kW seems the standard. Five years ago I witnessed the introduction of blade servers first hand. This generated much debate in the datacenter industry with some claiming we would all have 20+ kW racks in 5 years. This never happened… well at least not on a massive scale…

So what is the trend in energy consumption on a rack basis ?

Readers of my Dutch datacenter blog know I have been watching and trending energy development in the server and storage industry for a long time. To update my trend analysis I wanted to start with a consumption trend for the last 10 years. I could use the hardware spec’s found on the internet for servers but most published energy consumption values are ‘name-plate ratings’. Green Grid’s whitepaper #23 states correctly:

Regardless of how they are chosen, nameplate values are generally accepted as representing power consumption levels that exceed actual power consumption of equipment under normal usage. Therefore, these over-inflated values do not support realistic power prediction

I have known HP’s Proliant portfolio for a long time and successfully used their HP Power Calculator tools (now: HP Power Advisor). They display the nameplate values as well as power used at different utilizations and I know from experience these values are pretty accurate. So; that seems as good starting point as any…

I decide to go for 3 form factors:

..minimalist server designs that resemble blades in that they have skinny form factors but they take out all the extra stuff that hyperscale Web companies like Google and Amazon don’t want in their infrastructure machines because they have resiliency and scale built into their software stack and have redundant hardware and data throughout their clusters….These density-optimized machines usually put four server nodes in a 2U rack chassis or sometimes up to a dozen nodes in a 4U chassis and have processors, memory, a few disks, and some network ports and nothing else per node.[They may include low-power microprocessors]

For the 1U server I selected the HP DL360. A well know mainstream ‘pizzabox’ server. For the blade servers I selected the HP BL20p (p-class) and HP BL460c (c-class). The Density Optimized Sever could only be the recently introduced (5U) HP Moonshot.

For the server configurations guidelines:

  • Single power supply (no redundancy) and platinum rated when available.
  • No additional NICs or other modules.
  • Always selecting the power optimized CPU and memory options when available.
  • Always selecting the smallest disk. SSD when available.
  • Blade servers enclosures
    • Pass-through devices, no active SAN/LAN switches in the enclosures
    • No redundancy and onboard management devices.
    • C7000 for c-class servers
    • Converted the blade chassis power consumption, fully loaded with the calculated server, back to power per 1U.
  • Used the ‘released’ date of the server type found in the Quickspec documentation.
  • Collected data of server utilization at 100%, 80%, 50%. All converted to the usage at 1U for trend analysis.

This resulted in the following table:

Server type

Year

CPU Core count

CPU type

RAM (GB)

HD (GB)

100% Util (Watt for 1U)

80% Util (Watt for 1U)

50% Util (Watt for 1U)

HP BL20p

2002

1

2x Intel PIII

4

2x 36

328.00

   

HP DL360

2003

1

2x Intel PII

4

2x 18

176.00

   

HP DL360G3

2004

1

2x Intel Xeon 2,8Ghz

8

2x 36

360.00

   

HP BL20pG4

2006

1

2x Intel Xeon 5110

8

2x 36

400.00

&n
bsp;
 

HP BL460c G1

2006

4

2x Intel L5320

8

2x 36

397.60

368.80

325.90

HP DL360G5

2008

2

2x Intel L5240

8

2x 36

238.00

226.00

208.00

HP BL460c G5

2009

4

2x Intel L5430

8

2x 36

368.40

334.40

283.80

HP DL360G7

2011

4

2x Intel L5630

8

2x 60 SSD

157.00

145.00

128.00

HP BL460c G7

2011

6

2x Intel L5640

8

2x 120 SSD

354.40

323.90

278.40

HP BL460c Gen8

2012

6

2x Intel 2630L

8

2x 100 SSD

271.20

239.10

190.60

HP DL360e Gen8*

2013

6

2x Intel 2430L

8

2x 100 SSD

170.00

146.00

113.00

HP DL360p Gen8*

2013

6

2x Intel 2630L

8

2x 100 SSD

252.00

212.00

153.00

HP Moonshot

2013

2

Intel Atom S1260

8

1x 500

177.20

172.40

165.20

* HP split the DL360 in to a stripped down version (the ‘e’) and an extended version (the ‘p’)

And a nice graph (click for larger one):

power draw server trend

The graph shows an interesting peak around 2004-2006. After that the power consumption declined. This is mostly due to power optimized CPU and memory modules. The introduction of Solid State Disks (SSD) is also a big contributor.

Obviously people will argue that:

  • the performance for most systems is task specific
  • and blades provide higher density (more CPU cores) per rack,
  • and some systems provide more performance and maybe more performance/Watt,
  • etc…

Well; datacenter facility guys couldn’t care less about those arguments. For them it’s about the power per 1U or the power per rack and its trend.

With a depreciation time of 10-15years on facility equipment, the datacenter needs to support many IT refresh cycles. IT guys getting faster CPU’s, memory and bigger disks is really nice and it’s even better if the performance/watt ratio is great… but if the overall rack density goes up, than facilities needs to supp
ort it.

To provide more perspective on the density of the CPU/rack, I plotted the amount of CPU cores at a 40U filled rack vs. total power at 40U:

power draw rack trend

Still impressive numbers: between 240 and 720 CPU cores in 40U of modern day equipment.

Next I wanted to test my hypotheses, so I looked at a active 10.000+ server deployment consisting of 1-10 year old servers from Dell/IBM/HP/SuperMicro. I ranked them in age groups 2003-2013, sorted the form factors 1U Rackmount, Blades and Density Optimized. I selected systems with roughly the same hardware config (2 CPU, 2 HD, 8GB RAM). For most age groups the actual power consumption (@ 100,80,50%) seemed off by 10%-15% but the trend remained the same, especially among form factors.

It also confirmed that after the drop, due to energy optimized components and SSD, the power consumption per U is now rising slightly again.

Density in general seemed to rise with lots more CPU cores per rack, but at a higher power consumption cost on a per rack basis.

Let’s take out the Cristal ball

The price of compute & storage continues to drop, especially if you look at Amazon and Google.

Google and Microsoft have consistently been dropping prices over the past several months. In November, Google dropped storage prices by 20 percent.

For AWS, the price drops are consistent with its strategy. AWS believes it can use its scale, purchasing power and deeper efficiencies in the management of its infrastructure to continue dropping prices. [Techcrunch]

If you follow Jevons Paradox then this will lead to more compute and storage consumption.

All this compute and storage capacity still needs to be provisioned in datacenters around the world. The last time IT experienced growth pain at the intersection between IT & Facility it accelerated the development of blade servers to optimize physical space used. (that was a bad cure for some… but besides the point now..) The current rapid growth accelerated the development of Density Optimized servers that strike a better balance between performance, physical space and energy usage. All major vendors and projects like Open Compute are working on this with a 66.4% year over year in 4Q12 growth in revenue.

Blades continue to get more market share also and they now account for 16.3% of total server revenue;

"Both types of modular form factors outperformed the overall server market, indicating customers are increasingly favoring specialization in their server designs" said Jed Scaramella, IDC research manager, Enterprise Servers "Density Optimized servers were positively impacted by the growth of service providers in the market. In addition to HPC, Cloud and IT service providers favor the highly efficient and scalable design of Density Optimized servers. Blade servers are being leveraged in enterprises’ virtualized and private cloud environments. IDC is observing an increased interest from the market for converged systems, which use blades as the building block. Enterprise IT organizations are viewing converged systems as a method to simplify management and increase their time to value." [IDC]

With cloud providers going for Density Optimized and enterprise IT for blade servers, the market is clearly moving to optimizing rack space. We will see a steady rise in demand for kW/rack with Density Optimized already at 8-10kW/rack and blades 12-16kW/rack (@ 46U).

There will still be room in the market for the ‘normal’ rackmount server like the 1U, but the 2012 and 2013 models already show signs of a rise in watt/U for those systems also.

For the datacenter owner this will mean either supply more cooling&power to meet demand or leave racks (half) empty, if you haven’t build for these consumption values already.

In the long run we will follow the Gartner curve from 2007:

image

With the market currently being in the ‘drop’ phase (a little behind on the prediction…) and moving towards the ‘increase’ phase.

More:

Density Optimized servers (aka microservers) market is booming

IDC starts tracking hyperscale server market

Documentation and disclaimer on the HP Power Advisor

Share

Google’s BMS got hacked. Is your datacenter BMS next ?

<English cross post with my DCP blog>

A recent USA Congressional survey stated that power companies are targeted by cyber attacks 10.000x per month.hacked-scada

After the 2010 discovery of the Stuxnet virus the North American Electric Reliability Corporation (NERC) established both mandatory standards and voluntary measures to protect against such cyber attacks, but most utility providers haven’t implemented NERC’s voluntary recommendations.

Stuxnet hit the (IT) newspaper front-pages around September 2010, when Symantec announced the discovery. It represented one of the most advanced and sophisticated viruses ever found. One that targeted specific PLC devices in nuclear facilities in Iran:

Stuxnet is a threat that was primarily written to target an industrial control system or set of similar systems. Industrial control systems are used in gas pipelines and power plants. Its final goal is to reprogram industrial control systems (ICS) by modifying code on programmable logic controllers (PLCs) to make them work in a manner the attacker intended and to hide those changes from the operator of the equipment.

DatacenterKnowledge picked up on it in 2011, asking ‘is your datacenter ready for stuxnet?’

After this article the datacenter industry didn’t seem to worry much about the subject. Most of us deemed the chance of being hacked with a highly sophisticated virus ,attacking our specific PLC’s or facility controls, very low.

Recently security company Cylance published the results of a successful hack attempt on a BMS system located at a Google office building. This successful hack attempt shows a far greater threat for our datacenter control systems.

 

The road towards TCP/IP

The last few years the world of BMS & SCADA systems radically changed. The old (legacy) systems consisted of vendor specific protocols, specific hardware and separate networks. Modern day SCADA networks consist of normal PC’s and servers that communicate through IT standard protocols like IP, and share networks with normal IT services.

IT standards have also invaded facility equipment: The modern day UPS and CRAC is by default equipped with an onboard webserver able of send warning using an other IT standard: SNMP.

The move towards IT standards and TCP/IP networks has provided us with many advantages:

  • Convenience: you are now able to manage your facility systems with your iPad or just a web browser. You can even enable remote access using Internet for your maintenance provider. Just connect the system to your Internet service provider, network or Wi-Fi and you are all set. You don’t even need to have the IT guys involved…
  • Optimize: you are now able to do cross-system data collection so you can monitor and optimize your systems. Preferably in an integrated way so you can have a birds-eye view of the status of your complete datacenter and automate the interaction between systems.

Many of us end-users have pushed the facility equipment vendors towards this IT enabled world and this has blurred the boundary between IT networks and BMS/SCADA networks.

In the past the complexity of protocols like Bacnet and Modbus, that tie everything together, scared most hackers away. We all relied on ‘security through obscurity’ , but modern SCADA networks no longer provide this (false) sense of security.

Moving towards modern SCADA.

The transition towards modern SCADA networks and systems is approached in many different ways. Some vendors implemented embedded Linux systems on facility equipment. Others consolidate and connected legacy systems & networks on standard Windows or Linux servers acting as gateways.

This transition has not been easy for most BMS and SCADA vendors. A quick round among my datacenter peers provides the following stories:

  • BMS vendors installing old OS’s (Windos/Linux) versions because the BMS application doesn’t support the updated ones.
  • BMS vendors advising against OS updates (security, bug fix or end-of-support) because it will break their BMS application.
  • BMS vendors unable to provide details on what ports to enable on firewalls; ‘ just open all ports and it will work’.
  • Facility equipment vendors without software update policies.
  • Facility equipment vendors without bug fix deployment mechanisms; having to update dozens of facility systems manually.

And these stories all apply to modern day, currently used, BMS&SCADA systems.

Vulnerability patching.

Older versions of the SNMP protocol have known several vulnerabilities that affected almost every platform, included Windows/Linux/Unix/VMS, that supported the SNMP implementation.

It’s not uncommon to find these old SNMP implementations still operational in facility equipment. With the lack of software update policies, that also include the underlying (embedded) OS, new security vulnerabilities will also be neglected by most vendors.

The OS implementation from most BMS vendors also isn’t hardened against cyber attacks. Default ports are left open, default accounts are still enabled.

This is all great news for most hackers. It’s much easer for them to attack a standard OS like a Windows or Linux server. There are lots of tools available to make the life of the hacker easer and he doesn’t have to learn complex protocols like Modbus or Bacnet. This is by far the best attack surface in modern day facility system environments.

The introduction of DCIM software will move us even more from the legacy SCADA towards an integrated & IT enabled datacenter facility world. You will definitely want to have your ‘birds-eye DCIM view’ of your datacenter anywhere you go, so it will need to be accessible and connected. All DCIM solutions run on mainstream OS’s, and most of them come with IT industry standard databases. Those configurations provide an other excellent attack surface, if not managed properly.

ISO 27001

Some might say: ‘I’m fully covered because I got an ISO 27001 certificate’.

The scope of ISO27001 audit and certificate is set by the organization pursuing the certification. For most datacenter facilities the scope is limited to the physical security (like access control, CCTV) and its processes and procedures. IT systems and IT security measures are excluded because those are part of the IT domain and not facilities. So don’t assume that BMS and SCADA systems are included in most ISO 27001 certified datacenter installations.

Natural evolution

Most of the security and management issues are a normal part of the transition in to a larger scale, connected IT world for facility systems.

The same lack of awareness on security, patching, managing and hardening of systems has been seen by the IT industry 10-15 year ago. The move from a central mainframe world to decentralized servers and networks, combined with the introduction of the Internet has forced IT administrators to focus on managing the security of their systems.

In the past I have heard Facility departments complain that IT guys should involve them more because IT didn’t understand power and cooling. With the introduction of a more software enabled datacenter the Facility guys now need to do the same and get IT more involved; they have dealt with all of this before…

Examples of what to do:

  • Separate your systems and divide the network. Your facility system should not share its network with other (office) IT services. The separate networks can be connected using firewalls or other gateways to enable information exchange.
  • Assess your real needs: not everything needs to be connected to the Internet. If facility systems can’t be hardened by the vendor or your own IT department, then don’t connect them to the Internet. Use firewalls and Intrusion Detection Systems (IDS) to secure your system if you do connect them to the Internet.
  • Involve your IT security staff. Have facilities and IT work together on implementing and maintaining your BMS/SCADA/DCIM systems.
  • Create awareness by urging your facility equipment vendor or DCIM vendor to provide a software update & security policy.
  • Include the facility-systems in the ISO 27001 scope for policies and certification.
  • Make arrangements with your BMS and/or DCIM vendor about management of the underlying OS and its management. Preferably this is handled by your internal IT guys who already should know everything about patching IT systems and hardening them. If the vendor provides you with an appliance, then the vendor needs to manage the patching process and hardening of the system.

If you would like to talk about the future of securing datacenter BMS/SCADA/DCIM systems than join me at Observe Hack Make (OHM) 2013. IOHM is a five-day outdoor international camping festival for hackers and makers, and those with an inquisitive mind. Starts July 31st 2013.

Note:
There are really good whitepapers on IDS systems (and firewalls) for securing Modbus and Bacnet protocols, if you do need to connect those networks to the internet. Example: Snort IDS for SCADA (pdf) or  books about SCADA & security at Amazon.

Source:
A large part of this blog is based on a Dutch article on BMS/SCADA security January 2012 by Jan Wiersma & Jeroen Aijtink (CISSP). The Dutch IT Security Association (PViB) nominated this article for ‘best security article of 2012’.

Share

DCIM–Its not about the tool; its about the implementation

 

<English cross post with my DCP blog>

Failure Success Road Sign

So you just finished your extensive purchase process and now the DCIM DVD is on your desk.

Guess what; the real work just started…

The DCIM solution you bought is just a tool, implementing it will require change in your organization. Some of the change will be small; for example no longer having to manually put data in an Excel file but have it automated in the DCIM tool. Some of the change will be bigger like defining and implementing new processes and procedures in the organization. A good implementation will impact the way everyone works in your datacenter organization. The positive outcome of that impact is largely determined by the way you handle the implementation phase.

These are some of the most important parts you should consider during the implementation period:

Implementation team.

The implementation team should consist of at least:

  • A project leader from the DCIM vendor (or partner).
  • An internal project leader.
  • DCIM experts from the DCIM vendor.
  • Internal senior users.

(Some can be combined roles)

Some of the DCIM vendors will offer to guide the implementation process themselves others use third party partners.

During your purchase process its important to have the DCIM vendor explain in detail how they will handle the full implementation process. Who will be responsible for what part? What do they expect from your organization? How much time do they expect from your team? Do they have any reference projects (same size & complexity?)?

The DCIM vendor (or its implementation partner) will make implementation decisions during the project that will influence the way you work. These decisions will give you either great ease of working with the tool or day-to-day frustration. Its important that they understand your business and way of working. Not having any datacenter experience at all will not benefit your implementation process, so make sure they supply you with people that know datacenter basics and understand your (technical) language.

The internal senior users should be people that understand the basic datacenter parts (from a technical perspective) and really know the current internal processes. Ideal candidates are senior technicians, your quality manager, backoffice sales people (if you’re a co-lo) and site/operations managers.

The internal senior users also play an important role in the internal change process. They should be enthusiast early adapters who really want to lead the change and believe in the solution.

Training.

After you kicked off your implementation team, you should schedule training for your senior users and early adaptors first. Have them trained by the DCIM vendor. This can be done on dummy (fictive) data. This way your senior users can start thinking about the way the DCIM software should be used within your own organization. Include some Q&A and ‘play’ time at the end of the training. Having a sandbox installation of the DCIM software available for your senior users after the training also helps them to get more familiar with the tool and test some of their process ideas.

After you have done the loading of your actual data and you made your process decisions surrounding the DCIM tool, you can start training all your users.

Some of the DCIM vendors will tell you that training is not needed because their tool is so very user friendly. The software maybe user friendly but your users should still need to be trained on the specific usage of the tool within your own organization.

Have the DCIM vendor trainer team up with your senior users in the actual training. This way you can make the training specific for your implementation and have the senior users at hand to answer any organization specific questions.

The training of general users is an important part of the change and process implementation in your organization.

Take any feedback during the general training seriously. Provide the users with a sandbox installation of the software so they can try things without breaking your production installation and data. This will give you broad support for the new way of working.

Data import and migration.

Based on the advise in the first article , you will already have identified your current data sources.

During the implementation process the current data will need to be imported in to the DCIM data structure or integrated.

Before you import you will need to assess your data; are all the Excel, Visio and AutoCAD drawings accurate? Garbage import means garbage output in the DCIM tool.

Intelligent import procedures can help to clean your current data; connecting data sources and cross referencing them will show you the mismatches. For example: adding (DCIM) intelligence to importing multiple Excel sheets with fiber cables and then generating a report with fiber ports that have more than 1 cable connected to them (which would be impossible i.r.l ).

Your DCIM vendor or its partner should be able to facilitate the import. Make sure you cover this in the procurement negotiations; what kind of data formats can they import? Should you supply the data in a specific format?

This also brings us back to the basic datacenter knowledge of the DCIM vendor/partner. I have seen people import Excel lists of fiber cable and connect them to PDU’s… The DCIM vendor/partner should provide you a worry free import experience.

Create phases in the data import and have your (already trained) senior users preform acceptance tests. They know your datacenter layout and can check for inconsistencies.

Prepare to be flexible during the import; not everything can be modeled the way you want it in the software.

For example when I bought my first DCIM tool in 2006 they couldn’t model blade servers yet and we needed a work around for it. Make sure the workarounds are known and supported by the DCIM vendor; you don’t want to create all your workaround assets again when the software update finally arrives that supports the correct models. The DCIM vendor should be able to migrate this for you.

Integration.

The first article did a drill down of the importance of integration. Make sure your DCIM vendor can accommodate your integration wishes.

Integration can be very complex and mess-up your data (or worse) if not done correctly. Test extensively, on non-production data, before you go live with integration connections.

The integration part of the implementation process is very suitable for a phased approach. You don’t need all the integrations on day one of the implementation.

Involve IT Information architects if you have them within your company and make sure external vendors of the affected systems are connected to the project.

Roadmap and influence.

Ask for a software development roadmap and make sure your wishes are included before you buy. The roadmap should show you when new features will be available in the next major release of you
r DCIM tool.

The DCIM vendor should also provide you with a release cycle displaying the scheduled minor releases with bug fixes. When you receive a new release it should include release-notes mentioning the specific bugs that are fixed and the new features included in that new release. Ask the DCIM vendor for an example of the roadmap and release-notes.

During the purchase process you may have certain feature requests that the vendor is not able to fulfill yet. Especially new technology, like the blade server example I used earlier, will take some time to appear in the DCIM software release. This is not a big problem as long as the DCIM vendor is able to model it within reasonable time.

One way to handle missing features is to make sure it’s on the software development roadmap and make the delivery schedule part of your purchase agreement.

After you signed the purchase order your influence on the roadmap will become smaller. They will tell you it doesn’t… but it does… Urge your DCIM vendor to start a user-group. This group should be fully facilitated by the vendor and provide valuable input for the roadmap and the future of the DCIM tool. A strong user-group can be of great value to the DCIM vendor AND its customers.

Got any questions on real world DCIM ? Please post them on the Datacenter Pulse network: http://www.linkedin.com/groups?gid=841187 (members only)

Share

Before you jump in to the DCIM hype…

dilbert-information-strategy

<English cross post with my DCP blog>

You’re ready to enter the great world of DCIM software and jump right into the hype ?

Do you actually know what you need from a DCIM solution ? What are your functional requirements ?

So before you jump in, let’s take a step back and look at DataCenter Information Management from a 40,000 feet level: the datacenter facility information architecture.

Let’s start with ‘data’;

Data is all around us in the datacenter environment. It’s on the post-it notes on your desk, the dozen Excel files you manage to report and collect measurements and the collection of electrical and architectural drawings sitting in your desk drawer.

A modern day datacenter is filled with sensors connected to control systems. Some of the equipment is connected to central SCADA or BMS systems, some handle all the process control locally at the equipment. HVAC, electrical distribution and conversion systems, access control and CCTV; they all generate data streams. With the growth of datacenters in square meters and megawatts, the amount of data grows too.

The introduction of PUE and focus on energy efficiency have shown us the importance of data and especially data analysis. For most of us this has introduced even more data points, but enabled us to do better analysis of our datacenter’s performance. So; more data has enabled more efficiency and a better return on investment. Some of us could even say they entered the BigData era with datacenter facility data.

DCIM can play a role in the analysis of all this data, but it’s important to know where your data is first. Where is the current data stored ? What are the data streams within your datacenter ? What data is actually available and what data actually matters to your operation ? It’s a false assumption that all the data needs to be pulled in to a DCIM solution; that depends on your processes and your information requirements.

Process

Every datacenter has its collection of structured activities or tasks that produce a specific service or product for our internal or external customer. These are the primary processes focusing on the services your datacenter needs to provide. Examples are operations processes like Work Orders or Capacity Management.

These primary processes are assisted by supporting processes that make the core (primary) processes work and optimize them. Examples are Quality, Accounting or Recruitment processes.

Indentifying the primary and supporting processes in your datacenter enables you to optimize them by executing them in a consisted way very time and checking the output.

If you run an ISO9001 certified shop, you will definitely know what I’m talking about.

To run the processes we need information. Information is used in our processes to make decisions. The needed information can be collected and supplied by an individual or an (IT) system.

When data is collected it’s not yet information. Applying knowledge creates information from data. IT systems can assist us to create information from data, with built-in or collected knowledge.

Indentifying your datacenter processes also enables you to get a grip on the information that is needed to move the processes forward. Is this information available ? What is the quality of the information and process output ? How much time does it take to make it available ? Can this be optimized ?

DCIM solutions can assist you in creating information from data and provide information and process optimization. Most of the DCIM solutions depend on built-in knowledge on how datacenters work and operate, to facilitate this and optimize processes.

DCIM is only one of the applications used to support and optimize our datacenter processes. To support the full stack of processes we need a whole range of applications and tools. These applications can be everything from Planning to Asset Management to Customer Relationship Management (CRM) to SCADA/BMS tools.

Most of us already have some type of SCADA or BMS system running in our datacenter to control and monitor our facility infrastructure. This SCADA or BMS system will handle typical facility protocols like Modbus, BACnet or LonWorks. The programming logic used in most SCADA/BMS systems is not something found in typical DCIM solutions.

With the growing amount of sensors and their data, the SCADA/BMS system must be able to handle hundreds of measurements per minute. It must store, analyze and be able to react-on the provided data to control things like remote valves and pumps. This functionality is also typically not found in DCIM solutions. (So SCADA/BMS does not equal (!=) DCIM.)

Anyone running a production datacenter will already have a collection of applications to support their datacenter processes. You may have a ticketing system, a CRM application, MS Office application, etc.. Some times DCIM is perceived as the only tool you need to manage your datacenter but it will definitely not replace all your current tools and applications.

Model

Now that you have indentified your data, processes and current applications it’s time to focus on what you need DCIM for anyway; define your functional requirements.

One way of assisting you in this definition is creating your own datacenter facility information model.

IT architects are trained in creating information models, so if you have any walking around ask them to assist you.

Example of a model would be the one that the 451 Group created for their DCIM analysis. This is featured in the DCK Guide to Data Center Infrastructure Management (DCIM) (The model doesn’t cover the full scope for every organization, but it helps me to explain what I mean in this blog…)

dcim-451group

The model displays functionality fields what would typically exist when operating a datacenter.

You can use a model like this to identify what functionality you currently don’t have (from a process and application perspective) and what can be optimized.

It also enables you to plot your current tools on the model and indentify gaps and overlap. In this example I have plotted one of my SCADA/BMS systems on the (slightly modified) model:

dcim-plot-bms

I have also plotted the DCIM need for that project:

dcim-plot-dcim

Using models like this will give you a sense of what you actually expect from a DCIM solution and assist in creating your functional requirements for DCIM tool selection (RFP/RFI).

Integration is key

Modern day IT information management consists of collections of applications and datastores, connected for optimal information exchange. IT information and business architects have already tried the ‘one application to rule them all’ approach before and failed. Because creating information islands also doesn’t work, we need to enable applications and information stores to talk to each other.

You may have some customer information about the usage of datacenter racks in a CRM system like Salesforce. You may already have some asset information of your CRAC’s in a asset management system or maybe an procurement system. This is all interesting and relevant information for your ‘datacenter view on the world’. Connecting all the systems and datastores could get really ugly, time consuming and error-prone:

dc-info-connect

IT architects have already struggled with this some time ago when integrating general business applications. This has started things like Service-oriented architecture (SOA) , enterprise service bus (ESB) and application programming interface (API). All fancy words (and IT loves their 3 letter acronyms) for IT architectural models, to be enable applications to talk to each other.

dc-info-connect-soa

The DCIM solution you select, needs to be able to integrate in to your current world of IT applications and datastores.

When looking at integration, you need to decide what information is authoritative and how the information will flow. Example: you may have an asset management system containing unique asset names and numbers for your large datacenter assets like pumps, CRACs and PDUs. You would want this information to be pushed out to the DCIM solution but changes in the asset names should only be possible in the asset management system. Your asset management system would then be considered authoritative for those information fields and information will only be pushed from the asset system to DCIM and not vice versa (flow).

Integration also means you don’t have to pull all the data from every available data source in to your DCIM solution. Select only the information and data that would really add value to your DCIM usage. Also be aware that integration is not the only way to aggregate data. Reporting tools (sometimes part of the DCIM solution) can collect data from multiple datasources and combine them in one nice report, without the need to duplicate information by pulling a copy in to the DCIM database.

The 451group model does an excellent job of displaying this need for integration showing the “integration and reporting” layer across all layers.

Using your own information model you can also plot integration and data sources.

dcim-plot-integrate

Integration within the full datacenter stack (from facilities to IT)  is also key for the future of datacenter efficiency like I mentioned in my “Where is the open datacenter facility API ?” blog.

So, to summarize:

  • Look at what data you currently have, where it is stored and how that data flows across your infrastructure.
  • Look at the information and functionality you need by analyzing your datacenter processes. Indentify information gaps and translate them to functional requirements.
  • Look at the current tools and applications ; what applications to replace with DCIM and what applications to integrate with DCIM. What are the integration requirements and what information source is authoritative ?
  • Create your own datacenter facility information model. Position all your current applications on the model. (If you have in-house IT (information) architects; have them assist you…)

Preparing your DCIM tool selection this way will save you from headaches and disappointment after the implementation.

In my next blog we will jump to the implementation phase of DCIM.

 

More resources:

Full credits for the DCIM model used in this blog, go to the 451Group. Taken from the excellent DCK Guide to Data Center Infrastructure Management (DCIM) at http://www.datacenterknowledge.com/archives/2012/05/22/guide-data-center-infrastructure-management-dcim/

Disclosure: between 2006 and 2012 I have selected, bought and implemented three different DCIM solutions for the companies I worked for. At that time I was also part of either the beta-pilot group for those vendors or on the Customer Advisory Board. That doesn’t make me a DCIM expert, but it generated some insight into what is sold and what actually works and gets used.

Share