How to manage cloud data?

[date-stamp]Business applications accumulate a lot of data over time, regardless of whether they're cloud applications or on-site applications. This is business-critical data about everything from company strategy to customer data and sales forecasts.


The Challenge

Emails, documents, CRM data... they all contain information about your company. Sometimes the same information is stored in multiple systems, meaning it's duplicated. Other times the information stored in different applications is not the same; the updated phone number for a customer is stored in your personal contact register in Exchange Online but not in the CRM system. Your employees may start to doubt the information stored in the CRM system, because "it is not right."  Over time, this becomes challenging and starts to affect the quality of the work done by your business since some of it is based on outdated or incorrect information.  So how do you manage cloud data?

(Another scenario is a merger between two companies, where one company has used Zoho CRM and the other one SugarCRM. Which system to use in the new, merged company? Using both is cumbersome, so what's the solution?)


Obvious solution #1: Merge data

One obvious solution is to merge the data stored in different applications; this is the approach that CloudWork, among others, use. Data from a CRM system is imported into (for instance) Google Calendar, and then kept in sync. The problem however is that data is copied, and thus duplicated. Is it the calendar in the CRM system or Google Calendar, which is the one that contains aggregated data from other systems? What happens when the CRM calendar contains an appointment when I already have an appointment in Google Calendar?  The questions and potential pitfalls with this approach are many.

While being an approach with good intentions, this obvious solution can lead to chaos and multiple systems containing (almost) the same data.


Obvious solution #2: Data migration

Ok, merging data wasn't an ideal solution. Then what about migrating data from multiple applications to one, new application? That should eliminate the problem with duplicate data. The biggest obstacle when migrating data is deciding what to merge and what to leave behind. Management normally wants "everything" to be merged, from unimportant notes to valuable sales forecasts. Now, such a policy leads to an unnecessary and expensive data migration and puts a lot of garbage into the new application. In addition, it takes time (even if migrating from one CRM system to another); the fields and information stored is not 1:1-there are differences. So shall we store all these notes about a customer in the description field? It quickly gets messy and hard to navigate.

There are countless reports about the failure of data migration projects: Why do so many data migration projects end in disaster? and 10 Big data migration mistakes. So data migration is hard, and if you want to get it right you have to hire the experts instead of letting your own people do the job. So costs are quickly racked up, plus (as the reports show) the risk with data migration is not something you should ignore.


Obvious solution #3: Using cloud applications that talk to each other

One of the good things with cloud applications is that most of them have a set of plugins and extensions that enable them to talk to other cloud applications. Your company website can talk directly with the CRM system to capture leads, and your email system can integrate with Dropbox- so attachments do not have to be sent with the email anymore (just a link to Dropbox is added to the email).

Can cloud applications with a good set of plugins and extensions effectively manage cloud data in multiple applications? Fighting duplicate information is hard, so the best is to have a set of systems that discourage entering duplicate information. If you need a calendar in both your email and CRM system, use an extension in one of the systems that displays the calendar from the other system. That way there is only one calendar, but it is available in multiple systems. However, if duplicate information is entered in multiple cloud applications, there's no miracle cure here either. Same thing with information that is slightly different- using cloud applications is no silver bullet in this scenario.


Best real-world solution

If you have duplicate data, why not leave data where it is and place a search engine on top of all your applications? Compare this with how Google works when helping you find the web page you're looking for on the web. Google contains information about countless websites and is able to pinpoint which website and page you want to find. A similar approach for enterprise applications makes sense. When you're looking for a strategy document or the contract with a customer, it doesn't really matter to you where the information is stored, as long as you can find it.

If you are looking for a phone number for a customer, a search across all your applications can reveal the contact information that is stored in different systems. You get the contact cards from both your email system and the CRM system and can see that the phone numbers are different. Ok, which one to choose? Chances are that the one that was updated last is the correct one, so you should try that one first. Voila, problem solved! Additionally, a search engine can also reveal phone numbers regardless if they're stored in contact cards or not, so the reach is further than just a search in contact cards.

Choosing the solution that is right for you can be hard, and many people come to the discussion about data management with many ready-made arguments like "data migration is always the best approach." Well, is it? I believe a pragmatic approach where the suite of applications used by a company is taken into account and an analysis of the information stored in these applications is the best starting point for a data management strategy. If there's no duplicate information: problem solved. However, if there is (and this is normally the case), a solution should be found before the lack of data management starts to hurt your business. The best overall solution I've experienced is to leave data where it is and use modern search technology to index the data to make them available regardless of storage location.