According to Shahed Latif, a gobal steering committee partner for cloud computing at KPMG and author of the book Cloud Security and Privacy, it’s critical for companies to understand three essential concepts or attributes of data in a cloud computing environment:
- Data lineage: Data flow is no longer linear in a highly virtualized environment like the cloud. Without understanding the precise data flow it is difficult to know what controls to put in place to ensure data integrity.
- Data provenance: It’s important to understand latency issues in data processing. In a highly distributed and virtualized environment, if the wrong version of data is used at a given point of time it can have a significant effect. (Think of exchange rates in financial markets.)
- Data remanence: In the dynamic cloud environment, data can remain in cache for unexpected periods of time. Without visibility into the vendor’s application hygiene, you cannot be sure exactly where your data resides at any given time.
For a more detailed description of these concepts and other issues, see the book Cloud Security and Privacy (Mather, Kumaraswamy, Latif; O’Reilly Media). For the purpose of this discussion, it suffices to say that you may not always know exactly where your data might reside, and in what form, in cloud applications.
Even if the vendor can precisely outline its application architecture, the nature of the cloud is dynamic – vendors update the software on a continuous basis, and as a customer you don’t have a choice about remaining with an early version or architecture. Without transparency, organizations have to take it on faith that their data is secure and protected. How can you perform a security audit on an application that is completely outside your control?