The main points of our Effective and Infinite Storage in the Cloud session at TechDays 2010 in Sweden are listed below as well as links to all of our related material and the zip with our demos. The slides are here too but they are in Swedish only.
Our session at TechDays 2010 gave an overview of what storage in the Cloud is, according to Microsoft at present date, and some of the goals for the future in this area. We intended our session to provide the basis for companies to be able to evaluate if and how moving storage to the Cloud can enhance their business for their own specific scenarios.
As consultants working with different clients we see that companies in general are not yet ready to adopt Cloud storage on a broad scale. However we also see a continuous increase in interest in this area and we believe that this is directly related to the richness of the offerings and services now becoming available. The story matures in the Microsoft camp and the pace is rapid. Cloud storage from Microsoft is admittedly still very young. (Youngest in the business?) But now Microsoft has thrown the full weight of the company behind making the cloud story stick. Expect to see a lot of new offerings, services, products, frameworks etc. in the Cloud coming out from Microsoft in the near future.
There are a few common concerns when it comes to storing data in the cloud. A quick (and perhaps not complete) listing follows:
Storage cost in the Cloud
One big concern we hear from clients and in the buzz at conferences that take place around now is a concern related to the cost of storage and data transfer in the Cloud. This is naturally an equally important issue regardless in what Cloud you choose to store your data.
The thing about storing your data in the Cloud is that the cost for this storage is a lot more self evident compared to storing your data in your own servers on premise. This is because you receive a bill each month stating exactly what you have spent for storage in the passed period. What we tend to mitigate is that storage, backups, crash plans and managing all of this on premise, or even hosted, does not exactly come for free either. The big question then becomes what difference in cost there will be for your company for storing and transferring data to a Cloud compared to local hardware operated by your local IT department.
Another aspect of the cost for storage in general is the question of how you build your apps. We believe that many have been a bit oblivious to this issue when they have run their apps and stored our data only on premise or with traditional hosting. Perhaps we have been cost inefficient in this area until now? In the age of Soft Deletes – deleting data does not remove it but rather marks it as deleted – no data is ever actually deleted – managing costs for storing data will become a more and more important question for everyone. Building Cloud applications and storing data in the Cloud has to be done with data storing and management costs in mind. As we hinted to above; the same should have been true for all of your applications up until now. Storing data in the Cloud just makes this point more obvious.
Storage Security when your data in the Cloud
Perhaps the fist fear when it comes to storing data in the Cloud is the fear of security breeches and the perceived loss of control of this security. Your data is no longer at home.
What you need to ask yourself, and here we choose to quote a good colleague Sergio Molero @sergio_molero, is; “What sense of security do you have today? What security is it that you think you will loose when storing your data in the Cloud?”
The same exact thing can be said for hosted data as compared to data stored in the Cloud because the two concepts are very much the same thing in this regard. Data in the Cloud is also hosted data!
One important thing to also keep in mind is that moving to a data storage solution in the Cloud does not mean all of your data must move “out there” somewhere. It is still quite possible to store the most sensitive data on premise and use Cloud storage for more public data only. We believe that the higher focus on security for data in the Cloud will put higher focus on your security efforts compared to your present security on premise. The move to the Cloud could in conclusion prove to heighten the overall security level for your data.
Also look into Windows Identity Foundation (WIF) for insights on how to secure your data in the Cloud.
Trusting someone else with your data in the Cloud
If security from external attacks to your data in the Cloud is a concern you feel somewhat confident you will be able to handle then perhaps trusting someone else with storing your company sensitive data is a concern that causes you to think twice? Trust is a complex issue not to be taken lightly. We will only raise the concern here and not delve too deep into it.
Can you trust a hosted data storage provider today? If you can, then this is no difference compared to trusting data stored in the Cloud.
If you can’t trust anyone else with your data, then the Cloud is not an option for you. Clean and simple.
Legal ramifications for storing data in the Cloud
It feels important to also comment briefly on the potential legal ramifications for storing your data in a Cloud in another country somewhere. That being said we’d just like to point out that we are no lawyers. Let’s be clear on that.
Some data cannot legally be moved outside the borders of your own country. Hospital patient information is a good example of this. Other data must for legal reasons stay within your geo-region, say for instance the EU.
As we hinted to above; not all of your data has this limiting concern. You do have more public data as well as more private data in your business. Only the latter is a potential legal problem. There will be more hybrid solutions for data Storage in the future. Some data for a company will be stored in the Clouds and other data will be stored on premise.
But how can we store data both on premise and in the Cloud without having two separate data storage solutions? One thing that will assist in this endeavor is the rumored upcoming availability of local on-premise, or hosted, Clouds that will be possible to install locally and behave in the same way the big Clouds out there on the Internet do. Right now there does not exist many offerings of this nature. There will in time when companies start to demand them. In fact, given that the storage Clouds today has a REST interface, it is not very difficult to implement that same interface on your own on premise machines! Doing this however will require a good effort on your part and you will end up with a proprietary solution that you will have to evolve over time as the real Clouds shift. Another option is to wait a bit and let the Cloud vendors come up with packaged solutions that you may install simply install in your own data centers. The hybrid solutions that will result from this will play an important role in shaping the solutions for data storage for the coming decade.
In our SQL Azure part of our talk we pointed out three pros and three cons for attendees to take away from our session.
- Because Windows Azure takes care of all of the maintenance normally associated with a SQL Server server developers are now allowed to focus on the database rather than performing administrative tasks.
- The database is always available in the sense that upgrades are seamless as well as failover, backup and crash plan which means that the SQL Azure database is always available.
- The scaling story is very strong in SQL Azure. If you can shard your database into many pieces you can scale out from one database to many (say 100) databases to meet a peek load and then back down again to just one at will. For now there is no built in support for this in SQL Azure but this will likely be a story further down the road and you can build it your self today.
There are some things to consider when looking into deploying to SQL Azure.
- Security is locked down compared to SQL Server. The sys-tables are not yet available for instance. This is because your databases are co-located on the same servers as data belonging to some one else.
- Your data is no longer at home in the sense that it is moved outside your company and even outside your country.
- The size right now for one database in SQL Azure is 10 GB but this will increase to 50 GB in June.
Windows Azure Storage
With Windows Azure Storage Microsoft addresses a set of data storage stories that do not really fit well into a standard relational database structure.
- There is a powerful and “infinite” message queuing system that enables messaging between the different machines in your applications. This queuing system does not guarantee 100% ordering of the messages in the queue. This is in case of more than one machine working on the same queue. However the queues require dequeue (GET) and delete (DELETE) in order to complete an actual removal of a message. This method will ensure each message will actually be handled in full in case of some error when decueing and handling the message the first time.
- There is also a concept in Windows Azure of that which is known as a “Big Table”. Say you want to store 200 million user data sets for Facebook or why not all the messages ever sent to Twitter. For this you need an “infinite” tabular structure that has no end that you will ever hit.
- The third thing you can store in Windwos Azure are two kinds of blobs. 1) A so called Block Blob. A blob that is handled in blocks when uploaded. A block Blob is not really intended to be changed once uploaded. You can exchange blocks in the blob but that is not the intent. This is typically a video file or a picture or something that is rarely if ever changed. You would use block blobs to build the next youtube where you can stream files down into a player. 2) A Page Blob. This kind of blob is divided into pages rather than blocks. You can real or write pages at will into this kind of blob. This means that you are able to create a Virtual HardDrive (VHD) and upload it as a page blob to Windows Azure. Then you can actually mount this blob to one or more virtual machines in Windows Azure as an extra hard drive. This comes in handy when moving an application to Windows Azure that uses a lot of disk. All of your old File.Open will continue to work.
When considering Windows Azure for your storage think about:
- You should use only the parts of Windows Azure that you really need. If you need to store backup blob type of data – use only that. There is a good advantage to consider when you have applications to run that will use Windows Azure Storage. If you deploy your application to Windows Azure as well as the data to the same datacenter then the internal data transfer costs between this application and storage is without cost. If you instead run your apps on premise and store data in the cloud the transfer of data to and from the cloud carries a cost.
- Regarding the cost you should really make a data storage estimate and analysis before you start up deploying. Your application should have a plan for the amount of data stored, transferred and the duration of storage. This is true on premise too, as discussed above, and data storage plans and budgets will become a more and more important part of any application for the future.
- Windows Azure Storage means learning to use a new API. Perhaps not a dramatic thing but the three storage options also affects the way you build the rest of your application. If you disagree chances are you need to learn more about storing data in the cloud.
Cloud Storage Studio – a SSMS look alike for Windows Azure Storage
Fiddler2 – Web debugging proxy
CloudStorage.API @ CodePlex
Azure Contrib @ CodePlex
Ziped version of the demo code in the Windows Azure Queue Storage sample:
Slides from the presentation:
Slides for the demo attached as a zip above
We would like to thank Microsoft for arranging TechDays, we had a blast! Thanks to the audience for listening.