One of the challenges with cloud storage is the connection between
you and the storage. For almost everyone it is going to be slower
than what is available within the data center. This performance
difference does not mean a more limited use of cloud storage, it
means that greater intelligence is needed to load data into the
cloud. With that intelligence cloud storage could be leveraged for
even the most demanding of applications.
In almost all use cases, but especially cloud storage as part of a
primary storage solution, it is going to require some sort of local
presence to cache the active data sets. This local presence can
come in the form of a stand alone appliance, a virtual appliance or
can be integrated into the storage system itself. The goal of the
local presence is to store the active data subset on local high
speed storage and then as the data ages push it out to the cloud
storage service but do so transparently.
This hybrid type of deployment does mean that the data set does
have to be something that can be segregated by access dates. It
also means that the ideal data set is one where it has a short
create and edit cycle, then is rarely accessed in the future. A
file server is an obvious example but messaging and group
collaboration tools are as well.
No matter what the local data set is you are always going to need
to copy data to the cloud. Most of these hybrid type of solutions
will want to copy all new or modified data the moment that the
change occurs, this provides a level of redundancy from a data
protection perspective but means that the WAN bandwidth utilization
is upfront as well. Most of these hybrid type of devices can
trickle data to your cloud provider so bandwidth can be throttled
back. More importantly most of them have some form of WAN
optimization either compression or deduplication that reduces the
data set before it is sent and after it lands. For example in the
solution we are currently testing, while we have placed 77GB's of
data in the cache device only 27GB's of that data has been actually
transferred and stored thanks to compression and deduplication.
Even with this intelligent use of the available bandwidth there are
some practical steps you will want to take. First you need to have
a decent connection to the internet. When we began our testing we
immediately found our connectivity to be a little lacking. We
doubled our bandwidth for about 15 percent extra per month and it
made the application significantly more usable.
Second you also want to select a data set that can be gradually
migrated to the cloud, net new projects are ideal or data that can
be easily isolated by age, migrating the oldest data sets one at a
time. In our case if we used it for hosting our various projects
and simply decided that all new projects would go on the cloud
storage appliance. As a result we have seen almost no impact from
having all of our data be on the cloud storage device and we have
seen a performance improvement in local response time since our
appliance is on high speed storage,
With these considerations in mind cloud storage can be a viable
option for many applications and data sets. As the hybrid
technology continues to improve and the cost of bandwidth continues
to come down even more applications and data sets will be deemed
cloud appropriate, but the time to develop a cloud storage strategy
is now.