« September 2005 | Main | August 2006 »

March 15, 2006

Amazon S3

Amazon.com just launched S3 (Simple Storage Service), a serious game changer. Should be extremely interesting to follow what innovators integrate S3 into over the coming months.

Definately see interesting backup solutions and potentially digital content/software distribution via their BitTorrent interface. Very nice touch! Naysays may balk at the cost per-GB but I think they are really missing the boat here. To have your data securely stored across multiple physical locations (to avoid complete loss in fire, etc) is quite costly as a home user, particularly when you only need to store a small amount (e.g. 10GB) of data. Think about it. If you bought the smallest drive to fit that data (30GB) it costs you ~$25 (per pricewatch w/ S&H). Now buy 2 for local RAID, then a 2 more for storing offsite. You are already up to $100. And that assumes the offsite disks are manually moved locally to transfer data, and then manually (e.g. walk with sneakers) to the offsite location. Want online access? Buck up for a CPU, case, motherboard, network connection, etc. Easily another $100+.

S3 in comparison you can back up 10GB costs only $1.55/month + $2 transfer cost. The peace of mind knowing all my digital photos are safely archived offsite (and I can access from anywhere) is worth MUCH more than $1.50/month!

Also interesting is the Amazon Web Services team published the principles of distributed system design used to meet their requirements for S3.

  • Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.
  • Asynchrony: The system makes progress under all circumstances.
  • Autonomy: The system is designed such that individual components can make decisions based on local information.
  • Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.
  • Controlled concurrency: Operations are designed such that no or limited concurrency control is required.
  • Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.
  • Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.
  • Decomposed components: Do not try to provide a single service that does everything for everyone, but instead build small components that can be used as building blocks for other services.
  • Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.
  • Simplicity: The system should be made as simple as possible (– but no simpler).