In an effort to replace my home backup server with Amazon's S3, I've been collecting a list of Amazon S3 compatible backup tools to look at. Here's what I've discovered, followed by my requirements.
The List
I've evaluated exactly zero of these so far. That's next.
- s3sync.rb is written in Ruby as a sort of rsync clone to replace the perl script s3sync which is now abandonware. Given that I already use rsync for much of my backup system, this is highly appealing.
- Backup Manager appears to now have S3 support as of version 0.7.3. It's a command-line tool for Linux (and likely other Unix-like systems).
- s3DAV isn't exactly a backup tool. It's provides a WebDAV front-end (or "virtual filesystem") to S3 storage, so you could use many other backup tools with S3. Recent versions of Windows and Mac OS have WebDAV support built-in. Java is required for s3DAV.
- S3 Backup is an Open Source tool for backing up to S3. It's currently available only for Windows. Mac and Linux versions appear to be planned. The UI is built on wxWidgets.
- duplicity is a free Unix tool that uses S3 and the librsync library. It is written in Python but not considered suitable for backing up important data quite yet.
- S3 Solutions is a list of other S3 related tools on the Amazon Developer Connection.
- Brackup is a backup tool written by Brad Fitzpatrick (of LiveJournal, SixApart, memcached, perlbal, etc...). It's written in Perl, fairly new, and doesn't have a lot in the way of documentation yet.
- Jungle Disk provides clients for Mac, Windows, and Linux. It also offers a local WebDAV server.
- DragonDisk has Linux and Windows clients.
For those keeping track, non-S3 options suggested in the comment on my previous post are Carbonite, rsync.net, and a DreamHost account.
Are there other S3 tools that I'm missing?
Also, I've found that Amazon's S3 forum is quite helpful. The discussion there is generally of good quality and the software does the job nicely. Perhaps we should do something similar for YDN instead of using Yahoo! Groups?
My Requirements
Most of what I need to backup lives on Linux servers in a few collocation facilities around the country (Bowling Green, Ohio; San Jose, California; San Francisco, CA). My laptop and desktop windows boxes have USB backup and get automatically synced to a Unix box on a regular basis already using the excellent SyncBack SE, so I don't need to re-solve that problem.
I don't really need a fancy GUI. I'm really looking for a stand alone tool that's designed to work with S3 and keep bandwidth usage to a minimum. Alternatively, something that works at a lower level (such as a filesystem driver) to provide a "virtual drive" type of interface might work as well.
Posted by jzawodn at October 06, 2006 03:09 PM
I'm using BitBucket in python. Works for me, and reliable enough in my experience of dumping 1 gig (and smaller) chunks up. I'm using a variation on the basic sync example, but I'm gpg encrypted tarball snapshots, not individual files.
There is this project which aims to treat S3 as an infinitely large disk (Linux only?):
http://dev.extensibleforge.net/wiki/s3/fuse
Latest brackup started to have more docs:
http://search.cpan.org/~bradfitz/Brackup/
But yeah, a little weak. Has a nice test suite and clear source code, though. :-)
How much do you trust the hosting service? Even if you have "nothing to hide", is it important that your data be encrypted or not? I'm simply mentioning because this is more important to some people than others (especially if you want to backup Quicken files or some such).
I’ve been thinking of using S3 to store backups of various machines (basically all linux/OSX ones), but what’s been holding me back is the inability of S3 to do rsync on the server side. rsync really needs an instance of rsync running “near” where the data is stored in order to do its cleverest compression/do-not-transmit smarts. Rsync is basically a win if you have a high-bandwidth link between the rsync server and the backing store, and a lower bandwidth link between the rsync server and client. With S3, you’d have to run the rsync server side yourself, remote from S3, which kind of defeats the purpose of rsync… But then I had a brainstorm.
Amazon’s ECC service, which parallels S3, allows you to create a virtual machine and turn it on/off as needed in the amazon compute cloud. The ECC instances have high-bandwidth connectivity to S3 storage, and so would be ideal for running an rsync server! You can set up an ECC instance which serves rsync, and then your backup script can turn the instance on, do the rsync, then shut the instance down when it’s done.
Now all I have to do is actually create the ECC instance, then create some kind of wrapper around the whole thing which does the startup-backup-shutdown wrapping around the ECC API, and voila!
How much upstream bandwidth do you have to make this satisfactory? My measly 512K upstream makes backing up online sound like a drag (though I do do appreciate the reliability and general peace-of-mind).
As much as I'm willing to pay for. Again, I'm mainly backing up data that's already hosted in a data center somewhere.
I'm using s3sync for my newly-minted nightly backups to S3, and I have to say I like it a lot. I had some issues with version 1.0.1, but had a series of nearly real-time emails with the developer, was able to provide some tcpdump records of the problems, and version 1.0.2 came out soon thereafter that hasn't hiccupped once for me. I'm pushing MySQL database dumps up to S3 nightly with it, some of which are on the order of 30-50 Mb, and everything works great.
The other tool you don't mention is jets3t:
It's a Java toolkit for using S3, but it also comes with two tools built atop the framework, one a command-line tool (Synchronize) that works flawlessly at moving directories of files up to S3 and back down to your computer (including metadata, and some version of checking to make sure files need to be pushed over the pipe before using the bandwidth). There's also a GUI, Cockpit, that lets you upload and download files as well as manage ACLs. I've now run enough tests with it that if I run into any problems with s3sync, I feel totally secure I could move over to it without a problem.
Oh, one more thing: I tried to get Backup Manager working on my Linux box, but couldn't get it to actually communicate with S3. Between it needing a bunch of Perl CPAN modules installed (without any documentation to that effect, meaning you have to watch for errors, decode which package is missing, go find it and hope it'll compile against your specific distro, etc.) and there not being any real way for me to debug what was going wrong with its attempt to talk to S3, it was an easy call for me to move on to other tools.
What Exactly Is Jeremy Zawodny's Job?
I have no idea. Neither does Jeremy.
you forgot this one.....
http://www.sync2s3.com/
Best,
E. David Zotter
Thom:
Huh? What does that have to do with any of this?
Does everything that I write on my web site have to be related to my job?
What are you getting at, anyway? Come on. Out with it!
(re: linkblog)
Well, what exactly is your official job title Jeremy? What exactly do you do everyday? I saw in a previous entry that you pretty much just answer emails. Come on. Out with it!
JungleDisk is exactly what I was looking for. I went ahead and just signed up for S3. Up until now S3 seemed very developer oriented and wouldn't hold up well for the person looking to backup various files/share them amongst computers. Now all I have to work on is getting a faster upload, as Comcast has something against us with a paltry 100kbit/s upload.
Interesting little tidbit on that Jeep on Water video (from the BBC's TopGear) - it's presenter, Richard Hammond recently crashed a jet car at 300mph and is currently in hospital. See http://news.bbc.co.uk/1/hi/england/north_yorkshire/5365676.stm
He suffered a serious brain injury and is probably only still with us due to the quick action of the Yorkshire Air Ambulance. Air Ambulances over here in the UK are completely supported by charity donations, and a page was set up after the accident to take donations for the Yorkshire Air Ambulance, which is here - http://www.justgiving.com/phrichardhammond
General opinion now is that Richard is going to be ok (http://news.bbc.co.uk/1/hi/uk/5390598.stm), but I think you can put that solely down to the Air Ambulance saving his life.
For those of you who are interested in trying out S3 without it costing you anything, or if you just want to test your homebrewed S3 backup suite, you might want to check out Park Place. It's an S3 clone that you can run on your own computer.
And for whatever reason, the URL didn't come through (m aybe I should't have tried putting it inside an <a> tag...
Anyway, here's where you can find out more about Park Place.
It may not fit your requirements, but I'm evaluating Interarchy (Mac OS X) for use with S3. It treats S3 like any other remote server, and offers some nice automation features.
I have been thinking about offsite backups for awhile now. I posted a little bit about my thought process on my blog.
For now, I have decided to go with s3sync. I just finished a test yesterday backing up and restoring 2GB of photos. It worked great and I am now planning to document the process in detail when I implement it on my linux server to backup up 15GB-20GB of my important documents.
I actually upgraded my Comcast internet speed from 6 Mbps/384 Kbps up to 8 Mbps down/768 Kbps up, to double the upload speed ($10 more/month). In my test it took 3hr/GB upload, and about 17 minutes/GB download. Certainly it will take a long time for the initial upload, but it will only be the changed files after that.
I was quickly reminded I haven't enabled QoS on my router yet. My girlfriend called and asked why the "server was so slow" while she was trying to work with files on my ssh server. Of course I could barely hear her with my VoIP phone.
Comcast note: I overheard my barber talking about calling up comcast after seeing a TV special for TV pricing and they updated his existing account with the promotional pricing. I thought that can't be true, so I tried it last month with my internet connection and they gave me the latest special $33/month.
Here is a link to my thought process about different offsite backup solutions.
http://johneberly.blogspot.com/2006/10/cheap-reliable-secure-off-site-storage.html
not trying to spam, just thought it might help people.
By the way, is S3 backup really "open source"? I tried the Beta version which has an expiration date.
I am currently using a combination of BackupNinja (using the Duplicity module), Jungledisk and DAVFS2. A quite extensive list of programs, but it works perfectly.
BackupNinja is used to make dumps of the databases, make a backup using Duplicity, for the scheduling of the backups.
JungleDisk provides webdav access to the S3 space, and DAVFS mounts the webdav as a normal directory to which BackupNinja writes the backups.
Works like a charm, much more stable than using Duplicity's S3 access features. Moreover JungleDisk supports caching, so no problem if the internet connection is lost during the backup!
Thom, EC2+S3 would be a killer app. If only an S3 account could be mounted as a device. I know that there has been some work in doing that (see the S3 forums) but there's nothing stable yet.
I use jungledisk with SuperSync http://supersync.com to access and backup multiple music collections. Running supersync on an ecc system would be sweet, but just with the jungledisk cache, its pretty sweet.
I couldn't tell if you are a developer or not, but if you are there is one more route: roll your own using the free library code amazon supplies. I did this as a "get to know ruby" project, and it isn't very difficult with the supplied code as a base.
something to keep in mind: s3 *isn't* a file system, it is a key/blob system where the keys can be almost anything and the blob can be anything as long as the size is less than 5G. want it encrypted? do it before you send it. compression? same. metadata? up to 4k anything you want using 'x-amz-meta-???' in the headers when you put the data. puts are theoretically all or nothing (there is no way to seek in a blob, so there is no way to provide true rsync-like functionality...if the blob changes you have to reput the whole thing). you can look at the md5 sum after a put so you can compare without resending all. there is no key rename and there is nothing equivalent to a hardlink (multiple keys pointing to one blob), so rolling .0, .1, etc style snapshot backups aren't convenient (because re-putting the data costs you both time and money). certain operations may require multiple gets, e.g. listing the contents of a bucket will at most give back 1000 keys per get, so you may need to iterate. there is no equivalent to 'rm -fr *' either, you must delete the contents of a bucket one at at time. the upload speed is limited by your ISP, so don't be shocked if it is 35kbytes/sec (100 hours for 12G).
last, the simple way to put doesn't stream the contents, meaning it's ok for 3M jpegs, but if you want to put a 3G binary you'd better have a *lot* of memory. I think I recall from the forums that the s3sync author has figured out how to make streaming work, but I haven't looked at the code. you can ask him in the forum. when you are evaling tools you'll definitely want to check for streaming.
given what you want I'd look at s3sync first. I wouldn't have written something myself if it had been available at the time. I doubt you'd be happy with the ones that make s3 look like a disk in explorer because you can't automate using them. ruby/perl/python etc solutions are nice because it is at least possible to make them portable.
gymbrall said:
"For those of you who are interested in trying out S3 without it costing you anything..."
Come on! Only if your own time is free. Installing an S3 sim is work. Whatever your hourly rate is, it's not pennies a month like S3.
S3 Backup got upgraded yet again: http://www.maluke.com/blog/s3-backup-beta5-gtd
I just signed up for S3 yesterday after doing some extensive research while I've been looking for an offsite backup solution. I looked at Carbonite and I do like it quite a lot, but S3 has much better prices.
Thank you for this post, this is literally EXACTLY what I was looking for.
jeremy, you say you store your data on three different hosting sites (ohio, san jose, sf). isn't this your "backup"? ohio goes kaput, copy from sf or sj, etc.
Most of these boxes are 3-4 years old and don't have the space I'd need to do that. The newest of them has ~120GB of RAID, so it can backup the others. But they cannot back it up.
I must say, s3 is very intriguing. I'm still not sure on it, mainly because cabonite and mozy exist. I'm really hoping a mac client comes out or else I'm sorta stuck w/ s3.
Pricing wise, carbonite wins because I'd back all my media up too.. which gets me into 40-60GB range. Mozy would be next because you can do block level changes, and have a private key. Mozy has a better interface I believe too IMO. If I were keeping my windows box, I'd go with them.
Jungledisk is interesting... I was almost sold until I read it stores your key in plain text on your computer. The jets3t client looks interesting as well, and s3 backup when it goes multiplatform.
I'm too drawn to carbonites "unlimited storage" but I know I run risks if I dump 200GB of media on there. I also cannot understand how they plan on staying in business.
I started using Duplicity recently - it lets you do incremental (using librsync) encrypted (using gpg) backups (using tar) that can be stored remotely (using ssh/scp, ftp or supposely S3 as well).
It is part of Fedora/Debian which I suppose gives it some non-abandonware-ness :)
someone in the previous post's comments mentioned Dreamhost offering 200GB / 2TB-per-month for $90 for 2 years... so that's what I've gone with. ;) An excellent deal.
Works like a dream. I've linked to you, written it up, and am spreading the word. Thanks.
This is a great blog and admittedly is aimed at those who's backup needs are stricter than mine.
All I want is a backup system for my key files and photos. Music not so important. I have a Mac and PC.
So, I'm planning to use iDisk of .mac to backup my photos on my Mac and the XDrive on AOL for my other things on the PC.
@Linh
I am in the same exact boat. I like the simplicity of S3, I really do, but I like the automation of Carbonite. Perhaps once the application front-ends for S3 have matured a bit I may make the full switch over, but for right now I think Carbonite is more what I am looking for.
As for the space consideration, I am looking to only backup around 10GB or less...unless I decide to backup my MP3's but then we're talking about needing 80GB of storage which I do not really trust any other company to hold on to.
I still wonder how Carbonite plans to stay in business. If I had to guess I'd say they are using S3 as their backend. There is no real other way they could offer the prices they do. I suppose I will stick with Carbonite for now, if (god forbid) they fail miserably then I will make the full switch over to S3, but I will also keep an eye on the development of some of the S3 backup programs.
This is a great blog, and it has a great group of commentors. Keep up the good work guys!
On the questions about how Carbonite will stay in business, just think about it a bit. They surely use S3 or can beat S3's prices internally.
They allow you to upload the first 50GB's unthrottled, but after that you are limited to 0.5 GB/day. ($7.50 /month at S3's prices) Plus they reserve the right to kick "abusers" of the network. (They define abuse.) So the maximum loss for power users is pretty limited.
Then consider that most of their users will only have a small amount of data that rarely changes and it does seem like on average they will make a profit. They just need to be so simple it is the solution you recommend to your mother. (A goal they meet pretty well.)
Whatever backup solution you use, it's vital that you try the following test:
1. Install it.
2. Do a backup.
3. Uninstall it.
4. Restore from the backup without it.
If you can't do this, then you will get instant bitrot when the backup program's manufacturer gets bored or goes bust. Or upgrades the format: all my Backup MyPC backups became suddenly unreadable when a new version was released without backwards compatibility!
assuming s3 itself survives, the keys to passing JB's test above are the format of the keys (I used the relative path as the key, e.g. "photos/2006/09.06.06/xyz.jpg", which is quite easy to figure out) and whether you can figure out the metadata (e.g. I keep the timestamp of the original file in the metadata as seconds since the epoch). if the key scheme isn't decipherable then you are dead meat without the tool. if you can't figure out the metadata and you need things like timestamp, links, etc, then you are not going to get what you want without the tool. if the tool is a script then you'll have the code, meaning in theory you can figure out what scheme the tools uses. even if you *can* dope it out don't forget applying a complicated scheme manually to possibly tens of thousands of files isn't going to happen.
I agree with Joseph Bruno, this is important.
I know JungleDisk uses some really weird naming scheme, I didn't figure it out in limited time I spent trying.
S3 Backup uses straight keys (the way supersaurus suggested), if the file is compressed it has a meta field, returned as a HTTP header X-amz-meta-compression: zlib (or bz2 depending on the algo used). If encrypted it has a similar header X-amz-meta-cipher: AES or Blowfish. Figuring out the key to decrypt is not that simple however. For maximum security the password used for encryption is being 'salted' with a file name and a random string, different every time. The salt is stored as X-amz-meta-key-digest-salt, the SHA hash of the salted string is stored as X-amz-meta-key-digest (we need this to verify the password before downloading file and trying to decrypt it). Also there's X-amz-meta-decrypted-size that tells where to truncate the decrypted data (block ciphers need data to be aligned to a block boundary, it's funny how the anonymous guy who develops jungledisk took the easy way out -- he just used the not-so-secure RC4 cipher just because it doesn't require aligning).
http://en.wikipedia.org/wiki/RC4
I'll make sure to release a console tool compatible with S3 Backup, so you can be safe that your data are your data. BTW, you can make it happen sooner by voting -- see the last post in the blog (too many links to S3 Backup already, so I wont spam with more).
Oh, btw, I already have introduced incopatible changes to S3 Backup (in hash database format), but I took precautions so both old and new formats can be read, but the app just uses the new, better format when writing. That's it -- fully transparent but still evolving.
seems the list of backup services are pretty popular these days. Another similar post:
http://davie.wordpress.com/2006/06/30/couchsurfingcom-goes-down-but-a-lesson-learned/
And here is the list of more than 150 similar services:
http://www.listible.com/list/online-file-storage
Some recomendation? Strongspace seems pretty good to me (but a little expencive for my taste):
zhesto, general online file storage is VERY different from backup. Not to say that list is not useful, but those are not "similar services" at all.
Sergey, like you see somebody already complained in the comments about mixing different kind of services. Still there are several backup solutions inside the list (s3 based like jungledisk or not, like carbonite).
Jeremy, I just found this link from Scoble.
Might be another one to investigate? They serve up content for Digg, TWiT and others.
Jeremy, thanks again for starting this topic. It finally prodded me to implement the offsite backups I had been wanting to for awhile.
I think the idea of using ECC with S3 to perform backups is very interesting, but I decided to go with a simple solution for now.
I posted how I automated my backups to S3 using s3sync here.. http://blog.eberly.org/
I have successfully uploaded and restored over 30000 objects (>4GB) with this method. I plan on backing up over 10GB starting this weekend. Once the initial upload is done, I estimate my monthly bill will be $2 or so.
Just another FYI guys,
For anyone who uses Firefox this is a nice extension,
http://www.rjonna.com/ext/s3fox.php
Basically looks/works just like a FTP client :)
Jeremy -- first, thanks for an excellent post! It's been keeping me busy. :) I had a quick comment: in the blog post your item for S3 Backup says that it's "an Open Source tool for backing up to S3". I didn't see any info on the S3 Backup site that it's Open Source.
Sergey, can you confirm or deny either way if the project is Open Source?
Thanks very much!
Thanks very much for
Omar, the app is "partially" open-source, not free in a FSF meaning for sure. I've been quite busy with the app itself, but I do plan to release a lot of underlying code with GPL or maybe MIT license. Another possibility is that I will release full source code with a restrictive license -- not for use in other projects, only for inspection. I don't mind making the application fully open-source if I was compensated for all the time I spent working on it, but I don't think it will happen any other way than that I will monetize it myself. BTW I wonder what Jungle Dave is going to do, JungleDisk has a massive install base already and as the app is closed source, I guess he has plans similar to mine.
In other news Mac OS X and Linux ports of S3 Backup are coming soon, maybe the next beta will span all platforms.
Thanks for the info, Sergey! I'm looking forward to using the next beta version. :-)
Quick update on state of S3 Backup on Mac: http://www.maluke.com/blog/s3-backup-mac
A new entry is Openfount's InfoMirror. It has a sophisticated Ajax GUI as well as a part that runs in the background.
Use SyncBackSE (love it for local backups) but event its backup to FTP did not cover my needs (unattended, encrypted, and secure).
Signed up for S3, tried most of the backup clients listed, stuck with JungleDisk, but was very disappointed with the upload speed. I know, it mostly depends on your up-stream, but it was painfully slow.
Came upon Mozy... and never looked back. Good interface, smart features, set it and forget it. Non-nonsense people behind the company which helps a lot. Check it out if you're on a PC.
(Shameful link/plug: i.e. I get free storage if you click this link as opposed to retyping it https://mozy.com/?ref=3F5CS8 )
Not really what you've been looking for, but saw this on CrunchGear today, and it's kinda cool..
Jeremy,
Thank you - and thanks to your commenters - for this list of tools. You pointed me in the direction of the s3sync Ruby script, which I now have running on my NSLU2 (a $100 NAS device from Linksys) - first successful sync today! - and this post helped a lot.
(Here’s the first of the blog posts I made as I went along, in case you or any of your readers are interested: http://www.gilesthomas.com/?p=5)
Cheers,
Giles
Just another solution: S3Drive
Basically looks/works just like a Windows network drive and therefore accessable in all Windows and DOS application and all programming languages that support file I/O.
I have been using iFolder.
www.ifolder.com
It is designed more for 'file sharing' than pure backup, but I use it for both.
What about s3backup ? Ruby-based executables ...
Hi,
We are about a week away from launching a private beta of PutPlace 1.0. This will initially offer automated backup to S3 and an explorer style view of your content once its up there with the ability to retrieve at will. Registration for the beta is available at at http://putplace.com
Windows only at the moment with Mac in the works.
Hi Jeremy. Here is another tool to add to your list.
http://js3tream.sourceforge.net
It's a very simple bridge between TAR and S3.
Shane.
I tried jungle disk but it wouldn't do a scheduled backup - would do an on-demand one, but not scheduled - therefore pretty useless for me :-(
I have been using Super Flexible File Synchronizer for quite a while now to sync documents and photos between my laptop, desktop, and do ad-hoc backups to a second desktop drive and a USB HD.
It really is super flexible (well maybe too flexible, given the gui layout in some parts)
The latest version now supports S3 which I'm trying out. The program will run as a windows service, which is the sort of set-and-forget operation that I'd really like. So far so good.
Most tools I've looked at are simple and feel like a beta. SFFS goes beyond just a simple interface.
http://www.superflexible.com
30 day trial
windows only.
One little trick I've learnt is that Bucket names must be unique across all of S3 - so choose something obscure !
You can use a website to manage your Amazon S3 account: http://s3browse.com
Hello, a few months ago i started a similar list (inspired by this one). I'm trying to keep it up to date. If interested go to
http://elastic8.com/blog/tools_for_accessing_using_to_backup_your_data_to_and_from_s3.html
You can also check out S3-FTP at www.awsware.com. It implemnets an FTP server that stores to S3.
I run the s3fuse on a virtual server, http://www.redwoodvirtual.com/ However, I use dump/restore, and backup on an inode basis. I do a level 0 every 3-4 weeks and o tdhe tower of hanoi method of diffs. This keeps transfers very low, as well as disk usage. rsync is great, but it is just a copy of the data as opposed to incremental backups. Sometimes people (myself included) want to recover something that was intentionally deleted a day or week ago.
I tried duplicity but went back to dump/restore since almost everything duplicity tried to do was just a complicated way to do the similar thing dump/restore already does, but didn't even bother doing it on an inode level, but only on a file level. duplicity, at least the last time I tried, did not allow scripted gpg encryption. It required the gpg encryption to be "signed", and who in their right mind would leave their secret key around with no password, just so duplicity could use it to sign+encrypt. encrypt would have been good enough, but it was not possible.
gpg was no problem with dump/restore because I just gpg -e (ed) the stdout of dump before writing it do it's destination. Scriptable. Incremental.
Download using S3Fox doesn't seem to work on OS X. Looks like S3Fox is using backslashes where slashes are needed in destination paths for objects -- resulting in empty folders on the OS X side.
Thanks for the info, Sergey!i was looking forward for this..
You might want to check out Quillen. It is a new project that uses a novel approach to minimize data transfer and storage on S3. It also strives to be simple with just a command line interface.
Backup Review:
I found the best solution to backup your computer online. Mozy offers great customer service and wonderful services. You can backup your data for FREE for up to 2GB or you can purchase a UNLIMITED account for only $4.95 a month Nate was correct you should all try: http://mozy.com/?ref=3f9a896b&kbid=27203&m=5
I personally recommend them AAA+++
Iliana
Rdealz.com
I used mozy for a while and liked it until I switched to ubuntu and realized they don't currently support Linux.
I'm going to investigate some S3 solutions. I'm guessing that if I setup encfs I can back up the encrypted folder(s) for an additional level of security.
Btw if you do utilize mozy, they do offer private key encryption but files are encrypted on a per-each basis. You have to manually decrypt each file after a restore.
We released a User Interface today, which is called Bucket Explorer. Technically, its probably the best solution out there to transfer files to S3 and when you want to be in control, as it has easy to use UI, which is built on top of the robust JetS3t API.
You can use Bucket Explorer as a simple FTP tool or a backup tool for Amazon S3 or you can use it to:
1. Browse buckets and the files stored at Amazon S3.
2. Upload and download files, to and from Amazon S3 buckets.
3. Set Server Access Logging (Bucket Logging) for Audits.
Synchronize data between multiple computers.
4. Upload files in HTML format for Web Hosting (even when the extension is not .html or .htm).
5. Create public URLs and signed URLs to share the files.
Access shared buckets and files from someone else's account.
6. Set Access Control on Buckets and Files for authorizing other Amazon users or non authenticated users with different access rights.
Bucket Explorer works on every OS where Java is supported. It uses Amazon's ETag and its own SHA-1 hash combination to make sure that a file is never transferred again to Amazon S3 if is it not changed, to save the bandwidth costs and time.
http://www.bucketexplorer.com/
just an FYI I've posted "s3fs" (no relation to the original projected named "s3fs"), a fuse based filesystem backed by amazon s3, on google code
We use S3 at our startup for our embed tool. Thanks for pointing me to all these useful tools.
Hello
BEST. POST. EVER.
I had no idea how important it was to post content as HTML.I liked your blog, it was fun to read. I look for blogs all the time that can influence me to come up with unique ideas for my blog.
Bye
I've just come across S3, read most of the blog but didn't get the answer that I was looking for :)
Wonder if you know?
Are the any products that will run process on a linux box that will sync the data on the box with Amazon S3, based on a cron job?
WARNING. The JetS3t toolkit currently defaults to very weak (single 56-bit DES) encryption. Anybody considering using it needs to look into changing the algorithm it uses to PBEWithMD5AndTripleDES at least until they hopefully implement my bug report here:
Looked quickly at Backkup Manager and saw that there is no S3 restore capability. Meaning "they will help you check your bits in but not out".
Here are a few more tools for S3.
S3Drive plugs into the Windows file system and maps a S3 folder as local drive. It can be accessed like a network drive (\\servername\folder where "servername" is the name of your S3 account your specify in the S3Drive config for it). S3Drive is free. It creates a virtual file system structure in XML format. That means that the folder and file structure you see in Windows is not identical with the structure and names used for the resources on your S3 storage area. You can access the XML files via other tools, such as S3Fox etc.
S3 Plugin for Wordpress
http://tantannoodles.com/toolkit/wordpress-s3/
Flickr to S3 Backup Tool
http://www.postneo.com/2006/03/22/backing-up-flickr-photos-with-amazon-s3
Cheers!
check out 'Transmit' at http://www.panic.com/transmit/. so far so good, on the mac platform.
It might be interesting for you to consider this (Free) next generation backup product (Secobackup, S3SQL, www.secobackup.com). Its basically a "install and forget" type of product.
www.secobackup.com
Secobackup is a personal backup product - it provides continuous data protection - as you make changes and save them, they are automatically backed up into Amazon S3. One nice thing about this is that it uses deduplication internally. So if 3 of your PCs have the same JPEG picture stored on their disks, it will detect that they are duplicates and store only once.
S3SQL is a automated MySQL backup product utilizing Amazon S3 similarly. You set up scheduled backup of one or more of your MySQL databases, you point the backup to your Amazon account, and it will automatically do regular backups, compress them, encrypt them, do differential and de-duplication type optizations and store them on Amazon's S3 service. Its has a AJAX based browser UI. Install it on a Windows XP box and you can set up backup of MySQL databases from any number of hosts - Linux, Solaris, Windows etc.
Its free. Try it, our users typically are up and running in just a few minutes with the backup already going to Amazon S3. Its simple to setup, and gets you immediate peace of mind from disk crashes, leaky roof or whatever...
Enjoy!
GP
Hi,
Well our own little Windows and Mac OSX - Amazon S3 software is 3 months old today!
http://www.databucketpro.com/about
Regards
Marc Liron
Microsoft MVP
Jeremy,
thanks for the post.
I have a requirement that needs to back-up 1 Terabyte of Data FOREVER - basically, it's an archive of data. Once it's uploaded, it stays the same size indefinitely. Looking at your solutions, I think it would be better for me just to get 2 DreamHost accounts (2 500GB accounts) or maybe Carbonite or s3. I could be wrong. Any suggestions?
For business use or if you have multiple computers, servers or alternative operating systems you should look at www.filesafebackup.com. Also good if you need to back up several GBs instead of like... 5.
Cheers, very nice list. Needed some ideas for a couple of sites and found a gem to base on thanks to this list.
Pageman, please keep in mind that using a DreamHost account as backup is against their TOS. There was some controversy about this a while back when they explicitly said you shouldn't do it.
They probably won't do anything if you just host a couple of files, but they said that if the majority of an account was non-web accessible, that that would be considered abusing the service. I think two 500gb accounts just for storage would get you into trouble for sure!
I'm writing a implementation of S3 in PHP (for fun!). It's not exactly a backup, but if any of the tools above can backup to another S3 url then it might be useful. I'm writing it as I don't believe in a single source of failure and it might be useful for mocking / fail over source. It's not close to being finished yet, but given a few more weeks it'll be much closer to being useable.
WOW, thanks for this list, I was looking for Mac OS X S3 backup solution. Thank you!
I started some C# source awhile ago for accessing S3 from PowerShell.
S3Nas PowerShell Provider:
http://s3nas.com/Home/tabid/53/EntryID/10/Default.aspx
Cheers,
Timothy
I like S3Fox on Windows and Mac — it works reasonably well, but still, I wish there was an elegant S3 backup utility for Windows in the same vein as Transmit, which looks pretty spiffy! The Windows ones I've tried so far have disappointed me and been very clunky/awkward to use. :(
You might want to check out S3stat while you're at it. It's a service that provides web stats for your S3 buckets:
i've also done something similar, which takes both file data and sql data. suits CPanel / WHM sites better too.
http://duivesteyn.net/2008/amazon-s3-backup-for-webserver-public_html-sql-bash/
Hey All,
I have recently developed "River Drive" (Yes! I know!) Which is a Linux(via Mono) and Windows compatible GUI and command line interface to S3. It supports drag and drop, and can be used to upload multiple files at the same time. It has a few things to be ironed out and I plan to ad a load more features soon, such as a local FTP / DAV interface.
You can grab a download at www.tristanphillips.com/projects.php
I would appreciate any feedback anyone has?
Leave a comment on my blog at
http://www.sequencetechnology.co.uk/Blog/2008/07/amazon-simple-storage.aspx
Cheers!
I've been using SMEStorage.com - currently they use Amazon S3 to host their service but are opening up the platform to any S2 user in September of 08 - you get to use your files from iGoogle, iPhone and all sort of other channels.
I've been using a tool called s3-bash, I'm not sure if it's endorsed by Amazon but it's been working pretty well for my needs.
A quick guide if anybody is interested:
http://blog.datajelly.com/2008/07/ubuntu-script-to-backup-data-to-amazon.html
Take a look at Manent: http://freshmeat.org/projects/manent (or go directly to the project page at http://trac.manent-backup.com).
One other tool is s3cmd from SF's s3tools project:
it is becoming pretty popular as it quickly approaches version 1.0. Supports rsync like backup and restore, works great with non-ascii (e.g. unicode) filenames, and has plenty of other features. Worth giving it a try.
I've been trying the beta of a new service called SMEStorage available at SMEStorage.com. These guys have a beta called OpenS3 which sync's your S3 information to their platform so that you can access the data from your Iphone, iGoogle, Facebook, their rich web client etc.
They also then enable your data to be integrated with other services such as Zoho, Picnik, Flickr etc - its pretty slick.
Try looking at:
http://sourceforge.net/projects/wins3fs/
it is in beta, but I have been using it with good success for a while now.
WebDrive maps a drive letter to S3 and it has a simple backup/sync utility built in. It can also map additional drives to servers using other protocols, like SFTP, WebDAV, etc.
Here is the freeware GUI client for Amazon S3:
http://s3browser.com/
I just checked out SMEStorage.com that some of the other commenters here had blogged about. It allows me to use my S3 account with their service through something they call OpenS3. Now there is no shock their, because I have been using Jungle Disk for a while now to dump files to S3. However I now can use my S3 account through their platform. That means I can now access my S3 files via my iPhone amongst many other channels. I also get a file explorer for Windows that lets me upload/download and share files easily as well as plug-ins for Open Office and Microsoft Word. In short my Amazon S3 experience just became much nicer !
SMEStorage recently started supporting GMail as a cloud storage provider, meaning you can store files on Gmail and access them on their platform exactly like I do my S3 files. And because I can export my S3 files to my GMail account via SMEStorage I have a back up of all my files that is free !
THe only issue I have found is that I'm also using Jungle and it puts some weird information in the file names so it is hard to figure out what the file is outside of Jungle. I pinged the guys from SMEStorage and they said they would have a look at how the could parse this to make an import easier.
Hi Jeremy, hi all readers
i am the head developer of a new tool named "s3ganize", which comes with an explorer-like user interface for amazon s3. The Url of that tool is: http://www.s3ganize.com.
Its currently only available for MS Windows systems. but, as an linux-user, within near future i will deploy a multi-platform-application for giving linux users also access to s3 in a graphical way. Maybe it would be possible for me, to support mac os-x also, but i cant promise this right now.
I would be very glad, if you would add this program to your list of s3-tools.
Thank you very much. If you would list it, i'll give you updates if needed.
Kind regards,
Hagen
I want to store my companies data on Amazon's S3 for multiple users to access, edit, etc. and I'm looking for an application that would allow me to create a backup of the data stored on Amazon's S3 to an offsite location of my choice rather than use S3 as the backup storage.
Any ideas?
wait a sec... you say you want to store your companies info on S3 and then you say you dont want to use S3 as the backup storage; which is it? o_o
-jack
Because we are located in a major hurricane city on the Gulf Coast the idea is to no longer use our server as the primary location for our data, but to use S3 as the primary location, which multiple users can access and edit at anytime. What my bosses would like is something that will backup the data that is on S3, but maybe to a swappable drive that a member of management could take home as an extra sense of security. So essentially the original data is safely stored and used on S3 and we make backups to an external location. It is very sensitive data that basically runs the business and they want redundancy basically to the point of overkill.
I hope I explained this a little better. Let me know your thoughts.
Thanks,
Mike
Hi everyone,
want to try another S3 Browser? Check out www.cloudberrylab.com. It is completely FREE! We have released it as a side project from our main S3 based project and we want to share it with the rest of the world! Enjoy!
I found out about SMEStorage.com here from Morgan - nice, they accepted me on their OpenS3 beta and I've been using their Web file explorer and their windows back-up tools - nice service and free ! One interesting thing you can do is register for Gmail storage account (uses GMail as storage) and then import your Amazon S3 files into GMail as a backup - it obviously won't work for large files but its kind of reassuring to have a cloud backup of vital documents and stuff. Plus it's easy to use normal Gmail tools then to access the files.
I've been using dropbox . It uses Amazon S3 for the storage device. It is one of the best sync tools I've found. No hassle. It has worked flawlessly for me since I've been using it.
www.getdropbox.com
I'm using the solution I describe here:
http://sharph.net/2008/12/encrypted-offsite-backup-with-encfs-amazon-s3-and-s3cmd/
Its just a "reverse" encfs mount with s3cmd's sync command.
Hi,
http://www.s3rsync.com/ allow you to rsync to your s3 bucket.
I've written my own solution using PHP and MYSQLDUMP to automatically place backups on A3. Read it here: http://www.keithmander.com/?p=493
You have JS3Upload which is dedicated to upload (with MD5 and resume support):
http://www.jfileupload.com/products/js3upload/index.html
And JS3Explorer which allows to manage the content of your buckets (delete, upload, download, move, copy, update ACL and more):
http://www.jfileupload.com/products/js3explorer/index.html
http://www.tarsnap.com/ has a good pedigree and looks very promising.
You should also try GoodSync for backup and synchronization to Amazon s3 as well as other types of online storage: http://www.goodsync.com/synchronize-applications/how-to-sync-via-amazon-s3.html
S3Toolbox is my personal tool I created to move files to Amazon S3. It is free and it works.
I've written yet another S3 backup script that you may want to check out: http://dev.davidsoergel.com/trac/s3napback/. It's very easy to use and handles backup rotation, incremental backups, compression, encryption, and MySQL and Subversion dumps. Enjoy!
I like Gladinet (http://www.gladinet.com). Have been using it for over 3 months now. Works well with S3. The bonus point is that I can use Google Docs, Google Picasa, SkyDrive all at the same time side by side with S3 as virtual drive. Periodically I copy my Google Docs files into S3 for backup purposes.
Hi Jeremy, thanks for your post!
Please add to your list a new CloudBerry Online Backup powered by Amazon S3 with friendly user interface, strong data encryption and scheduling capabilities. You can sign-up for beta at http://cloudberrydrive.com/
You have CrossFTP Pro which is an excellent FTP tool to manage your data on Amazon S3, and do the backups/synchronizations.
http://www.crossftp.com/
Found something very simple that gets the thing done called As3FileSync at http://www.as3soft.com/
Another very useful tool: S3fm, a free online Amazon S3 file manager. 100% Ajax, runs directly from amazon s3, secure and convenient.
Does anyone know if there is a way to map an S3 drive/bucket to a windows machine as a network drive?
Also, it's necessary that any files copied to the S3 bucket be set as public read so that they can be accessed from the web.
I have tried WebDrive, but all copies end up as Private and they don't know when they will have a version that can inherit ACLs from the parent bucket.
Thanks.
David Pascoe mentioned Super Flexible File Synchronizer in February of 2007. I'd like to bring it up again. "SFFS" is rapidly evolving and its S3 support has also received many revisions since its introduction.
The program is very flexible indeed.. so much so that it might take a while for a new user to figure out all it has to offer. That is to say, it is extremely configurable.
Currently available for Windows and Mac. 30-day free trial available at http://www.superflexible.com
(I'm not associated with "SFFS" other than being a satisfied user for several years.)
We are building amasons3 backup software for home user and enterprice.
Supports block copy and binary paching is on the way
Please add us.
Check out :
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2762
We offer a backup program that does online backups to the Amazon S3 service (http://backazon.com). We offer a 30 day free trial, so please check us out.
(P.S. - I read your posting policy, and I think this comment is on topic and not spammy. My apologies if you disagree.)
There is nothing like "The Best". You have to find one fits your needs.
Something to think when you pick one:
1. Easy to use.
2. Can you get your files back when you need them? most online backup services do well backing up your file. but when you lose your files
and need your files back, you either couldn't do it or it takes long time and lots of time to download. At that time, it is already too late for you to know.
3 Real cost, don't believe in any "unlimited" myth, it is for PR only, First, you pay for each computer and your usage is capped by your drive size. second, your system files are not backed up. You have to have a rough idea how much you will actually backup then calculate on GB permonth cost base. You will find out that "unlimited" is not really cheap in most cases.
4. How long do you think the company can last? most startups can disappear and your files will disappear too. the better way is to backup to some well established online storage like Amazon S3.
Kevin
Kevin@netcdp.com
http://www.netcdp.com
For those looking for an up to date, easy to use, rather cheap online backup solution I've recently just started using backblaze.
$5/month (per PC), for unlimited storage.
Much cheaper than S3
s3backer is an open source user-mode block-oriented S3-backed filesystem for UNIX systems. Details here:
The upload method sometimes feels like the only reliable backup method. I personally use a hard drive, but then, what about a flood or fire, am I protected?
That's really a nice list and it's good to know about this.
Here's another service offering offsite backup using open standards (Rsync, SSH, Encfs). No proprietary client required...
www.datastorageunit.com
The basic plan with 100 GB's of storage is only 2.99 a month.
@archie re: s3backer - thanks
We've got multiple machines and technical expertise and that looks ideal :)
Thanks for the post Jeremy
Ben
Make and take care of backup using backup resources.....http://rapiddigger.com/search/backup/
Really these backup tools are great.I have used some of them.But at present i am using Magic Backup online service & really it's great .Magic Backup is so easy to use, and so reliable. Unlike other backup products that perform "scheduled" backups during the middle of the night, Magic Backup is always on the lookout for new or changed files that need to be backed up. The minute you're done editing a document, (well, 10 minutes after actually), Magic Backup will silently prepare and transfer a secure copy of that file to your private location on our servers. You never have to worry about complicated configuration settings, marking files for backup, changing backup tapes, burning backup CDs, or any of that old-school backup mumbo-jumbo.
well, beside all of the comments given above I'd like to say 'good work Jeremy...'
I need feedback for my S3 client (Qt based) :
http://www.jamdisk.com
Thanks.
Tony
Hi Tony,
I discovered the linux version of jamdisk. I like it very much.
John.
Amanda/Zmanda recently added S3 support. A bit more commitment to install than the others, but once in place, very effective.
Found a (commercial) command line tool for S3 with on-the-fly data encryption, multi-threaded, diff, symlinks,...
http://www.activ8.at/homepage/en/a8s3.php
using it to create nightly backups of several servers (user files, db-dumps, config files, ...) to S3, data is encrypted at S3 (so none of the amazon guys can get to our user's info ;)), even stores symlink info (symlink-destination metainfo) and restores symlink as such (java 1.7 required for that - but that's not a problem)
I like S3Fox on Windows and Mac � it works reasonably well, but still, I wish there was an elegant S3 backup utility for Windows in the same vein as Transmit, which looks pretty spiffy! The Windows ones I've tried so far have disappointed me and been very clunky/awkward to use. :(
I have no issue with paying money for it, but I really care about my back up data that process in the data center.
The cloud has become a lot more powerful with Cloud Storage and Cloud IT Solution 5.0. It is far more than just storage or backup. Not only you can backup files to the cloud, you can also move your entire file server, FTP server, email server, web server and backup system to the cloud. You can create sub-users and sub-groups; you can set different user roles; share different folders to different users with different permissions. For a small business, Cloud-based storage, backup, sharing and Cloud IT Solution can save you a lot of cost, while offering better, more secure and reliable services that can be accessed from anywhere.
DriveHQ.com is one of the first few companies offering such cloud based services. It is now offering the version 5.0 Cloud Storage and Cloud IT Solution. For more info, please visit: http://www.drivehq.com/. DriveHQ basic service is also free.