Cheapest way to put a hard drive on the internet.
I'm currently researching the cheapest off site backup system and it looks like leaving a hdd at a friends house is the best option. The only thing I am stuck on is how to access it remotely. I need a system on a chip that I can plug in to the hdd and Ethernet and that provides ssh access. My first thought was a raspberry pi with a sata to usb cable but since I will only be doing weekly backups it makes no sense to keep the drive spinning 24/7. I need some way to turn off the drive and then back on over the internet. From what I understand there are linux programs that can do it but only directly over sata because the command doesn't work on usb sata controllers.
What I need is a cheap linux SoC that has sata and ethernet. Does anyone have any ideas?
Any USB drive that I can think of will automatically spin down after a short period of inactivity, so if that's your only concern then you should be fine. Just make sure that the drive either comes with an external power supply or has a low enough draw for the Pi to supply - spinning disks, especially if you go for 3.5", sometimes draw more than you'd think.
More generally, depending exactly how price sensitive you are in this (and what parts you have already, and how much you're trading off time vs. cash), it might be worth looking at the networked drives you can get from Buffalo, Western Digital, or Synology. They do what you need and are pretty much entirely plug and play; if you're happy with manufacturer refurbished, the WD ones are barely more expensive than the drive inside.
There's cheap in the sense of 'I paid the lowest amount for this to get started', and there's cheap in the sense of 'I get the best value for my money over time' - for the first piggybacking on the friend's internet might do it, but you're now liable for all equipment failures etc etc. The second option something like Dropbox is going to do far, far better. And if you have security concerns pick up an open source encryption solution and just archive encrypted images.
I plan on doing backups using duplicity which encrypts everything with GPG before it leaves my computer so I should be OK on that side.
I understand what you are saying with the hardware failures but I want to store about 3-4TB and I did some calculations and buying a drive will be cheaper in about 4-5 months and I expect the drive will last a lot longer than that. I also have the upside of being able to recover it without having to download the data over the internet which would take ages.
There are failures you're aware of before it's too late, and failures you aren't. One of the biggest benefits of most cloud storage is it's stored using some flavour of RAID, where it's actually stored multiple times to ensure it can't be lost. Imagine you have a failure, and want to rollback, only to find you've lost drive sectors where your data is.
Unless you're running fairly intensive verification procedures on a regular basis (which themselves limit the life of the drive) or using a proper NAS with a drive array to have multiple points of failure, you're gambling.
For 4TB, even at the lowest price I could find, that's $25-30/month indefinitely. Within a year, you've already covered the cost of two redundant drives and a basic but solid NAS enclosure.
There are definitely trade offs either way, so I'd say it's actually not clear cut overall, but for pretty reasonable levels of reliability it looks as though local beats cloud pretty noticeably on price once you pass about the 1.5-2TB range.
[Edit] Looking even further, G Suite gives unlimited drive storage if you have 5+ users on the 'business' plan at $10/user/month. Which would probably be worthwhile again once you're up to 10-15TB required storage or more. I imagine they'd probably cut you off beyond 100TB or so, though.
Once you hit a certain point (and a certain style of use) Amazon S3 Glacier is easily the best. Cloud creams local for scale, without any possibility for competition unless you've missed something important - remember, the cloud provider is getting better drives than you, cheaper than you can buy bad drives, and their economy of labour means they can maintain them for a fraction of the cost.
The downside to glacier is if you want frequent retrieval it's better to go with S3 standard, but if it's an emergency backup system then you're only likely to want a retrieval in an emergency.
Oh yeah, I agree on that for anything serious in scale, importance, or both - but that's generally the case in a business setting where an extra $50-100/month is a drop in the bucket. I wouldn't be telling anyone in a work context anything other than to put it on the cloud.
For personal backups, a small lump sum followed by zero ongoing cost can still be attractive - if it hits "good enough" and works out at half the price over 2-3 years then it might still be the way to go.
Piggybacking on the 5+ users thing.. I have one gapps user and it still let me add the $5/mo unlimited drive add-on without adding 4 more users. so US$10/mo for unlimited storage, sitting just over 1TB used at the moment and it hasn't capped me at all.
Is it a Windows or a Mac machine you want to backup? Have you looked into backblaze? They have "unlimited" backups for $50/year. They don't support duplicati, but they do let you put your own key into their program. They also have some pretty nice ways to restore like getting a flash drive or hard drive with your data mailed to you.
If you're serious about backing things up safely you need to go for more than just one drive.
Pick up a Synology DS218j and a pair of drives to put in it (4TB+ each). Set the synology up to mirror the data on both drives. This way you'll be protected from drive failure. The NAS software on these devices is second to none - just plug it into the network and set it up as your backup target. It'll spin drives down when not in use and stay in low power mode (couple of watts). You can also configure the synology to upload what you put on it to any cloud provider just in case. These are about $170 without drives. They'll last a long, long time (I've got a DS1815+ that's over 10 years old and still working great).
I am regularily sleeping at two different locations, so I have one hdd setup for both locations using Raspberry Pis.
You can use usb HDDs on a raspberry pi without problems if they have their own power supplies.
One can normally also set their spin-down time or manually spin them down using
hdparm
, the range of supported drives is quite wide and from my experience seems to support most usb drives.I also use
openvpn
to connect the two locations andunison
to sync them at 4 am so that I can usually just backup really fast to the local drive and it gets synced to the respective other location, but this doesn't seem really useful in this case, since you probably only want to transfer the backups from your main site to your offsite andrsync
is enough for that.If you don't want to open ports at your friend's router but you're okay with opening ports at your own place, I would suggest running an openvpn client offsite to connect to your main site openvpn server.
Of course, any SoC with usb is okay, but of course it is preferable to have native ethernet and SATA connectors instead of relying on usb peripherals.
For absolutely cheapest (although this is probably not what you're going for), Google offers free image and video hosting, which you could use to store data with some sort of data to image converter, than use those images as video frames.
I am 100% sure if I uploaded 4TB of data encoded in to QR codes my account would be banned.
But you sure could make a neat-o blog-post about the process ;)
I'd skip the rasperry pis and stuff and go with a proper tiny tower with a few drives, mirroring etc. Something like this.
What sort of transfer speeds are on each side of this process? For me, if I saw the person on a regular basis, I'd physically swap drives. It all depends on the frequency of updates and the types of files, though.