- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
290 lines
25 KiB
Plaintext
290 lines
25 KiB
Plaintext
Episode: 544
|
|
Title: HPR0544: HPR: A private data cloud
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0544/hpr0544.mp3
|
|
Transcribed: 2025-10-07 22:50:37
|
|
|
|
---
|
|
|
|
name is Kanfar and today's episode of Hacker Public Radio I'm going to talk about setting
|
|
up a secure private data cloud solution where I use Ersenc over SSH to back up my data
|
|
to remote location.
|
|
Over the last few years I've stopped using an analog camera and an analog video camera.
|
|
And as a result, you know, I don't print out photos anymore because, you know, the real
|
|
is never full of such, which kind of explains why my parents don't have any recent picture
|
|
of the kids, but you got to realize if you're in this situation as well, you got to ask
|
|
yourself the question, well, how secure and safe is my data?
|
|
Because up until this point, basically if your heart is crashed, there wasn't really
|
|
that much stuff on us that was going to be important.
|
|
Maybe you had a backup CD of a few documents that you had, but, you know, maybe you needed
|
|
to re-create your CV or whatever your resume.
|
|
So things have changed though now because the only copy of your charged photos are going
|
|
to be on that computer and the only memories of your, of your, the videos of your kids
|
|
are going to be on that, on that hard disk.
|
|
And over sometimes as the capacity of hard disks have increased, the reliability has decreased
|
|
and they're doing an awful lot of tricks on the controller to make sure that they, as
|
|
data sectors fail that they're moved seamlessly to another location, but you can't really
|
|
trust that.
|
|
I've included a PDF study done at Google and it shows the reliability failure of data
|
|
disks over time, you know, from 2% up to 50%, and within five years you can more or less
|
|
be guaranteed that your hard disk is going to fail.
|
|
So, you know, what was worrying there as well in that study was that the smart monitoring
|
|
tools that are supported to report the health of the hard disk in many of the cases they
|
|
didn't.
|
|
So you probably won't even have any warning until it's too late that something's going
|
|
wrong.
|
|
And traditionally the way we backed up stuff as we copied the important files onto a CD
|
|
or, you know, copy a few documents and a few spreadsheets onto a USB stick and we're happy
|
|
out.
|
|
But those mediums themselves I've already spoken about, these are unreliable and to grade
|
|
over time.
|
|
And DV video from four years ago is already bad, DVDs and CDs get corrupted and it's difficult
|
|
to read them.
|
|
Worse is the SD cards if they fail, that's it.
|
|
There's nothing left, you're in big trouble, but the fact of the matter is the capacity
|
|
of these backup mediums are okay for a few documents, but when you're talking about
|
|
photos and videos, especially when we're talking about HD stuff, they're there isn't enough
|
|
disk capacity to be able to cover this.
|
|
So the problem is that data capacity is increased, hard disk capacity is increased, but the
|
|
backup, the means to backup this hasn't.
|
|
So of course the solution of this is the problem, is the solution is to go out and buy yourself
|
|
even more hard disks and you can pick up a 1.5 terabyte hard disk from Amazon, I just
|
|
checked there and it was 95 dollars and equivalent over here in Europe, I won't terabyte for
|
|
70 euros from my com, that and now so basically the solution to this problem of hard disk failing
|
|
is a juggling act where you move data from one to the other and you as hard disk fail,
|
|
you replace them.
|
|
So you're looking at something like a mirroring array where everything is written to both
|
|
disks and the phone fails, you still have the backup or a red 5 array where you've got
|
|
three or more disks where the phone fails, there's enough information on the other two to
|
|
be able to recreate the third.
|
|
Now there are also proprietary solutions out there like a drawable which is an all-on-one
|
|
unit and you just can connect it to the network, but there are also a sub 100 euro dollar
|
|
NAS solutions network attached storage solutions, my brother has a HP one that runs devian and
|
|
I've just seen a few, see them every week small little low power devices that can take
|
|
serial 8a hard disk someone automatically murder them, no thinking required.
|
|
So if you got that in your home then at least the risk of one hard disk failing is kind
|
|
of eliminated or at least minimized somewhat, you've got to be keeping an eye on to make
|
|
sure that if one hard disk fails that you notice it so that you can replace it in time,
|
|
so you've got to do your homework still.
|
|
Okay so you've got a good solution but as any good system in will tell you there is no such
|
|
thing, raid and murdering are not backup solutions there, you still have to have a backup solution
|
|
and in this case you've got to have an offside backup solution because what if somebody comes in
|
|
and you know there's a power outage and it accidentally fries all the disks, so your
|
|
you know your network text storage blows up and goes on fire or somebody comes in and steals them
|
|
or there's a flood or whatever. So you really want to have an offside strategy and a lot of this
|
|
stuff has been around for you know these concepts have been around for years but they've been
|
|
limited to like banks or wherever but now you have important data that's stored only electronically
|
|
you've got to make sure that it is replicated to offside locations and what you can do the first
|
|
thing that springs to mind is something like like a Gmail drive or a Ubuntu one or Dropbox and even
|
|
even the Twitch network add for carbon and I pro there for a while but the problem with those
|
|
solutions is they're good for the USB sticky type amounts of data but you start talking about
|
|
the amount of data I'm talking about so I've got a quarter to a half of terabyte and you're looking
|
|
at $100 per month and you know the price of a one terabyte hard disk itself is $100 you know
|
|
is $100 so it's far more economical to simply just buy one or more disks and have them in an
|
|
offside location and by offside location I mean a family member or friend or some guy and
|
|
worker acquaintance or colleague your parents if they have broadband your brother if they've got
|
|
a broadband your sister whatever it makes a lot more sense to put your data over there rather than
|
|
paying these prices and what I've heard about some of these solutions is that the cost of uploading
|
|
data is quite cheap but you really pay through the notes when your data is is gone and you need to
|
|
download it because then they have it by the short and currently it's basically and you have to
|
|
pay outrageous amounts to get your data back so the solution then is to buy more hard disks and
|
|
just to put them in different locations and use rsync to replicate the data from your
|
|
NAS or your rage solution or your server or whatever we'll call it your NAS replicate that
|
|
be rsync over ssh to the remote location now before I get into anything else I want to say there's
|
|
like a degree of trust involved on this on both sides first of all you got to trust that on the
|
|
other side they're not going to be looking at your private and confidential data and on our side
|
|
you've got to take into account the fact that the person you're you know you're going to have
|
|
you're going to have access to a machine on the inside of your network and they may be able to
|
|
access files and your files as well are they may be able to set off processes that might get you
|
|
in trouble with the law now the way around this of course the way around the looking at the files
|
|
will be to use encrypted disks and to place instead of shipping just a hard disk you see one of these
|
|
low cost server NAS servers with just enough of a no s that it boots up and you know gets an IP address
|
|
registers with dynamic DNS or whatever and notifies you and that you can ssh in and mount an
|
|
encrypted drive so that that gets over that problem but it doesn't actually get you over the problem
|
|
with somebody having a machine and and doing nasty stuff on your on your side in the network so
|
|
proceed with caution but you know if it's family member there's a level of trust there so I wasn't
|
|
actually that concerned about that to be honest with you so what are you going to do is you know
|
|
if you both have newly provisioned disks and you don't have a lot of space or you don't have a lot
|
|
to synchronize you probably have enough space to just go ahead and begin the synchronization
|
|
but if you do you might want to consider buying a disk you buying the disk that's going to be in the
|
|
other location putting the data on it and then shipping it physically because
|
|
there the way rsync works is it's it's it's a bit like a copy on the initial on the initial run
|
|
so you copy all the data from the source to the destination and then subsequently any changes are
|
|
met by rsync so you're never going to copy everything over again or at least you shouldn't
|
|
if you are copying stuff over again it could be that something's wrong like
|
|
like I had a case where my the parameter I was using was specifying that the user
|
|
they that you should check the user ID and the group ID to make sure that they were correct and if not
|
|
modify them so it was recoupping files over because the user ID and group ID I was using on
|
|
the other side was different so anyway all I got fixed so what I suggest you do on the first
|
|
run is to use sneakernet where you physically you physically put it on a physical drive but the
|
|
physical drive in the post the post person DHL or wherever ships it off to the other side and then
|
|
you you put it in and then you can just synchronize the changes after that so that's a nice
|
|
convenient solution also another tip if you're synchronizing from a unique system to a Windows system
|
|
you're going to lose some of your some of the file attributes so users permissions and that
|
|
sort of thing will be lost so if you can it's better to keep it between Unix systems
|
|
you can of course you can if you're not worried about that you you know permissions whatever
|
|
you're just actually worried about the files themselves you can you can happily go out and
|
|
our sync will support our syncing between between Unix and Windows okay the command so now in our
|
|
hypothetical situation here we bought an additional one terabyte disk it's been shipped to your home
|
|
you plug it into your NAS server and it just been mounted as slash media slash disk so the
|
|
ursync command is and I'll go through these one by one afterwards is rsync dash vva dash dash
|
|
dry dash run dash dash delete dash dash force and then the source and destination which in my
|
|
case is slash data also think space slash media disk and so the first one is always the source
|
|
so slash data also think it's the source and then the second one is also always the destination
|
|
slash media disk now the dry run and there as you can probably guess is to make sure that your
|
|
you're lonely go through the motions of copying it it'll give you all the messages or
|
|
messages of what it's going to do but it doesn't actually do it and that's very very useful on your
|
|
first synchronization it's a very good idea when you're starting this for the first time to
|
|
um to put in some test directories both on the source and on the destination and use those
|
|
until you're sure you've got to sync text down and correct and in those directories I'd usually
|
|
have identical files on both sides files with the same name with different file size files with the
|
|
um you know directories with different file size directories on one that are done on the other
|
|
and then run the command and with that uh with that command you should see what's going on on
|
|
on both sides so let's go through the command here um or sync the dash vv is the sets that were
|
|
both the level which increases the amount of information given during the transfer one v is
|
|
small amount of information the more vz the more information you get then the dry run is you know
|
|
don't perform any changes they delete will be delete any files on the destination that are not
|
|
on the source and the force will also delete empty directories on the destination that are not
|
|
on the source this is where my warning comes you got to be careful when transferring this data
|
|
that you don't accidentally override anything because if you accidentally put slash media disk space
|
|
slash data slash also think whatever do is look oh there's a lovely new empty um an empty hard disk
|
|
here and he wants me to synchronize it over with that uh big location over there with all those
|
|
with all those files and folders and um yeah i've been told to force and delete then oh okay i'll
|
|
go and delete everything over there and they on what all my videos and photos are now gone officially
|
|
in a way to get them back so be careful about what you're doing um might be no harm at this point
|
|
to mount the destination uh do it through a loopback and mount that drive or that section of your
|
|
device via read only share so that the chance of that happening is is minimized as in this in this
|
|
setup um our sync supports you know um synchronize in two ways and you know keeping things in synchronization
|
|
but in this case what you're doing is you're just replicating it out you're not serve as always
|
|
going to be the master you know in case some are you know unless you need to get that information
|
|
back but it's always going to be the master so the assumption here is that if a file is not on the
|
|
master then delete it from the slave if it's um if it's on the master then put it to the slave that
|
|
sort of thing okay then we have the um the only other one that's left is the dash a command which
|
|
actually stands for dash rl p t g o all in or case and then capital D and those are um links
|
|
recreate the links and simulings on the destination we commit permissions times groups on owners on
|
|
the destination and their capital D is transfer character block device files names how could some
|
|
file files so basically all the special files so once you're happy you know what you're doing you've
|
|
done your test directory you're you're now doing a dry run against your read um read only copy of
|
|
the um your data drive then the thing that you can do is drop the dry run and then actually do
|
|
the synchronization to your hard disk and depending on the amount of data it might take a while to
|
|
synchronize because it needs to do checks on everything but actually it might take that long because
|
|
there's no destination so there's no comparison going on so the next step might be to ship
|
|
off the disk to the remote location and then set up or sync over ssh but I prefer to have an
|
|
additional testing step where I or sync over ssh to a pc in the home so I'll take the disk that
|
|
we've already or synced put it into a laptop set all the ssh stuff up just like I wanted on the
|
|
remote location just to make sure everything's working the steps to do this are going to be exactly
|
|
the same as what we've done before so um so you can work along now what you need to do on your
|
|
NAS server is you're going to need to generate a new ssh public and private key pair that has no
|
|
password associations and the reason you're going to do this is that you want to synchronization to
|
|
occur automatically so you don't need to be able to access the remote system without having to enter
|
|
a password if you do have a ssh key already it probably got a password on it so you don't want to
|
|
be there in the middle of the night trying to type your password to get this this script going
|
|
now there are security concerns about passwordless ssh keys anyone on that NAS
|
|
whose root on that NAS device of yours will be able to get to that key and then we'll be able to
|
|
ssh to the other location but kind of seeing as you're the user it's your NAS device I'm assuming
|
|
that this is a minor security concern so once you've generated your keys you probably want to call
|
|
something different than the normal one I use rsync dash key so I know what they are so I'll have
|
|
or sync dash key and then rsync dash key dot pub and they I just take the contents of that dot
|
|
pub file and I add it to the end of the authorized key files on my laptop and on the remote pc
|
|
do both at the same time so you get the ssh issues worked out and I've got link in the show notes
|
|
to journey mates journey mates is website that is more information on how you can go through all this
|
|
this has also been covered on the hpr network before so I'm not going to go into it too much
|
|
so once you have the keys generated and the public key copied over to the authorized key files
|
|
on both laptops you're going to want to ssh into them and the only trick here from a normal ssh is
|
|
that you need to use the dash i to specify the new name of the ssh key so it's ssh dash i slash
|
|
home user dot ssh or sync key and then the normal space user at example dot com if this is the
|
|
first time you've logged into that machine from the NAS server your your NAS server as the user
|
|
that you're going to do the rsync you're going to need to type in yes so that the ssh d keys from
|
|
the other side are going to be added to ssh dot ssh known host file so if you're it makes sense to
|
|
actually log in as that user so that that is done so now you've got the keys on your laptop and
|
|
you've also got them on the remote location and you've got a console up so that is that's very good
|
|
you might want to just create a file on both locations to make sure you can create files delete that
|
|
file again to make sure you can delete files so then that's ssh part about it that's the ssh part
|
|
done so we're going to put both of those two commands together into the rsync command and
|
|
the end command this will all be in the show notes as well is rsync dash v a space delete space
|
|
force dash e and this is the new bit of the dash e tells rsync to use this shell which is ssh
|
|
dash i slash home user dot ssh slash rsync key double call to space and then the source which
|
|
is slash data also think that's the same and the destination has changed so it'll be space
|
|
user at example dot com that we call and also think so that means the ssh user at the dns name
|
|
or ip address all go well there should be no updates but you may want to try adding,
|
|
deleting, modifying files and both ends to make sure the process is working correctly and when
|
|
you're happy you can ship the disk to the other side the only requirement is that the other network
|
|
ssh is allowed to firewall to your server and that you got a well known public ip addresses
|
|
if you don't have a static ip address then you can use services like dynamic dns or
|
|
there's a range and again there'll be a link in the show notes to where you can get that
|
|
and you should be able to ssh to your server like before if you're not able to
|
|
connect to the server over port 22 say for example the person you're peering with has already got
|
|
port 22 in use you can use you can specify a port option in your string to connect in an
|
|
and an additional port so but the whole point of this synchronization is that it should be seamless
|
|
so you want your rsync to be rolling constantly and the easiest way to do this is just start a
|
|
screen session and then run the command that we're given above into a simple loop and that has the
|
|
advantage of getting you all quickly but it's kind of not very resilient to reboots so i've created
|
|
a simple script which i put in user local bin also think and it's got the hash slash bin bash
|
|
while true do date and then the rsync string exactly as it was before then date and then sleep
|
|
three six zero zero and then done and what that does is the while true do loop look in my last
|
|
episode on bash loops about that so it puts it into an infinite loop it puts a date at the
|
|
beginning of the rsync and a date at the end of the rsync so that i can see you know when it
|
|
ran last and then it sleeps for an hour so i'm not flooding either sides of the script
|
|
so i take that and i put it into a crontab and you should see my last episode about cront
|
|
for more information on cront my crontab file is available crontab space dash l and i've got
|
|
uppercase male two equals double quote double quote so an empty male two line i've got zero space one
|
|
asterix asterix asterix space timeout space five four zero zero zero zero space user local bin
|
|
auto sync and then i read that redirect standard output to temp auto sync that log and i
|
|
redirect standarder to also temp auto sync that log now those of you among us may be thinking well
|
|
he's put a script with an infinite loop into a crontab surely the script is going to be
|
|
respond every night and at the end of the year i'm gonna have 365 of these scripts running
|
|
and chewing up my resources and the answer to that will be yes that is correct except for the fact
|
|
that i'm calling the script not from cront i'm calling the command that's actually calling the
|
|
script is the IBM command written by the IBM dudes which is called timeout and what timeout does
|
|
is it terminates any application after a particular period of time so if you want to run uh i don't
|
|
want to run a movie for a particular period of time uh playlist and then stop at a particular time
|
|
you put the timeout and then the number of seconds can highly wanted to stop and then kill the process
|
|
it's actually very nice um a very nice utility just little two short sweet does something
|
|
and the reason i did this is that uh orsync does have a patch that allows you to specify what times
|
|
it runs on my time at dusk but this is a very uh simple quick and simple means to um prevent
|
|
orsync running during the evening so that so that allows my brother to come home and
|
|
browse the network you know from four o'clock in the evening to midnight and then my
|
|
rsync script starts again at one o'clock and continues to four o'clock in the afternoon one o'clock in
|
|
the morning to four o'clock in the afternoon and then it terminates. I sell the script will
|
|
run infinitely from one o'clock stop after one o'clock stop at two stop at three four five six
|
|
with all the way around and then it gets terminated. Now the reason for the male two equals
|
|
double quotes is that although i'm redirecting the output of autosync to uh a log file the timeout
|
|
itself is run in a separate process because you yeah you don't want to terminate the application
|
|
and then timeouts gets terminated and it doesn't kill the other one so that's that will then add
|
|
entries into your crontab file every time that that runs it's not too serious so the male two
|
|
basically means you're not getting emailed um if you know in this with somebody else they might not
|
|
have a problem with urinal 24 or seven as long as you can throttle the bandwidth and orsync does
|
|
have a switch called dash dash bandwidth limit and then it equals and then you put in the killbrits
|
|
per second to have a value and that will limit all orsync copies to that to that bandwidth so you
|
|
might set it at 10% of the total download bandwidth of the person that you're thinking to you can so
|
|
that that is quite nice so that's basically that with regard to orsync i probably made the whole
|
|
thing sound a lot more scary and complicated than it is um whatever you can choose to do this or
|
|
not if there's something you're taking away from the show is that you should you should have some
|
|
system where everybody in your in your home network saves a file it's going to get saved to some
|
|
some mirrored or um backed up solution and then that is going to be sent to an offsite location
|
|
um in this case i've done it with one brother i'm working on the process of doing it with two
|
|
or possibly three of my other brothers so um that will be there's no reason why this orsync
|
|
orsync script can't replicate more locations the more the better now i would like to take some time
|
|
out at the end of this podcast to mention another podcast which i'd like to recommend and that's
|
|
called screencasters and it's at he thenext.org i'll just give you a little extract from
|
|
their above page the goal of screencasters.he thenext.org is to provide a means through a simple
|
|
website of allowing new users in the inkscape community to watch some basic and intimate two
|
|
tutorials by the authors of this website so he thenext and richer queer and have produced a show
|
|
that puts a lot of professional tutorials to shame i have sat through some very very sad episodes
|
|
of uh of videos that i've purchased through work but there's our actual joy to watch i've been
|
|
watching them from episode one uh on the train on mayes respire one every evening coming home on the
|
|
train and quite often you see people looking over my shoulder looking at what i'm what i'm looking
|
|
at they even have you know they go through the whole gamut of uh everything that inkscape can do
|
|
which is a drawing program if you if you didn't know um they even have many tutorials that will
|
|
get you started like this is the interface this is the menu bar this is where you can find this
|
|
this is where you can find that um after watching the entire episodes i'm now looking at posters
|
|
and work and ads and all sorts of things going on oh that's how they did that and you see that
|
|
effect there and this logo or this icon and this application has done like that so it's it's really
|
|
really cool even if you're not interested in graphics like it's it's something you want to download
|
|
if you know somebody who's using um photoshop or something else you know burn burn some of these
|
|
onto a DVD and hand it over to them and put a copy of minkscape on their irons and windows as well so
|
|
yeah good stuff so that is my recommendation for a podcast for this month and with that i'll
|
|
bid you a due and wrap this one up and hope that uh you will take some time out of your busy day
|
|
to record a show i'm very interested in hearing other episodes from other people with that thank
|
|
you very much um talk to you bye thank you for listening to after public radio
|
|
hpr sponsored by kill.net so head on over to c-a-r-o dot n-c for all of those
|