Hi,

I’m not sure if this is the right community for my question, but as my daily driver is Linux, it feels somewhat relevant.

I have a lot of data on my backup drives, and recently added 50GB to my already 300GB of storage (I can already hear the comments about how low/high/boring that is). It’s mostly family pictures, videos, and documents since 2004, much of which has already been compressed using self-made bash scripts (so it’s Linux-related ^^).

I have a lot of data that I don’t need regular access to and won’t be changing anymore. I’m looking for a way to archive it securely, separate from my backup but still safe.

My initial thought was to burn it onto DVDs, but that’s quite outdated and DVDs don’t hold much data. Blu-ray discs can store more, but I’m unsure about their longevity. Is there a better option? I’m looking for something immutable, safe, easy to use, and that will stand the test of time.

I read about data crystals, but they seem to be still in the research phase and not available for consumers. What about using old hard drives? Don’t they need to be powered on every few months/years to maintain the magnetic charges?

What do you think? How do you archive data that won’t change and doesn’t need to be very accessible?

Cheers

  • DasFaultier@sh.itjust.works
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    This is my day job, so I’d like to weigh in.

    First of all, there’s a whole community of GLAM institutions involved in what is called Digital Preservation (try googling that specifically). Here in Germany, a lot of them have founded the Nestor Group (www.langzeitarchivierung.de) to further the case and share knowledge. Recently, Nestor had a discussion group on Personal Digital Archiving, addressing just your use case. They have set up a website at https://meindigitalesarchiv.de/ with the results. Nestor publishes mostly in German, but online translators are a thing, so I think you will be fine.

    Some things that I want to address from your original post:

    • Keep in mind that file formats, just like hardware and software, become obsolete over time. Think about a migration strategy for your files to a more recent format of your current format falls out of style and isn’t as widely supported anymore. I assume your photos are JPGs, which are widely not considered safe for preservation, as they decay with subsequent encoding runs and use lossy compression. A suitable replacement might be PNG, though I wouldn’t go ahead and convert my JPGs right away. For born digital photo material, uncompressed TIFF is the preferred format.
    • Compression in general is considered a risk, because a damaged bit will potentially impact a larger block of compressed data. Saving a few bytes on your storage isn’t worth listing your precious memories.
    • Storage media have different retention times. It’s true that magnetic tape storage has the best chances for survival, and it’s what we use for long term cold storage, but it’s prohibitively expensive for home use. Also, it’s VERY slow on random access, because tape has to be rewound to the specific location of your file before reading. If you insist on using it, format your tapes using LTFS to eliminate the need for a storage management system like IBM Spectrum Protect. The next best choice of storage media are NAS grade HDDs, which will last you upwards of five years. Using redundancy and a self correcting file system like ZFS (compression & dedup OFF!) will increase your chances of survival. Keep you hands off optical storage media; they tend to decay after a year already according top studies on the subject. Flash storage isn’t much greater either, avoid thumb drives at all cost. Quality SSD storage might last you a little longer. If you use ZFS or a comparable file system that provides snapshots, you can use that to implement immutability.
    • Kudos for using Linux standard tooling; it will help other people understand your stack of anything happens to you. Digital Preservation is all about removing dependencies on specific formats, technologies and (importantly) people.
    • Backup is not Digital Preservation, though I will admit that these two tend get mixed into one another in personal contexts. Backups save the state of a system at a specific point in time, DigiPres tries to preserve only data that isn’t specific to a system and tends to change very little. Also, and that is important, DigiPres tries to save context along with the actual payload, so you might want to at least save some metadata along with your photos and store them all in a structure that is made for preservation. I recommend BagIt; there’s a lot of existing tooling for creating it, it’s self-contained, secured by strong checksums and it’s an RFC.
    • Keep complexity as low as possible!
    • Last of all, good on you for doing SOMETHING. You don’t have to be perfect to improve your posture, and you’re on the right track, asking the right questions. Keep on going, you’re doing great.

    Come back at me if you have any further questions.

  • JubilantJaguar@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    1 month ago

    The local-plus-remote strategy is fine for any real-world scenario. Make sure that at least one of the replicas is a one-way backup (i.e., no possibility of mirroring a deletion). That way you can increment it with zero risk.

    And now for some philosophy. Your files are important, sure, but ask yourself how many times you have actually looked at them in the last year or decade. There’s a good chance it’s zero. Everything in the world will disappear and be forgotten, including your files and indeed you. If the worst happens and you lose it all, you will likely get over it just fine and move on. Personally, this rather obvious realization has helped me to stress less about backup strategy.

    • 8263ksbr@lemmy.mlOP
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      So you would suggest to get bigger and bigger storages?

      I really like and can embrace the philosophical part. I do delete rigorously data. At the same time, i once had a data lost, because I was young and stupid and tried to install Suse without an backup. I still am sad to not to be able to look at the images of me and my family from this time. I do look at those pictures/videos/recordings from time to time. It gives me a nice feeling of nostalgia. Also grounds me and shows me how much have changed.

      • JubilantJaguar@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        1 month ago

        Fair enough!

        So you would suggest to get bigger and bigger storages?

        Personally I would suggest never recording video. We did fine without it for aeons and photos are plenty good enough. If you can still to this rule you will never have a single problem of bandwidth or storage ever again. Of course I understand that this is an outrageous and unthinkable idea for many people these days, but that is my suggestion.

        • 8263ksbr@lemmy.mlOP
          link
          fedilink
          arrow-up
          1
          ·
          30 days ago

          Never recording videos… That is outrageous ;) Interesting train of thought, though. Video is the main data hog on my drives. It’s easy to mess up the compression. At the same time is combines audio, image and time in one easy to consume file. Personally, i would miss it.

  • phanto@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    1 month ago

    This is actually a real problem… A lot of digital documents from the 90’s and early 2000’s are lost forever. Hard drives die over time, and nobody out there has come up with a good way to permanently archive all that stuff.

    I am a crazy person, so I have RAID, Ceph, and JBOD in various and sundry forms. Still, drives die.

    • Sl00k@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      nobody out there has come up with a good way to permanently archive all that stuff

      Personally I can’t wait for these glass hard drives being researched to come at the consumer or even corporate level. Yes they’re only writable one time and read only after that, but I absolutely love the concept of being able to write my entire Plex server to a glass harddrive, plug it in and never have to sorry about it again.