• cRazi_man@lemm.ee
    link
    fedilink
    English
    arrow-up
    10
    ·
    2 months ago

    I recently built a 12th Gen PC, expecting an upgrade to 13th Gen will be a cheap and significant upgrade path soon. Now there isn’t going to be any way to know if a second-hand CPU is damaged in this way.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      2 months ago

      It can get a whole lot worse.

      I bought a $500 13th gen CPU that destroyed itself, replaced it (and didn’t keep the dead CPU) with a $500 14th gen CPU that destroyed itself, and spent another ~$500 on related hardware and dumping Intel stuff to go AMD to get a working system. I also spent a lot of time trying to resolve the problem. I’d bet that I’m not the person burned worst, because someone could very easily have replaced their motherboard or memory or power supply unit in the hopes of fixing the issue, as any of these could have looked like potential causes, and there’d be no way for anyone to prove to Intel that this was the cause even if Intel intended to reimburse for these.

      Maybe, I might get $500 back at most if Intel reimburses for the 14th gen CPU; I’d assume that at best, based on what they’ve been doing so far, that they’d send out another Intel CPU (which I no longer have a use for, having gone AMD).

      And I was mostly using this system for fun. While I was corrupting my root filesystem regularly at boot at the end, I ultimately didn’t – as far as I know – suffer any serious data loss or expense from the data that the processor was corrupting. My system was mostly to be used for my own entertainment. I didn’t miss deadlines or lose critical information.

      As Steve Burke has pointed out in earlier episodes on this, there are people who have been impacted by those secondary costs, some of which might make my own costs look irrelevant.

      He was talking to video game companies who were using affected processors; they had apparently banned some customers for cheating because they knew that the internal state of the game was incorrect; they couldn’t figure out what the customers were doing, but knew that their game state was being modified. It apparently wasn’t the customers cheating, but their CPU, which had partially destroyed itself, corrupting memory.

      Another had been using CPUs for video game servers and those kept dying and taking down service; another company estimated that they’d lost $100k in player business due to the problem.

      Apparently these were also popular, due to high single-threaded performance, with hedge funds that do stock trading. I imagine that a system that suddenly stops working or corrupts data can very quickly become extremely expensive in that context, far in excess of what the CPUs cost.

      OEMs who build and sold systems containing these CPUs had apparently been taking back systems and repeatedly replacing parts; they probably incurred substantial costs and hits to their own reputation, as customers are upset with them.

      Same thing with datacenter providers, who incurred a lot of costs investigating and mitigating problems, swapping parts and CPUs. One of these Burke quoted as having advised customers to use an alternate AMD-based system and if they insisted on the Intel one, the provider would charge a $1000 additional service fee to cover all the costs the provider was taking in having to deal with systems based on the CPUs. Gives an idea of what they were losing.

      God only knows what the impact of having a ton of data around the world corrupted is. Probably no more than a tiny fraction of the problems related to corruption will ever actually be attributed to the CPUs themselves.

      And I don’t know how many systems out there may not be fully-tracked – so they don’t get updates to avoid the problem – and have the CPUs built into them. Industrial automation hardware? Ship navigation systems? Who knows? All kinds of things that might fail in absolutely spectacular ways if they work for a period of time, then down the road, eventually start corrupting data more and more severely.

      I mean, Intel might, at best, provide a cash refund for a dead CPU. But they aren’t gonna cover losses from secondary problems, and there’s no realistic way that most businesses and people who bought these could prove them, anyway.

      Buying the last CPU they made before this clusterfuck occurred is maybe one of the best things you could have done and still be indirectly affected, as you got a reasonably fast system that wasn’t directly affected – if I’d known about this in advance, rather then Intel not saying anything, I’d have purchased a 12th gen CPU happily rather than another $1k in useless hardware and spent a ton of time to try to resolve my problems. You’ll have the option to, at upgrade time, go AMD or 15th gen Intel and LGA 1851, if you want to hope that Intel’s 15th gen is more solid than their previous two. Just means a new motherboard and, if you’re using DDR4 memory, you’ll need to toss that and buy DDR5.

      • cRazi_man@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        I would have gone AMD in the first place if this happened at the time of my purchase.

        Oh well. Upgrade time is going to be a long way away. My last gaming PC served me well for almost 10 years before I did an in socket upgrade.

    • kvasir476@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 months ago

      What, if anything, can customers do to slow or stop degradation ahead of the microcode update?

      Intel recommends that users adhere to Intel Default Settings on their desktop processors, along with ensuring their BIOS is up to date. Once the microcode patch is released to Intel partners, we advise users check for the relevant BIOS updates.

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        I destroyed my second CPU, a 14900KF, while having already been aware of that recommendation, and having disabled all of the settings like that that the motherboard vendor had enabled by default prior to ever inserting the replacement CPU, and only used the CPU with those settings; it still destroyed itself, like the first. I am very confident that you can still destroy a CPU having done that.

        That isn’t to say that using conservative settings is a bad idea (and maybe doing something further, like running memory at minimum frequency, not just using the Intel recommended default rather then the motherboard vendor defaults, might actually manage to reliably avoid CPU damage). But I am confident that just running standard Intel recommended settings is not, alone, enough to avoid damage.

    • Justin@lemmy.jlh.name
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 months ago

      There’s no 100% way until the new microcode is released next month. All affected CPUs are at risk of silicon degradation by the excessive voltage.

      The are some power limits and July bios updates you can use that Intel says can help reduce the damage or prevent it entirely in some scenarios. I believe the damage is specifically caused by single threaded spikes, so reducing LLC and running something like prime95 in the background might hold the voltage low enough that it won’t happen. But there is no fix yet, so if your CPU is susceptible, running it will degrade the CPU, at least until the fix is out.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 months ago

      If I had a known unused one, I would absolutely not use it until Intel finishes putting out their patch to motherboards to address this. You have no idea whether you could cause damage that won’t be detected, leaving you with a slightly damaged processor that malfunctions occasionally.

      Intel may publish guidance on how to use unpatched processors. If they don’t – they sure have not been forthcoming with information thus far – here’s my own suggestion.

      When I do use it, I would, prior to booting any OS on the CPU, go into the BIOS and turn everything related to the CPU to minimal performance. Memory speed down, disable Intel turbo boost, everything. If you can disable cores there, disable all but one – even my severely-damaged pair of CPUs could still boot without corrupting my root filesystem as long as I ran using only a single core (though two cores induced problems), and I’d take that as an argument in favor of one core being preferable, though I cannot say for sure that doing so helps avoid damaging the chip rather then just avoiding being affected by the damage once incurred.

      And the first thing I’d do, booted into that minimal-performance-CPU-environment, would be to do that motherboard BIOS update. Then go back and reset the motherboard to defaults and use the thing normally.

      Maybe that’s over-cautious, but we know that the processors destroy themselves with use, and we have no idea what the minimum amount of time – if any – to incur damage is. Unless Intel can come out with some kind of diagnostic to reliably detect damaged CPUs, you won’t know if you damaged your CPU in that window before the BIOS update, and it is maybe occasionally corrupting data, which I’d guess is a situation that you probably don’t want to be in during the lifetime of the CPU.

    • darvocet@infosec.pub
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      2 months ago

      Best to throw that away. Good job keeping it from affecting the performance of your pc.

      • Sanctus@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        At that rate I’ll make a keychain out of it. It sucks cause its above my normal price range and was a gift.

  • taaz@biglemmowski.win
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 months ago

    If your CPU is crashing/unstable then yes, damage is already done, but for the few of us who bought these later just update your bios to the latest one, set intel defaults, do not overclock (I have even undervolted it a bit, but ymmv) and wait for the microcode update.

    Though I do wonder if Intel isn’t just stalling for time, I do hope they are not. Didn’t wanna touch my build for next ~5 years.

  • kowcop@aussie.zone
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    2 months ago

    I thought I read that Intel said this was from messing with voltages? I have had plenty of these processors in the last couple of years and never experienced crashes, but I don’t overclock