16 votes

AMD launches EPYC 9004 "Genoa" processors - Up to 96 cores, AVX-512, incredible performance

19 comments

  1. [13]
    Luna
    (edited )
    Link
    AMD has outdone themselves yet again! Their lineup looks fantastic and I'm glad to see they aren't resting on their laurels. Highlights: Genoa lineup has 16-96 cores/32-192 threads (lineup chart)...

    AMD has outdone themselves yet again! Their lineup looks fantastic and I'm glad to see they aren't resting on their laurels. Highlights:

    • Genoa lineup has 16-96 cores/32-192 threads (lineup chart)
    • Support for 6 TB per socket of 12-channel DDR5-4800
    • AVX-512 support
      • On top of that, their AVX-512 extensions enable some ML workloads to be run on CPU (!!!)
    • 14% IPC uplift
    • CXL support (CXL.mem only for now)
    • Massive datacenter cost savings: 1, 2 (edit: Imgur backup in case those links 404)
    • Also laid out the roadmap for the rest of 4th-gen Epyc:
      • Genoa - general purpose
      • Bergamo - targeted to cloud providers; Zen 4C, up to 128 cores
      • Genoa-X - targeted to HPC; 3D V-Cache
      • Siena - targeted to telco/edge (lower-end servers)

    All I have to say is...wow.

    The possibilities here are enormous. Especially CXL, which I really don't think is being given enough attention. Having a shared memory bank that can be accessed in a cache-coherent manner is insane for HPC and HA applications.

    And Intel still hasn't released Sapphire Rapids. I wonder if they're going to delay it again after this tour de force.

    To quote SemiAnalysis:

    The 4th generation Epyc Genoa launch marks the 3rd consecutive generation where AMD beats Intel in the majority performance metrics. Rome and Milan made the cloud players start buying a lot of AMD, and Genoa is when the volumes jump across most remaining markets and end users. SemiAnalysis believes the gap between Genoa and Sapphire Rapids is larger than the gap between Milan and Ice Lake.

    Some relevant videos:

    Edit: Some benchmarks - https://www.tomshardware.com/reviews/amd-4th-gen-epyc-genoa-9654-9554-and-9374f-review-96-cores-zen-4-and-5nm-disrupt-the-data-center

    7 votes
    1. [12]
      JXM
      Link Parent
      That's insane. I realize this isn't a great benchmark for a number of reasons, but Crucial is selling 64 GB of DDR5 memory for $300. So that's almost $30,000 just to max out the memory. My first...

      Support for 6 TB per socket of 12-channel DDR5-4800

      That's insane. I realize this isn't a great benchmark for a number of reasons, but Crucial is selling 64 GB of DDR5 memory for $300. So that's almost $30,000 just to max out the memory.

      My first computer had 128 MB of memory.

      6 votes
      1. Amarok
        Link Parent
        I remember spending about $300 for 8MB in the early 90s, maxed out my 486DX2. SIMMs were the new hotness, SIPPs were out of style (thank god). That memory was just to get Photoshop and Picture...

        I remember spending about $300 for 8MB in the early 90s, maxed out my 486DX2. SIMMs were the new hotness, SIPPs were out of style (thank god). That memory was just to get Photoshop and Picture Publisher the memory they needed... it totally wasn't for playing X-Wing, Doom, and Descent. Couldn't even access that much memory without EMM386 which is still one of the most ridiculously shitty hacks that ever made it into mainstream computing.

        Back then any work on a new PC required a blood sacrifice. Everything was scalding hot, everything was sharp, every solder joint was like a tiny razor just waiting for you to brush up against it, and nothing really quite fit into anything that it was supposed to fit into without the use of a damn crowbar. They were much more sensitive to static electricity back then, too. Modern hardware is a lot more tolerant.

        6 votes
      2. [10]
        Greg
        Link Parent
        I forget exactly where, but a tech video I saw recently said "the numbers have got so big they sound small again" and I thought that was a really good way of putting it. I know 6TB is a huge...

        I forget exactly where, but a tech video I saw recently said "the numbers have got so big they sound small again" and I thought that was a really good way of putting it. I know 6TB is a huge amount of RAM, but without stopping and really thinking about it it's hard to internalise that "six" actually means "almost four hundred of my laptops stacked on top of each other".

        4 votes
        1. [9]
          vord
          Link Parent
          Lol yea the last time I remember this happening was the transition from MB -> GB of RAM and from Mhz to Ghz CPU, circa 2000ish. I recall some chips were marketed with 1000+ Mhz instead of 1+ GHz...

          Lol yea the last time I remember this happening was the transition from MB -> GB of RAM and from Mhz to Ghz CPU, circa 2000ish.

          I recall some chips were marketed with 1000+ Mhz instead of 1+ GHz for this reason.

          4 votes
          1. [8]
            nothis
            Link Parent
            I remember the GHz jump and it's all the more disappointing we're still in the low single digits on that scale, 20 years later. All the CPUs from this announcements are in the 3 to 4 GHz range....

            I remember the GHz jump and it's all the more disappointing we're still in the low single digits on that scale, 20 years later. All the CPUs from this announcements are in the 3 to 4 GHz range. Any speed improvements come from clever optimization, removal of external bottlenecks and parallelization. Raw speed is stuck in like 2008.

            3 votes
            1. [7]
              spctrvl
              Link Parent
              We've crept up towards 6GHz in the last processor release cycle actually, but even if we abandon silicon, I wouldn't expect too much more than that because of speed of light issues. Grace Hopper...

              We've crept up towards 6GHz in the last processor release cycle actually, but even if we abandon silicon, I wouldn't expect too much more than that because of speed of light issues. Grace Hopper of computer programming legend used lengths of wire to illustrate this decades ago pertaining to inefficient programming, but these days, the sheer speed of processors means that signals can only physically travel an inch or two per clock cycle.

              8 votes
              1. [6]
                nothis
                Link Parent
                Interesting, I thought I was mostly a heat problem. Are we touching some theoretical limit due to distance traveled?

                Interesting, I thought I was mostly a heat problem. Are we touching some theoretical limit due to distance traveled?

                3 votes
                1. [5]
                  spctrvl
                  Link Parent
                  It is mostly a heat problem at present, but what I was getting at is that we are within about an order of magnitude of speed of light issues making further frequency increases redundant anyway,...

                  It is mostly a heat problem at present, but what I was getting at is that we are within about an order of magnitude of speed of light issues making further frequency increases redundant anyway, since the different parts of the chip will hardly have time to communicate with each other at a clock speed of e.g. 20GHz. It can be mitigated by packing things closer, which has been the trend (worsens heating problems though), and further miniaturization of transistors, though we're actually approaching physical limits there too, with transistors already being made just a couple dozen atoms across. Basically, while I think there's still tons of gains to be had in the processor space, I'd expect most of them to come from more clever design than brute force. Amazingly, we're starting to run out of room at the bottom, and brute force mostly died in the pentium 4 era. But when it comes to clever processor design, I've been continually amazed at the IPC leaps and bounds that Apple and AMD have carried us through the last few years.

                  5 votes
                  1. [4]
                    Greg
                    Link Parent
                    I just did a back of the envelope calculation on this and I'm surprised just how close we already are. Boost clock on a 13900k is 5.8GHz, which is 52mm of travel distance per cycle at c. The...

                    I just did a back of the envelope calculation on this and I'm surprised just how close we already are. Boost clock on a 13900k is 5.8GHz, which is 52mm of travel distance per cycle at c. The diagonal die size of that chip is 26mm, so already half way to the absolute theoretical limit.

                    Now I'm not sure how necessary it is in reality for signals to propagate from corner to corner in one cycle, nor how much slower they propagate in silicon compared to in vacuum, but I'm very much not used to seeing situations where real world consumer tech is already half way to the speed of light limit (and quite possibly much closer when taking the physics of the material into account)!

                    4 votes
                    1. [3]
                      spctrvl
                      Link Parent
                      Where it's a bigger problem is main memory access, the distance from the CPU to the RAM is at least 5-10cm even on the most compact boards. I suspect that may be a reason cache amounts have gone...

                      Where it's a bigger problem is main memory access, the distance from the CPU to the RAM is at least 5-10cm even on the most compact boards. I suspect that may be a reason cache amounts have gone up so much in the last decade or so, and it's starting to become common practice to solder the RAM chips directly to the CPU in some designs.

                      4 votes
                      1. [2]
                        Amarok
                        Link Parent
                        Are the heat dissipation issues still the reason we haven't moved into thicker multi-layer 3D chip designs? There's still another physical dimension to exploit, in theory we can find a way to...

                        Are the heat dissipation issues still the reason we haven't moved into thicker multi-layer 3D chip designs? There's still another physical dimension to exploit, in theory we can find a way to realize some performance gains by stacking.

                        2 votes
                        1. spctrvl
                          Link Parent
                          Definitely one of the big reasons, we have actually started moving towards 3D chips but on low heat output applications like HBM and flash memory.

                          Definitely one of the big reasons, we have actually started moving towards 3D chips but on low heat output applications like HBM and flash memory.

                          2 votes
  2. soks_n_sandals
    Link
    I'm really curious to see how Genoa and Genoa-X will compare with the Sapphire Rapids HBM SKUs. The HBM advancement presents a really exciting hardware jump for simulation and scientific computing...

    I'm really curious to see how Genoa and Genoa-X will compare with the Sapphire Rapids HBM SKUs. The HBM advancement presents a really exciting hardware jump for simulation and scientific computing and it'll be fantastic to have two high performance chips in the marketplace.

    3 votes
  3. teaearlgraycold
    Link
    Always good to see Intel made to sweat a little bit.

    Always good to see Intel made to sweat a little bit.

    1 vote
  4. [4]
    JXM
    Link
    How does this compare to Intel's current offerings? I know AMD was lightyears ahead of them as of the last generation.

    How does this compare to Intel's current offerings? I know AMD was lightyears ahead of them as of the last generation.

    1. [2]
      FlippantGod
      Link Parent
      Read the article? Sapphire Rapids isn't supposed to be announced until January, so Ice Lake is out here getting slaughtered.

      Read the article? Sapphire Rapids isn't supposed to be announced until January, so Ice Lake is out here getting slaughtered.

      2 votes
      1. JXM
        Link Parent
        I did. All it says is that the current Intel chips are behind. But they haven't launched their newest iterations yet. I was more curious if there were any rumors about the Sapphire Rapids chip...

        I did. All it says is that the current Intel chips are behind. But they haven't launched their newest iterations yet.

        Intel Xeon Sapphire Rapids is set to be finally announced in January while we'll see how long it takes before we get our hands on those next-generation Xeon Scalable processors. In any event, Sapphire Rapids looks like it will have some mighty competition ahead while current-generation Xeon Ice Lake gets annihilated by Genoa.

        The SKU table for AMD EPYC Genoa is nice and remains much smaller than Intel's often complex mix of processors. At the low-end is the EPYC 9124 and 9174F CPUs that are sixteen core. The EPYC 9174F is a rather interesting part in that it has a 4.1GHz base frequency and 4.4GHz boost frequency -- the highest clocked part of all the EPYC Genoa SKUs being revealed today. But for this high hitting frequency-optimized SKU, it has a 320 Watt TDP or a cTDP up to 400 Watts.

        I was more curious if there were any rumors about the Sapphire Rapids chip performance yet. I don't follow chip news that closely anymore so I have no clue.

        1 vote