Monday, May 22, 2017

Oracle SuperCluster M7 SLOB LIO Tests vs Intel Xeon E5-2699 V4

Here are some SLOB LIO figures from a DB zone configured with 16 threads running on an Oracle SuperCluster M7 hardware. For comparison I've also included numbers from an Intel Xeon E5-2699 V4 CPU.

It makes sense to mention that this is not exactly a fair comparison -- a single SPARC M7 core has 8 threads associated with it so my zone is able to utilize a total of two SPARC M7 cores (16 threads total with 8 threads per core). E5-2699 V4 is currently top of the line Intel CPU core packed model with 22 cores. So we're comparing two SPARC M7 cores vs 16 E5-2699 cores. It does however help answer the question -- if you're running on a certain number of Intel cores what kind of performance can you expect when you move over to a heavy threaded M7 if you transfer your workload "as is"?

Below are the results:



The first thing worth mentioning is that M7 still exhibits a large deficit when it comes to single threaded performance -- a single SPARC M7 thread is about 60% performance compared to an E5-2699 V4 core (which is not even a top bin CPU when it comes to frequency). Throughput is a different story -- two M7 cores are almost able to match four E5-2699 V4 cores thanks to a heavily threaded design.

7 comments:

  1. Why are you comparing SPARC M7 thread performance to Intel E5-2699 V4 core performance? A single SPARC M7 chip has 32 x cores and 256 x threads versus Intel E5-2699 V4 @ 22 x cores, 44-threads, almost 6x fewer threads! Why not compare core to core performance or atleast run 88-threads to max out the Intel 2-socket box to see how the SPARC M7 would compare?

    ReplyDelete
  2. Hello Phil,

    thank you for your comment. The way you posed the question seems to indicate that you find this comparison unfair. Which by the way I do too :) Assuming we agree on this I also think that comparing a chip with 256 threads to a chip with 44 threads in a way that benefits highly threaded architecture is unfair as well.

    ReplyDelete
  3. Well, I guess I didn't understand what you're trying to prove here? If you are trying to compare SPARC M7 to Intel's E5-2699 V4, I would think you'd want to compare both architectures at its limits? And of course, licensing, licensing costs most likely will be the biggest consideration in comparing these two and as licensing is mostly per core based, comparing performance core vs core most likely what most readers would be interested in. Im certainly interested ;-)

    ReplyDelete
  4. Hello, Alex !

    What the storage equipment did you use for the tests ? Thank you

    ReplyDelete
    Replies
    1. This was a LIO test so the entire test was done in-memory (buffer cache).

      Delete
  5. Hi Alex,
    I can confirm that single-threaded performance of SPARC M7 core is worse than Intel x86-64 core
    it was bad surprise that hash joins perform longer after moving from x86-64 to SuperCluster....
    BTW:
    good comparison would be to compare core-by-core performance - actually price based performance if we consider core factor table.
    so
    22 cores(44 threads) of E5-2699V4
    vs
    22 cores(176 threads) of SPARC M7
    ?

    ReplyDelete
    Replies
    1. Thanks for your comment. I think comparing M7 threads to E5 cores is still a good comparison because:

      1. It shows that M7 still have a single threaded performance deficit. There is a class of applications/tasks that care about it.

      2. If I have a query running with a certain DOP on Intel and run the same query with the same DOP on M7 I shouldn't be surprised if I get much slower performance (assuming CPU-bound case). Hopefully this blog post will prevent some other people from having a "bad surprise". DOPs will likely have to be increased for such queries (potentially bringing more concurrency issues?).

      3. Would you rather have M7 with 4 threads per core where each thread is twice as fast?

      4. Why can't we compare 22 E5 cores vs 22 M7 cores where both are running 22 threads? Well that's because such a case renders M7 useless ;-) All the while it's a valid use case for Intel -- Azure runs with HT disabled, for instance. Food for thought.

      Delete