Cloud server CPU performance comparison – retailic.com
JAN RYCHTER,
PARTNER AT RETAILIC
Alternate titles: “The cloud makes no sense”, “Intel Xeon processors are slow”, “The Great vCPU Heist”.
I recently decided to try to move some of my CPU-intensive workload from my desktop into the “cloud”. After all, that’s supposedly what those big, strong and fast cloud servers are for.
I found that choosing a cloud provider is not obvious at all, even if you only want to consider raw CPU speed. Operators do not post benchmarks, only vague claims about “fastest CPUs”. So I decided to do my own benchmarking and compiled them into a very unscienti!c, and yet revealing comparison.
I was aiming for the fastest CPUs. Most of my needs are for interactive development and quick builds, in terms of wall clock performance. Which means CPU performance matters a lot. Luckily, that’s what all cloud providers advertise, right?
I decided to write up my experiences because I wish I could have read about all this instead of doing the work myself. I hope this will be useful to other people.
Providers tested
In alphabetical order:
- Amazon AWS (c5.xlarge, c5.2xlarge, c5d.2xlarge, z1d.xlarge)
- Digital Ocean (c-8, c-16)
- IBM Softlayer (C1.8×8)
- Linode (dedicated 8GB 4vCPU, dedicated 16GB 8vCPU)
- Microsoft Azure (F4s v2, F8s v2)
- Vultr (404 4vCPU/16GB, 405 8vCPU/32GB)
Why those? Well, those are the ones I could quickly find and sign up for without too much hassle. Also, those are the ones that at least promise fast CPUs (for example, Google famously doesn’t much care about individual CPU speed, so I didn’t try their servers).
Setting up and differences between cloud providers
Signing up and trying to run the various virtual machines o#ered by cloud operators was very telling. In an ideal world, I would sign up on a web site, get an API key, put that into docker-machine and use docker-machine for everything else.
Sadly, this is only possible with a select few providers. I think every cloud operator should contribute their driver to docker-machine, and I don’t understand why so few do. You can use Digital Ocean, AWS and Azure directly from within docker-machine. The other drivers are non-existent, $aky or limited, so one has to use vendor-speci!c tools. This is rather annoying, as one has to learn all the cute names that the particular vendor has invented. What do they call a computer, is it a server, plan, droplet, size, node, horse, beast, or a daemon from the underworld?
One thing I quickly discovered is that what the vendors advertise is often not available. As a new user, you get access to the basic VM types, and have to ask your vendor nicely so that they allow you to spend more money with them. This process can be quick and painless with smaller providers, but can also explode into a major time sink, like it does with Azure. There was a moment when I was spending more time dealing with various tiers of Microsoft support than testing. I !nd this to be rather silly and I don’t understand why in the age of global cloud computing I still have to ask and specify which instances I’d like to use in which particular regions before Microsoft kindly allows me to.
Assuming you can actually get access to VM instances, there is a big di#erence in how complex the management is. With Digital Ocean, Vultr or Linode you will be up and running in no time, with simple web UIs that make sense. With AWS or Azure, you will be spending hours dealing with resources, resource groups, regions, availability sets, ACLs, network security groups, VPCs, storage accounts and other miscellanea. Some con!gurations will be inaccessible due to weird limitations and you will have no idea why. A huge waste of time.
The benchmark
I used the best benchmark I possibly could: my own use case. A build task that takes about two and a half minutes on my (slightly overclocked) i7-6700K machine at home. I started signing up at various cloud providers and running the task.
After several tries, I decided to split the benchmark into two: a sequential build and a parallel build. Technically, both builds are parallel and use multiple cores to a certain extent, but the one called “parallel” uses “make -j2” to really load up every core the machine has, so that all cores are busy nearly all of the time.
The build is dockerized for easy and consistent testing. It mounts a volume with the source code, where output artifacts go, too. It does require a fair bit of I/O to store the resulting !les, but I wouldn’t call it heavily I/O-intensive.
Methodology
A single test consisted of starting a cloud server, provisioning it with Docker (both were sometimes done automatically by docker-machine), copying my source code to the server, pulling all the necessary docker images, and performing a build.
The total wall clock time for the build was measured. The smaller the better. I always did one build to prime the caches and discarded the first result.
I tried to get six builds done, over the course of multiple days, to check if there is variance in the results. And yes, there is very signi!cant variance, which was a surprise.
For some cloud providers (Linode and IBM) the build times were so abysmal that I decided to abandon the e#ort after just two builds. No point in torturing old rust.
I also threw in results for my own local build machine (a PC next to my desk), with no virtualization (but the build was still dockerized), and a dedicated EX62- NVMe server from Hetzner.
Results
I first created rankings for average build times, but then realized that with so much variance, these averages make little sense. What I really care about is the worst build time, because with all the overbooking and over-provisioning going on, this is what I really get. I might get better times if I’m lucky, but I’m paying for the worst case.
The error bars indicate how much better the best case can be. As you can see, in some cases the di#erences are very significant.
These are the worst-case results for “sequential” builds (see “The benchmark” above for a description of what “sequential” means):
These are the worst-case results for “parallel” builds:
And this is the best case you can possibly get using a “sequential” build, if you are lucky:
The ugly vCPU story
What cloud providers sell is not CPUs. They invented the term “vCPU”: you get a “virtual” CPU with no performance guarantees, while everybody still pretends this somehow corresponds to a real CPU. Names of physical chips are thrown around.
Those “vCPUs” correspond to hyperthreads. This is great for cloud providers, because it lets them sell 2x the number of actual CPU cores. It isn’t so great for us. If you try hyperthreading on your machine, you will see that the bene!ts are on the order of 5-20%. Hyperthreading does not magically double your CPU performance.
If you wondered why everybody was so worried about hyperthreading-related vulnerabilities, it wasn’t because of performance loss. It was because if we pressured the cloud providers, they would have to disable hyperthreading, and thus cut the number of “vCPUs” they are selling by a factor of two.
In other words, we now have a whole culture of overselling and overbooking in the cloud, and everybody accepts it as a given. Yes, this makes me angry.
Now, you might get lucky, and your VMs might have neighbors who do not use their “vCPUs” much. In that case, your machines will run at full (single-core) performance and your “vCPUs” will not be much di#erent from actual CPU cores. But that is not guaranteed, and I found that most of the time you will actually get poor performance.
Intel® Xeon® processors are slow
There. I’ve said it. These processors are slow. Dog slow, in fact. We’ve been told over the years that the Intel® Xeon® chips are the powerhouses of computing, the performance champions, and cloud providers will often tell you which powerful Xeon® chips they are using. The model numbers are completely meaningless at this point, which I think is intentional confusion, so that even a 6- year old chip branded with the Xeon® name appears to be powerful.
Fact is, Xeon® processors are indeed very good, but for cloud providers. They let them pack lots of slow cores onto a single CPU die, put that into a server, and then sell twice that number of cores as “vCPUs” to us.
Now, if your workload is batch-oriented and embarassingly parallel, and if you can make 100% use of all the cores, then Xeon® processors might actually make sense. For other, more realistic workloads, they are completely smoked by desktop chips with lower core counts.
Of course, if this were the case, then everybody would buy desktop chips. Which is why Intel intentionally cripples those, removing ECC RAM support, thus making them more unreliable. And desktop chips are inconvenient for cloud providers, because you can’t get as many “vCPUs” from a single physical server. Still, there are providers where you can get servers with desktop chips
— Hetzner, for example, and these servers come out at the very top of my performance charts, being a fraction of the cost.
In other words, what we actually buy when we order our “Powerful compute- oriented Xeon®-powered VM” is a hyperthread on a dog-slow processor.
Enterprise shmenterprise
But, I can hear you say, this is wrong! Intel® Xeon® processors are for ENTERPRISE workloads! The serious stu#, the real deal, the corporate enterprisey synergistic large-mass cloud computing workloads that Real Enterprises use!
Well, my build is mostly Java execution and Java AOT compilation. Dockerized. That enterprisey enough? There is also some npm/grunt (it’s a modern enterprise), with a bunch of I/O. It can make use of multiple cores, although not perfectly. I’d say it’s the ideal “enterprise” use case.
Seriously, Xeon® chips are just plain slow. The benchmarks show it, especially in the single-threaded CPU performance part. They still rank relatively well in the multi-threaded benchmarks, but remember, a) your code is not embarassingly parallel most of the time, b) you will be renting 4-8 “vCPUs” (hyperthreads), not 16 actual cores that you’re looking at in the GeekBench results.
Takeaways
If you want to spin up a relatively fast developer-friendly cloud server for software development, I’d say that Vultr and Digital Ocean are the top picks.
Digital Ocean is by far the most user- and developer-friendly. If you have little time, just go with them. Things are simple, make sense, and are fun to use. As an example, Digital Ocean lets you con!gure !rewall rules and apply them to servers based on server tags. Any server deployed with a certain tag will then use those !rewall rules. Simple, makes sense, quick and easy to use. Now go and try doing the same in Azure, let us know in a week how things are going.
Vultr has some rough edges, but is a very promising provider. Almost as user- friendly as Digital Ocean (but no docker-machine driver!). If you want to use attached storage, you will run into problems (attaching storage reboots the machine, which their support tells me is expected behavior).
You can get slightly faster machines at AWS if you pay a lot more. The z1d instances are advertised as fast. My testing shows them to be only slightly faster, which probably isn’t worth the price increase over a c5.2xlarge.
Buying more “vCPUs” often gets you better performance, even for the sequential build case. This is a bit surprising, until you realize that you are buying hyperthreads on an over-provisioned machine. If you buy more hyperthreads, you push out the neighbors and “reserve” more of the real CPU cores for yourself.
The best performance comes from… desktop-class Intel processors. My old i7- i6700K is near the top of the charts, so is Hetzner’s EX62-NVMe server with an i9- 9900K. The EX62-NVMe is 64€/month, so for development it might make sense to just rent one or two and not bother with on-demand cloud servers at all.
Apart from Hetzner’s desktop CPU o#erings, there seems to be no way to get a cloud server with fast single-core performance.
Another conclusion from these benchmarks is that I decided to buy an iMac for my development machine, not an iMac Pro. Sure, I would like to have the improved thermal handling of the iMac Pro, as well as better I/O, but I do not want the dog-slow Xeon® processor. Perhaps it makes sense if you load all cores with video encoding/processing, but for interactive development it most de!nitely does not, and a desktop-class Intel CPU is a much better !t.
AWS vs GCP vs on-premises CPU performance comparison | by Daniel Megyesi | Infrastructure adventures
Recently I had the chance to participate in a project where we had to evaluate the price/value ratio of different cloud providers and had to compare it to existing on-premises hardware. During our research on the Internet, we found a surprisingly small amount of actual, useful benchmarks when it comes to raw CPU performance, so we decided to make our own.
The goal: gather data which can support a decision about which cloud provider to choose, and help exactly how many vCPUs you need to buy in the cloud, when you already know how many you normally use in a physical server in your own bare-metal environment.
This round of testing does not intend to be perfect and thorough, there are professional IT magazines who do that; we wanted to have quick and reliable benchmark data, which fits our needs. If you have more time, would be interesting to see detailed benchmarks with different kernels, before/after Meltdown-Spectre tests with different thread/CPU core count, etc.
As a reference, I’m going to use a self-hosted physical server with a recent model of Intel Xeon. All the participants will be different Xeon models. Both on Amazon and Google you can only find Intel Xeon CPUs, literally nothing else, and this trend is pretty much the same in datacenters.
I made the tests using a Docker image of the well-known sysbench tool, but as a comparison, I did the same measurement with the binary, without using Docker. I found a <0.5% difference in multiple runs, so to make the testing procedure easier and ensure we use the exact same sysbench version with the same libraries (sysbench 1.0.13 (using bundled LuaJIT 2.1.0-beta2)), we decided to go all-in on Docker (CE 17.xx stable).
The following test commands were used:
docker run --cpus 1 --rm -ti severalnines/sysbench sysbench cpu --cpu-max-prime=20000 --threads=1 --time=900 rundocker run --cpus 2 --rm -ti severalnines/sysbench sysbench cpu --cpu-max-prime=20000 --threads=2 --time=900 rundocker run --cpus 8 --rm -ti severalnines/sysbench sysbench cpu --cpu-max-prime=20000 --threads=8 --time=900 run
Measurement time will be
- 10 seconds to see spike-performance and
- 15 minutes to see actual long-term performance.
We’re going to compare the CPU speed by events per second values of the test results.
On bare-metal, I made several tests to see if there’s a significant difference based on the operating system (and therefore, the kernel) used: I tested the same machine with CoreOS Container Linux stable (1632.3.0 — kernel 4.14.19), Ubuntu 14.04 LTS and CentOS 7. Again, the difference was measurement error category, so we are going to see the following operating systems:
- on bare-metal: CentOS 7 and CoreOS 1632.3.0
- on Amazon Web Services: Amazon Linux
- on Google Cloud Platform: CoreOS 1632.3.0
The reference machine: a 2016-model Intel(R) Xeon(R) CPU E5–2690 v4 @ 2.60GHz.
Photo by Igor Ovsyannykov on Unsplash
On a single-core, single-thread setup, during a short 10-second test we get 303.13 events/second, while the long-duration test showed a slightly better performance with 321.84 e/s. We will take the 15-min result as 100% and compare everything else to this value.
Next we’re going to do the benchmark on 2 dedicated CPU cores, using 2 parallel threads. Interestingly, now the difference of the 10 vs 900 second benchmark seem to be very small: 670.61 vs 672.89 e/s. These results show that 2 CPU cores vs 2*1 CPU cores are 4.54% more performant on this specific Intel Xeon model.
Similarly, on 8 cores-8 threads, we get 2716.31 events per second, which gives us a +5.50% (or 105.50%) of the 8*1 CPU core performance.
So let’s compare this to other physical machines!
Competitors:
- 2014-model of Intel(R) Xeon(R) CPU E5–2660 v3 @ 2.60GHz
- 2013-model of Intel(R) Xeon(R) CPU E5–2658 v2 @ 2.40GHz
- and for some fun, a 2009-model of Intel(R) Xeon(R) CPU X3460 @ 2.80GHz
As expected, the older the CPU, the slower it will be:
2016 → 2014 → 2013: 321.
84 → 308.67 → 284.93 on the single core benchmark
Or in percentages, compared to the 2016 Xeon:
100.00% → 95.91% → 88.53% (1-core)
100.00% → 96.36% → 86.55% (2-core)
100.00% → 95.14% → 86.53% (8-core)
As you can see, on physical servers the CPU performance is linear with the number of cores and threads. The performance of n core vs. n*1 core is between 102–105%, similarly to the first tested model.
But hey, didn’t you mention 4 Xeons in the comparison?!
*drumroll* — the nearly 10 years old Xeon X3450 caused some unexpected surprises: it beat the crap out of all the newer brothers on the single-thread synthetic benchmark, by scoring an unbelievable 431.13 e/s value — that’s 133.96% of the 2016 reference model. Yeah, back then multi-threading was not really a thing for the average application.
Of course, as expected, this advantage melts down very quickly as we increase the thread count first to 2, later to 8: while on the dual-core setup we still achieve a sparkling 127. 71% of the 2016 reference, on 8-cores we’re already at only 73.52% performance of the big brother (1996.96 e/s vs 2716.31 e/s). This CPU has 8 logical cores, so we cannot go any further with the tests.
The 10-second spike benchmark results, on premisesThe 15-minute benchmark results, on premises
By the way, interestingly the benchmark showed the same results on the 20-core E5–2658 v2 with 40 threads (or 40 logical cores, as in Hyper Threading), with 60 threads, 80 threads or 160 threads — and until 40, it increased linearly: 10 core was 25% of the 40-core result, 20 core was 50%, 30 core 75%, etc. So looks like after you match the actual number of the logical CPU cores, increasing the thread count above that doesn’t gain you anything on the long term.
- performance scales linearly with the number of cores: if you put more cores, you get linearly more performance
- there seems to be about +5% gain each year in the new Xeon model, compared to the previous year’s
- the old 2009-model Xeon is significantly stronger on single-thread workloads, but quickly loses as multiple threads appear
Relative performance compared to the 2016 Xeon E5–2690 v4Multi-thread optimization vs. single-thread workflows, on premises
On the AWS platform, you have a ton of different instance types you can tailor for your needs, so we made tests with quite a lot of them. I also included here the suggested use-case of these instance types by Amazon:
- reference: on-premises Intel(R) Xeon(R) CPU E5–2690 v4 @ 2.60GHz
- t2 (basic): Intel(R) Xeon(R) CPU E5–2676 v3 @ 2.40GHz
- m5 (generic): Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
- c5 (high CPU): Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
- r4 (high mem): Intel(R) Xeon(R) CPU E5–2686 v4 @ 2.30GHz
- i3 (high IOPS): Intel(R) Xeon(R) CPU E5–2686 v4 @ 2.30GHz
Except for the base t2 type (2015), all the CPUs are 2016 or latest 2017 models, so they are all comparable to our reference. An interesting side note: these specific Xeon Platinum models are actually tailor-made for Amazon, you cannot buy them on the market.
Amazon is selling vCPUs, which is according to the fine print, logical CPU cores, with Hyper Threading enabled and not just the actual physical cores. These cores are normally not over-provisioned; while they are not shared “best effort” CPU cores, there’s no guarantee they don’t do optimisations between the different users on the same host. (With the micro instances, you have the option to buy partial cores shared between multiple tenants, for a much smaller price.)
So let’s go for the tests! After doing the same sysbench measurements, we arrived at the following values in the 10-second short test:
The 10-second spike benchmark results, AWS
You can already see:
- the single-core performance is much better than our reference, with only 1 exception
- but already with 2 threads, you start losing 10–25% compared to self-hosted physical hardware
- the t2 seems like a very reliable, stable instance with bare-metal performance
Don’t forget Amazon might allow temporary spikes in your workload without rate-limiting your CPU performance. That’s why we did the 15-min benchmarks:
The 15-minute benchmark results, AWS
On the long-term, the physical instances showed a constant 105% performance compared to the single-thread results.
Again, the t2 acts like our own self-hosted servers, with a very predictable performance.
The rest is not so appealing, even best case we lose ~17%, which goes up to ~27% with the m5 generic-purpose instances. It means if you have 100 CPU cores in your data center, you need to buy 127 vCPU cores in Amazon to match the same performance.
AWS relative performance compared to the 2016 Xeon E5–2690 v4Multi-thread optimization vs. single-thread workflows, AWS
Update: one of my colleagues pointed out that the t2 is a burstable type, unless the others; it works with so called “CPU credits”: https://aws.amazon.com/ec2/instance-types/#burst
So in general, this means either you will suffer from throttled performance by a synthetic benchmark (of 100% CPU usage) of consecutive 2 hours or you will need to pay a minimum of extra 5 cents per hour to get the unlimited CPU burst feature of the t2. Unless you know very well your application’s characteristics, this could lead to unpredictable costs.
I’m wondering whether it would feasible to destroy and recreate all my t2 instances every 23 hours, so I can stay on the fixed price, cheap high performance instance…? (Of course if the application and the infrastructure supports it.)
On the contrary to Amazon, Google offers a very simplified portfolio of instances: either you buy standard or CPU-optimized virtual machines — and that’s it. Even the CPU-optimized means you get the same standardized hardware, but with more CPU cores allocated, instead of giving more RAM for example.
Seems like they use a very simple, very flat hardware park and it probably helps them a lot with the maintenance. They don’t actually tell you what hardware is running in your VM when you do a cat /proc/cpuinfo
, but by the frequency you can have a guess, because they claim to have the following portfolio:
- 2.6 GHz Intel Xeon E5 (Sandy Bridge)
- 2.5 GHz Intel Xeon E5 v2 (Ivy Bridge)
- 2.
3 GHz Intel Xeon E5 v3 (Haswell)
- 2.2 GHz Intel Xeon E5 v4 (Broadwell)
- 2.0 GHz Intel Xeon (Skylake)
On all of my tests I always received a 2.5 GHz model, the CPU info only said the following: Intel(R) Xeon(R) CPU @ 2.50GHz
. This seems to be a 2013 model.
Since there’s only basically 2 kind of instances, the test was very quick and easy. I chose the n1-standard and the n1-highcpu types.
Let’s crunch the numbers!
The 10-second spike benchmark results, GCP
All the single-core results were better than our physical hardware (2016 Xeon), but only slightly. If it’s really the 2013 Xeon, then wow, all my respect to the Google optimization engineers!
As a reminder: Amazon had a 10–24% performance loss as we increased the number of cores. (Except for the very constant t2 instance.) Seems like Google is more or less the same so far.
Surprisingly, the high CPU instance was actually slower than the standard. But as I mentioned above, this is the same type of hardware, it’s just more cores than RAM compared to the standard instance.
Again, similarly to Amazon, Google allows you to have temporary spikes in your CPU usage, without throttling your available computing capacity. So therefore let’s see the long-term benchmarks:
The 15-minute benchmark results, GCP
Apparently, as we increase the workload, we get to lose constantly 15–22% of performance. On Amazon it was 17–27%.
Here unfortunately I didn’t see a t2 equivalent instance, it’s supposed to be the n1-standard, but it definitely does not perform like our physical machines.
GCP relative performance compared to the 2016 Xeon E5–2690 v4Multi-thread optimization vs. single-thread workflows, GCP
When you look only at the raw performance, Amazon seems to be very strong in the competition:
Relative CPU performance, AWS vs GCP compared to the 2016 Xeon E5–2690 v4
However, such a dumbed-down comparison is never really useful: Amazon offers lot of different instance types, which might have a weak CPU, but you get NVMe lightning-fast storage, etc. Sometimes that’s exactly what you need. Still, this article is only about raw CPU performance, so let’s see where the bill ends up:
On-demand prices for 8 vCPU cores, Amazon vs Google
Now you can see it’s much more balanced! You get what you pay for.
In case you need smaller machines, the diagram might look slightly different — let’s say for dual core instances:
On-demand prices for 2 vCPU cores, Amazon vs Google
Of course you can save a ton of money by using Amazon spot instances (a stock exchange-kind of licits on free computing capacity) or the preemptible Google instances (which can be turned off any time randomly by Google, but latest after 24 hours). For a real production workload, I don’t find it realistic that you could reserve all your capacity by hazardous bargaining to win 20–90% of discounts.
A realistic scenario might be to buy on-demand fixed instances for your usual core workload, then auto scale it with spot/preemptible cheap instances when there’s a peak of traffic. Also, for your QA environment the cheap should be perfectly fine — just adapt all your tools to manage correctly suddenly disappearing virtual machines and re-allocate resources dynamically. And of course, cloud is all about auto scaling: when you don’t have so many visitors during the night, you don’t need to pay for a lot of running instances. And this is one of the things where you can have a big gain compared to traditional on-premises infrastructures. (You don’t need to buy +200 physical machines with maintenance contracts, etc. only because you have every day a 2-hour peak, then those machines only consume electricity with 40% idle CPU…)
An additional option you can have: both providers also offer long-term discounts, if you commit on 12 or 36 months of continuous usage.
The cost of solution A or B is far more complex than just checking random instance hourly prices, when you start considering custom networking, storage requirements, bandwidth, etc. This article intended only to focus on the raw computing capacity comparison, as I found lack of up-to-date information on the Internet.
If there are a few key things we definitely realized by making this comparison:
- on physical machines: if you add more CPU cores, you get linearly bigger performance
- while on the cloud providers, it was only partially true: it increases linearly with the more vCPUs, but still you only tend to get ~80% performance of a physical machine (= you need to buy more CPUs in the cloud)
- on single-thread, single CPU workflows the cloud providers win hands-down because they have the most expensive, biggest CPUs which are very strong on single thread
One of the two cloud providers gave us direct feedback on the results we achieved. They said the performance loss is due to using the Hyper Thread cores, instead of having the real ones, like in a bare metal test — because in the physical machine when you restrict Docker to 8 CPU cores, you still have maybe 12 more installed, ready for the OS to use for interrupts, etc.
So they suggested that if we need 8 real cores to compare to physical machines, we should opt for a 16 core instance to get the true 8 physical CPU cores reserved for us. One on hand, it absolutely makes sense, on the other hand it still means I need to buy 2x the size (and the price) of the instance to achieve/surpass the actual on premises performance…
To validate their claims, we did the same benchmarks on our on premises KVM cluster, assigning 8, 2, 1 vCPU cores, just like in the cloud. Then just to test what they suggested, we also did a round with +2 extra vCPUs, left only for the OS.
The results were consistent with our previous measurements from the non-KVM, on-premises hardware tests:
The 15-minute benchmark results, KVM on-premises
As you can see, it’s the exact same result: if you put 8x more virtual cores in KVM, you get 8x more performance. Not 6x more only or so.
Due to lack of time, I just did then a quick test in Google Cloud, using the above mentioned method: overprovision the available cores by a lot — so basically I need only 2 cores for my application, but I will buy 8:
The 15-minute benchmark results, GCP with overprovisioned resources
Yes, it’s true, here I got linear performance increase, just like with a bare metal — but for the price of buying 2x, 8x, etc. more than what I wanted to pay originally, while with the physical machines I did not have this limit, even with KVM virtualization.
Next step would be to do a real Java benchmark or some other more realistic performance test, but so far these results can be already used in plannings and calculations.
Thank you for taking your time to read this, I hope you also found it useful. Please feel free to share your thoughts or you if made a similar benchmark, would be nice to see how they compare with these results.
New Intel processors crushed the Apple M1 Max. Performance test
Intel
Alder Lake
Apple
Macbook Pro 14
Macbook Pro 16
Tests
New Intel processors defeated the Apple M1 Max. Performance test
Konstantin Alekseevich
—
Last week, Intel officially unveiled Alder Lake-S 12th Gen Core processors and a compatible platform with the new LGA1700 socket. Desktop 10-nm Intel chips have acquired support for DDR5 RAM and a high-speed PCI Express 5.0 interface. Yesterday, the company lifted the ban on the publication of tests, after which the first comparisons and results of new products in synthetic benchmarks appeared on the network.
The tests involved MacBook Pro laptops based on Apple M1 Max and M1 Pro chips with a clock speed of 3.2 Hz, the former includes eight high-performance and two energy-efficient cores, as well as 32 graphics cores. The second chip has the same cores, but the graphics module has 16 cores. The result of the new products was compared with the latest Intel 12th generation Alder Lake-S processors, as well as AMD Ryzen 5000 chips.0003
The new Core i9-12900K achieved 4243 points in PassMark single-threaded testing, outperforming the M1 Pro and M1 Max chips with 3880 and 3850 points, respectively.
In the Geekbench 5 test, the flagship Core i9-12900K outperformed Apple’s chips in single-threaded and multi-threaded workloads with a score of 2004 and 18,534 points, respectively. M1 Max results — 1764 and 12430 points. It is noteworthy that the mid-range Core i5-12600K processor was faster than the M1 Max.
It’s worth noting that Intel’s 12th Gen Alder Lake-S desktop processors can consume up to 241W in turbo mode, far exceeding the power-efficient Apple platform found in compact laptops.
iGuides in Yandex.Zen — zen.yandex.ru/iguides.ru
iGuides on Telegram — t.me/igmedia
Source:
Hothardware
Buy advertising
Recommendations
-
Sberbank told why it is better to get a plastic card without the name of the owner
-
How to download videos from YouTube, Instagram*, TikTok and other popular services using Telegram
-
AliExpress upset the Russians very much. We wait and hope that everything will change
Recommendations
Sberbank told why it is better to get a plastic card without the name of the owner
How to download videos from YouTube, Instagram*, TikTok and other popular services using Telegram
«Yandex. Taxi» has reduced prices. Drivers are unhappy and think how to cheat
AliExpress upset the Russians very much. We wait and hope that everything will change
Buy advertising
See also
Cars
Russia
Honor 80 Pro unveiled with better camera than iPhone 14
Honor
smartphones
Marshall Major III bluetooth headphones sell on AliExpress with a very good discount
Aliexpress
CPU performance comparison
Igor Pavlov’s 7-Zip archiver allows you to perform a Linux system performance test. Launch a terminal and run the command «7z b».
Below are graphical representations of multi-processor test results performed on various Linux distributions on various hardware platforms.
7-Zip tests, parameter Total. Scale 100:1 (parrots/px)
Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz × 56 (2CPUs x 14 cores 28 threads) (35M Cache, 145W, 22nm, FCLGA2011-3)
Intel® Xeon(R) CPU E3-1270 V2 @ 3.50GHz × 8 (8M Cache, 69W, 22nm, FCLGA1155)
Intel® Core™ i7-3630QM CPU @ 2.40GHz up 3.40GHz × 8 (6M Cache , 45W, 22nm, FCPGA988)
Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz × 4 (6M Cache, 65W, 14nm, FCLGA1151)
Intel® Core™ i5-3470 CPU @ 3.20GHz x 4 (6M Cache, 77W, 22nm, FCLGA1155)
Intel® Xeon(R) CPU E5405 @ 2.00GHz × 8 (2 CPUs) (12M Cache, 80W, 45nm, LGA771)
Intel® Core™ i5-4460 CPU @ 3.20GHz × 4 (6M Cache, 84W, 22nm, FCLGA1150)
AMD Athlon(tm) II X3 455 Processor × 3 (3.3GHz, 1.5M Cache, 95W, 45nm, AM3)
7-Zip tests, parameter Total. Scale 10:1 (parrots/px)
Intel Core(tm) i3-2100 CPU @ 3.10GHz x 4 (3M Cache, HT, 65W, 32nm, FCLGA1155)
Intel(R) Celeron(R) N4100 CPU @ 1. 10GHz x 4 (4M Cache, 6W, 14nm, Gemini Lake, FCBGA1090)
Intel® Pentium(R) CPU G2120 @ 3.10GHz × 2 (3M Cache, 55W, 22nm, FCLGA1155, DDR3-1600)
Intel® Core™ i3-5010U Processor @ 2.10 GHz x 4 Threads (3M Cache, 15W, 14nm)
Intel® Celeron(R) CPU J1900 @ 2.00GHz up 2.42GHz × 4 (2M Cache, 10W, 22nm, FCBGA1170)
AMD Athlon(tm) II X2 270 Processor @ 3.4GHz × 2 (2M Cache, 65W, 45nm, AM3, DDR3)
Intel® Pentium® CPU G2010 @ 2.80GHz × 2 (3M Cache, 55W, 22nm, FCLGA1155, DDR3-1333)
Broadcom® BCM2711 @ 1.5GHz × 4 (ARM v8 A72 core 1M L2 Cache, OpenGL ES, 3.0 graphics)
Intel® Core™2 Duo CPU E8400 @ 3.00GHz × 2 (6M Cache, 65W, 45nm, LGA775)
Intel(R) Pentium(R) CPU G850 @ 2.90GHz x 2 (3M Cache, 65W, 32nm, FCLGA1155)
Pentium(R) Dual-Core CPU E5700 @ 3.00GHz × 2 (2M Cache, 65W, 45nm, LGA775)
Intel® Core™2 Duo CPU E7200 @ 2.53GHz × 2 (3Mb Cache, 65W , 45 nm, LGA775)
Intel® Core™2 Duo CPU E8200 @ 2.66GHz × 2 (6Mb Cache, 65W, 45 nm, LGA775)
Allwinner H6 Quad-core Cortex-A53 64bit @ 1. 7GHz × 4 (Orange Pi One Plus)
Intel(R) Celeron(R) CPU G530 @ 2.40GHz x 2 (2M Cache, 65W, 32nm, FCLGA1155)
Intel® Core™2 Duo E6550 @ 2.33GHz x 2 (4M Cache, 65W, 65nm, PLGA775)
Intel(R) Intel® Core™2 Duo CPU E4600 @ 2.40GHz x 2 (2M Cache, 65W, 65nm, LGA775)
Intel(R) Intel® Celeron(R) CPU E3500 @ 2.70GHz x 2 (1M Cache , 65W, 45nm, LGA775)
Broadcom BCM2837B0 Quad-core Cortex-A53 64bit SoC @ 1.4GHz × 4 (Raspberry Pi 3 Model B+)
Allwinner H616 Quad-core Cortex-A53 64bit SoC @ 1.5GHz × 4 (Orange Pi Zero2) R) Pentium(R) Dual CPU E2220 @ 2.40GHz × 2 (1M Cache, 65W, 65nm, LGA775)
Broadcom BCM2837 Quad-core Cortex-A53 64bit SoC @ 1.2GHz × 4 (Raspberry Pi 3 Model B)
Intel(R) Core™2 CPU 6400 @ 2.13GHz × 2 (2M Cache, 65W, 65nm, LGA775)
Allwinner H5 Quad-core Cortex-A53 64bit (Orange Pi PC 2)
Intel® Core™ Duo T2400 @ 1.83GHz x 2 (32 bit, 2M Cache, 31W, 65nm, PBGA479, PPGA478)
Allwinner h4 Quad-core Cortex-A7 H.