Performance of NaCL vs. PNaCL
May 30, 2014
Since I am now able to build my experimental vector-based mathematics application Osoasso using Portable Native Client (PNaCL) with exceptions enabled, I decided to test the performance of PNaCL.
A confession
I chose to test the performance of Native Client (NaCL) and PNaCL by multiplying two \(nxn\) matrices, with \(n=1024\). The Osoasso code can use multiple threads, but I must confess that my current matrix multiplication implementation does not scale well. In fact, it scales only to about three threads, so these numbers are not nearly as good as they could be. Still, they do provide an interesting comparison between NaCL and PNaCL, at least in my code base.
I ran both tests on two machines, a dual core Intel T2050 1.6 GHz laptop and an Intel Xeon X5675 3.7GHz with 12 cores. On the former machine, I used two threads, on the latter machine I used three threads.
NaCL performance
I multiplied the same two matrices made up of randomly generated double values five times using version 19 of Osoasso. This version is built with the NaCL newlib tool chain, and it requires the Native Client flag be enabled in Chrome. Here are the results:
Machine | Intel T2050 1.6 GHz | Intel Xeon X5675 3.7GHz |
---|---|---|
Run 1 | 40.6093 | 8.76589 |
Run 2 | 40.636 | 8.9953 |
Run 3 | 41.5904 | 9.111 |
Run 4 | 40.756 | 9.11112 |
Run 5 | 40.81 | 8.84316 |
Average | 40.88034 | 8.965294 |
MFLOPS | 52.531 | 239.533 |
PNaCL performance
Again, I multiplied the same two matrices made up of randomly generated double values five times using version 20 of Osoasso. This version is built with the PNaCL tool chain, and it does not require the Native Client flag be enabled in Chrome. Here are the results:
Machine | Intel T2050 1.6 GHz | Intel Xeon X5675 3.7GHz |
---|---|---|
Run 1 | 47.2024 | 7.48539 |
Run 2 | 47.8889 | 8.73858 |
Run 3 | 49.0555 | 8.94701 |
Run 4 | 47.7729 | 8.91277 |
Run 5 | 48.9688 | 9.01714 |
Average | 48.1777 | 8.620178 |
MFLOPS | 44.574 | 249.123 |
Comparison
These results present a mixed bag. While the performance of PNaCL was consistently worse on an older, dual-core laptop, it performed slightly better on a Xeon processor. For my experimental application at least, the cost of using PNaCL seems to be worth the benefit of portability. Now that the PNaCL build is the default for Osoasso, it can run in Chrome without the need to enable special flags.