Performance of NaCL vs. PNaCL

May 30, 2014

Since I am now able to build my experimental vector-based mathematics application Osoasso using Portable Native Client (PNaCL) with exceptions enabled, I decided to test the performance of PNaCL.

A confession

I chose to test the performance of Native Client (NaCL) and PNaCL by multiplying two \(nxn\) matrices, with \(n=1024\). The Osoasso code can use multiple threads, but I must confess that my current matrix multiplication implementation does not scale well. In fact, it scales only to about three threads, so these numbers are not nearly as good as they could be. Still, they do provide an interesting comparison between NaCL and PNaCL, at least in my code base.

I ran both tests on two machines, a dual core Intel T2050 1.6 GHz laptop and an Intel Xeon X5675 3.7GHz with 12 cores. On the former machine, I used two threads, on the latter machine I used three threads.

NaCL performance

I multiplied the same two matrices made up of randomly generated double values five times using version 19 of Osoasso. This version is built with the NaCL newlib tool chain, and it requires the Native Client flag be enabled in Chrome. Here are the results:

Machine Intel T2050 1.6 GHz Intel Xeon X5675 3.7GHz
Run 1 40.6093 8.76589
Run 2 40.636 8.9953
Run 3 41.5904 9.111
Run 4 40.756 9.11112
Run 5 40.81 8.84316
Average 40.88034 8.965294
MFLOPS 52.531 239.533
All times are reported in seconds


PNaCL performance

Again, I multiplied the same two matrices made up of randomly generated double values five times using version 20 of Osoasso. This version is built with the PNaCL tool chain, and it does not require the Native Client flag be enabled in Chrome. Here are the results:

Machine Intel T2050 1.6 GHz Intel Xeon X5675 3.7GHz
Run 1 47.2024 7.48539
Run 2 47.8889 8.73858
Run 3 49.0555 8.94701
Run 4 47.7729 8.91277
Run 5 48.9688 9.01714
Average 48.1777 8.620178
MFLOPS 44.574 249.123
All times are reported in seconds


Comparison

These results present a mixed bag. While the performance of PNaCL was consistently worse on an older, dual-core laptop, it performed slightly better on a Xeon processor. For my experimental application at least, the cost of using PNaCL seems to be worth the benefit of portability. Now that the PNaCL build is the default for Osoasso, it can run in Chrome without the need to enable special flags.


Content © Josh Peterson

Site design by Sirupsen