A friend called Allan de Medeiros Martins has made me loose some time playing with Restricted Boltzmann Machines just for fun!
Matrix multiplication is a critical operation in respect to the performance of the algorithm we’ve been discussing. Ruby has a Matrix class at the standard library and its Matrix#* method does the job!
But, the whole thing was really slow compared to the matlab version of the code.
Then I implemented a simple version of the matrix multiplication using Array of Arrays and I was surprised that was something around 2.5x faster than the specialized Matrix#* method. Unfortunately, this was not even acceptable yet.
Doing some search I’ve (re)reached SciRuby project and their NMatrix library. AMAZING project! The NMatrix#dot method does the correct multiplication (dot product), while NMatrix#* is just an element-by-element multiplication. Using NMatrix#dot I could reach to 3x faster compared to Matrix#* and something around 1.2x faster than my Array version. But something was not right, I was expecting a much more significant improvement in speed. After digging around, gotcha! In some part of the code I was using NMatrix#map and this method was returning an “untyped” NMatrix object as advised at the documentation.
“Note that map will always return an :object matrix, because it has no way of knowing how to handle operations on the different dtypes.”.
Well, with an untyped matrix all the C specialized algorithms are disabled and we can’t get a good speed boost. So I have changed the code to guarantee that a :float64 NMatrix was used in all steps of the algorithm. Boom! The NMatrix#dot with a dtype: :float64 is more than 400x faster than Matrix#*.
See code at:
https://github.com/abinoam/matrix_dot_bench – /exe/matrix_dot_bench
Environment:
ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin15] System Version: OS X 10.11.6 (15G1217) Kernel Version: Darwin 15.6.0 MacBook Pro (13-inch, Late 2011) 2,4 GHz Intel Core i5 16 GB 1600 MHz DDR3
Wow! Thanks man, keep up the good work.