Matthias Troyer wrote:
On Jun 12, 2007, at 3:35 PM, Dirk Schuricht wrote:
The recompilation of ATLAS and LAPACK on the APPLE had no effect on this issue.
Recompiling it might have no effect at all if ALPS still chooses Apple's version instead.
My wild guess on the subject: is the code using many small BLAS/LAPACK calls? If so, are you using threading? (if you don't know then you probably are, it's enabled by default).
My experience with vecLib was that it was highly inefficient for small calls as the overhead for threading seems to be huge. Recompiling ATLAS for OSX and disabling threads or using Intel's MKL and exporting OMP_NUM_THREADS=1 helped.
If you download CHUD from the OSX developer's page you can use the 'shark' application for a quick guess on where you're spending your time. If it's threading then you'll see that there.
Regards, Emanuel