New Optimised Executables Links - a READ ONLY thread


Advanced search

Message boards : Cruncher's Corner : New Optimised Executables Links - a READ ONLY thread

Sort
AuthorMessage
Profile Mike Hewson
Forum moderator
Avatar
Joined: Dec 1 05
Posts: 1984
ID: 135571
Credit: 4,980,593
RAC: 6,454
Message 28095 - Posted 14 Mar 2006 21:20:04 UTC
Last modified: 14 Mar 2006 22:09:50 UTC

Only akosf to post here, please, if he has any other links to download the new gear....

For any discussion, please start a new thread or add to an existing one. :-) :-)

The following was provided by akosf. I have tested by successfully downloading from the link given. It points to a zip file ( 1,044,480 bytes packed down to 412,787 bytes ) containing his optimised version of 'albert_4.37_windows_intel86.exe'.



C37 - optimised x86 compatible windows executable

Install:

1, download file
2, stop BOINC
3, unzip file to ../BOINC/projects/einstein.phys.uwm.edu/ directory
4, restart BOINC


____________
"I have made this letter longer than usual, because I lack the time to make it short." - Blaise Pascal

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 28394 - Posted 20 Mar 2006 5:15:21 UTC

S38 - optimised x86-SSE compatible executable

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 28547 - Posted 22 Mar 2006 5:11:27 UTC

S39 - optimised x86-SSE compatible executable (for windows)

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 28668 - Posted 23 Mar 2006 3:52:04 UTC
Last modified: 23 Mar 2006 4:16:55 UTC

S39L - optimised SSE compatible windows executable

Change: S39L needs less L1 cache (~11kB) than S39 (~33kB) for its important datas.

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 28909 - Posted 26 Mar 2006 6:49:27 UTC

C40 - optimised x86 compatible windows executable

Changes: sin/cos interpolator, AGI optimisations
Speedup: ~10% faster than C37

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 29080 - Posted 28 Mar 2006 19:34:45 UTC

D40 - 3DNow! optimised windows executable

Comment: 3DNow! supported on these cpus:
K6-2,K6-III,Athlon,Duron,AthlonXP,Sempron,Athlon64,C3,Samuel-II,Ezra,Nehemiah,Winchip2

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 29529 - Posted 3 Apr 2006 3:46:53 UTC
Last modified: 3 Apr 2006 4:14:51 UTC

S40 - SSE optimised windows executable

Changes: better prefetching, AGI optimisations

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 29534 - Posted 3 Apr 2006 8:41:10 UTC

S40 is not perfect! Don't use it!
Sometimes gives back an "access violation" on one of my Durons.
(I will check the code and the PC.)

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 30006 - Posted 8 Apr 2006 19:49:03 UTC

S40.03 - SSE optimised windows executable

Comment: S40 was good, just one of my Durons is in poor health.

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 30445 - Posted 16 Apr 2006 10:30:07 UTC

S40.04 - SSE optimised windows executable

Comment: bugfixed version of S40.03 (zero credit problem)

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 30632 - Posted 20 Apr 2006 16:10:07 UTC

S40.12 - SSE optimised windows executable

- double size sin/cos look-up table ( 2 more valuable bits but worse speed )
- 4 cache-lines are freed up ;-)

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 30995 - Posted 25 Apr 2006 4:10:25 UTC

C41.00 - 386 compatible windows executable

- some tricks from S40.xx
- 2kB size sin/cos look-up table

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31383 - Posted 28 Apr 2006 15:11:55 UTC

C41.01 - 386 compatible windows executable

- better look-up table handling

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31532 - Posted 30 Apr 2006 8:25:38 UTC

D41.12 - 3DNow! compatible windows executable

- 3DNow! based trigonometry (2kB look-up table)
- reorganized address generation
- newton-raphson iteration
- mathematical reductions

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31695 - Posted 1 May 2006 13:05:13 UTC

D41.13 - 3DNow! compatible windows executable

- increased accuracy
- FPU based trigonometry with 4kB data
- common denominator for 4 pairs

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31731 - Posted 1 May 2006 16:27:02 UTC
Last modified: 1 May 2006 16:39:34 UTC

D41.13 is refreshed (status bit bug is fixed).

Why did you give negative markings for these messages?

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31732 - Posted 1 May 2006 16:30:22 UTC

S41.06 - SSE compatible windows executable

- FPU based trigonomety with 4kB data
- common denominator for 4 pairs
- better FPU/SSE overlapping

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31767 - Posted 1 May 2006 19:42:06 UTC

U41.01 - SSE3 compatible windows executable

- SSE3 truncation

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 31915 - Posted 2 May 2006 9:52:29 UTC
Last modified: 2 May 2006 9:57:49 UTC

Hello! Don't use U41.01 because it doesn't work well. Sorry...

edit: I will look after and correct it.

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 32152 - Posted 3 May 2006 18:30:16 UTC

C41.02 - 386 compatible windows executable (386,486,Pentium,Pentium2,K5,K6,...)
D41.14 - 3DNow! compatible windows executable (K6-2,K6-3,Spitfire,Thunderbird,Cyrix3,Samuel-II,Ezra,...)
S41.07 - SSE compatible windows executable (Pentium-III,Pentium4,AthlonXP,Duron(1GHz+),...)
U41.04 - SSE3 compatible windows executable (AMD: from E3 stepping, Intel: Prescott? and newer)

- special rounding method (~2% speedup)

comments:
1,use at your own risk
2,use the observation threads for bug report and for your questions

Profile akosf
Volunteer developer
Avatar
Joined: Nov 13 05
Posts: 545
ID: 121407
Credit: 3,779,672
RAC: 282
Message 37948 - Posted 10 Jun 2006 8:52:59 UTC
Last modified: 10 Jun 2006 9:40:16 UTC

C41.03
D41.15
S41.08
U41.05

- fixed memory prefetching fault

[af>quebec] Philippe Chayer
Joined: Feb 11 05
Posts: 1
ID: 16041
Credit: 232,496
RAC: 360
Message 73454 - Posted 16 Aug 2007 20:29:36 UTC - in response to Message 37948.

C41.03
D41.15
S41.08
U41.05

- fixed memory prefetching fault


These url do not work.... Have the files been moved somewhere?
____________

Stick
Joined: Feb 24 05
Posts: 786
ID: 36935
Credit: 218,623
RAC: 68
Message 73455 - Posted 16 Aug 2007 20:37:09 UTC - in response to Message 73454.
Last modified: 16 Aug 2007 20:58:30 UTC

C41.03
D41.15
S41.08
U41.05

- fixed memory prefetching fault


These url do not work.... Have the files been moved somewhere?


They were withdrawn (about a year ago) due to an issue with database integrity.

____________

Profile slavko.sk
Avatar
Joined: Jan 22 05
Posts: 33
ID: 6210
Credit: 738,958
RAC: 2,880
Message 73456 - Posted 16 Aug 2007 21:02:48 UTC

Now already distributed, official applications are optimized. Don't worry about some other. Official one are the right one.
____________
ALL GLORY TO THE HYPNOTOAD!
Do You Dare?
Potrebujete pomoc?

Profile Bikeman
Forum moderator
Volunteer developer
Avatar
Joined: Aug 28 06
Posts: 2173
ID: 210833
Credit: 5,839,880
RAC: 7,648
Message 73457 - Posted 16 Aug 2007 21:16:47 UTC - in response to Message 73456.

Now already distributed, official applications are optimized. Don't worry about some other. Official one are the right one.



Not quite...not yet.

The current suite of apps is not yet optimized in the sense that hand-optimized code is included. It's pretty much the same code for all platforms, and only C/C++ , no assembly language.

But Akos has contributed some optimizations to the code that will be used in one of the next major releases of the science apps, after one or two remaining stability problems are sorted out.

CU

BRM


____________

DanNeely
Joined: Sep 4 05
Posts: 825
ID: 106636
Credit: 5,227,733
RAC: 8,220
Message 73463 - Posted 17 Aug 2007 0:41:34 UTC

These were s4 apps and totally useless for the current science run.

As far as actual optimization, while the code in the current app hasn't been agressively tweaked like the old app was acording to Bernd the new algorithm is allowing them to do 64x more work in a given time perdiod than the old one did.
____________

Message boards : Cruncher's Corner : New Optimised Executables Links - a READ ONLY thread


Return to Einstein@Home main page

This material is based upon work supported by the National Science Foundation (NSF) under Grant NSF-0200852 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2010 Bruce Allen for the LIGO Scientific Collaboration