Gravitational Wave search GPU App version

log in

Advanced search

Message boards : News : Gravitational Wave search GPU App version

1 · 2 · 3 · 4 . . . 8 · Next
Author Message
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130440 - Posted: 11 Apr 2014, 9:12:32 UTC

Due to the excellent work of our French volunteer Christophe Choquet we finally have a working OpenCL version of the Gravitational Wave search ("S6CasA") application. Thank you Christophe!

This App version is currently considered 'Beta' and being tested on Einstein@Home. To participate in the Beta test, you need to edit your Einstein@Home preferences, and set "Run beta/test application versions?" to "yes".

It is currently available for Windows (32 Bit) and Linux (64 Bit) only, and you should have a card which supports double precision FP in hardware.

BM

Claggy
Send message
Joined: 29 Dec 06
Posts: 560
Credit: 2,443,990
RAC: 1,354
Message 130449 - Posted: 11 Apr 2014, 11:32:32 UTC - in response to Message 130440.
Last modified: 11 Apr 2014, 11:35:36 UTC

It is currently available for Windows (32 Bit) and Linux (64 Bit) only, and you should have a card which supports double precision FP in hardware.

On my Win 7 x64 i5-3210M/GT650M/Intel_Graphics_HD4000 host I'm getting:

2014-04-11 11:18:14.2272 [PID=29146] Request: [USER#xxxxx] [HOST#8941572] [IP xxx.xxx.xxx.80] client 7.2.42
2014-04-11 11:18:14.2881 [PID=29146] [send] effective_ncpus 3 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2014-04-11 11:18:14.2881 [PID=29146] [send] effective_ngpus 2 max_jobs_on_host_gpu 999999
2014-04-11 11:18:14.2881 [PID=29146] [send] Not using matchmaker scheduling; Not using EDF sim
2014-04-11 11:18:14.2882 [PID=29146] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2014-04-11 11:18:14.2882 [PID=29146] [send] CUDA: req 17926.81 sec, 0.50 instances; est delay 0.00
2014-04-11 11:18:14.2882 [PID=29146] [send] Intel GPU: req 0.00 sec, 0.00 instances; est delay 0.00
2014-04-11 11:18:14.2882 [PID=29146] [send] work_req_seconds: 0.00 secs

Snip

2014-04-11 11:18:14.2882 [PID=29146] [send] active_frac 0.935463 on_frac 0.963980 DCF 1.528061
2014-04-11 11:18:14.2897 [PID=29146] [send] [HOST#8941572] is reliable
2014-04-11 11:18:14.2897 [PID=29146] [send] set_trust: random choice for error rate 0.029197: yes
2014-04-11 11:18:14.2897 [PID=29146] [mixed] sending non-locality work first (0.5707)
2014-04-11 11:18:14.3054 [PID=29146] [mixed] sending locality work second
2014-04-11 11:18:14.5549 [PID=29146] [version] Checking plan class 'SSE2'
2014-04-11 11:18:14.5561 [PID=29146] [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2014-04-11 11:18:14.5561 [PID=29146] [version] plan class ok
2014-04-11 11:18:14.5561 [PID=29146] [version] Don't need CPU jobs, skipping version 105 for einstein_S6CasA (SSE2)
2014-04-11 11:18:14.5561 [PID=29146] [version] Checking plan class 'GWopencl-ati-Beta'
2014-04-11 11:18:14.5561 [PID=29146] [version] beta test app versions not allowed in project prefs.
2014-04-11 11:18:14.5561 [PID=29146] [version] Checking plan class 'GWopencl-nvidia-Beta'
2014-04-11 11:18:14.5562 [PID=29146] [version] beta test app versions not allowed in project prefs.
2014-04-11 11:18:14.5562 [PID=29146] [version] Checking plan class 'SSE2-Beta'
2014-04-11 11:18:14.5562 [PID=29146] [version] beta test app versions not allowed in project prefs.
2014-04-11 11:18:14.5562 [PID=29146] [version] no app version available: APP#24 (einstein_S6CasA) PLATFORM#9 (windows_x86_64) min_version 0
2014-04-11 11:18:14.5562 [PID=29146] [version] no app version available: APP#24 (einstein_S6CasA) PLATFORM#2 (windows_intelx86) min_version 0
2014-04-11 11:18:14.5580 [PID=29146] [debug] [HOST#8941572] MSG(high) No work sent
2014-04-11 11:18:14.5580 [PID=29146] [debug] [HOST#8941572] MSG(high) see scheduler log messages on http://einstein.phys.uwm.edu//host_sched_logs/8941/8941572
2014-04-11 11:18:14.5580 [PID=29146] Sending reply to [HOST#8941572]: 0 results, delay req 60.00
2014-04-11 11:18:14.5583 [PID=29146] Scheduler ran 0.338 seconds


I've already got CPU beta CasA 1.06 work from this host, it is on the home venue, and Run beta is set to yes at that venue,
generally on Windows x64 projects supply x32 Cuda apps because there is no need for 64bit addressing, and there is a slowdown running 64bit Cuda apps because of the 64bit addressing.

Claggy
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130457 - Posted: 11 Apr 2014, 12:11:07 UTC - in response to Message 130449.

Thanks for reporting.

Should work now.

BM

Holmis
Send message
Joined: 4 Jan 05
Posts: 648
Credit: 101,397,590
RAC: 122,702
Message 130459 - Posted: 11 Apr 2014, 12:28:42 UTC
Last modified: 11 Apr 2014, 13:15:05 UTC

After editing my prefs and upping the cache a bit I managed to get a few S6 tasks assigned to my GTX660Ti, I then suspended other GPU tasks is queue to try on out while I'm here to check on things.
All tasks immediately got a computational error with the following in the stderr:
(unknown error) - exit code -1073741515 (0xc0000135)

Usually a sign of a missing file or .dll etc. The only file downloaded was:
einstein_S6CasA_1.06_windows_intelx86__GWopencl-nvidia-Beta.exe

Next step will be to try a driver upgrade to make sure all files are present and accounted for.
Other GPU work (BRP4G and FGRP3) run OK.

Edit: Updated the Nvidia driver to 335.23 via clean install but that did not change things, still getting instant error with the above error message.
Testing on hold until further notice.

Richard Haselgrove
Send message
Joined: 10 Dec 05
Posts: 1721
Credit: 64,824,904
RAC: 56,895
Message 130462 - Posted: 11 Apr 2014, 13:15:31 UTC - in response to Message 130459.

After editing my prefs and upping the cache a bit I managed to get a few S6 tasks assigned to my GTX660Ti, I then suspended other GPU tasks is queue to try on out while I'm here to check on things.
All tasks immediately got a computational error with the following in the stderr:
(unknown error) - exit code -1073741515 (0xc0000135)

Usually a sign of a missing file or .dll etc. The only file downloaded was:
einstein_S6CasA_1.06_windows_intelx86__GWopencl-nvidia-Beta.exe

Next step will be to try a driver upgrade to make sure all files are present and accounted for.
Other GPU work (BRP4G and FGRP3) run OK.

That can sometimes be unravelled by using dependency walker.
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130463 - Posted: 11 Apr 2014, 13:26:46 UTC

Is the Windows version running successfully anywhere?

BM

Holmis
Send message
Joined: 4 Jan 05
Posts: 648
Credit: 101,397,590
RAC: 122,702
Message 130464 - Posted: 11 Apr 2014, 13:34:20 UTC - in response to Message 130462.

Got this with dependency walker when opening "einstein_S6CasA_1.06_windows_intelx86__GWopencl-nvidia-Beta.exe":



The swedish phrase "Det går inte att hitta filen" translates to "File not found".

Richard Haselgrove
Send message
Joined: 10 Dec 05
Posts: 1721
Credit: 64,824,904
RAC: 56,895
Message 130465 - Posted: 11 Apr 2014, 13:35:26 UTC - in response to Message 130463.

Is the Windows version running successfully anywhere?

BM

I'll test too.
Profile Freddykrug [Astronomy.Ru Forum]
Send message
Joined: 29 May 10
Posts: 1
Credit: 5,394,605
RAC: 225
Message 130466 - Posted: 11 Apr 2014, 13:45:25 UTC

Sorry, but why this is not done at Albert@home?

Richard Haselgrove
Send message
Joined: 10 Dec 05
Posts: 1721
Credit: 64,824,904
RAC: 56,895
Message 130469 - Posted: 11 Apr 2014, 14:00:53 UTC - in response to Message 130463.

Got a similar but rather shorter list of missing files with the 32-bit version of dependency walker (bitness matters, with that tool).



Host is host 5744895 - 64-bit Windows 7 with NV GTX 670, driver 335.23 (about 4 weeks ago).

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130470 - Posted: 11 Apr 2014, 14:02:37 UTC - in response to Message 130469.

Thanks. Looks like something went wrong with the build. While investigating, I'll disable the current Windows Beta App version.

BM

Richard Haselgrove
Send message
Joined: 10 Dec 05
Posts: 1721
Credit: 64,824,904
RAC: 56,895
Message 130471 - Posted: 11 Apr 2014, 14:07:49 UTC - in response to Message 130470.

Googling suggests the problem might be related to missing Microsoft Visual Studio runtime redistributable packages. Are you using either VS 2008 or VS 2010 - if so, which?

(tasks are erroring, as Holmis described, but I'll save some for testing later)

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130472 - Posted: 11 Apr 2014, 14:28:57 UTC - in response to Message 130471.

Are you using either VS 2008 or VS 2010 - if so, which?


None - MinGW.

BM
Richard Haselgrove
Send message
Joined: 10 Dec 05
Posts: 1721
Credit: 64,824,904
RAC: 56,895
Message 130473 - Posted: 11 Apr 2014, 14:47:29 UTC - in response to Message 130472.

OK, those API- exports are probably not relevant, then.

Maybe these are more significant, if you can recognise any of them?

[D? ] DCOMP.DLL

Import Ordinal Hint Function Entry Point
------ ------------- ---- ------------------------ -----------
[OE ] 1017 (0x03F9) N/A N/A Not Bound
[CE ] N/A N/A DCompositionCreateDevice Not Bound

[D? ] GPSVC.DLL

Import Ordinal Hint Function Entry Point
------ ------- ---- ------------------------------------- -----------
[CE ] N/A N/A ProcessGroupPolicyCompletedExInternal Not Bound
[CE ] N/A N/A RsopAccessCheckByTypeInternal Not Bound
[CE ] N/A N/A RsopFileAccessCheckInternal Not Bound
[CE ] N/A N/A RsopSetPolicySettingStatusInternal Not Bound
[CE ] N/A N/A ProcessGroupPolicyCompletedInternal Not Bound
[CE ] N/A N/A RsopResetPolicySettingStatusInternal Not Bound

[D? ] IESHIMS.DLL

Import Ordinal Hint Function Entry Point
------ ------- ---- ------------------------------------ -----------
[CE ] N/A N/A IEShims_Initialize Not Bound
[CE ] N/A N/A IEShims_InDllMainContext Not Bound
[CE ] N/A N/A IEShims_GetOriginatingThreadId Not Bound
[CE ] N/A N/A IEShims_CreateWindowEx Not Bound
[CE ] N/A N/A IEShims_SetRedirectRegistryForThread Not Bound

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130474 - Posted: 11 Apr 2014, 15:19:47 UTC

The problem seems to be that the libstdc++-6.dll is not linked statically into the App. The version of the MinGW compiler that I used for the first time apparently requires an addional option for this (-static-libstdc++).

Will build a new App, however I doubt that I can publish it before Monday.

BM

Sunny129
Avatar
Send message
Joined: 5 Dec 05
Posts: 163
Credit: 160,342,159
RAC: 0
Message 130477 - Posted: 11 Apr 2014, 16:55:15 UTC

just a heads-up - my work host recently downloaded 2 S6casA 1.06 (SSE2 Beta) tasks, and this host most certainly is not set to accept beta/test applications. also, it shows as SSE2, not OpenCL...are there now SSE2 Beta tasks that i haven't yet seen mentioned on the boards?
____________

Holmis
Send message
Joined: 4 Jan 05
Posts: 648
Credit: 101,397,590
RAC: 122,702
Message 130480 - Posted: 11 Apr 2014, 18:16:33 UTC - in response to Message 130477.

just a heads-up - my work host recently downloaded 2 S6casA 1.06 (SSE2 Beta) tasks, and this host most certainly is not set to accept beta/test applications. also, it shows as SSE2, not OpenCL...are there now SSE2 Beta tasks that i haven't yet seen mentioned on the boards?

Yes, the CPU beta apps were announced in the Tech news section here.
rattorosso [Marche]
Avatar
Send message
Joined: 3 May 13
Posts: 8
Credit: 3,836,830
RAC: 2,667
Message 130500 - Posted: 12 Apr 2014, 22:52:04 UTC - in response to Message 130466.

Sorry, but why this is not done at Albert@home?


I'm also curious about this. Isn't albert@home the beta test platform for einstein@home?
____________
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3611
Credit: 128,421,347
RAC: 56,240
Message 130507 - Posted: 13 Apr 2014, 8:21:42 UTC - in response to Message 130466.
Last modified: 13 Apr 2014, 8:38:57 UTC

Sorry, but why this is not done at Albert@home?


Albert@Home was originally set up to test server side code, including changes that were necessary to support new applications (e.g. locality scheduling). Historically we were testing Beta App versions on Einstein as "anonymous platform packages" as long as we had only one search (i.e. application). With the introduction of a second search (BRP), maintainig app_info.xml files became a hazzle, so we switched to testing new app versions on Albert. This, however, has its own drawbacks:

- Providing the right amount of "work" of the right type (that belongs to the App version that we want to test) is getting increasingly difficult. We e.g. need to maintain a separate line of code for each workunit generator.

- The computing power and thus throughput on Albert is not very high, so you need to repeatedly reconfigure the system to not waste any on applications that you don't want to test right now.

- When you issue a new series of application versions and generate work for it, validation compares results of these only to that of versions of the same series. A comparison to the result of an established version - which is what we really want - happens only in very rare, accidental cases.

- the variation of different systems attached to Albert is not at all representative for what's running on Einstein.

- due to the low throughput, especially when tasks are assigned with additional constraints (locality scheduling), feedback (i.e. validation) is very slow, slowing down development.

- App version testing is not independent from the server code. Occasionally the need for testing application version prevented us from testing server side changes on Albert, or at least required us to postpone these.

For us these were enough arguments to shift testing of application versions back to Einstein. Server code and new applications will still be tested on Albert, and I think for some time at least BRP app versions, too.

BM
ExtraTerrestrial Apes
Avatar
Send message
Joined: 10 Nov 04
Posts: 709
Credit: 39,144,144
RAC: 1,488
Message 130510 - Posted: 13 Apr 2014, 10:49:06 UTC - in response to Message 130440.

and you should have a card which supports double precision FP in hardware.

Ouch - this could mean serious trouble! Does the entire calculation need DP or is it only for a few calculations? In the former case the app will work on all modern GPUs (except Intel), but will be very slow on all but a few high end chips.

You're surely aware of this, but most crunchers are not (as evidenced by the many people running Milkyway on hardware which is really not suitable for that task). Ideally any GPU slow in DP should rather stick with BRP calculations, as long as E@H has enough of those. Leave the DP stuff for the "big guns" and CPUs (ideally with SSE3/4 or AVX1/2).

But I don't know how to handle this properly and educate users. It would be a shame if they only found out about this after their RAC dropped to 1/2, 1/4 or even worse.

MrS
____________
Scanning for our furry friends since Jan 2002
1 · 2 · 3 · 4 . . . 8 · Next

Message boards : News : Gravitational Wave search GPU App version


Home · Your account · Message boards

This material is based upon work supported by the National Science Foundation (NSF) under Grants PHY-1104902, PHY-1104617 and PHY-1105572 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2016 Bruce Allen