Posts by Richard Haselgrove

1) Message boards : Problems and Bug Reports : "Input toplist has zero length" with "Gravitational Wave search S6Bucket Follow-up #2 v1.01 (SSE2)" tasks (Message 142294)
Posted 1 day ago by Richard Haselgrove
The fix deployed in v7.6.6 applied to one specific file only: stderr.txt

Projects vary in how important stderr.txt is to the validation process: Milkyway@Home returns its entire scientific result in stderr (no separate upload files), and SETI@Home returns some special-case processing flags used by the validator. But so far as I know, Einstein only uses the uploaded data files, and doesn't look at stderr.txt at all during validation. If that's the case, v7.6.6 will make no difference here.

There was an earlier fix, deployed in v7.6.2, relating to the non-deletion of files larger than 4 GB from slot directories. I doubt the questioner has checkpoint files that big... But while researching that problem, I learned a lot more about how the client ensures that the slot directories are clean before reuse: all files are deleted when a task finishes, all files are deleted (again, just in case) before a new task is started in an existing slot, and if any files are present after the second delete, the slot isn't reused and another one is chosen instead.

I was able to watch that process in action with the stderr.txt files from Milkyway, and it seems robust (and has been in place for a long time). Even if file system corruption made it impossible to delete slot files, I'd expect the client to go on making new slot directories and using them instead.
2) Message boards : Problems and Bug Reports : BRP4 Intel GPU app feedback thread (Message 142203)
Posted 8 days ago by Richard Haselgrove
p2030.20150531.G32.86+00.63.C.b6s0g0.00000_3408_0 223767791 11955750 23 Jul 2015 11:51:59 UTC 30 Jul 2015 11:51:59 UTC 计算中 --- --- --- --- Binary Radio Pulsar Search (Arecibo, GPU) v1.52 (BRP4G-Beta-opencl-intel_gpu)
p2030.20150531.G68.35+00.83.S.b3s0g0.00000_1536_0 223713556 11955750 22 Jul 2015 2:06:33 UTC 23 Jul 2015 14:08:40 UTC 已完成,等待验证 81,587.69 493.70 4.28 等待中 Binary Radio Pulsar Search (Arecibo, GPU) v1.52 (BRP4G-Beta-opencl-intel_gpu)
p2030.20150531.G68.35+00.83.S.b4s0g0.00000_1040_0 223713287 11955750 22 Jul 2015 2:06:33 UTC 22 Jul 2015 6:41:11 UTC 被用户中止 0.00 0.00 0.00 --- Binary Radio Pulsar Search (Arecibo, GPU) v1.52 (BRP4G-Beta-opencl-intel_gpu)

为什么运行81,587.69秒,只申请4.28分?
Why did I run 81587.69 seconds but only apply for 4.28 credits?

Task 510876799

It doesn't actually 'claim' or 'apply for' credit at all. It simply reports how long the processing took, and the rather elderly server software here works out that figure, on the assumption that the "CPU time: 493.70 seconds" was the only processing done. At this stage, the server doesn't take any notice of the fact that the bulk of the work was done on the intel_gpu.

Once your work is validate by a second user, you should get the same standard fixed credit as everyone else - 1,000 credits. That assumes that the results match.
3) Message boards : Technical News : BOINC transitioning to a Community Based Governance (Message 142080)
Posted 17 days ago by Richard Haselgrove
The email exchanges can be read directly at

https://groups.google.com/forum/#!forum/boinc_admin
4) Message boards : Technical News : Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6" (Message 142027)
Posted 20 days ago by Richard Haselgrove
It would probably help Bernd if people could (re-)state their GPU model when making remarks about speedup or otherwise.
5) Message boards : Problems and Bug Reports : BRP6_1.56_cuda55 - Driver API not loading on either Host (Message 142015)
Posted 20 days ago by Richard Haselgrove
Or just recognise that an API (Application Programming Interface) is a different beast from a driver, and not under your control.

Computers which are carefully managed by people who know exactly what they are doing - like archae86's GTX 970 - also report

[19:57:38][5500][INFO ] Version of installed CUDA driver: 6050
[19:57:38][5500][INFO ] Version of CUDA driver API used: 3020

in stderr_txt - that's the way the application was written. It's not under our control. Any of us.
6) Message boards : Problems and Bug Reports : BRP6_1.56_cuda55 - Driver API not loading on either Host (Message 141999)
Posted 21 days ago by Richard Haselgrove
Use Process Explorer while the task is running to see which DLL files are being used, and where they're being loaded from.

I was caught out many years ago at SETI Beta, when I confidently predicted that an application would fail because it had been deployed wrongly - but it ran successfully and validated.

Process explorer demonstrated (screenshot in that link) that Windows had found some SDK files I'd forgotten I'd even installed on that machine - and they were the right ones for the application.

In general, Windows will try to use the files supplied by Einstein first, but if you've installed developer files (not needed for normal crunching), they may be used instead.

Richard, I have no experience with Process Explorer yet but I did see something curious when I ran the 1.57 CUDA 5.5 app against Dependency Walker. It showed the naming of the CUDA dll's being different than the actual names that Einstein gives the CUDA 5.5 dll's. Dependency Walker throws errors, "file not found".

See archae86's message 141792 in Gary's timing discussion thread, in Cruncher's Corner.

Einstein uses BOINC's "copy and rename" facility on these DLLs. Copy the app and anything mentioned in the <app_version> sections listed to a scratch folder. Rename those copies (only) of the two cuda DLLs as shown in <app_version>, and retry Dependency Walker. That's how I found the dependency on LIBWINPTHREAD-1.DLL for v1.54 reported in Technical News.

Edit - in reply to your second post, look in the slot directory while the app is running. And double-check your own <app_version> section for v1.57 in client_state.xml
7) Message boards : Technical News : Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6" (Message 141958)
Posted 22 days ago by Richard Haselgrove
My GTX 470 is running, but not under ideal conditions for timing. Recent work allocation has meant two PMPS running together, whereas previous timings were one Einstein with one SETI. I think I saw an improvement, but only a small one.
8) Message boards : Problems and Bug Reports : BRP6_1.56_cuda55 - Driver API not loading on either Host (Message 141932)
Posted 23 days ago by Richard Haselgrove
Use Process Explorer while the task is running to see which DLL files are being used, and where they're being loaded from.

I was caught out many years ago at SETI Beta, when I confidently predicted that an application would fail because it had been deployed wrongly - but it ran successfully and validated.

Process explorer demonstrated (screenshot in that link) that Windows had found some SDK files I'd forgotten I'd even installed on that machine - and they were the right ones for the application.

In general, Windows will try to use the files supplied by Einstein first, but if you've installed developer files (not needed for normal crunching), they may be used instead.
9) Message boards : Technical News : Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6" (Message 141929)
Posted 23 days ago by Richard Haselgrove
Windows (32Bit) version 1.57 will be published soon. This just upgrades the CUDA 5.5 libraries to the latest available version 5.5.22 (from 5.5.20 used in version 1.56).

BM

It's a shame the version 5.5 CUDA FFT library is so big - 67 MB for each version, and BOINC doesn't discard the previous one. The next release up, CUDA 6.0, is back down to 27 MB (all for 32-bit files).
10) Message boards : Technical News : Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6" (Message 141881)
Posted 25 days ago by Richard Haselgrove
Found only at two subdirectories of f:\NVIDIA\DisplayDriver\344.60\Win8_WinVista_Win7_64\
DLL Name: cudart64_55.dll

Not found at all
DLL Name: cufft64_55.dll

Look at your app_version listing from last time. Einstein distributes CUDA DLLs with additional version information in the names, and renames them on copying to the slot directory. That bit's OK.

(assuming the app_version is essentially unchanged - I didn't catch one myself, again)


Next 10

Home · Your account · Message boards

This material is based upon work supported by the National Science Foundation (NSF) under Grants PHY-1104902, PHY-1104617 and PHY-1105572 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2015 Bruce Allen