Posts by Claggy

1) Message boards : Cruncher's Corner : CUDA 6.5? (Message 135227)
Posted 1 day ago by Claggy
What about such scenario:
Deprecate CUDA 3.2 app and replace with CUDA 5.5
If CUDA 6.5 brings reasonable performance (or other) advantages make an app for it and maintain in parallel (send only to those hosts which support it). Would that be too much effort ?

Why would you want to deprecate the Cuda32 version?, it would mean all hosts on Cuda32, Cuda40, Cuda42 and Cuda50 drivers would no longer get work,
To run a Cuda55 app you have to have drivers that support that version of the Cuda SDK.

Claggy
2) Message boards : Cruncher's Corner : GPU error message (Message 135222)
Posted 1 day ago by Claggy
Could any one give me a clue as to what is wrong and what to do?

You're not trying to use remote desktop are you?

Claggy
3) Message boards : Cruncher's Corner : Mac Intel HD4000 GPU question (Message 135220)
Posted 1 day ago by Claggy
Is this related to why the Intel GPU isn't working? Am I doing something wrong, or are the drivers not available? (I would suggest it on the wish list if its easy to do then).

Einstein doesn't have an Intel GPU app for the Mac deployed here, the only one deployed here is for Windows on the Binary Radio Pulsar Search (Arecibo) sub-project.

You could attached to the Albertathome test project, there is an Intel GPU app deployed there, but only for the Gamma-ray pulsar search #3 sub-project:

http://albert.phys.uwm.edu/apps.php

Gamma-ray pulsar search #3 Platform Version Created Average computing Mac OS X on Intel 1.12 (FGRPopencl-intel_gpu-lion) 11 Jul 2014, 6:27:36 UTC 5 GigaFLOPS


Claggy
4) Message boards : Cruncher's Corner : wasted time (Message 134888)
Posted 6 days ago by Claggy
See the other thread for analysis.

Claggy
5) Message boards : Problems and Bug Reports : wasted time (Message 134887)
Posted 6 days ago by Claggy
this task:
PB0056_00821_88_0 202720988 6 Nov 2014 17:07:22 UTC 15 Nov 2014 0:31:11 UTC Error while computing 554,121.40 4,593.92 9.42 --- Binary Radio Pulsar Search (Perseus Arm Survey) v1.39 (BRP5-opencl-ati)

took 6.413 days & completed with an error - what a waste - I won't run another 1 of these on this computer!

Boinc aborted the task with 'Maximum elapsed time exceeded' as it took 10 times longer than it should, Have you freed a CPU core to stop the GPU apps being slowed down?

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>


That host is running a particularly old CAL driver of 1.4.1417, this equates to Cat 11.6 and should have OpenCL runtime of 650.9 (assuming they haven't become mismatched), this was an SDK 2.4 driver/runtime:

ATI Driver Version Cheat Sheet

It is quite possible that your host has a too old driver/OpenCL runtime for the projects apps, They may have been built with SDK's as recent as SDK 2.7 or 2.8

Claggy
6) Message boards : Cruncher's Corner : not getting cpu tasks (Message 134738)
Posted 11 days ago by Claggy
I can't get any cpu tasks and I don't understand the message that is referenced:
11/10/2014 5:52:37 AM | Einstein@Home | see scheduler log messages on http://einstein5.aei.uni-hannover.de/EinsteinAtHome/host_sched_logs/11681/11681266

You're got plenty of CPU tasks on that host (now), i make it you're got 125 Gamma-ray pulsar search #4 v1.04 (FGRP4-SSE2) tasks on that host:

In progress Gamma-ray pulsar search #4 tasks for computer 11681266

But you also keep aborting them:

Error Gamma-ray pulsar search #4 tasks for computer 11681266

You're got a quad core processor, But you're limited to only using 2 cores, Boinc isn't asking for CPU work, only GPU work, You're got the min cache set for 10 days work,
You're asking for 10.5 days of GPU work for each GPU, this project has maximum deadlines of 14 days,
The other use of the min cache settings is how many days Boinc is going to have unavailability of internet access,
So Boinc will need to do that work 10 days early, meaning Boinc has 4 days to get that 10 days of work done,
The scheduler should refuse to send you more work if you can't possibly get it done in time,
Set a more reasonable cache size, a couple of days is all you need at this project:

2014-11-10 12:50:19.1108 [PID=2015] Request: [USER#xxxxx] [HOST#11681266] [IP xxx.xxx.xxx.93] client 7.2.42
2014-11-10 12:50:19.1115 [PID=2015 ] [send] effective_ncpus 2 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2014-11-10 12:50:19.1115 [PID=2015 ] [send] effective_ngpus 2 max_jobs_on_host_gpu 999999
2014-11-10 12:50:19.1115 [PID=2015 ] [send] Not using matchmaker scheduling; Not using EDF sim
2014-11-10 12:50:19.1115 [PID=2015 ] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2014-11-10 12:50:19.1115 [PID=2015 ] [send] ATI: req 1822018.04 sec, 0.00 instances; est delay 544613.28
2014-11-10 12:50:19.1115 [PID=2015 ] [send] work_req_seconds: 0.00 secs
2014-11-10 12:50:19.1115 [PID=2015 ] [send] available disk 3.34 GB, work_buf_min 864000
2014-11-10 12:50:19.1115 [PID=2015 ] [send] active_frac 0.999936 on_frac 0.990347 DCF 0.733791
2014-11-10 12:50:19.1223 [PID=2015 ] [send] [HOST#11681266] is reliable
2014-11-10 12:50:19.1225 [PID=2015 ] [send] set_trust: error rate 0.310944 > 0.050000, don't trust
2014-11-10 12:50:19.1225 [PID=2015 ] [mixed] sending non-locality work first (0.4721)
2014-11-10 12:50:19.1513 [PID=2015 ] [version] Checking plan class 'FGRP4-SSE2'
2014-11-10 12:50:19.1533 [PID=2015 ] [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2014-11-10 12:50:19.1533 [PID=2015 ] [version] numerical Windows version: 601760100 (Microsoft Windows 7 Ultimate x64 Edition, Service Pack 1, (06.01.7601.00))
2014-11-10 12:50:19.1533 [PID=2015 ] [version] plan class ok
2014-11-10 12:50:19.1533 [PID=2015 ] [version] Don't need CPU jobs, skipping version 104 for hsgamma_FGRP4 (FGRP4-SSE2)
2014-11-10 12:50:19.1533 [PID=2015 ] [version] no app version available: APP#27 (hsgamma_FGRP4) PLATFORM#9 (windows_x86_64) min_version 0
2014-11-10 12:50:19.1533 [PID=2015 ] [version] no app version available: APP#27 (hsgamma_FGRP4) PLATFORM#2 (windows_intelx86) min_version 0
2014-11-10 12:50:19.1636 [PID=2015 ] [mixed] sending locality work second
2014-11-10 12:50:19.1666 [PID=2015 ] [debug] [HOST#11681266] MSG(high) No work sent
2014-11-10 12:50:19.1666 [PID=2015 ] [debug] [HOST#11681266] MSG(high) see scheduler log messages on http://einstein5.aei.uni-hannover.de/EinsteinAtHome/host_sched_logs/11681/11681266
2014-11-10 12:50:19.1666 [PID=2015 ] Sending reply to [HOST#11681266]: 0 results, delay req 60.00
2014-11-10 12:50:19.1676 [PID=2015 ] Scheduler ran 0.060 seconds


Claggy
7) Message boards : Problems and Bug Reports : How important is the GPU? (Message 134629)
Posted 14 days ago by Claggy
Hi Danie,

The scheduler log on your 662 host shows the following:
2014-11-07 19:18:26.9681 [PID=2005 ] [version] driver version required min: -30100, supplied: 26658

The driver for this video card needs to be updated to at least version 30100 before it will be seeing any gpu work...

Gord

He has to do nothing of the sort, there are two plan_classes for drivers either side of the Buggy 295.xx/296.xx drivers:

2014-11-07 19:18:26.9681 [PID=2005 ] [version] Checking plan class 'BRP5-cuda32'
2014-11-07 19:18:26.9681 [PID=2005 ] [version] parsed project prefs setting 'gpu_util_brp': 0.000000
2014-11-07 19:18:26.9681 [PID=2005 ] [version] Peak flops supplied: 4.4864e+10
2014-11-07 19:18:26.9681 [PID=2005 ] [version] plan class ok
2014-11-07 19:18:26.9681 [PID=2005 ] [version] Don't need CUDA jobs, skipping version 139 for einsteinbinary_BRP5 (BRP5-cuda32)
2014-11-07 19:18:26.9681 [PID=2005 ] [version] Checking plan class 'BRP5-cuda32-nv301'
2014-11-07 19:18:26.9681 [PID=2005 ] [version] parsed project prefs setting 'gpu_util_brp': 0.000000
2014-11-07 19:18:26.9681 [PID=2005 ] [version] driver version required min: -30100, supplied: 26658


Claggy
8) Message boards : Problems and Bug Reports : Gamma-ray pulsar search 4 (Message 134495)
Posted 17 days ago by Claggy
You're attached to four projects, three of them have work on your host, Boinc will take turns in doing work for each project, when it's Einstein's turn computation will start again,
(assuming you haven't suspended CPU or GPU activity via the Activity menu, or suspended the project, or suspended it's tasks)

No did to delete/abort any tasks, be aware some of their apps report their progress very infrequently, just let them run, they'll finish eventually.

Claggy
9) Message boards : Cruncher's Corner : No New Tasks Set in BOINC - STILL Getting New Tasks (Message 134447)
Posted 21 days ago by Claggy
I keep adding and removing Einstein@home to my BOINC client (I remove it because it hogs my CPU/GPU).

I was advised to reduce my cache size to fix the problem of Einstein@Home always running in high priority mode ...

So I have set 0.4 days as my cache size and 0.1 days additional work buffer (in network settings).

This should make sure I have no more than half a days worth of work units WUs in my BOINC client tasks list - right ?

WRONG

As soon as I added the project I got 12 WU/tasks each with an estimated 8 hour run time (4 days worth of work).

So I went and set "No New Tasks" for this project (then I hit the update button to make sure the Einstein@Home server gets the message).

Well the Einstein@Home got the message - and I got an ADDITIONAL 12 WUs/tasks !!!

So I thought xxxx you I'm aborting the last 12 WUs/tasks you sent me after I specifically told you NO NEW TASKS

This project has old Boinc scheduler code, where the scheduler will resend existing tasks allocated to your computer even with NNT set,
more resent scheduler code will only resend lost work if the client asks for work,

Your host has at present 159 tasks allocated to it already:

In progress tasks for computer 11635427

Boinc is only resending those existing tasks, it is not sending new tasks:

http://einstein5.aei.uni-hannover.de/EinsteinAtHome/host_sched_logs/11635/11635427

2014-10-31 05:02:10.0344 [PID=6650 ] [version] Checking plan class 'BRP5-opencl-ati'
2014-10-31 05:02:10.0344 [PID=6650 ] [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2014-10-31 05:02:10.0344 [PID=6650 ] [version] Peak flops supplied: 2.304e+12
2014-10-31 05:02:10.0344 [PID=6650 ] [version] plan class ok
2014-10-31 05:02:10.0344 [PID=6650 ] [version] Best version of app einsteinbinary_BRP5 is ID 481 (49.31 GFLOPS)
2014-10-31 05:02:10.0344 [PID=6650 ] [send] est. duration for WU 201891923: unscaled 9126.06 scaled 9003.44
2014-10-31 05:02:10.0347 [PID=6650 ] [debug] Sorted list of URLs follows [host timezone: UTC+36000]
2014-10-31 05:02:10.0347 [PID=6650 ] [debug] zone=-28800 url=http://einstein.ligo.caltech.edu
2014-10-31 05:02:10.0347 [PID=6650 ] [debug] zone=-21600 url=http://einstein-dl2.phys.uwm.edu
2014-10-31 05:02:10.0347 [PID=6650 ] [debug] zone=-18900 url=http://einstein-dl.syr.edu
2014-10-31 05:02:10.0348 [PID=6650 ] [debug] zone=+03600 url=http://einstein2.aei.uni-hannover.de
2014-10-31 05:02:10.0350 [PID=6650 ] [send] [HOST#11635427] Sending app_version 481 einsteinbinary_BRP5 9 139 BRP5-opencl-ati; 49.31 GFLOPS
2014-10-31 05:02:10.0351 [PID=6650 ] [send] [RESULT#462221471] [HOST#11635427] (resend lost work)
2014-10-31 05:02:10.0368 [PID=6650 ] [send] est. duration for WU 201891923: unscaled 9126.06 scaled 9003.44
2014-10-31 05:02:10.0368 [PID=6650 ] [HOST#11635427] Sending [RESULT#462221471 PB0053_02511_344_0] (est. dur. 9003.44 seconds, delay 1209600, deadline 1415941330)
2014-10-31 05:02:10.0384 [PID=6650 ] [send] est. duration for WU 201892158: unscaled 9126.06 scaled 9003.44
2014-10-31 05:02:10.0387 [PID=6650 ] [send] [HOST#11635427] Sending app_version 481 einsteinbinary_BRP5 9 139 BRP5-opencl-ati; 49.31 GFLOPS
2014-10-31 05:02:10.0389 [PID=6650 ] [send] [RESULT#462221941] [HOST#11635427] (resend lost work)
2014-10-31 05:02:10.0406 [PID=6650 ] [send] est. duration for WU 201892158: unscaled 9126.06 scaled 9003.44
2014-10-31 05:02:10.0406 [PID=6650 ] [HOST#11635427] Sending [RESULT#462221941 PB0053_024C1_34_0] (est. dur. 9003.44 seconds, delay 1209600, deadline 1415941330)
2014-10-31 05:02:10.0425 [PID=6650 ] [send] est. duration for WU 201892160: unscaled 9126.06 scaled 9003.44
2014-10-31 05:02:10.0426 [PID=6650 ] [send] [HOST#11635427] Sending app_version 481 einsteinbinary_BRP5 9 139 BRP5-opencl-ati; 49.31 GFLOPS
2014-10-31 05:02:10.0428 [PID=6650 ] [send] [RESULT#462221946] [HOST#11635427] (resend lost work)


Claggy
10) Message boards : News : Project downtime tomorrow (Message 134420)
Posted 23 days ago by Claggy
This "Notice" is still appearing on the BOINC Manager page ... ?

Stop posting in this thread, your very posting is causing it to get reshown, repeatedly.

The Boinc dev's are aware of this problem, and have put fixes in place, when they reach this project is another matter.

Claggy


Next 10

Home · Your account · Message boards

This material is based upon work supported by the National Science Foundation (NSF) under Grants PHY-1104902, PHY-1104617 and PHY-1105572 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2014 Bruce Allen