Sudden lurch in remaining work display

log in

Advanced search

Message boards : Cruncher's Corner : Sudden lurch in remaining work display

1 · 2 · Next
Author Message
archae86
Send message
Joined: 6 Dec 05
Posts: 1763
Credit: 357,893,516
RAC: 572,417
Message 79937 - Posted: 23 Jan 2008, 17:39:36 UTC

The S5R3 search progress pane on the server status page suddenly changed from saying we had well over 300 days of work to go to claiming on 6.2 days.

Bug, or news item?
____________

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3612
Credit: 128,657,615
RAC: 52,314
Message 79939 - Posted: 23 Jan 2008, 17:47:45 UTC

Actually news, but I didn't come to write the details yet. We found we had to break the current run in two parts at 800Hz frequency. The display shows the work remaining below 800Hz. We'll have set up the upper half in a few days.

BM

th3
Send message
Joined: 24 Aug 06
Posts: 208
Credit: 2,208,434
RAC: 0
Message 79941 - Posted: 23 Jan 2008, 18:11:40 UTC

I noticed all the WUs i had above 800 gives way too much credit, will future WUs in that range give lower credits?

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3612
Credit: 128,657,615
RAC: 52,314
Message 79943 - Posted: 23 Jan 2008, 18:17:50 UTC - in response to Message 79941.
Last modified: 23 Jan 2008, 18:18:04 UTC

I noticed all the WUs i had above 800 gives way too much credit, will future WUs in that range give lower credits?

Apn unexpected side-effect of the problems we have found with the >=800Hz WUs is that they run shorter as intended. The new ones will get the same credit, but run noticeably longer.

BM
DanNeely
Send message
Joined: 4 Sep 05
Posts: 1120
Credit: 189,724,803
RAC: 256,131
Message 79970 - Posted: 24 Jan 2008, 1:19:53 UTC - in response to Message 79943.

I noticed all the WUs i had above 800 gives way too much credit, will future WUs in that range give lower credits?

Apn unexpected side-effect of the problems we have found with the >=800Hz WUs is that they run shorter as intended. The new ones will get the same credit, but run noticeably longer.

BM


Would it be possible to keep the runtime unchanged and adjust the credit instead? This would reduce alot of the grumbling from people with older machines that aren't on 24/7.
____________
Brian Silvers
Send message
Joined: 26 Aug 05
Posts: 782
Credit: 282,700
RAC: 0
Message 79971 - Posted: 24 Jan 2008, 1:25:11 UTC - in response to Message 79970.

I noticed all the WUs i had above 800 gives way too much credit, will future WUs in that range give lower credits?

Apn unexpected side-effect of the problems we have found with the >=800Hz WUs is that they run shorter as intended. The new ones will get the same credit, but run noticeably longer.

BM


Would it be possible to keep the runtime unchanged and adjust the credit instead? This would reduce alot of the grumbling from people with older machines that aren't on 24/7.


If the official Windows app becomes 4.26, there may be enough of a speed boost to help the GUM (Great Unwashed Masses). If not, then boosting deadlines up to 16-18 days until SSE can be implemented in the Windows app may also help...
____________
Profile Donald A. Tevault
Avatar
Send message
Joined: 17 Feb 06
Posts: 443
Credit: 73,513,404
RAC: 0
Message 79972 - Posted: 24 Jan 2008, 1:32:36 UTC - in response to Message 79971.

I noticed all the WUs i had above 800 gives way too much credit, will future WUs in that range give lower credits?

Apn unexpected side-effect of the problems we have found with the >=800Hz WUs is that they run shorter as intended. The new ones will get the same credit, but run noticeably longer.

BM


Would it be possible to keep the runtime unchanged and adjust the credit instead? This would reduce alot of the grumbling from people with older machines that aren't on 24/7.


If the official Windows app becomes 4.26, there may be enough of a speed boost to help the GUM (Great Unwashed Masses). If not, then boosting deadlines up to 16-18 days until SSE can be implemented in the Windows app may also help...


Hmmm. . .

I don't know. If I understand Bernd correctly, it sounds like these >= 800Hz workunits don't run long enough to complete all of the needed calculations. Thus, the need to create new workunits with longer runtimes.
____________
Brian Silvers
Send message
Joined: 26 Aug 05
Posts: 782
Credit: 282,700
RAC: 0
Message 79974 - Posted: 24 Jan 2008, 1:40:45 UTC - in response to Message 79972.

I noticed all the WUs i had above 800 gives way too much credit, will future WUs in that range give lower credits?

Apn unexpected side-effect of the problems we have found with the >=800Hz WUs is that they run shorter as intended. The new ones will get the same credit, but run noticeably longer.

BM


Would it be possible to keep the runtime unchanged and adjust the credit instead? This would reduce alot of the grumbling from people with older machines that aren't on 24/7.


If the official Windows app becomes 4.26, there may be enough of a speed boost to help the GUM (Great Unwashed Masses). If not, then boosting deadlines up to 16-18 days until SSE can be implemented in the Windows app may also help...


Hmmm. . .

I don't know. If I understand Bernd correctly, it sounds like these >= 800Hz workunits don't run long enough to complete all of the needed calculations. Thus, the need to create new workunits with longer runtimes.


Yes, and what I stated does depend on the runtime staying consistent between the < and the >=. If >= 800 takes longer than < 800, then there is definitely going to be some need for panic...

____________
DanNeely
Send message
Joined: 4 Sep 05
Posts: 1120
Credit: 189,724,803
RAC: 256,131
Message 80002 - Posted: 24 Jan 2008, 11:43:43 UTC - in response to Message 79972.


I don't know. If I understand Bernd correctly, it sounds like these >= 800Hz workunits don't run long enough to complete all of the needed calculations. Thus, the need to create new workunits with longer runtimes.


Depends on what Bernd meant. The way i read it was that the WUs were completing all the work they needed to do in significantly less time than was expected.
____________
Profile Donald A. Tevault
Avatar
Send message
Joined: 17 Feb 06
Posts: 443
Credit: 73,513,404
RAC: 0
Message 80007 - Posted: 24 Jan 2008, 12:42:58 UTC - in response to Message 80002.


I don't know. If I understand Bernd correctly, it sounds like these >= 800Hz workunits don't run long enough to complete all of the needed calculations. Thus, the need to create new workunits with longer runtimes.


Depends on what Bernd meant. The way i read it was that the WUs were completing all the work they needed to do in significantly less time than was expected.



If that's the case, then I don't understand what the problem is.

Hopefully, we'll get some more amplifying info on this later.
____________
Brian Silvers
Send message
Joined: 26 Aug 05
Posts: 782
Credit: 282,700
RAC: 0
Message 80010 - Posted: 24 Jan 2008, 13:09:50 UTC - in response to Message 80007.


I don't know. If I understand Bernd correctly, it sounds like these >= 800Hz workunits don't run long enough to complete all of the needed calculations. Thus, the need to create new workunits with longer runtimes.


Depends on what Bernd meant. The way i read it was that the WUs were completing all the work they needed to do in significantly less time than was expected.



If that's the case, then I don't understand what the problem is.

Hopefully, we'll get some more amplifying info on this later.


The way I translated it, the workunits ran much faster than anticipated. What isn't stated is why they ran faster than anticipated. Another related message here was about how tasks at the 799.xx frequency were erroring out immediately...

I unno... I've asked multiple times about deadline extensions. I was considering not asking again based upon the increase in speed by Windows 4.26. Will have to wait and see...
____________
Profile Donald A. Tevault
Avatar
Send message
Joined: 17 Feb 06
Posts: 443
Credit: 73,513,404
RAC: 0
Message 80024 - Posted: 24 Jan 2008, 13:54:21 UTC

I've finally received a pair of these >= 800Hz jobs. They completed in about 76,000 seconds, far less than the 110,000 - 120,000 seconds that would be normal for this machine. So, there's definitely something strange here.

Dual Pentium III 866
____________

Brian Silvers
Send message
Joined: 26 Aug 05
Posts: 782
Credit: 282,700
RAC: 0
Message 80027 - Posted: 24 Jan 2008, 14:20:33 UTC - in response to Message 80024.

I've finally received a pair of these >= 800Hz jobs. They completed in about 76,000 seconds, far less than the 110,000 - 120,000 seconds that would be normal for this machine. So, there's definitely something strange here.

Dual Pentium III 866


My timing always sucks... I am only up to 779... :-(
____________
Profile Astro
Avatar
Send message
Joined: 18 Jan 05
Posts: 257
Credit: 1,000,560
RAC: 0
Message 80029 - Posted: 24 Jan 2008, 14:34:46 UTC - in response to Message 80027.
Last modified: 24 Jan 2008, 14:38:58 UTC

I've finally received a pair of these >= 800Hz jobs. They completed in about 76,000 seconds, far less than the 110,000 - 120,000 seconds that would be normal for this machine. So, there's definitely something strange here.

Dual Pentium III 866


My timing always sucks... I am only up to 779... :-(

Brian, Here's a look at my Mobile AMD64 3700 laptops wus using windows and the work done so far:

Brian Silvers
Send message
Joined: 26 Aug 05
Posts: 782
Credit: 282,700
RAC: 0
Message 80031 - Posted: 24 Jan 2008, 15:20:17 UTC - in response to Message 80029.

I've finally received a pair of these >= 800Hz jobs. They completed in about 76,000 seconds, far less than the 110,000 - 120,000 seconds that would be normal for this machine. So, there's definitely something strange here.

Dual Pentium III 866


My timing always sucks... I am only up to 779... :-(

Brian, Here's a look at my Mobile AMD64 3700 laptops wus using windows and the work done so far:



Yeah yeah... rub it in... You got the credit boost from going above 799 and then the performance boost by going to 4.26... :-P on you too...

____________
Profile Astro
Avatar
Send message
Joined: 18 Jan 05
Posts: 257
Credit: 1,000,560
RAC: 0
Message 80032 - Posted: 24 Jan 2008, 15:42:00 UTC
Last modified: 24 Jan 2008, 16:14:37 UTC

Well, to be honest, I hadn't looked at the credits for the 800's until you mentioned it.

Calculating Credit/hour with the benchmark means it should get 14.16/hour.

After 407 Rosetta wus(recent app), this host got 12.79/hour avg.
After 125 stock 5.27 Seti wus it got 15.21/hour avg.
With 4 4.15 wus <800 it got 16.225/hour, but with >800 and 4.15 it got 28.65/hour. And with >800 AND 4.26 it yields 33.1125/hour. WOW

OK, now in fairness/full disclosure: The other project I've recently ran was Boinc Simap and this host is getting and avg. 22.45/hour there.

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3612
Credit: 128,657,615
RAC: 52,314
Message 80033 - Posted: 24 Jan 2008, 15:43:22 UTC
Last modified: 24 Jan 2008, 15:48:00 UTC

The data files currently on Einstein@home of 800Hz and above (h1_0800.0_S5R2* / l1_0800.0_S5R2*) are wrong. While we are generating the correct ones, we stopped generating workunits for 800Hz and above.

We intend to let the some thousand WUs that point to the wrong files that are already in the database simply run out. The ones on the boundary (that have 0799.5 files as well as 0800.0) will error out just at the beginning when trying to read the files ("error in SFT sequence"), with no CPU time wasted. The ones above 800Hz that are already in the database will run shorter as the assigned credit would suggest, because the run-time and this the credit was estimated based on correct datafiles. If we would simply cancel these workunits, people that already have completed such a task would get no credit at all for this, so I decided to be rather too generous and let them run.

The current WU generator will only generate new WUs below 800Hz. There are ~300,000 left to be generated, which should be work for the project for about a week in total. During that time we will generate correct data files and set up a new WU generator for the work of 800Hz and above.

So the second half run of S5R3 (currently internally called S5R3b) should start early next week. The new Tasks will run as long as estimated and thus will get the same credit we currently give to the ones with the same base-frequency (but wrong data files).

Brian, we are considering your proposal to extend the deadline for these new WUs.

BM

Mats Nilsson
Send message
Joined: 10 Dec 05
Posts: 94
Credit: 8,987,237
RAC: 6,702
Message 80034 - Posted: 24 Jan 2008, 15:49:57 UTC

Current app will handle this new WU?
____________

Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 15 Oct 04
Posts: 3612
Credit: 128,657,615
RAC: 52,314
Message 80035 - Posted: 24 Jan 2008, 15:53:37 UTC - in response to Message 80034.

Current app will handle this new WU?

The new workunits will reference the same Apps. No change there.

BM
Brian Silvers
Send message
Joined: 26 Aug 05
Posts: 782
Credit: 282,700
RAC: 0
Message 80036 - Posted: 24 Jan 2008, 16:26:40 UTC - in response to Message 80033.


Brian, we are considering your proposal to extend the deadline for these new WUs.


Thanks... The speed increase from 4.26 is definitely appreciated and would probably reduce the incidence of tasks missing deadline by only a couple of days as it appears to be 10-20% faster, depending on hardware. I guess it all will depend on how long the new results take...

Anyway, as for the boundary tasks, do you know if all of those have already been distributed? Since they fail very quickly, any host that gets them will likely be driven down to only 1/day quota...

____________
1 · 2 · Next

Message boards : Cruncher's Corner : Sudden lurch in remaining work display


Home · Your account · Message boards

This material is based upon work supported by the National Science Foundation (NSF) under Grants PHY-1104902, PHY-1104617 and PHY-1105572 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2016 Bruce Allen