Constant WU restarts and "Other"

Joe
Joe
Joined: 24 Jan 08
Posts: 32
Credit: 1624667
RAC: 939

I've been having this trouble

I've been having this trouble too. After reading this thread, I tried upping the CPU percentage to 100 as suggested, and restarted. Here's the messages since restart:

2/17/08 6:57:57 PM||Starting BOINC client version 5.10.30 for windows_intelx86
2/17/08 6:57:58 PM||log flags: task, file_xfer, sched_ops
2/17/08 6:57:58 PM||Libraries: libcurl/7.17.1 OpenSSL/0.9.8e zlib/1.2.3
2/17/08 6:57:58 PM||Data directory: C:\\PROGRAM FILES\\BOINC
2/17/08 6:57:59 PM||Processor: 1 GenuineIntel x86 Family 15 Model 1 Stepping 2 [x86 Family 15 Model 1 Stepping 2]
2/17/08 6:57:59 PM||Processor features: fpu sse sse2 mmx
2/17/08 6:57:59 PM||OS: Microsoft Windows 98: SE, (04.10.2222.00)
2/17/08 6:57:59 PM||Memory: 255.46 MB physical, 500.00 MB virtual
2/17/08 6:57:59 PM||Disk: 39.99 GB total, 30.44 GB free
2/17/08 6:57:59 PM||Local time is UTC -8 hours
2/17/08 6:57:59 PM|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 455076; location: (none); project prefs: default
2/17/08 6:57:59 PM|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 1098855; location: (none); project prefs: default
2/17/08 6:57:59 PM||General prefs: from Einstein@Home (last modified 17-Feb-2008 18:33:10)
2/17/08 6:57:59 PM||Host location: none
2/17/08 6:57:59 PM||General prefs: using your defaults
2/17/08 6:57:59 PM||Reading preferences override file
2/17/08 6:57:59 PM||Preferences limit memory usage when active to 191.59MB
2/17/08 6:57:59 PM||Preferences limit memory usage when idle to 191.59MB
2/17/08 6:57:59 PM||Preferences limit disk usage to 5.59GB
2/17/08 6:58:47 PM|Einstein@Home|Restarting task h1_0783.75_S5R2__120_S5R3a_1 using einstein_S5R3 version 426
2/17/08 7:18:15 PM|Einstein@Home|Restarting task h1_0783.75_S5R2__120_S5R3a_1 using einstein_S5R3 version 426
2/17/08 7:27:34 PM|Einstein@Home|Restarting task h1_0783.75_S5R2__120_S5R3a_1 using einstein_S5R3 version 426
2/17/08 7:36:04 PM|Einstein@Home|Restarting task h1_0783.75_S5R2__120_S5R3a_1 using einstein_S5R3 version 426
2/17/08 7:52:06 PM|Einstein@Home|Restarting task h1_0783.75_S5R2__120_S5R3a_1 using einstein_S5R3 version 426

I'm also having a tad of trouble with the WCG part, but I'm talking to them about it. Does anybody have any more suggestions about the constant restarts? If it helps, it's only been the last day or so.

stewjack
stewjack
Joined: 4 Mar 06
Posts: 17
Credit: 1168109
RAC: 0

RE: I've been having this

Message 78439 in response to message 78438

Quote:
I've been having this trouble too. After reading this thread, I tried upping the CPU percentage to 100 as suggested, and restarted.


I am the originator of this thread, but I have already detached from Einstein. However; I will keep monitoring this thread for a while.

My restart problem was not apparent when I had my CPU load set to 100%. It became apparent, and got progressively worse, as I increased the throttling. ( ie decreased the CPU load ) At 50% CPU load I was only completing 3 checkpoints an hour. At 100% CPU load I was getting checkpoints about every two minutes, and no restart messages

I do run WCG ( dddt project only ) and Rosetta, but I have had no problems with either of them.

About the Checkpoints
I have been crunching Rosetta for about two years. Rosetta has widely varying checkpoints and I have set up a cc_config.xml file to display checkpointing notices. I only mention this because my BOINC message output showed [checkpoint_debug] messages and yours did not. You can ignore that fact. We DO both get the same error messages.

My problems with Einstein started out with my first work unit. You don't mention how long you have been running BOINC or Einstein or WCG. That information could be important.

It's not clear that our problems are related. In my case, when I ran ageless's cc_config.xml file, we did get some information about BOINC noticing the need to restart, but nothing about the cause of the original problem.

Good luck,

Jack

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: It's not clear that our

Message 78440 in response to message 78439

Quote:
It's not clear that our problems are related. In my case, when I ran ageless's cc_config.xml file, we did get some information about BOINC noticing the need to restart, but nothing about the cause of the original problem.


I've sent all your information through to the BOINC developers. Including your results with 5.10.42, so am waiting for whatever they think is causing it.

But it could, indeed, also have to do with how the Einstein app writes its checkpoints.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.