Killing Processes with Manual Refresh

Rauch Christian

Joined: 1 Mar 07

Posts: 8

Credit: 109429

RAC: 0

RE: This is one of my

24 Apr 2007 5:39:50 UTC

Message 62771 in response to message 62767

(moderation:

)

Quote:

This is one of my failed WUs

http://einsteinathome.org/task/83371387

I see the error you mentioned, though I can't see this one in my WUs.

Quote:

Also only running the manager in advanced view.

I was able to complete one S5R2 run, tho. This one was on a server that doesn't even have openGL installed...sooooo...just speculating here. I tried to disable graphics by moving away the *.so but BOINC is just too smart and re-installs it from the server

This machine completed 7 S5R2 WUs before trashing WUs.

Quote:

Anyway I don't need the graphics, how can I disable it for good in a safe way?
CU
BRM

Sorry, don't know either.

Regards,
Chris

Erik

Joined: 14 Feb 06

Posts: 2815

Credit: 2645600

RAC: 0

RE: Anyway I don't need the

24 Apr 2007 7:15:28 UTC

Message 62772 in response to message 62767

(moderation:

)

Quote:

Anyway I don't need the graphics, how can I disable it for good in a safe way?

CU

BRM

If possible, install BOINC as a service instead of a single user or shared installation.

bonnyscott

Joined: 4 Feb 06

Posts: 6

Credit: 5373362

RAC: 0

RE: RE: I will update to

24 Apr 2007 8:48:33 UTC

Message 62773 in response to message 62763

(moderation:

)

Quote:

Quote:

I will update to 5.8.17 now and see whether this helps.

Did so too, now I'm waiting for some WUs to complete, let's see, if manual reporting crashes the running processes again.

The glitch happens _not_ only with the newer Boinc Versions.

Got the same mess here on my Laptop, which still runs Boinc 5.4.9, as well as on an X2 running 5.8.15, both with Linux.
No other debug- output than "caught sigabrt".
And that after several dozen ksec of number- cracking.
Really "enchanting"!

And I _definitly_ didn't fiddle with the "update"- button at the time the X2- WUs went up in smoke!

Regards, Bonnyscott

Kimegi Tepeex

Joined: 1 May 05

Posts: 8

Credit: 250148

RAC: 0

I have got several WUs

25 Apr 2007 19:32:04 UTC

Message 62774

(moderation:

)

I have got several WUs terminated in error (process got signal 11) since S5R2 :

On one computer (AMD Duron 1800, Linux 2.6.12-12mdk, BOINC 5.4.11) this result lost 8,000 seconds.
This computer has succesfully crunched 2 WUs of +60,000 seconds after this first crash.

On a second one (AMD Duron 1200, Linux 2.6.18-1.2868.fc6, BOINC 5.8.17), two results here and here have each spent just a little bit more than one full day (+87,000 seconds) of hard work before being sadly killed :(

Maybe one occured after a manual update, but I am sure this was not the case for the two others : BOINC did it by itself.

In either case, no graphics, no screensaver, no other application, just crunching (almost bare) blades on a shelf with others...

Any suggestion on how to avoid those crash to occur again would be greatly appreciated.

Charles Dennett

Joined: 22 Jan 05

Posts: 22

Credit: 45273

RAC: 9

Reattached a couple of my

26 Apr 2007 14:38:52 UTC

Message 62775

(moderation:

)

Reattached a couple of my systems several days ago. I've also seen this problem. Just a short while ago I changed the share EAH was getting and did a manual update. A WU that was paused (but still in memory) aborted when I did this. It's here System is running Fedora 5 with a 2.6.20 kernel and the 5.8.17 core client as supplied by the BOINC project (I used to compile my own.) Processor is AMD XP2600+.

There are a couple of other WUs that did the same thing in the past several days when I was adjusting the resource share and doing a manual update.

Charlie

taoran

Joined: 16 Nov 05

Posts: 1

Credit: 2072305

RAC: 0

I also got some "process got

27 Apr 2007 12:07:27 UTC

Message 62776

(moderation:

)

I also got some "process got signal 11" errors. After I restarted my computer.
OS is redhat Linux 2.6.9-5.ELsmp.

http://einsteinathome.org/task/83512190
http://einsteinathome.org/task/83519160
http://einsteinathome.org/task/83519653
http://einsteinathome.org/task/83567271
http://einsteinathome.org/task/83571545
http://einsteinathome.org/task/83573933
http://einsteinathome.org/task/83592717
http://einsteinathome.org/task/83604575
http://einsteinathome.org/task/83608648
http://einsteinathome.org/task/83628361

Mikie Tim T

Joined: 22 Jan 05

Posts: 105

Credit: 263777741

RAC: 0

The update glitch also

28 Apr 2007 3:15:59 UTC

Message 62777

(moderation:

)

The update glitch also happened on my linux machine. And it runs 5.8.8, so it doesn't seem to matter what version of CC we're running.

bahur

Joined: 31 Mar 05

Posts: 1

Credit: 4629271

RAC: 0

Allas, our new compute

28 Apr 2007 14:46:38 UTC

Message 62778

(moderation:

)

Allas, our new compute cluster failed about 50 WUs in just one day. I run E@H for thermal stability assessment and at first I thought that the machine is overheating but then my C2D at home started failing WUs. There's a real twofold annoying problem with the Linux client - we are wasting CPU seconds for nothing and E@H project gets slowed down because the WUs have to be recomputed again...

Now I've switched back to the 64-bit 5.4.11 core client by Debian that used to work flawlessly before S5R2.

ohiomike

Joined: 4 Nov 06

Posts: 80

Credit: 6453639

RAC: 0

I also have had the "Process

28 Apr 2007 17:22:53 UTC

Message 62779

(moderation:

)

I also have had the "Process got signal 11" fault on one of my machines. It is strange in that I have 2 almost identical machines: Boinc 5.8.17, Linux 2.6.21, on AMD x2 CPUs. Both have been running fine for months, now one throws 4 errors in one day, then continues on fine. In my case it looks like it crashed both running tasks after a reboot (I rebooted that machine twice to do updates on other SW).

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 692146435

RAC: 9004

Hi! Is anybody getting

29 Apr 2007 0:38:54 UTC

Message 62780 in response to message 62779

(moderation:

)

Hi!

Is anybody getting these "signal 11" errors on systems that don't have OpenGL installed (e.g. servers without X11)?

My observation (tho only from 13 WU) is that E@H on Linux will run more reliable when graphics are disabled (either because openGL isn't installed at all or because you keep the client from loading libGL ).

Host #1 (no libGL installed) 5 out of 6 WU were completed (1 with an error other than "signal 11")

Host #2 (graphics disabled) 2 out of 2 WU were completed

Host #3: while graphics were still enabled: 3 out of 3 WUs failed (with "signal 11")
After disabling graphics: 3 out of 3 WUs were completed

BRM

Killing Processes with Manual Refresh

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports