Computation errors since months

free_outerrim
free_outerrim
Joined: 24 Aug 05
Posts: 8
Credit: 556867
RAC: 0
Topic 193707

Hi folks!

I have a serious problem with einstein@home, again. Using Boinc version 5.10.45 I got no problems with the current seti@home app (5.27) but einstein (S5R3_4.38) creates computation errors for each and every package it tries to calculate. (code 38)

See the recent tasks.

Though it doesn't matter: system is Linux from 2.6.23 to 2.6.25. (Arch Linux) Boinc is from the community-repo.

How can I find out more about the reasons for that error?

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

Computation errors since months

Quote:

Hi folks!

I have a serious problem with einstein@home, again.
...
See the recent tasks.
...
How can I find out more about the reasons for that error?


I could only check this one (The link you gave is accessible only by you :-). The key to the problem seems to be:APP DEBUG: Application caught signal 8.
Perhaps you find more infos in the BOINC Wiki or the BOINC FAQ Service. (There's also an Unofficial Wiki, but I don't have the url :-)

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Michael Karlinsky
Michael Karlinsky
Joined: 22 Jan 05
Posts: 888
Credit: 23502182
RAC: 0

Heat and faulty memory are

Heat and faulty memory are possible reasons too.

Any blue screens lately?

MfG
Michael

free_outerrim
free_outerrim
Joined: 24 Aug 05
Posts: 8
Credit: 556867
RAC: 0

Gundolf Jahn wrote:Perhaps

Gundolf Jahn wrote:
Perhaps you find more infos in the BOINC Wiki or the BOINC FAQ Service. (There's also an Unofficial Wiki, but I don't have the url :-)

Splendid, neither error code 38 nor signal 8 are explained on this page. But I can tell that they exist... Maybe I find something if I take a closer look.
Sorry, didn't know about the links. But you get them by my nick anyway, I guess.

Michael Karlinsky wrote:
Any blue screens lately?


Never had any BSoD on Linux, do you? I didn't have kernel panics, as well. And since Seti works properly, I would suspect more something like a versions incompatibility. Last time it was such thing.

Michael Karlinsky
Michael Karlinsky
Joined: 22 Jan 05
Posts: 888
Credit: 23502182
RAC: 0

RE: Michael Karlinsky

Message 82055 in response to message 82054

Quote:
Michael Karlinsky wrote:
Any blue screens lately?

Never had any BSoD on Linux, do you? I didn't have kernel panics, as well. And since Seti works properly, I would suspect more something like a versions incompatibility. Last time it was such thing.

Tried to make a snappy remark. Failed, as soon as I realised you are using Linux too....

Michael

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

free_outerrim wrote:Michael

Message 82056 in response to message 82054

free_outerrim wrote:
Michael Karlinsky wrote:
Any blue screens lately?

Never had any BSoD on Linux, do you? I didn't have kernel panics, as well. And since Seti works properly, I would suspect more something like a versions incompatibility. Last time it was such thing.


You could paint them blue. ;-)

Perhaps that [url=http://www.nabble.com/CONFIG_PREEMPT-causes-corruption-of-application's-FPU-stack-td17293854.html]this discussion[/url] helps you out. Let me know and I'll adapt it into the FAQs.

free_outerrim
free_outerrim
Joined: 24 Aug 05
Posts: 8
Credit: 556867
RAC: 0

Ageless wrote:Perhaps that

Message 82057 in response to message 82056

Ageless wrote:
Perhaps that [url=http://www.nabble.com/CONFIG_PREEMPT-causes-corruption-of-application's-FPU-stack-td17293854.html]this discussion[/url] helps you out. Let me know and I'll adapt it into the FAQs.

Well, Arch Linux uses vanilla Kernels...
But if my problem would depend on the issue in this thead, I would have the problem since April 2007. According to DistroWatch the 2.6.20 kernel was introduced there into the Distro. But that is not the case. I don't know exactly when it occurs the fist time, but it must be less than 4 months. I guess the maintainer guys haven't used this CONFIG_PREEMPT thing before. But I don't compile the kernel for my self, so I'm not so sure about it.

The bugtracker doesn't show the issue, yet. I start a thread there. If that brings any additional infos I'll write it here.

Paper Moon
Paper Moon
Joined: 12 Apr 08
Posts: 14
Credit: 292052
RAC: 0

RE: Splendid, neither error

Message 82058 in response to message 82054

Quote:
Splendid, neither error code 38 nor signal 8 are explained on this page. But I can tell that they exist... Maybe I find something if I take a closer look.

Always start looking @home: man 7 signal

Signal 8 is SIGFPE (floating point exception). To find out more about the FPU status word value (0xb8c1 == 0b1011_1000_1100_0001), look here. As to the actual cause: heat (OC?) is a good suspect.

Regards,
Waldi

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686138814
RAC: 555564

RE: Ageless wrote:Perhaps

Message 82059 in response to message 82057

Quote:
Ageless wrote:
Perhaps that [url=http://www.nabble.com/CONFIG_PREEMPT-causes-corruption-of-application's-FPU-stack-td17293854.html]this discussion[/url] helps you out. Let me know and I'll adapt it into the FAQs.

Well, Arch Linux uses vanilla Kernels...
But if my problem would depend on the issue in this thead, I would have the problem since April 2007. According to DistroWatch the 2.6.20 kernel was introduced there into the Distro. But that is not the case. I don't know exactly when it occurs the fist time, but it must be less than 4 months. I guess the maintainer guys haven't used this CONFIG_PREEMPT thing before. But I don't compile the kernel for my self, so I'm not so sure about it.

The bugtracker doesn't show the issue, yet. I start a thread there. If that brings any additional infos I'll write it here.

So it would be interesting to know if your kernel uses the CONFIG_PREEMPT setting.

Maybe a .config or similar ASCII file can be found in /usr/src/linux or a directory with a similar name? Some Linuxes also support /proc/config.gz , so you can try

cat /proc/config.gz | gunzip - | grep CONFIG_PREEMPT

CU
Bikeman

free_outerrim
free_outerrim
Joined: 24 Aug 05
Posts: 8
Credit: 556867
RAC: 0

Paper Moon wrote:Always start

Message 82060 in response to message 82059

Paper Moon wrote:
Always start looking @home: man 7 signal


Ok, I hope I'll remember this next time, thanks.

Bikeman wrote:

So it would be interesting to know if your kernel uses the CONFIG_PREEMPT setting.

Maybe a .config or similar ASCII file can be found in /usr/src/linux or a directory with a similar name?


How about that:

$ grep PREEMPT /usr/src/linux-2.6.25-ARCH/.config 
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_DEBUG_PREEMPT is not set

That may be it, doesn't it?

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: How about that:$ grep

Message 82061 in response to message 82060

Quote:
How about that:
$ grep PREEMPT /usr/src/linux-2.6.25-ARCH/.config 
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_DEBUG_PREEMPT is not set

That may be it, doesn't it?


It may, if you know how to build a kernel and this fixes the problem for you.
It may not be an option in all distro kernels, that's something that needs to be tested as well.

And the applications need to be checked, as this is only another workaround hinting to the real problem.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.