Bernd Machenschalk Forum moderator Project developer
Joined: Oct 15 04 Posts: 2033 ID: 2 Credit: 21,971,104 RAC: 41,805
On Einstein@Home we have "sticky" files, i.e. files that are used for several Tasks and thus stay on the participants machines for quite some time (some of them may be more than a year old).
Our S5R2 App suffers from various Client Errors some of which we could track to come from corrupted files, in particular the ones that are pretty old ("earth" and "sun" ephemeris files). We do some sanity checking of the files in the App, but all it can do if it finds a file being broken is to terminate with a client error.
Therefore we now enabled verify_files_on_app_start. With BOINC Core Client verion 5.6 or newer the Client will check the consistency (length, md5 sum) every time before starting a new task (and should download a file again if it's found to be corrupted).
So people observing mysterious client errors (especially if the first one happens after restarting in the middle of a task and every following task terminates with a client error in the very first seconds) are encouraged to upgrade to a recent Client.
"Signature verification failed for einstein_S5R2_4.24_windows_intelx86.exe"
All I changed was the "AuthenticAMD" to "AuthenticABC" in the file so I don't get that 30% penalty running an AMD cpu versus an Intel. What a waste of computing power resources and energy when I loose 1/3 of it. Could you guys maybe fix that from your end since you tight our hands now with this md5 verification?
____________
ID: 72293 |
Bernd Machenschalk Forum moderator Project developer
Joined: Oct 15 04 Posts: 2033 ID: 2 Credit: 21,971,104 RAC: 41,805
The main reason for the "penalty" has been eliminated in App version 4.24. Manually patching an App 4.24 or newer should give an effect below 5% in speed. See the old 4.24 thread.
If you think you need to, I suggest to patch the current Beta 4.30 and run it with the app_info.xml of the package on anonymous platform.
I have 10 machines running full time. All are running XP except one runs Vista. When that machine restarts (Microsoft automatic update or power failure) the current Einstein task fails with a client error, none of the others does. Is this a problem with Vista, or BOINC, or what?
____________
A combination, I think. When Vista does its fast shutdown prior to a reboot, it doesn't allow BOINC to write its complete state to disk and so results can go missing or corrupt after the reboot.
S. Cowell
Joined: Nov 22 05 Posts: 3 ID: 124514 Credit: 241,939 RAC: 701
Client Error. I have been running Einstein for some time without problems. However, it has recently stopped crediting me with units after completing W.U.'s. The Results page:- http://einstein.phys.uwm.edu/results.php?userid=124514 indicates a Client Error.
Hi,
I am running Einstein @ home on my Vista laptop, BOINC client 5.10.30, and have had my last few Einstein results fail because of client error. Checking the messsages, I found that the "Output file" is "absent".
I had had some similar problem with a variety of BOINC projects so I let the work units I had finish up, installed the latest version (I had been using 5.10.28) and reset each project, hoping that would fix things.
Any ideas?
Thanks
Matt
PS -- I just installed the BOINC FAQ reg entry about preventing too-fast shutdowns, but I don't know yet whether it will make a difference.
12/14/2007 5:54:32 PM||Starting BOINC client version 5.10.30 for windows_intelx86
12/14/2007 5:54:32 PM||log flags: task, file_xfer, sched_ops
12/14/2007 5:54:32 PM||Libraries: libcurl/7.17.1 OpenSSL/0.9.8e zlib/1.2.3
12/14/2007 5:54:32 PM||Data directory: C:\\Program Files\\BOINC
12/14/2007 5:54:32 PM||Processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor TK-53 [x86 Family 15 Model 104 Stepping 1]
12/14/2007 5:54:32 PM||Processor features: fpu tsc pae nx sse sse2 pni 3dnow mmx
12/14/2007 5:54:32 PM||OS: Microsoft Windows Vista: Home Edition, (06.00.6000.00)
12/14/2007 5:54:32 PM||Memory: 958.00 MB physical, 2.12 GB virtual
12/14/2007 5:54:32 PM||Disk: 66.42 GB total, 18.81 GB free
12/14/2007 5:54:32 PM||Local time is UTC -8 hours
12/14/2007 5:55:06 PM|Einstein@Home|Computation for task h1_0595.30_S5R2__149_S5R3a_2 finished
12/14/2007 5:55:06 PM|Einstein@Home|Output file h1_0595.30_S5R2__149_S5R3a_2_0 for task h1_0595.30_S5R2__149_S5R3a_2 absent
12/14/2007 5:56:12 PM|Einstein@Home|Sending scheduler request: To fetch work. Requesting 2858 seconds of work, reporting 1 completed tasks
12/14/2007 5:56:16 PM|Einstein@Home|Scheduler request succeeded: got 1 new tasks
12/14/2007 5:56:16 PM|Einstein@Home|Got server request to delete file h1_0595.30_S5R2
12/14/2007 5:56:16 PM|Einstein@Home|Got server request to delete file l1_0595.30_S5R2
____________
ID: 78281 |
KSMarksPsych Forum moderator
Joined: Oct 15 05 Posts: 2349 ID: 114819 Credit: 422,629 RAC: 18
Matt,
Try moving the entire BOINC directory out of C:\\Program Files\\.
Completely shut down BOINC. Uninstall BOINC. Move the directory to almost anywhere else (I put it right in C:\\ so that it's C:\\BOINC\\). Reinstall BOINC pointing the installer to the new location.
The fast shut down issue can be minimized with the registry hack you read about or by remembering to completely shut down BOINC before shutting down your computer.
____________
Kathryn :o) The BOINC FAQ Service The Unofficial BOINC Wiki The Trac System
More BOINC information than you can shake a stick of RAM at.
That seems to have worked! Thank you for your help! (I am delayed in responding as I was watching the performance for awhile and looking for errors.)
As I write this, I realize that you have helped me several times before through the general Boinc help site, so thanks *again*.
~~ Matt
Matt,
Try moving the entire BOINC directory out of C:\\Program Files\\.
Completely shut down BOINC. Uninstall BOINC. Move the directory to almost anywhere else (I put it right in C:\\ so that it's C:\\BOINC\\). Reinstall BOINC pointing the installer to the new location.
The fast shut down issue can be minimized with the registry hack you read about or by remembering to completely shut down BOINC before shutting down your computer.
____________
ID: 78762 |
KSMarksPsych Forum moderator
Joined: Oct 15 05 Posts: 2349 ID: 114819 Credit: 422,629 RAC: 18
I'm getting that "Output file is absent" error message. Running Ubuntu Linux 8.04, and BOINC client/manager 5.10.45. SETI was running fine, but since attaching to Einstein, it hasn't run any files.
EDIT: Correction, SETI had run a file, but was showing only 22 seconds of computing, and .24% complete.
ID: 88097 |
Bikeman Forum moderator Volunteer developer
Joined: Aug 28 06 Posts: 2056 ID: 210833 Credit: 5,079,839 RAC: 9,661
I'm getting that "Output file is absent" error message. Running Ubuntu Linux 8.04, and BOINC client/manager 5.10.45. SETI was running fine, but since attaching to Einstein, it hasn't run any files.
EDIT: Correction, SETI had run a file, but was showing only 22 seconds of computing, and .24% complete.
Both of the results from your P4 crashed before completing, but for different reasons. This and the problems with SETI together seem to suggest that there might be a hardware problem with that particular PC.
I'm getting that "Output file is absent" error message. Running Ubuntu Linux 8.04, and BOINC client/manager 5.10.45. SETI was running fine, but since attaching to Einstein, it hasn't run any files.
EDIT: Correction, SETI had run a file, but was showing only 22 seconds of computing, and .24% complete.
Both of the results from your P4 crashed before completing, but for different reasons. This and the problems with SETI together seem to suggest that there might be a hardware problem with that particular PC.
CU
Bikeman
SETI has actually worked fine since this posting. I think it's completed 4 tasks. I've added my notebook to Einstein, so if the P4 can't complete any tasks for E@H, then I'll probably detach and run the notebook only. If I have further problems I will let you know.
ID: 88130 |
Clay M. Davis
Joined: Sep 8 07 Posts: 3 ID: 280348 Credit: 117,637 RAC: 379
Since late August seeing about 75% failure due to compute error. This is on 3 machines 1 Vista, 1 XP, 1 Win 2000. Upgraded Vista machine to latest BOINC in last 18 hrs. Still had 2 failed results on this machine so far.
Ideas?
Clay
Team NAVY
ID: 88190 |
Bikeman Forum moderator Volunteer developer
Joined: Aug 28 06 Posts: 2056 ID: 210833 Credit: 5,079,839 RAC: 9,661
Hi!
Are you using "CPU throttling", that is, have you configured BOINC to use only a certain percentage of idle cylces (see the: web preferences, setting: "Use at most XXX percent of CPU time"). BOINC seems to have difficulties with this setting, I would suggest to set this to "100%" and see if the problem goes away.
CU
Bikeman
____________
ID: 88232 |
Clay M. Davis
Joined: Sep 8 07 Posts: 3 ID: 280348 Credit: 117,637 RAC: 379
Interesting will give that a go. I have machines set to about 75% to cut heat. TNX
Clay
Hi!
Are you using "CPU throttling", that is, have you configured BOINC to use only a certain percentage of idle cylces (see the: web preferences, setting: "Use at most XXX percent of CPU time"). BOINC seems to have difficulties with this setting, I would suggest to set this to "100%" and see if the problem goes away.
CU
Bikeman
ID: 88264 |
Bikeman Forum moderator Volunteer developer
Joined: Aug 28 06 Posts: 2056 ID: 210833 Credit: 5,079,839 RAC: 9,661
For multicores, you could limit the numbers of CPU per host instead if you feel heat or responsiveness is an issue.
Hi,
I am running Einstein @ home on my Vista laptop, BOINC client 5.10.30, and have had my last few Einstein results fail because of client error. Checking the messsages, I found that the "Output file" is "absent".
Hi,
I have the same issue but I'm running BOINC 5.10.45 on XP SP3. Since I've installed SP3 I experience this problem of the missing output file on the Einstein project only, all other projects are running fine.
Would you recommend doing the same procedure as you've done so for the above Vista machine?
Hi,
I have the same issue but I'm running BOINC 5.10.45 on XP SP3. Since I've installed SP3 I experience this problem of the missing output file on the Einstein project only, all other projects are running fine.
Would you recommend doing the same procedure as you've done so for the above Vista machine?
Thanks a lot!
Yes, as much as I heard, SP3 gives the security features of vista, which cause the problem.
Gruß,
Gundolf
____________
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 88851 |
Clay M. Davis
Joined: Sep 8 07 Posts: 3 ID: 280348 Credit: 117,637 RAC: 379
Bikeman,
Looks like that was part of the fix. Ran throttles to 100% and made some other adjustments back toward default values and the computation errors ceased. Note also that Vista has too rapid a shutdown cycle. So I upped the write to disk to 1:45 sec. This permits better recovery should the system trip off line. So far so good. Temperatures on the large systems are tollarable at about 40 - 45C.
Clay
For multicores, you could limit the numbers of CPU per host instead if you feel heat or responsiveness is an issue.
...
Note also that Vista has too rapid a shutdown cycle. So I upped the write to disk to 1:45 sec. This permits better recovery should the system trip off line...
You could also lengthen the shutdown cycle, see this FAQ.
Gruß,
Gundolf
____________
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 88874 |
Gunnar
Joined: May 18 07 Posts: 11 ID: 262986 Credit: 8,935 RAC: 1
I'm running BOINC 5.10.28 on a laptop with Windows XP OS with SP 2. Over a year, it has done its work properly, but just a few minutes ago I saw this:
05.10.2008 16:36:26|Einstein@Home|Task h1_1052.65_S5R4__940_S5R4a_1 exited with zero status but no 'finished' file
05.10.2008 16:36:26|Einstein@Home|If this happens repeatedly you may need to reset the project.
05.10.2008 16:37:09|Einstein@Home|Task h1_1052.65_S5R4__940_S5R4a_1 exited with zero status but no 'finished' file
05.10.2008 16:37:09|Einstein@Home|If this happens repeatedly you may need to reset the project.
05.10.2008 16:37:10|Einstein@Home|Restarting task h1_1052.65_S5R4__940_S5R4a_1 using einstein_S5R4 version 604
What can I do? Do I really have to reset the project - means to remove it, as far as I understand it - from my computer? What about my account then?
Kind regards,
Gunnar
____________
ID: 89545 |
Bikeman Forum moderator Volunteer developer
Joined: Aug 28 06 Posts: 2056 ID: 210833 Credit: 5,079,839 RAC: 9,661
I'm running BOINC on a laptop with Windows XP OS. Over the years, it has done its work properly, but just a few minutes ago I saw this:
05.10.2008 16:36:26|Einstein@Home|Task h1_1052.65_S5R4__940_S5R4a_1 exited with zero status but no 'finished' file
05.10.2008 16:36:26|Einstein@Home|If this happens repeatedly you may need to reset the project.
05.10.2008 16:37:09|Einstein@Home|Task h1_1052.65_S5R4__940_S5R4a_1 exited with zero status but no 'finished' file
05.10.2008 16:37:09|Einstein@Home|If this happens repeatedly you may need to reset the project.
05.10.2008 16:37:10|Einstein@Home|Restarting task h1_1052.65_S5R4__940_S5R4a_1 using einstein_S5R4 version 604
What can I do? Do I really have to reset the project - means to remove it, as far as I understand it - from my computer? What about my account then?
Kind regards,
Gunnar
Hi!
Please press the "Update" button which will transmit der output from the failed jobs to the server immediately so we can have a look.
CU
Bikeman
____________
ID: 89546 |
Gunnar
Joined: May 18 07 Posts: 11 ID: 262986 Credit: 8,935 RAC: 1
I'm running BOINC on a laptop with Windows XP OS. Over the years, it has done its work properly, but just a few minutes ago I saw this:
05.10.2008 16:36:26|Einstein@Home|Task h1_1052.65_S5R4__940_S5R4a_1 exited with zero status but no 'finished' file
05.10.2008 16:36:26|Einstein@Home|If this happens repeatedly you may need to reset the project.
05.10.2008 16:37:09|Einstein@Home|Task h1_1052.65_S5R4__940_S5R4a_1 exited with zero status but no 'finished' file
05.10.2008 16:37:09|Einstein@Home|If this happens repeatedly you may need to reset the project.
05.10.2008 16:37:10|Einstein@Home|Restarting task h1_1052.65_S5R4__940_S5R4a_1 using einstein_S5R4 version 604
What can I do? Do I really have to reset the project - means to remove it, as far as I understand it - from my computer? What about my account then?
Kind regards,
Gunnar
Hi!
Please press the "Update" button which will transmit der output from the failed jobs to the server immediately so we can have a look.
CU
Bikeman
Done, did you have a look already?
ID: 89552 |
Bikeman Forum moderator Volunteer developer
Joined: Aug 28 06 Posts: 2056 ID: 210833 Credit: 5,079,839 RAC: 9,661
Done, did you have a look already?
Ah yes, it's the lockfile problem again :-(
Can't acquire lockfile - exiting
FILE_LOCK::unlock(): close failed.: No such file or directory
Is it possible that you have configured boinc to use less than 100% of the CPU cycles?
There is a bug in BOINC itself (not in the einstein@home app) that can cause problems with this setting. Most of the time, setting the threshold to 100% CPU in your preferences (in the web frontend here) solves the problem.
The BOINC devs are working on a fix but I guess this will take some more time.
CU
Bikeman
____________
ID: 89556 |
Keith
Joined: Feb 11 05 Posts: 2 ID: 16710 Credit: 59,342 RAC: 459
Hi,
I've been running einstein@home on two computers (both of which are on more or less permanently) for years now. I rarely look at detailed statistics because if I see general progress being made on the boinc statistics tab, then I am happy. But today I just chanced to look in detail at my einsteing@home results for both computers and was somewhat disturbed to see the majority of work unites on both machines seem to fail with a client error (and have been doing for some time). Please see below for the error nformation but the gist seems to be that something is locking a file and various waits occur as retries are made and then eventually the whole thing times out.
Clearly, with all of these exit (0) going on, Something seems to be locking a file einstein@home is trying to write to. I can only imagine that it is either Norton 360 antivirus or einstein@home itself. Both machines are running vista 32 bit on Intel core due processors. Both machines have two processors and I have been allowing up to 100% cpu on each processor, so i guess there is some possibility of a deadlock condition between two running instances of einsteing@home?
Anyway, for the momet I have restricted boinc so that it can only use one processor on each machine and I will see how that plays out. Any help or advice would be much appreciated though because this problem has resulted in the loss or waste of many weeks of cpu time which is very disappointing.
Many thanks
Keith
core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
application '..\..\projects\einstein.phys.uwm.edu\einstein_S5R5_3.05_windows_intelx86_2.exe'.
Activated exception handling...
22:14:36 (7216): Can't acquire lockfile (32) - waiting 35s
22:15:11 (7216): Can't acquire lockfile (32) - exiting
22:15:11 (7216): Error: The process cannot access the file because it is being used by another process. (0x20)
2009-07-17 22:15:13.4570 [normal]: This program is published under the GNU General Public License, version 2
2009-07-17 22:15:13.4590 [normal]: For details see http://einstein.phys.uwm.edu/license.php
2009-07-17 22:15:13.4600 [normal]: This Einstein@home App was built at: Apr 10 2009 17:21:18
<core_client_version>6.2.18</core_client_version>
<![CDATA[
<message>
process exited with code 255 (0xff, -1)
</message>
<stderr_txt>
Detected CPU type 2
execv returned: -1
</stderr_txt>
]]>
Using client 6.2.18 on Ubuntu with an AMD dual core processor with 3G of RAM.
What am I doing wrong? I run Seti and Rosetta with no problems.
You started with process exited with code 22 which means you are running a 64bit operating system and expect Einstein to provide you with a 64bit application. Einstein only has 32bit applications, so you need to have the 32bit compatibility libraries installed to be able to run32bit apps on your OS.
____________ Jord
You started with process exited with code 22 which means you are running a 64bit operating system and expect Einstein to provide you with a 64bit application. Einstein only has 32bit applications, so you need to have the 32bit compatibility libraries installed to be able to run32bit apps on your OS.
Forgive be for being dense, but where does one find these 32bit compatibility libs? Einstein appears to be running fine under a stock 64bit processor using a Kubutu (KDE) interface on another machine.
Forgive be for being dense, but where does one find these 32bit compatibility libs? Einstein appears to be running fine under a stock 64bit processor using a Kubutu (KDE) interface on another machine.
As always, different versions of Linux install different things. So some automatically install the 32bit libraries, some do not.
This material is based upon work supported by the National Science
Foundation (NSF) under Grant NSF-0200852 and by the Max Planck
Gesellschaft (MPG). Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the investigators
and do not necessarily reflect the views of the NSF or the MPG.