validate errors |
Message boards : Cruncher's Corner : validate errors
| Author | Message |
|---|---|
|
could someone explain whats happened? the one thats not finished yet is suppended. should i abort? why process wus already with validate errors? | |
| ID: 75973 | | |
|
They had database trouble today, and are fixing all the erroneous validate errors even as we speak. | |
| ID: 75974 | | |
|
Here is a quick summary of what happened in the past 8 hours: | |
| ID: 75979 | | |
|
Well thanks for the update Dr. Allen. | |
| ID: 75982 | | |
|
Is this one of those errors? Notice I got no credit while 2 others have some, but my results seem ok. | |
| ID: 75983 | | |
Is this one of those errors? Notice I got no credit while 2 others have some, but my results seem ok. Yes, that was my mistake. This was one of 131 results that I should have left as 'outcome=validation errors' but in my haste I changed this to 'outcome=success'. I have fixed these 131 results (including yours). Thanks for pointing it out! Cheers, Bruce ____________ | |
| ID: 75984 | | |
|
Is this one of the mistakes? http://einstein.phys.uwm.edu/workunit.php?wuid=34921280 | |
| ID: 75987 | | |
Is this one of the mistakes? http://einstein.phys.uwm.edu/workunit.php?wuid=34921280 This appears to be a genuine error in the result. Bruce ____________ | |
| ID: 75989 | | |
|
No "'finished' file"? This is a first for me--all part of the error? Bits of the log file follow: | |
| ID: 75994 | | |
Is this one of the mistakes? http://einstein.phys.uwm.edu/workunit.php?wuid=34921280 Actually this looks like a bug in the 4.07 App, probably related to the "new checkpointing code", so the 4.09 might have it, too. BM | |
| ID: 75995 | | |
Is this one of the mistakes? http://einstein.phys.uwm.edu/workunit.php?wuid=34921280 Then I am glad I brought it up. Something more for you guys to work on. Thanks for both your updated, Dr. Allen and Bernd (are you a Dr. also?). | |
| ID: 76002 | | |
Here is a quick summary of what happened in the past 8 hours: Show me someone who say's they have not done that kind of thing and I'll show you a liar. Looks like you have done a good job of recovery and also a big thanks from me for running such a stable and trouble free project - From the crunchers point of view. I can imagine it gives you a few headaches though. | |
| ID: 76009 | | |
|
So true. This is the first downtime I can remember for quite some time, and most participants probably didn't even notice it because of work caches. | |
| ID: 76010 | | |
Thank you very much for the kind comments. We try hard not to make mistakes, but we're human! Cheers, Bruce ____________ | |
| ID: 76014 | | |
|
It happens. Reminds me of some server mistakes I made when I was really tired. Great job getting it fixed so quickly! | |
| ID: 76018 | | |
|
You'll get to love the --i-am-a-dummy mysql client setting. Also available with a less offensive name under --safe-updates. If you do an UPDATE without a WHERE clause, it will give an error. Saved my a** a couple of times. | |
| ID: 76032 | | |
You'll get to love the --i-am-a-dummy mysql client setting. Also available with a less offensive name under --safe-updates. If you do an UPDATE without a WHERE clause, it will give an error. Saved my a** a couple of times. Good idea -- I will pass this on to our admin! Bruce ____________ | |
| ID: 76043 | | |
|
One of my WU's got hit too: | |
| ID: 76046 | | |
One of my WU's got hit too: Ouch! The trouble is that with this bug there are very few workunits that can't be finished valid with the 4.07 App (probably the 4.09 Linux Beta had the same problem, which should be fixed in 4.12). BM | |
| ID: 76096 | | |
|
My first validate error: | |
| ID: 76175 | | |
|
Another one: | |
| ID: 76201 | | |
|
My intention is to manually grant credit for the validate errors that result from the bug in the 4.07 App (and only those). Please be patient with me while I find out how to do this without messing up the database. And keep reporting them! | |
| ID: 76206 | | |
|
http://einstein.phys.uwm.edu/result.php?resultid=86705754 Is this a validate error? Thanks in advance | |
| ID: 76232 | | |
|
Another one: | |
| ID: 76256 | | |
|
Based on Bernds comments uping to 4.13 should fix it. If not, he definitely needs to know. | |
| ID: 76277 | | |
|
Is RID=87685693 also one of these 4.07 validation bug problems? If so, would appreciate the manual fix thanks! | |
| ID: 76404 | | |
|
Think I might have one here :: http://einstein.phys.uwm.edu/workunit.php?wuid=35045658 | |
| ID: 76416 | | |
|
I got a hit on http://einstein.phys.uwm.edu//workunit.php?wuid=34960047 along with 4 others who have errored out. Only one person has validated. | |
| ID: 76432 | | |
|
I have begun to manually grant credit for validation errors from the 4.07 App that resulted from the "sorting bug" in 4.07. As a first step this applies to workunits with more than one validation error from 4.07, I still have 55 single results on my list that need further investigation. | |
| ID: 76504 | | |
I have begun to manually grant credit for validation errors from the 4.07 App that resulted from the "sorting bug" in 4.07. As a first step this applies to workunits with more than one validation error from 4.07, I still have a list of 55 single results on my list that need further investigation. That is certainly very considerate of you, and adds to the overall goodwill of this project! :) ____________ Regards, Bob P. | |
| ID: 76507 | | |
I have begun to manually grant credit for validation errors from the 4.07 App that resulted from the "sorting bug" in 4.07. As a first step this applies to workunits with more than one validation error from 4.07, I still have 55 single results on my list that need further investigation. I guess I got all of them that dropped in by now. I'll look for more of these results in a few days again. BM | |
| ID: 76512 | | |
I have begun to manually grant credit for validation errors from the 4.07 App that resulted from the "sorting bug" in 4.07. As a first step this applies to workunits with more than one validation error from 4.07, I still have 55 single results on my list that need further investigation. Dr. Allen, I took one box off of Einstein because I discovered it had virtually nothing but validate errors: http://einstein.phys.uwm.edu/results.php?hostid=956316 (Win2k) I've experienced a rather steep drop in my RAC lately, and the above was a contributing factor. Not like I'm losing any sleep over it, or anything (I have real concerns in my life, and this certainly isn't one of them), I'm just trying to put the pieces back together to bring my RAC back up to where it should be. Some more errors here: http://einstein.phys.uwm.edu/results.php?hostid=974305 (Ubuntu 6.06LTS) Interestingly enough, another AMD box which sits literally adjacent to the above, and is connected to my network via the same switch hub, seems to be doing fine: http://einstein.phys.uwm.edu/results.php?hostid=1049118 (WinXP Home SP2) This was the first machine I attached to this project, and, while I had some results error out initially, it has worked fine since then. The above are AMD boxes, which, in my experience, don't always play nice with Einstein. I was surprised to see an error with my "new" Intel box: http://einstein.phys.uwm.edu/results.php?hostid=1062708 (Ubuntu 6.06LTS) In the interest of full disclosure, I want to mention that I've been experiencing a LOT of communication problems lately. The router for my network has become increasingly flaky in recent weeks. Note that my network is 100% wired. If these problems could by caused by communication glitches, then no one should expend any time or effort on this unless I experience further problems after getting the router taken care of. I just need to know if this could be the problem. Now, I don't expect to be "made whole" credit-wise (it would be nice, but I'm honestly not worried about it). There are only a few WUs that any of these boxes have spent any time on, anyway (the typical error is after only a few minutes or even seconds). If the problem could be with the apps I'm running, I suppose what I'm looking for is some guidance on what version app to run on each box, where to get them, and how to install them without causing further problems. I welcome assistance from anyone that knows these things, or knows how/where I can find out about these things. Thanks. ____________ ![]() | |
| ID: 78131 | | |
|
I'm not Bruce, but I'll respond anyway. :-) I also make your host IDs clickable. Dr. Allen, This host has a mix of errors, one is exit code 99 and the other is exit code 128. One suggested fix for exit code 99 is to reset the project so you download another data set. The suggested fix for exit code 128 is an update of Direct X. Some more errors here: http://einstein.phys.uwm.edu/results.php?hostid=974305 (Ubuntu 6.06LTS) This host was fine up until a few days ago. Now it's getting Signal 11 errors. You might want to try the beta app for that host. The above are AMD boxes, which, in my experience, don't always play nice with Einstein. I was surprised to see an error with my "new" Intel box: http://einstein.phys.uwm.edu/results.php?hostid=1062708 (Ubuntu 6.06LTS) That host looks ok other than one error (exit code 112) unless there are other errors sitting on the machine that haven't reported back yet. I'm not familiar with that particular error code and a quick Google search doesn't turn up anything BOINC related. In the interest of full disclosure, I want to mention that I've been experiencing a LOT of communication problems lately. The router for my network has become increasingly flaky in recent weeks. Note that my network is 100% wired. If these problems could by caused by communication glitches, then no one should expend any time or effort on this unless I experience further problems after getting the router taken care of. I just need to know if this could be the problem. This could be the problem with the box getting exit code 99s. If the problem could be with the apps I'm running, I suppose what I'm looking for is some guidance on what version app to run on each box, where to get them, and how to install them without causing further problems. If it were me, I'd try the Linux beta app for the box getting signal 11. Keep an eye on the Intel/Linux box with issues. And update DirectX on the 2K box with issues. ____________ Kathryn :o) The BOINC FAQ Service The Unofficial BOINC Wiki The Trac System More BOINC information than you can shake a stick of RAM at. | |
| ID: 78138 | | |
Fixed the link to the beta app thread... ____________ ![]() | |
| ID: 78139 | | |
|
DirectX 9.0c November 2007 update (Multilingual). | |
| ID: 78140 | | |
I'm not Bruce, but I'll respond anyway. :-) Kathryn, Wow! This is all quite wonderful! I appreciate your looking at this and providing such specific and detailed suggestions. Frankly, it's pretty embarrassing to ask such utterly noob questions, because I work in IT. Clearly, this isn't my area of expertise ;^) BTW, the 2K box had one or more trojans and whatnot in it (which I wasn't able to get rid of using conventional means), so I ended up reformatting the drive and starting from scratch. I was able to bring BOINC back up on it without any problem. I'll pay specific attention to the DirectX version on it. I hope to be able to get to switching the router out in the next few days, we'll see if I keep getting those error 99s. Anyway, thank you so much for responding so quickly, in such a helpful fashion. ____________ ![]() | |
| ID: 78177 | | |
DirectX 9.0c November 2007 update (Multilingual). Thank you, Ageless. I appreciate the link. ____________ ![]() | |
| ID: 78178 | | |
Message boards :
Cruncher's Corner :
validate errors