My Apologies To The Einstein Crunchers

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

RE: RE: Greetings

Quote:
Quote:

Greetings everyone,

*** UPDATE ***

Ok, supposedly the Windoze repair install has finished. I did see a weird thing happen where an nVidia driver was un-installed during the Win install. The un-install asked for a reboot, I click "No".

Now, for the past 20 minutes I have been sitting on a screen with the WinXP logo and copyright, sitting here "Please wait..."-ing, still. I believe Windoze has stalled, at this point. I'm thinking I have to hit the reset button on the i7, now. Hopefully, Windoze will recover...

Keep on BOINCing...! :)

You MUST click YES to reboot or the changes cannot be made to the OS!

Yes if it is 'hung' just push the off button and hold it down until the machine powers off, then tap it again, after about a slow count to 10, and the machine should power back on just fine.


Greetings Mikey,

You're saying that I should go ahead and let the i7 re-boot even during the WinXP install? It was the nVIDIA un-installer asking for a re-boot, not Windoze. I let Windoze re-boot after the install, naturally. :)

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Matt Giwer
Matt Giwer
Joined: 12 Dec 05
Posts: 144
Credit: 6891649
RAC: 0

RE: Greetings Einstein

Quote:

Greetings Einstein Crunchers,

Last month, October, I decided to re-attache to Einstein@home since SETI@Home is going through, what I would like to call, a spurt of growth. Many of you know that SETI is in the process of receiving 2 new servers to better serve the project and the thousands of volunteers faithfully crunching SETI WUs.

Not long after I started crunching Einstein on my i7 PC, I noticed some strange anomalous behavior. I would see WUs that were "Waiting for memory". I would also notice that after some time, I would see that text was missing from things like web pages and even things on the i7 such as the Windows start button would no longer say "Start" and tabs in my browser and HTML editor would also be blank. When shutting down a program, it would take a couple minutes for the window to go away. Programs would no longer start. BOINC would no longer make a connection to localhost after restarting it. The i7 re-booted all on its own once. I immediately suspected Einstein.

Now, lets go back in time to May (I believe it was) when I built this i7 PC. I noticed strange things happening with my video display. My mouse cursor would suddenly jump from one point to another across the screen. Programs, like my HTML editor, would auto scroll while I would be doing my coding. When on a website, I would be reading a page and suddenly be transported back a page or 2 as if I hit the back button, which I hadn't. I suspected that there was something flakey about my new PCI-E video card. I lived with it for several months.

When I saw the problems I started having, when starting Einstein again, I figured, even though I still suspected that Einstein was at the root of the current problems, that my video card was getting worse. So, a couple days ago, I bought a new one and installed it. The problems I was seeing, before Einstein, went away. Good call on that one, the "old" card was indeed flakey. But, I was still observing the problems that manifested after starting Einstein again. My suspicion of Einstein was getting greater.

I mentioned these problems in another thread, here, and a few members gave me some suggestions to try to alleviate my problems. A few of them had no affect, but, I did get a definitive explanation for my problem. It was indeed Einstein that was the cause.

But, my i7 was also at fault, sort of. I'm not running a 64 bit OS, so WinXP Pro is not seeing all of my 4GB of RAM. Einstein uses a great deal of RAM, due to the size of the WUs, and perhaps other factors as well. Since my 32 bit OS is only seeing 3GB of RAM and 8 Einstein WUs are taking up 2 of those GBs, the i7 was left with 1GB for everything else including BOINC.

And now for the apology: About 45 minutes ago, give or take, I checked on the i7 and BOINC. You see, after about 8 to 10 hours of continuous running, BOINC would start creating havoc with the i7 and I would have to re-boot the PC. When I checked on the progress of BOINC, I saw that 175 WUs were reporting "Computational error"(s). I had 5 completed WUs waiting to report. Suffice to say, there probably are many of you out there waiting for the results of those 175 WUs, and for that I do apologize.

For the time being, I have resigned myself not to run Einstein. As a matter of fact, BOINC is sitting idle right now with nothing to do. I do plan on upgrading to a 64 bit OS (WinXP Pro, not Win7). I may even decide to double my RAM to 8GB. Maybe then I will start crunching Einstein again. But, right now, I do not want what happened this morning to happen again. It is NOT fair to the rest of you waiting for results so you can get the credit for the work you have already done.

My thanks go out to Mikey, Gundolf and DanNeely for the help they gave me. I really appreciate it! :)

And now, it's time to do my normal routine(s), now that the i7 is free to do my bidding, once again. ;)

Have a great day, everyone! :)

Keep on BOINCing...! :)

I am running E@H on a AMD Phenom Quad with 4GB under Fedora 11 64bit and no problems whatsoever. That machine also runs my TV (home entertainment center to be pretentious) which is admittedly a low stress function but no problems. I have also done a full full test with the latest threaded Povray rendering of a 1200 frame animation and no problem. The diff between 32 and 64 bit is 3.2 GB to 4BG. Might be worth writing home about but not to assign blame.

I did have a problem running it on a machine with a 32bit Fedora 10 with 0.5GB RAM and calculated it was trying to use about 1.2GB. No problem now that it is stand alone without graphics doing nothing but keeping my RAC up. Also E@H is unnoticeable on an Athlon II x4 with 6GB but the L3 cache on the Phenom may obviate the difference in RAM.

I was having bad WU problems a while back and was told that setting the option to suspend processing at the 80% level in options worked for him. It worked for me too. No more bad work units. I had the problem on both the Phenom and Athlon machines so it was not a RAM issue.

I have no experience with what an i7 requires but I doubt it places greater burdens on the rest of the hardware simply out of good design policy. There should be chips to make more of what an i7 can do but should not be necessary for system pricing reasons. The main reason is the Intel CPUs start at about $100 more than AMD and if cheaper support chips were not permitted without degradation of performance Intel based machines would price themselves out of most markets.

Matt Giwer
Matt Giwer
Joined: 12 Dec 05
Posts: 144
Credit: 6891649
RAC: 0

RE: Another thing that

Quote:
Another thing that kinda disturbs me is the amount of storage Einstein requires and is currently using. When I had a full compliment of WUs, about 215 or so, Einstein took up almost 2 GB of space. Granted, I have a 500 GB drive, but it's the fact that with all the WUs gone, Einstein is still taking over .5 GB.

You now have a modern machine. You must get over the thinking that arose with 40GB drives. It is not easy to get over but we all have to do it.

Matt Giwer
Matt Giwer
Joined: 12 Dec 05
Posts: 144
Credit: 6891649
RAC: 0

RE: RE: My feeling is

Quote:
Quote:
My feeling is that most of the symptoms you're seeing, including the problem the BOINC Manager has communicating with the client, are due to this start-up disk access for Einsten tasks. And the disk access demands of your 8 i7 cores at startup will be even more severe than the demands of my 8 E5320 cores

This seemingly unproductive period during Einstein execution has troubled me for a long time. I just timed a startup on my E5620 host (four cores running HT, so 8 virtual CPU's). It took almost three minutes from the time it started running the aps before it reached the "useful" phase when the enormous I/O, appreciable idle time, and significant CPU time charged to System went away and normal computing began. With the flavor of WU currently on my host, Process Explorer showed just over 3 Gigabytes of I/O Read to that point. It appears that this number goes up as the frequency explored by the WU goes up--so this effect keeps getting worse as we press on to higher and higher frequencies.

I believe I've noticed that the actual wall clock time required can be materially increased by one's Antivirus application. If your AV makes it easy to pause protection, it might amuse you to compare this startup time with and without. I'm running Kaspersky on this host, and when I tried the comparison just now, there was little difference--perhaps ten or twenty percent without, and that possibly really an artifact of longer time since boot rather than really an AV side effect. Still, in the past on another host I think I've seen a much larger effect with other AV programs--probably eSet NOD32 and possibly Norton of some years ago.

From what I have read those four cores producing 8 virtual cores actually operate on the level of six real cores. That is why I opted for two refurbed AMD four cores for the same price and a lot more backup flexibility and functionality.

mikey
mikey
Joined: 22 Jan 05
Posts: 11978
Credit: 1834161750
RAC: 203558

RE: RE: RE: Greetings

Quote:
Quote:
Quote:

Greetings everyone,

*** UPDATE ***

Ok, supposedly the Windoze repair install has finished. I did see a weird thing happen where an nVidia driver was un-installed during the Win install. The un-install asked for a reboot, I click "No".

Now, for the past 20 minutes I have been sitting on a screen with the WinXP logo and copyright, sitting here "Please wait..."-ing, still. I believe Windoze has stalled, at this point. I'm thinking I have to hit the reset button on the i7, now. Hopefully, Windoze will recover...

Keep on BOINCing...! :)

You MUST click YES to reboot or the changes cannot be made to the OS!

Yes if it is 'hung' just push the off button and hold it down until the machine powers off, then tap it again, after about a slow count to 10, and the machine should power back on just fine.


Greetings Mikey,

You're saying that I should go ahead and let the i7 re-boot even during the WinXP install? It was the nVIDIA un-installer asking for a re-boot, not Windoze. I let Windoze re-boot after the install, naturally. :)

Keep on BOINCing...! :)

I am not sure, but yes probably, if it wants to reboot, during the Windows install process, I always let it. It should pick up where it left off when it restarts.

Have you tried downloading Linux and running Boinc under it? I am NOT talking about a Linux install but booting off the Linux disk and doing it 'live'. If you put Boinc on a usb stick you will not be messing with the hard drive and if Boinc runs for 24 hours you KNOW it is your Windows installation that is the problem.

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86316101
RAC: 316

All very curious

All very curious indeed...

Assuming that your system has passed ok all of:

Memtest86+ full set of tests for a few passes;
GIMPS mprime86 'torture test';
And that your disks check out ok for a disk surface scan and filesystem check...

That should show that the hardware is ok.

A last ditch test is to run a graphics display test (or just play a high fps video game) to make sure your graphics card is ok.

The only things to suggest next are:

Is your system overheating?

Do you have mains power problems?

Can you run Boinc overnight with the network PHYSICALLY DISCONNECTED (unplugged) and with all anti-virus disabled and with no 'screen savers' running?

If it passes that, then that's interesting.

If it fails, then...

The only way to eliminate whether it's Windows or the hardware is to try a different OS on that system. Linux even?

Meanwhile, if there were problems with Boinc and E@H, then I'm sure there would be many more howls than just yourself.

The 8 - 10 hours before failure is suspicious... Is that the time your cat settles on the PC to keep itself warm for a snooze?!

Or more seriously, is that when other household equipment is stirred to life and electrical noise?

Has your Windows system got a virus/trojan/malware that is waking up every few hours to do some dirty deed?

Let us know what you find!

Good luck,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Matt Giwer
Matt Giwer
Joined: 12 Dec 05
Posts: 144
Credit: 6891649
RAC: 0

RE: ... The only way to

Quote:
...
The only way to eliminate whether it's Windows or the hardware is to try a different OS on that system. Linux even?...
Good luck,
Martin

I know flame wars are sort of fun but it is not my intention to start one. I started with MS-DOS 3.0 and it was not until Win-98 SP2 that I threw in the towel.

I merely wish to observe when it comes to Windows it is all a matter of guess work and experience whereas in linux it is open to anyone to analyze the problem. Talk about virus protection on Windows? Then you have to have not only installed it but sussed out how it works in detail and what that means. It is not that way with linux.

Not meaning to denigrate anyone not many people who use Windows really know what they are using or doing. It is mainly because of proprietary software that will not tell what it is doing. Very few can reverse engineer proprietary software to the point of being about to figure out what it is really doing.

Gamers should look into Codeweavers Crossover to see if their games can be run on linux or resign themselves to eternal problems with Windows. OR decide to make the effort to learn enough about PCs and software to really understand what all the MS stuff is really doing.

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

RE: RE: RE: RE: -[

Quote:
Quote:
Quote:
Quote:
-[ snip ]-
-[ snip ]-

Greetings Mikey,

You're saying that I should go ahead and let the i7 re-boot even during the WinXP install? It was the nVIDIA un-installer asking for a re-boot, not Windoze. I let Windoze re-boot after the install, naturally. :)

Keep on BOINCing...! :)

I am not sure, but yes probably, if it wants to reboot, during the Windows install process, I always let it. It should pick up where it left off when it restarts.

Have you tried downloading Linux and running Boinc under it? I am NOT talking about a Linux install but booting off the Linux disk and doing it 'live'. If you put Boinc on a usb stick you will not be messing with the hard drive and if Boinc runs for 24 hours you KNOW it is your Windows installation that is the problem.


Greetings Mikey,

The previous "repair" install did not seem to help. The i7 ran for about 11 hours, without BOINC, without any problems. I started BOINC and I don't think it was 2 hours later when my Windoze color scheme changed to something looking like Windoze 98. Everything was that beige color.

I shut down BOINC and was going to re-boot. Somewhere in the Windows shutdown process an error message popped up saying something about some application could not write to a section of memory. So, I booted into my "Ultimate Boot CD" CD and ran a memory checker that was made by, none other than, Micro$oft. I ran the extended tests. Some of the tests that are standard run with cache turned off in the extended tests. I now see why cache is so important. Those tests were excruciatingly slow to run. At the end of it all, no errors were found, once again, with my RAM.

I know that when an application writes data, to be used, to RAM that that section of RAM is locked so that the data cannot be overwritten by another application. Now, what I'm not sure about is this: Is the application responsible for locking and unlocking the RAM, or is that the responsibility of Windoze? I'm thinking that it is the OS that is responsible. Only makes sense to me since the OS is responsible for allocating resources. Another thought on that would be that the application should be responsible for contacting the OS saying,"Hey, OS, I'm done with the RAM now."

Anyway, since I don't have to work today, I'm going to attempt another "repair" install and use your suggestion about the re-boot after the video driver uninstall. If that doesn't fix this problem, then I'm going to do a clean install. Hopefully Windoze install will fully re-format the HDD before installing the OS, again.

About your response over at VP, perhaps I will just re-attache to the projects. Makes sense that something in the BOINC folder could be of a corruptible nature. I'd hate to get the i7 going again, with a fresh install, only to have the problem re-appear.

My last task will be to save up my pennies and nickles and order Win7 Pro 64 bit. This, much sooner than I would normally do with a newly released upgraded Windoze OS. I usually wait until after SP2 is out, well, at least SP1.

Thanks Mikey! :)

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

RE: All very curious

Quote:

All very curious indeed...

Assuming that your system has passed ok all of:

Memtest86+ full set of tests for a few passes;
GIMPS mprime86 'torture test';
And that your disks check out ok for a disk surface scan and filesystem check...


I ran memtest86 over night, no errors reported. I ran a CPU torture test from my "Ultimate Boot CD" CD, the CPU passed with highest honors. I ran a HDD diagnostics and the boot drive checked out with no problems.

Quote:


That should show that the hardware is ok.

A last ditch test is to run a graphics display test (or just play a high fps video game) to make sure your graphics card is ok.


I did run a diagnostics that checked both RAM and video RAM. Video checked out just fine. Once a week I am in an online chat called IMVU. It is a VR video chat with constant video changes. It runs just fine, without BOINC running. 2 weeks ago I had it do the same in that I would lose text and things would start heading south, with BOINC running. Ever try to chat when you cannot see the text you are typing?

Quote:


The only things to suggest next are:

Is your system overheating?


No overheating problem once I installed my H50 Corsair cooling system.

Quote:

Do you have mains power problems?


No problems with power.

Quote:

Can you run Boinc overnight with the network PHYSICALLY DISCONNECTED (unplugged) and with all anti-virus disabled and with no 'screen savers' running?

If it passes that, then that's interesting.

If it fails, then...


I have never run BOINC that way since switching from dial up to broadband years ago. I don't run screen savers. I'm not sure running BOINC in this manner would help since the only thing it is doing is communicating with project server(s). No crunching at this point.

Quote:

The only way to eliminate whether it's Windows or the hardware is to try a different OS on that system. Linux even?


I can and have dual-booted the i7 with Windoze and Kubuntu 64 bit. Perhaps that's not a bad idea, but that will only show that there is nothing wrong with my hardware. Worth a try though...

Quote:

Meanwhile, if there were problems with Boinc and E@H, then I'm sure there would be many more howls than just yourself.


I do not believe that there is a problem with BOINC or even Einstein. I believe that whatever caused the problems I am having would have happened whether I was re-attaching to Einstein or attaching to another project at that same moment.

Quote:

The 8 - 10 hours before failure is suspicious... Is that the time your cat settles on the PC to keep itself warm for a snooze?!


No cat involved, don't have one. Don't even want one. Don't like them. Don't tell Mark over at SETI I said this. ;) By the way, the current up time is 1 day, 11 hours and 23 minutes. Without BOINC running.

Quote:

Or more seriously, is that when other household equipment is stirred to life and electrical noise?


The only thing that kicks in, periodically, is the furnace downstairs. That has never been a problem.

Quote:

Has your Windows system got a virus/trojan/malware that is waking up every few hours to do some dirty deed?


I use Emsisoft Anti-Malware (formerly known as A-Squared Free) as my malware scanner/remover. I have run it several times since this started and it only finds tracking cookies, which I have it delete. I use Avira AntiVir for my virus/trojan/malware/rootkit scanner. It is running at all times keeping an eye on all things happening. It has not sent me any messages since this all began.

Quote:

Let us know what you find!

Good luck,
Martin


Please see my reply to Mikey, below. It tells of what my next objective is and what has happened since the Windoze "repair" install.

I will keep everyone posted.

Oh, by the way. I forgot to mention that I was using my Linux PC to use the Google machine to find information on the Internet. I discovered an article explaining the use of a Windoze utility called SFC. You hit Start/run and type sfc /scannow and it scans and verifies all Windoze system files. Must have the install CD in when doing so. There is a way around having to use the install CD, however. Anyway, it didn't tell me that anything was wrong with the system files, it just ended. I don't know much about the utility, maybe it just creates a log file of what it finds and one has to view the log. Need to check into it a little more. I just thought of that just now. :)

Anyway, thanks for the suggestions, Martin. :)

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

Greetings

Greetings everyone,

RATS...

I forgot to mention one crucial occurrence which happened after the Windoze "repair" install. And, I have to do this from the Linux PC since this is where all the info is that I found.

During my nVIDIA driver install, I suddenly, out of the blue, got a BSOD. The error was caused by a Windoze system file called "ks.sys". The message was: 0x000000CB: DRIVER_LEFT_LOCKED_PAGES_IN_PROCESS (not sure of the number in front of the message). I copied that message from a forum post I found on the subject. The message is the same, not sure of the preceding number. Mine could have been different, I don't know.

The BSOD was doing a memory dump while on screen, dumping it to a, I would assume, a file on the HDD. While it was doing its thing, I came to this PC and started using the Google machine to find out what was happening. After what seemed like an eternity, I hit the reset button on the i7. Windows booted somewhat fine and my nVIDIA driver was installed and the utility usable. I say "somewhat fine" because the artifact I mentioned a few days ago, on the Windoze "cylon" splash screen was gone, but re-appeared after the BSOD.

During my time of research was when I found the article about the "sfc /scannow" utility, which, as reported earlier, I ran. On the above linked page, there is a link to a Micro$oft page giving more info on SFC. Evidently, it just does its thing and does not report back to the user anything it found or did. Kinda lame if you ask me. But then, that's kinda reminiscent of the very, very vague error messages coming out of Windoze.

Anyway, just thought I would toss this into the mix, just for GP (general principle). :)

Now, off to do another "repair" install... :(

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.