My Apologies To The Einstein Crunchers

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6534
Credit: 284710168
RAC: 110462

RE: Here are my i7

Quote:
Here are my i7 SPECIFICATIONS:
Intel i7-860 2.80 GHz CPU
Corsair H50 CPU cooling system
Asus P7P55D-E Pro MoBo
Galaxy nVidia GeForce GT 430 PCI-E 1MB GDDR3 RAM (CUDA Enabled)
Kingston 4GB PC3 10600 DDR3 RAM (Dual Channel)
Western Digital 500GB 3G SATA HDD x 2 = 1TB Storage
Micro$oft WinXP Pro 32 bit
BOINC v6.10.58


On the face of it with this spec, I'd have to say you ought have no trouble with E@H! It would seem that something about our WU's has caused a belly up though. The unit listing for your rig does not indicate a preference for any particular type of WU ( GW/ABP/CUDA) when it comes to errors. Now if one looks ( bottom of page ) at the stderr output for these units in error you'll see something like this ( some variations thereof ) :

[pre]6.10.58

app_version download error: couldn't get input files:

einstein_S5GC1_3.02_windows_intelx86__S5GCESSE2.exe
-120
signature verification failed

]]>[/pre]
These go back right to the oldest error in you listing. This message pretty well hints that, whatever else is going on, BOINC can't find what it thinks ought be there come time to fire up a WU. Now also most of those WU's in error don't run for long ( or at all ) before that mode of exit occurs. However what I really don't understand is why a few of these units ( say this one ) go for 9K, 21K etc seconds before falling over if indeed the executable is missing - as stderr suggests.

So now I suggest we think of why can't BOINC find what it wants, and when it wants it. First thing I'd do is look in the E@H part of the BOINC directories and see if einstein_S5GC1_3.02_windows_intelx86__S5GCESSE2.exe or einsteinbinary_ABP2_3.11_windows_intelx86__ABP2cuda23.exe are there at all - heck, search the whole machine for that matter. So if it is present, and where it ought to be ( BOINC has per project repositories on your drive ) then we need to think why can't it access? That is, what does signature verification failed mean etc .... :-)

Cheers, Mike.

( edit ) Another thought. Are any of your disks in a RAID configuration ( even RAID 0 )? For WD, it's either EADS or EARS ( I can't remember which ), that don't 'hold' very long in RAID arrays ( spin-up lag? or somesuch ) and so drop out of concurrency forcing a rebuild or the RAID becomes invalid ....

( edit ) .... and since 'RAID' 0 doesn't actually have any redundancy then you can be stuffed in such circumstances ie. you were better off not linking the drives in an array fashion at all .

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: Sorry, but that would

Quote:
Sorry, but that would defeat the purpose that I built my i7 for to begin with.


Using an i7 with a 32bit OS is blasphemy. You expect to run with 8 threads and then you physically hamper those 8 threads by telling them they're only able to use 3 GB of memory total. You defeated the purpose of the i7 by adding XP, Rick. Go 7, go 64bit Ultimate. It's cool. :P

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

Greetings everyone, Wow!

Greetings everyone,

Wow! So much information and suggestions. Not sure where to start so...

@ archae86: I went to the basement and dug out the box that my modular PSU and its cables came in and rifled through the cables. I found the 8 pin ATX cable and replaced the 4 pin with it. I'm not sure if it was my imagination or what, but the PC seemed to boot faster. Probably imagination... ;) Thanks! :)

@ Chris (Highlander): I went to the Asus website and entered my information. I even added one or 2 extra items just for GP (general principle). The results were a recommendation 100W less than my PSU is rated at. Recommendation: 450W PSU: 550W So, good to go there... Thanks! :)

@ MrS (ETA): I have BOINC set to 50% when in use and 90% when idle. I lowered the 'when in use' down from 75% when this all started. I will wait until I can upgrade to full 64 bit (OS and RAM) before resuming with Einstein. Thanks! :)

@ Mike Hewson: At this point the errors are moot. I'm not going to worry about them. I'm just sorry that the 175 error-ed WUs will cause other users to be delayed in getting credit for doing their work on them. My drives are not set in a RAID configuration. Winblows XP just would not see RAID even with the RAID driver(s) pre-loaded. Thanks! :)

@ Jord (Ageless): You never cease to make me laugh my friend. :D Blasphemy, huh? Yeah, I suppose so. I will go 64 bit when I get my tax return next year. It's only a few months, I believe I can handle 32 bit for the time being. ;) Thanks! :)

Well, it's almost time for dinner and TV, so I better get going. I'll check back in the morning.

Thanks again, everyone! :)

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Rob Windgassen
Rob Windgassen
Joined: 16 Aug 10
Posts: 5
Credit: 118675094
RAC: 0

RE: At this point the

Quote:
At this point the errors are moot. I'm not going to worry about them.

In general it is good to understand what the root cause of the problem is. Wouldn't it be a pity if you upgrade your system and then it turns out that the problem isn't related to the upgraded hardware?

Quote:
I'm just sorry that the 175 error-ed WUs will cause other users to be delayed in getting credit for doing their work on them.

Don't worry, as your WUs failed early, most or all of them are reassigned and others are happy crunching on them so your wingmen are not really affected badly. [edit] After all we're all just volunteers.

Rob

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828125138
RAC: 206221

RE: Another thing that

Quote:
Another thing that kinda disturbs me is the amount of storage Einstein requires and is currently using. When I had a full compliment of WUs, about 215 or so, Einstein took up almost 2 GB of space. Granted, I have a 500 GB drive, but it's the fact that with all the WUs gone, Einstein is still taking over .5 GB. SETI is taking only 31 MB, Orbit 0(zero) and Virtual Prairie, which I just attached to, 484K. Seems to me, unless there is some underlying logical reason for the over sized WUs, Einstein could reduce the size of them and not put so much stress on users PCs. In my opinion, it's arrogant to assume that EVERYONE has a PC that can handle the work they bulldoze out to us. Or that users will (can) just go out and get what's needed to upgrade their PC to handle the load from Einstein.
Keep on BOINCing...! :)

I think Einstein downloads a huge file full of stuff and then references it for alot for the workunits, so instead of you and I downloading 30 files over 30 connections to the Einstein Server, it sends us one big file that we reference and then we send it back and get another big file full of stuff. I used words like 'stuff' because I do not technically know what is in the big files, but it was discussed elsewhere.

induktio
induktio
Joined: 1 Oct 10
Posts: 15
Credit: 10144774
RAC: 0

RE: Another thing that

Quote:

Another thing that kinda disturbs me is the amount of storage Einstein requires and is currently using. When I had a full compliment of WUs, about 215 or so, Einstein took up almost 2 GB of space. Granted, I have a 500 GB drive, but it's the fact that with all the WUs gone, Einstein is still taking over .5 GB. SETI is taking only 31 MB, Orbit 0(zero) and Virtual Prairie, which I just attached to, 484K. Seems to me, unless there is some underlying logical reason for the over sized WUs, Einstein could reduce the size of them and not put so much stress on users PCs. In my opinion, it's arrogant to assume that EVERYONE has a PC that can handle the work they bulldoze out to us. Or that users will (can) just go out and get what's needed to upgrade their PC to handle the load from Einstein.

I really don't understand why people can complain of the disk usage since hard drive space is almost free nowadays. Usually computations should be optimized to minimize runtime, not memory usage. Spent computation time cannot be later 'recovered' but disk space can be regained afterwards.

Maybe it's the fact I'm running Boinc on dedicated machines, but it seems Boinc projects just will not take advantage of the all the available memory. Currently the machines are running Linux and have 4-8 GB ram. It seems E@H will only use at most 256 MB per core, no matter what. If there was an option to double memory/disk usage and reduce runtime by 10%, I would definitely use it.

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

Greetings all, I have now

Greetings all,

I have now seen something new concerning my i7, something I hadn't seen before or during the current problem. But before I get into that, let me just say...

Somewhere in all this I stated that I have 2 500 GB HDDs on my i7. I really don't care about the size of the WUs as it would pertain to my drives and have no real adverse affect on them. The point I was attempting to make, and what disturbed me, was the assumption that EVERYONE has a PC that can handle the load from Einstein.

The WUs are roughly 250 MB in size. I was told that with my system setup, running 4 cores with HT and 3 recognized GB of RAM, 8 Einstein WUs are taking up 2 of the 3 GB of RAM, leaving 1 GB for everything else. Perhaps my swap file is not large enough to handle the task, which is what I am about to get into that I discovered this morning...

I am currently trying to run Virtual Prairie. I finally got a boat load of WUs last night while I was getting my beauty sleep.;) I noticed a few symptoms had reappeared. I cleared everything up, including shutting down BOINC. No applications were running and the OS seemed to be functioning without any problems.

I ran a-squared, an anti-malware scanner, and all it found were 9 bad cookies. I deleted them and shut down a-squared. Everything appeared to be running ok. I tried to start BOINC and once again, it could not make a connection to localhost. All I had was the tiny window that stated that BOINC was communicating with the client, or whatever it says. I shut down BOINC, which took a bit of time, less than a minute, when a balloon popped up on my task bar saying something about my virtual memory was not sufficient for the operation or some such thing and that it was being re-sized.

My conclusion, in all this, is that, once again, my Windoze installation has become corrupted. I had to do at least 1 repair install several months ago. And, the funny thing about this is the fact that there were no problems with the i7 or Windoze until after I started running Einstein again. Is it any wonder that I have been suspecting Einstein ever since this problem started?

I believe that what I need to do, now, is set VP to NNT and let the cache run dry. Keep a close, watchful eye on the i7 and BOINC and since I have noticed that the crap starts happening after about 8 to 10 hours of operation, shut down BOINC and re-boot about every 6 hours. Then, do a repair install, AGAIN, and hope for the best. In the wisdom of those reading this thread, would that be a wise course of action?

BTW, I am posting this using my Linux box again, attempting to alleviate any stress on Windoze on the i7. :)

I await any recommendations... :)

Keep on BOINCing...! :) (I am certainly trying to!)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2752781592
RAC: 1441907

RE: ... My conclusion, in

Quote:
...
My conclusion, in all this, is that, once again, my Windoze installation has become corrupted. I had to do at least 1 repair install several months ago. And, the funny thing about this is the fact that there were no problems with the i7 or Windoze until after I started running Einstein again. Is it any wonder that I have been suspecting Einstein ever since this problem started?
...


You might like to look at this comment, and Bernd's response. My feeling is that most of the symptoms you're seeing, including the problem the BOINC Manager has communicating with the client, are due to this start-up disk access for Einsten tasks. And the disk access demands of your 8 i7 cores at startup will be even more severe than the demands of my 8 E5320 cores.

If you can bear to walk away from your host during the start-up phase (and find some way of not starting Einstein on all 8 cores at once), I think you'll find that Einstein's steady-state resource usage is more reasonable when you come back.

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

RE: RE: -[ snip ]- You

Quote:
Quote:
-[ snip ]-

You might like to look at this comment, and Bernd's response. My feeling is that most of the symptoms you're seeing, including the problem the BOINC Manager has communicating with the client, are due to this start-up disk access for Einsten tasks. And the disk access demands of your 8 i7 cores at startup will be even more severe than the demands of my 8 E5320 cores.

If you can bear to walk away from your host during the start-up phase (and find some way of not starting Einstein on all 8 cores at once), I think you'll find that Einstein's steady-state resource usage is more reasonable when you come back.


Greetings Richard,

Ok, that makes sense. I have Einstein on hold right now and am running VP. I just re-booted to get BOINC communicating with the client again. I knocked down the number or cores used by 4, in BOINC preferences.

After all that, I fired up task manager to observe the performance graphs. The page file usage was over 5 GB and my available physical memory was at about 4 MB, this with only 4 VP WUs running! After a few minutes everything mellowed out to just over 2 GB page file usage and 1.6 GB available physical memory.

Ok, now I'm really confused! I just looked at my VP WUs and they're barely over 100 bytes each! And still, a great deal of physical memory is being used. And not only that, I'm still getting "Waiting for memory" messages!

Now I'm really baffled! I have never had this much problem with BOINC before, not even on the past, less powerful PCs I've built. At least not that I can remember. I have 2 WUs running and 2 "Waiting for memory". I still show 1.6 GB available physical memory. Why would a 100 byte WU need to be waiting for memory!?

Keep on BOINCing...! :) (I'm trying already!!!)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

crazyrabbit1
crazyrabbit1
Joined: 23 Sep 06
Posts: 34
Credit: 4207137
RAC: 0

Hi, i do not know if this

Hi,

i do not know if this helps but i had some similar problems the last days and lost also wu's, einstein and gpugrid. On my side it was a problem with the virusscanner ( i hope ), made an update to the newest version, big mistake. Pagefile is growing all the time and the available memory is shrinking, nothing to see in the taskmanager, everything is looking ok for the processes. After deinstalling the new version it looks good again, at the moment i run a test with an older version of the scanner and startet gpu crunching on einstein also. At the moment the pagefile is not growing and no shrinking of the avaiable ram.
I think tomorrow i know if it was the virusscanner, but at the moment it looks like.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.