new units not downloading

kenlo

Joined: 1 Jun 05

Posts: 16

Credit: 28206

RAC: 0

27 Jun 2005 21:13:23 UTC

Topic 189428

(moderation:

)

new h1 units not downloading

kenlo

Thierry Van Dri...

Joined: 9 Feb 05

Posts: 210

Credit: 229929

RAC: 0

new units not downloading

27 Jun 2005 22:24:40 UTC

Message 13506

(moderation:

)

Quote:

new h1 units not downloading

Any relevant message(s) from the messages tab of Boinc would be interesting to post.

Greetings from Belgium
Thierry

kenlo

Joined: 1 Jun 05

Posts: 16

Credit: 28206

RAC: 0

RE: new h1 units not

27 Jun 2005 22:59:34 UTC

Message 13507

(moderation:

)

Quote:

new h1 units not downloading

06/27/05 19:02:34||Starting BOINC client version 4.43 for windows_intelx86
06/27/05 19:02:34||Data directory: D:\Program Files\BOINC
06/27/05 19:02:35|Einstein@Home|Computer ID: 307979; location: home; project prefs: default
06/27/05 19:02:35|orbit@home|Computer ID: 682; location: home; project prefs: default
06/27/05 19:02:35||General prefs: from Einstein@Home (last modified 2005-06-13 13:31:31)
06/27/05 19:02:35||General prefs: no separate prefs for home; using your defaults
06/27/05 19:02:35||Remote control not allowed; using loopback address
06/27/05 19:02:35|Einstein@Home|Resuming computation for result H1_0326.5__0326.9_0.1_T21_Fin1_2 using einstein version 4.79
06/27/05 19:02:35|orbit@home|Deferring communication with project for 14 hours, 48 minutes, and 26 seconds
06/27/05 19:02:35|Einstein@Home|Started download of h1_0326.5
06/27/05 19:02:35||schedule_cpus: must schedule
06/27/05 19:02:49|Einstein@Home|Temporarily failed download of h1_0326.5: 416
06/27/05 19:02:52|Einstein@Home|Started download of h1_0326.5
06/27/05 19:03:03|Einstein@Home|Temporarily failed download of h1_0326.5: 416
06/27/05 19:03:06|Einstein@Home|Started download of h1_0326.5

kenlo

Ulrich Metzner

Joined: 22 Jan 05

Posts: 113

Credit: 963370

RAC: 0

Here an excerpt from

27 Jun 2005 23:46:44 UTC

Message 13508

(moderation:

)

Here an excerpt from proxomitron log:

+++GET 30654+++
GET /download/38/h1_0205.0 HTTP/1.0
User-Agent: BOINC client
Host: einstein.astro.gla.ac.uk:80
Range: bytes=14736000-
Accept: */*
Connection: keep-alive

+++RESP 30654+++
HTTP/1.0 416 Requested Range Not Satisfiable
Date: Mon, 27 Jun 2005 23:41:50 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) DAV/2 SVN/1.1.4 mod_python/3.1.3 Python/2.3.5 PHP/4.3.10-15 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_perl/1.999.21 Perl/v5.8.4
Keep-Alive: timeout=15, max=89
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 30654+++

+++GET 30655+++
GET /download/38/h1_0205.0 HTTP/1.0
User-Agent: BOINC client
Host: einstein.astro.gla.ac.uk:80
Range: bytes=14736000-
Accept: */*
Connection: keep-alive

+++RESP 30655+++
HTTP/1.0 416 Requested Range Not Satisfiable
Date: Mon, 27 Jun 2005 23:41:54 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) DAV/2 SVN/1.1.4 mod_python/3.1.3 Python/2.3.5 PHP/4.3.10-15 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_perl/1.999.21 Perl/v5.8.4
Keep-Alive: timeout=15, max=88
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 30655+++

There is some filesize wrong!

Aloha, Uli

Ananas

Joined: 22 Jan 05

Posts: 272

Credit: 2500681

RAC: 0

I had the same problem just

27 Jun 2005 23:55:51 UTC

Message 13509

(moderation:

)

I had the same problem just now and I had to reset the project on that PC.

The reason :

It had two download tasks running on exactly the same file. (h1_0400.0)

One was downloaded successfully with the expected file size and the other downloader "wondered where those bytes all came from" and reported a file size error too with a retry every few seconds.

BOINC 4.19, Dual CPU P3s

After the reset it did download stuff successfully but still it shows is "download failed". Nothing missing but I guess I cannot allow BOINC to have two files with the same filename ;-)

There must be something damaged on server/scheduler side or in the WU XML config.

Ananas

Joined: 22 Jan 05

Posts: 272

Credit: 2500681

RAC: 0

After a reset I got a

28 Jun 2005 0:08:19 UTC

Message 13510

(moderation:

)

After a reset I got a H1_501.0

Same problem first - but then after successful(!) transfer of H1_501.0 BOINC got a request to delete H1_501.0 while it was still downloading H1_501.0 on the other download thread.

Of course the client didn't like that too much either - now there's a checksum error, 2 tasks are crunching and a few are still in "downloading" state

Very weird !

Ananas

Joined: 22 Jan 05

Posts: 272

Credit: 2500681

RAC: 0

The story continues : After

28 Jun 2005 0:12:15 UTC

Message 13511

(moderation:

)

The story continues : After manually contacting the scheduler to report the error, it tried to delete H1_501.0

BOINC was very sad and told me it couldn't delete H1_501.0 .... but the work units are happy now and not trying to download H1_501.0 again (as it's still there of course)
___________

I guess it's the WU configuration that is wrong, the scheduler request which I saved after the first problem had this in it :

H1_0400.0

h1_0400.0

i.e. twice the same stuff

I would rate this as a critical problem

Robert Nelson

Joined: 19 Mar 05

Posts: 5

Credit: 84095502

RAC: 25056

RE: I would rate this as

28 Jun 2005 1:00:50 UTC

Message 13512 in response to message 13511

(moderation:

)

Quote:

I would rate this as a critical problem

Same here, just caught one machine in an endless loop here is an excerpt
6/27/2005 8:16:21 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:23 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:24 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:25 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:27 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:27 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:28 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:30 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:30 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:31 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
It went on till I aborted transfer which appears to have killed the issue. This machine running Einstein beta and 4.45 windows.

Walt Gribben

Joined: 20 Feb 05

Posts: 219

Credit: 1645393

RAC: 0

RE: RE: I would rate

28 Jun 2005 1:47:52 UTC

Message 13513 in response to message 13512

(moderation:

)

Quote:

Quote:

I would rate this as a critical problem

Same here, just caught one machine in an endless loop here is an excerpt
6/27/2005 8:16:21 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:23 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:24 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:25 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:27 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:27 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:28 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:30 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:30 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:31 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
It went on till I aborted transfer which appears to have killed the issue. This machine running Einstein beta and 4.45 windows.

Shut down boinc and restart it. Usually "exit" in boincmgr will do it, but the boinc process must end. If it doesn't, use the taskmanager to "kill" it.

Theres a bug in BOINC where temporarily failed downloads keep the file open which can cause the problems you see. When boinc ends, Windows will close all the files.

kenlo

Joined: 1 Jun 05

Posts: 16

Credit: 28206

RAC: 0

RE: new h1 units not

28 Jun 2005 2:00:33 UTC

Message 13514

(moderation:

)

Quote:

new h1 units not downloading

all i did after the bad download was to abort it and it seems to be running ok now.

kenlo

Walt Gribben

Joined: 20 Feb 05

Posts: 219

Credit: 1645393

RAC: 0

RE: RE: new h1 units not

28 Jun 2005 2:13:21 UTC

Message 13515 in response to message 13514

(moderation:

)

Quote:

Quote:
new h1 units not downloading

all i did after the bad download was to abort it and it seems to be running ok now.

Thats good. But run Process Explorer, look at the handles for the BOINC process, and see if theres any for h1_0326.5. Or any other h1_* file.

Its fine for the einstein application to use these, but BOINC shouldn't hold on to the file. It'll cause problems later, when BOINC has to delete it. Which shouldn't be for a few weeks yet, when the scheduler decides its time to work in a different set of data.

EDIT:

The "download looping" problem is in boinc 4.43 and fixed with 4.45. Don't remember whether 4.45 fixes the "open handle" one though.

EDIT**2:

From Roberts post, I'd say the "open handle" bug isn't fixed in 4.45. Thats what happens when downloads fail like that, if BOINC leaves the file open, it can't delete the file to download it again. Thats a problem for Einstein@home, where one file is downloaded for all the WU's to use. In that case, its probably a good idea to restart BOINC.

new units not downloading

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports