Bugreport: Excessive checkpointing depending on preferences

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 691560843
RAC: 237687
Topic 192882

Hi!

As already suspected here, there's a performance issue related to E@H's interpretation of the user preferences.

Some users might be tempted to set the value

"Write to disk at most every xx seconds" (default: 60)

in their preferences to a lower value in order to relax restrictions BOINC imposes on the science client, in the hope of speeding up the computation.

If you set this value to a very low value, e.g. 1 second, this will hurt performance of Einstein@Home, because E@H will use this setting as the checkpointing interval. The smaller the value, the more often checkpointing will occur (as often as once every second). Because checkpointing involves disk IO, this will slow down the overall computation quite a bit.

Observed with the new Beta app under Windows, but I guess it affects all versions.

I guess E@H should instead use a minimum checkpointing interval of (say) 60 seconds.

CU
BRM

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 691560843
RAC: 237687

Bugreport: Excessive checkpointing depending on preferences

Quote:

Hi!

As already suspected here, there's a performance issue related to E@H's interpretation of the user preferences.

Some users might be tempted to set the value

"Write to disk at most every xx seconds" (default: 60)

in their preferences to a lower value in order to relax restrictions BOINC imposes on the science client, in the hope of speeding up the computation.

If you set this value to a very low value, e.g. 1 second, this will hurt performance of Einstein@Home, because E@H will use this setting as the checkpointing interval. The smaller the value, the more often checkpointing will occur (as often as once every second). Because checkpointing involves disk IO, this will slow down the overall computation quite a bit.

Observed with the new Beta app under Windows, but I guess it affects all versions.

I guess E@H should instead use a minimum checkpointing interval of (say) 60 seconds.

CU
BRM

I checked the BOINC source code and actually this is the expected behaviour. So it's the text label in the Web interface that is kind of misleading.

Instead of "Write to disk at most every xx seconds" (sounds like a resource usage limit) maybe
"Write intermediate result to disk every xx seconds" would be less ambiguous?

CU

BRM

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

RE: Instead of "Write to

Message 68437 in response to message 68436

Quote:
Instead of "Write to disk at most every xx seconds" (sounds like a resource usage limit) maybe "Write intermediate result to disk every xx seconds" would be less ambiguous?


However, there are two checkpointing styles. Einstein apparently can write a checkpoint virtually as often as you allow. Therefore you would sensibly like to limit the checkpoint frequency to "Write at most every xx seconds"; that is, "no more frequently than".

OTH, Rosetta project writes checkpoints only at completion of unit of work which they call a "model". That might be once every 2 hours even though you specified a checkpoint interval of 60 seconds.

Verbiage like you propose "Write intermediate result to disk every xx seconds" would lead one to think that the project will/must checkpoint only at the specified interval, rather than "no more often than". Perhaps a better phrase would be "Write to disk no more often (or frequently) than xx seconds".

anders n
anders n
Joined: 29 Aug 05
Posts: 123
Credit: 1656300
RAC: 0

RE: OTH, Rosetta project

Message 68438 in response to message 68437

Quote:

OTH, Rosetta project writes checkpoints only at completion of unit of work which they call a "model". That might be once every 2 hours even though you specified a checkpoint interval of 60 seconds.

Rosetta has checkpointing within a model now.
Usually not more than 30 min between them.

Anders n

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

RE: RE: OTH, Rosetta

Message 68439 in response to message 68438

Quote:
Quote:

OTH, Rosetta project writes checkpoints only at completion of unit of work which they call a "model". That might be once every 2 hours even though you specified a checkpoint interval of 60 seconds.

Rosetta has checkpointing within a model now.
Usually not more than 30 min between them.

Anders n


Yes, I used 2 hours to exaggerate my point. Even so, I assume Rosetta will checkpoint when it wants and not at your specified BOINC checkpoint interval (CI). I wonder, even if you specify an "at most" CI of 2 hours and Rosetta wants to checkpoint after completion of a model at 30 minutes, will it not do so? Rosetta and projects like it probably do not consider what your specified CI is. So, every volunteer participant really needs to know not only how the BOINC parameters work, they also need to understand the manner in which attached projects operate and whether any parameter is functional in that context.

anders n
anders n
Joined: 29 Aug 05
Posts: 123
Credit: 1656300
RAC: 0

RE: Yes, I used 2 hours to

Message 68440 in response to message 68439

Quote:
Yes, I used 2 hours to exaggerate my point. Even so, I assume Rosetta will checkpoint when it wants and not at your specified BOINC checkpoint interval (CI). I wonder, even if you specify an "at most" CI of 2 hours and Rosetta wants to checkpoint after completion of a model at 30 minutes, will it not do so? Rosetta and projects like it probably do not consider what your specified CI is. So, every volunteer participant really needs to know not only how the BOINC parameters work, they also need to understand the manner in which attached projects operate and whether any parameter is functional in that context.

Agree :)

Anders n

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.