Is there a GPU version of the app in the works?

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2142

Credit: 2776017938

RAC: 805554

RE: The problem is, I don't

10 Feb 2009 9:44:47 UTC

Message 87170 in response to message 87169

(moderation:

)

Quote:

The problem is, I don't think, really appreciated by the developers that this new capability is a significant departure from the past and yet my impression of the comments that have been made is that the approach is that a GPU is really a CPU by another name. Yet, I can put a GTX 295, GTX 280, and a 9800 GT in to a three slot system and the GPUs have vastly different capabilities for processing work. More interesting to me is the continuing research by Richard that (on a perliminary basis at least) seems to indicate that in the case of SaH the additional capabilities of the higher end cards does not produce improved processing performance.

Paul is referring to this thread where my trusty Excel grapher has come out to play again.

I wouldn't go so far as to say that the higher-spec cards (the 2xx range) produce no improvement, just that the improvement that exists is much less than you might have expected. We have one case where a GTX295 shows a mere 10% improvement over a 9800GTX+ in some runs, but a 40% improvement in other runs - a clear band-gap between the two. I don't understand that.

Bear in mind that the SETI application is still young and inexperienced, and those %age readings were taken in the range where it is most problematic. I don't think it would be fair or wise to base long-term purchasing decisions on the limited data so far - though I must say I don't regret my decision to start with the 9800 range.

Paul D. Buck

Joined: 17 Jan 05

Posts: 754

Credit: 5385205

RAC: 0

RE: RE: The problem is, I

10 Feb 2009 15:21:50 UTC

Message 87171 in response to message 87170

(moderation:

)

Quote:

Quote:
The problem is, I don't think, really appreciated by the developers that this new capability is a significant departure from the past and yet my impression of the comments that have been made is that the approach is that a GPU is really a CPU by another name. Yet, I can put a GTX 295, GTX 280, and a 9800 GT in to a three slot system and the GPUs have vastly different capabilities for processing work. More interesting to me is the continuing research by Richard that (on a perliminary basis at least) seems to indicate that in the case of SaH the additional capabilities of the higher end cards does not produce improved processing performance.

Paul is referring to this thread where my trusty Excel grapher has come out to play again.

I wouldn't go so far as to say that the higher-spec cards (the 2xx range) produce no improvement, just that the improvement that exists is much less than you might have expected. We have one case where a GTX295 shows a mere 10% improvement over a 9800GTX+ in some runs, but a 40% improvement in other runs - a clear band-gap between the two. I don't understand that.

Bear in mind that the SETI application is still young and inexperienced, and those %age readings were taken in the range where it is most problematic. I don't think it would be fair or wise to base long-term purchasing decisions on the limited data so far - though I must say I don't regret my decision to start with the 9800 range.

a 10-40% improvement for a factor of 4 increase in price is, um, not a worthwhile increase ...

More importantly the showing on GPU Grid is that there is at least a factor of two increase in speed to as much as a factor of 4 ... it is a little hard for me to be precise in that there have been changes in the application and there are at least three different sizes of tasks.

But, for example:

W1, GTX 280
..CS.........Time..........Time per step
2478.....16,607s..........33.214ms (4.61 Hours)
2478.....17,275s..........34.551ms
3718.....25,007s..........33.343ms (6.94 Hours)
3718.....24,788s..........33.051ms
3848.....24,690s..........49.382ms (6.84 Hours)
3848.....24,709s..........49.419ms

W2, GTX 295
..CS.........Time..........Time per step
2478.....18,570s..........37.140ms (5.18 Hours)
2478.....18,395s..........36.790ms
3718.....27,771s..........37.029ms (7.71 Hours)
3718.....27,905s..........37.207ms
3848.....27,557s..........55.115ms (7.65 Hours)
3848.....27,480s..........54.961ms

Xeon-32a, 9800 GT
..CS.........Time..........Time per step
2478.....50,112s..........100.225ms (13.92 Hours)
2478.....50,263s..........100.527ms
3718.....65,404s............87.206ms (18+ Hours)

From this list of tasks (limited in the case of the 9800 Gt because it has not been in the system that long)... you can get an idea that the GTX 280 is showing an increase in processing speed of at least 3 ...

Though the 295 shows slightly slower numbers than the GTX 280, there are two cores in the GTX 295 so you have twice as much done in the same time ...

Anyway, Richard is doing an interesting study and I am vastly interested in the results...

Though while I am on it, no labels on the axis made it difficult for me to clearly know what you were plotting in each direction ...

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2142

Credit: 2776017938

RAC: 805554

RE: Though while I am on

10 Feb 2009 15:31:38 UTC

Message 87172 in response to message 87171

(moderation:

)

Quote:

Though while I am on it, no labels on the axis made it difficult for me to clearly know what you were plotting in each direction ...

X-axis: SETI 'Angle Range' - distinguishes different tasks
Y-axis: elapsed ('wall clock') time, in seconds

Gerry Rough

Joined: 1 Mar 05

Posts: 102

Credit: 1847066

RAC: 0

Just a heads up: Adam over

14 Mar 2009 0:25:17 UTC

Message 87173

(moderation:

)

Just a heads up:

Adam over on the Lattice Project boards has put out the word that TLP will make a formal announcement in the next week or two about some CUDA enabled WUs.

[sound effects of Gary, Richard and Bernd on the rack!!]

Things CUDA are moving along.

[/sound effects of Gary, Richard and Bernd on the rack!!] :-)

(Click for detailed stats)

MarkJ

Joined: 28 Feb 08

Posts: 437

Credit: 137763410

RAC: 17654

RE: Does anyone here know

14 Mar 2009 11:13:00 UTC

Message 87174 in response to message 87164

(moderation:

)

Quote:

Does anyone here know if there will be BOINC preferences to the use of CUDA enabled apps that either tell BOINC not to do CUDA at all, or to put time constraints on using the GPU? I certainly hope so. :-/

To answer this, as of 6.6.15 we have the following controls relating to GPU use:

1. A cc_config setting to disable use entirely
2. A check box under Advanced settings to disable GPU use when computer in use
3. You can specify via app_info the number of GPU's to use

Using the last one unfortunately doesn't allow you to specify which of the GPU's (assuming you have 2 or more) to use.

In the case of Seti they have a setting on their web site to allow cuda work (they currently only have 1 cuda app). If you deselect it then you don't get cuda work. Hopefully when Einstein releases a cuda app they too will offer a setting to enable/disable cuda work.

BOINC blog

Jord

Joined: 26 Jan 05

Posts: 2952

Credit: 5779100

RAC: 0

RE: 2. A check box under

14 Mar 2009 13:04:41 UTC

Message 87175 in response to message 87174

(moderation:

)

Quote:

2. A check box under Advanced settings to disable GPU use when computer in use

This option is also available in the Seti computing preferences, under the option of "Suspend GPU work while computer is in use?", saying it works with 6.7+ while it really works with 6.6.7 and above.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2142

Credit: 2776017938

RAC: 805554

RE: 3. You can specify via

14 Mar 2009 13:13:58 UTC

Message 87176 in response to message 87174

(moderation:

)

Quote:

3. You can specify via app_info the number of GPU's to use

Using the last one unfortunately doesn't allow you to specify which of the GPU's (assuming you have 2 or more) to use.

Correction: the setting in app_info is to indicate how many GPUs to use per task. So far as I know, no-one has written an application that uses more than one GPU per instance (just as, in general, BOINC tasks only use 1 CPU core: if you have more cores, BOINC starts more instances of the application). I think the same would happen for GPUs - if you have 6 GPUs installed in the same host (as a number of SETI users have), then BOINC will use all of them, without the option, to run six BOINC GPU-enabled tasks.

Various people have asked the BOINC developers to allow freedom of choice and finer control over their own machines than this - see for example trac ticket [trac]#842[/trac] - but so far the response from BOINC has been less than lukewarm.

Quote:

In the case of Seti they have a setting on their web site to allow cuda work (they currently only have 1 cuda app). If you deselect it then you don't get cuda work. Hopefully when Einstein releases a cuda app they too will offer a setting to enable/disable cuda work.

And if you have an app_info.xml file, then the deselect switch is over-ridden!

Here at Einstein, we already have:

Arecibo Binary Pulsar Search: no
Hierarchical all-sky pulsar search: yes
Hierarchical S5 all-sky GW search #5: yes

Once the final 1,412 results for S5R4 have been purged from the database, the middle line will become redundant: maybe we could then have

Arecibo Binary Pulsar Search: no
Hierarchical S5 all-sky GW search #5 (on CPU): yes
Hierarchical S5 all-sky GW search #5 (on GPU if available): yes

John Clark

Joined: 4 May 07

Posts: 1087

Credit: 3143193

RAC: 0

As the thread title relates

14 Mar 2009 13:48:05 UTC

Message 87177

(moderation:

)

As the thread title relates to GPU crunching on E@H, I read it from the beginning. Daunting, and leads to s speed up or skim towards the end. I also noticed Paul D Buck posts wisdom.

There is no mention regarding the high end ATI GPU cards - HD38xx and HD48xx. These are the GPUs that are making successful dents on Milkyway, where a cruncher with an HD4850 can raise an RAC of 50K on one rig.

A considerable number of MW crunchers are heading in this direction, and using the CPUs to run other projects.

There is anecdotal evidence that mixed NV and ATI cards are working, where the NV run the system graphics. Where the ATI card also runs the system graphics, then the MW crunching gives very sluggish screen refreshes. But, if the machine really is only used as a MW cruncher, then that is OK.

So, to my question -

Is all the GPU coding related to NV cards and a CUDA approach, or is there also work for ATI users?

Shih-Tzu are clever, cuddly, playful and rule!! Jack Russell are feisty!

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2142

Credit: 2776017938

RAC: 805554

RE: Is all the GPU coding

14 Mar 2009 14:34:47 UTC

Message 87178 in response to message 87177

(moderation:

)

Quote:

Is all the GPU coding related to NV cards and a CUDA approach, or is there also work for ATI users?

Yet more reading for you, I'm afraid :-)

Bernd posted in the parallel Einstein support OpenCL? thread:

Quote:

[re: CUDA or OpenCL?]

There are people working on both. AFAIK CUDA is somewhat further, maybe we get something working in March there.

As you surmise, CUDA would be nVidia only, but OpenCL should be vendor-independent.

Paul D. Buck

Joined: 17 Jan 05

Posts: 754

Credit: 5385205

RAC: 0

RE: Is all the GPU coding

14 Mar 2009 15:48:18 UTC

Message 87179 in response to message 87177

(moderation:

)

Quote:

Is all the GPU coding related to NV cards and a CUDA approach, or is there also work for ATI users?

OpenCL as RIchard states should be vendor independent. I suspect if history is a guide, particularly in the beginning the OpenCL versions will be slower than "native" API versions (CUDA / Brooke).

At this time, the ATI cards only have an application at Milky Way. The Lattice Project just announced that they will be releasing applications for their "suite" of projects with some being CUDA and others ATI ... with perhaps others OpenCL (as I read the note).

Until we actually see the applications it is hard to say what and how well these applications will work. From personal experience I can tell you that it is fun to rip through the MW tasks with an ATI card. I can also tell you that because BOINC does not "recognize" the ATI card that the integration is less than what one could hope for. Not that Cluster P. is not doing a fantastic job, it is just that we have this kludge bolted onto BOINC ... heck, CUDA does not work all that well yet either ...

Is there a GPU version of the app in the works?

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner