New Improved Gravational Wave App - Discussion

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5678
Credit: 7759627458
RAC: 2659863
Topic 230645

I would like to propose we move the discussion on the News Thread here: https://einsteinathome.org/content/new-improved-gravitational-wave-app-happy-new-year-2024-special?page=1#comment-221365

To this new Thread.

Especially if you are troubleshooting etc.  Trying to decide if it is suitable for your system(s) etc.

HTH,

Tom M

===edit===

Trying to update the message area title so it is not so "loud".

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5678
Credit: 7759627458
RAC: 2659863

I was poking around in the

I was poking around in the very top of the Top50 listing and so far it looks like almost none of the heavy hitters up there have decided to run any All-Sky Gravity tasks.

Here is a possible solution to be edited as you need it to.  It allows you to run the brp7/meerKat application on the Anonymous platform along with the All-Sky Gravity tasks.  

George and everyone,

 

Here is my working: app_info.xml file.

 

I have not stripped out the old gprs#1 stuff either.  But it is working ASIS so I am not touching it.

 

You ALSO need the two or three parameter files listed below the app_info.xml file.

 

===========app_info.xml===================

 

<app_info>
  <file>
    <name>HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v1.0</name>
    <executable/>
 </file>
  <file_info>
    <name>BRP7_einsteinbinary_x86_64-pc-linux-gnu__cuda1222</name>
    <executable/>
  </file_info>
  <file>
    <name>einstein_O3AS_1.07_x86_64-pc-linux-gnu__GW-opencl-nvidia-2</name>
    <executable/>
  </file>
  <file>
    <name>O3ASHF1b_0.config</name>
    
 </file>
 <file>
    <name>O3ASHF1b_1.config</name>
   
 </file>  
 <file>
    <name>O3ASHF1b_2.config</name>
   
 </file>  
  <app>
    <name>hsgamma_FGRPB1G</name>
    <user_friendly_name>Gamma-ray pulsar binary search #1 on GPUs</user_friendly_name>
    <non_cpu_intensive>0</non_cpu_intensive>
  </app>
  <app>
    <name>einsteinbinary_BRP7</name>
    <user_friendly_name>Binary Radio Pulsar Search (MeerKAT) (GPU)</user_friendly_name>
    <non_cpu_intensive>0</non_cpu_intensive>
  </app>
   <app>
    <name>einstein_O3AS</name>
    <user_friendly_name>All-Sky Gravitational Wave search on O3 (GPU)</user_friendly_name>
    <non_cpu_intensive>0</non_cpu_intensive>
  </app>
 <app_version>
    <app_name>hsgamma_FGRPB1G</app_name>
    <version_num>128</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>1.00</avg_ncpus>
    <plan_class>FGRPopencl2Pup-nvidia</plan_class>
    <api_version>7.17.0</api_version>
    <file_ref>
      <file_name>HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v1.0</file_name>
      <main_program/>
    </file_ref>
    <coproc>
      <type>NVIDIA</type>
      <count>1</count>
    </coproc>
  </app_version>
  <app_version>
    <app_name>einsteinbinary_BRP7</app_name>
    <platform>x86_64-pc-linux-gnu</platform>
    <version_num>999</version_num>
    <api_version>7.17.0</api_version>
    <coproc>
      <type>CUDA</type>
      <count>1.0</count>
    </coproc>
    <file_ref>
      <file_name>BRP7_einsteinbinary_x86_64-pc-linux-gnu__cuda1222</file_name>
      <main_program/>
    </file_ref>
    <dont_throttle/>
  </app_version>
  <app_version>
    <app_name>einstein_O3AS</app_name>
    <version_num>107</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>0.900000</avg_ncpus>
    <flops>39017110278.446121</flops>
    <plan_class>GW-opencl-nvidia-2</plan_class>
    <api_version>7.3.0</api_version>
    <file_ref>
        <file_name>einstein_O3AS_1.07_x86_64-pc-linux-gnu__GW-opencl-nvidia-2</file_name>
        <main_program/>
    </file_ref>
    <file_ref>
        <file_name>O3ASHF1b_0.config</file_name>
    </file_ref>
    <file_ref>
        <file_name>O3ASHF1b_1.config</file_name>
    </file_ref>
    <file_ref>
        <file_name>O3ASHF1b_2.config</file_name>
    </file_ref>
    <coproc>
        <type>NVIDIA</type>
        <count>0.333000</count>
    </coproc>    
    <dont_throttle/>
</app_version>
</app_info>

 

===========end app_info.xml=============

 

============O3ASHF1b_0.config===========

 

FreqOffset = 0

 

========end .config file=============

 

The _1 and _2 files have FreqOffset = 1 and = 2 respectively.

 

 

 

-------------

 

==edit===

 

I do NOT remember if I actually got 3 All-Sky tasks to work at once.  But I have had 2 work definitely.

 

==another edit===

 

Setup a separate folder with a copy of the new executable and the .config files.  If something screws up those will disappear and you will need a set to copy back into the folder.

 

If they update the version# again you will need to get that new exe as well as change the app_info.xml exe name.

 

====

HTH,

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Yeti
Yeti
Joined: 17 Nov 04
Posts: 59
Credit: 1295221220
RAC: 1915451

What does  FreqOffset = 1 and

What does  FreqOffset = 1 and = 2 bring about ?

Supporting BOINC, a great concept !

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4756
Credit: 17730539359
RAC: 4800297

Question for the devs.  They

Question for the devs.  They supply the files for some unknown reason.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3722
Credit: 34958653061
RAC: 35369044

i don't think you need a "2"

i don't think you need a "2" file at all. i only ever saw "0" and "1" versions of the file and my anonymous platform setup does not have that file. not sure where Tom got that or if he just made one extra.

but i think they have something to do with the tasks being cut in half and getting the 2nd half of the GPU portion of the task to restart on the proper GPU device. these files weren't present in the older version of the app.

_________________________________________________________________________

fastbunny
fastbunny
Joined: 20 Apr 06
Posts: 22
Credit: 91417756
RAC: 19173

Just a quick note to follow

Just a quick note to follow up on my previous post: I have spread out the 4 GPU tasks I'm running simultaneously over the two chiplets of my 5900X CPU, so they have more CPU cache available, by setting the affinity with Process Lasso, and immediately I'm seeing shorter runtimes. The total board power and utilization of the GPU have improved as well. This is with 4 other CPU tasks running at the same time from another project.

So preliminary testing seems to point to this app loving CPU cache. I wonder whether people with X3D CPUs see substantially shorter runtimes.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5678
Credit: 7759627458
RAC: 2659863

I predict that any user

I predict that any user running the anonymous version of brp7/meerKat will be reluctant to switch to All Sky Gravity Wave.

So that means all the other systems may have a chance to make their marks.

So do we have non-Nvidia versions? Non-Linux versions?

Hmmm.... Let me see.. Looks like ver 1.07 is available for all three Operating systems and both Nvidia and Radeon.

Didn't notice one for Mac GPU though.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5678
Credit: 7759627458
RAC: 2659863

Has anyone been able to show

Has anyone been able to show if 3 x is more productive than 2 x?  My short run on another system (now turned off) at 4 x didn't seem to be running all that much faster (total productivity) per GPU.

I haven't had much luck with trying to  "stagger" the tasks either so one is processioning on the CPU while the other is processing (more less) on the GPU.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

B.I.G
B.I.G
Joined: 26 Oct 07
Posts: 108
Credit: 972184429
RAC: 804988

fastbunny wrote:So

fastbunny wrote:

So preliminary testing seems to point to this app loving CPU cache. I wonder whether people with X3D CPUs see substantially shorter runtimes.

If I only knew how to compare my run times? ;)

Anybody else having a AMD Pro W7600 GPU? Because I have the 7800X3d as CPU.

I get the best results when running 3 tasks at a time. But I always had 4 CPU tasks running, that makes 7 of the 8 cores utilised.

I can't get up to 4 GPU tasks because then I run out of graphics memory - even with the new app. Board power draw of the card is at about 115 Watts, it was 10 watts less with 2 tasks and by switching from 2x to 3x my RAC increased from 450.000 to 520.000. It's specified with 130 watts peak power draw so that's already pretty close.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5678
Credit: 7759627458
RAC: 2659863

B.I.G wrote: fastbunny

B.I.G wrote:

fastbunny wrote:

So preliminary testing seems to point to this app loving CPU cache. I wonder whether people with X3D CPUs see substantially shorter runtimes.

If I only knew how to compare my run times? ;)

Anybody else having a AMD Pro W7600 GPU? Because I have the 7800X3d as CPU.

I get the best results when running 3 tasks at a time. But I always had 4 CPU tasks running, that makes 7 of the 8 cores utilised.

I can't get up to 4 GPU tasks because then I run out of graphics memory - even with the new app. Board power draw of the card is at about 115 Watts, it was 10 watts less with 2 tasks and by switching from 2x to 3x my RAC increased from 450.000 to 520.000. It's specified with 130 watts peak power draw so that's already pretty close.

So you are not using Hyper-threading/SMT for cup projects?

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5678
Credit: 7759627458
RAC: 2659863

B.I.G wrote: If I only knew

B.I.G wrote:

If I only knew how to compare my run times? ;)

It looks like your running the equivalent of about 15.35 minutes per task.  ((2763 / 60) / 3 =~ 15+

If you compare that 15+ minute run time to the results of 1 task and that 1 task on the GPU is taking greater than 15+ minutes you are golden.  If it takes less than 15 minutes per task you are running slower than you could be.

Basically if you haven't started collecting data at 1 x, 2 x to go with your 3 x results you are not completely sure what your most productive setting will be.

Which reminds me.  I need to run some 1 x on mine because I lost track of what my baseline is.

Tom M

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.