Why quota of 40 per CPU core?


Advanced search

Message boards : Cruncher's Corner : Why quota of 40 per CPU core?

AuthorMessage
archae86
Send message
Joined: Dec 6 05
Posts: 1065
Credit: 112,226,125
RAC: 98,928
Message 116661 - Posted 2 Apr 2012 17:36:29 UTC

    Last modified: 2 Apr 2012 18:03:59 UTC

    Just a day or two ago I completed a pretty strong effort to assess the productivity and power consumption of a host with one GTX 460 GPU card and an i5-2500K CPU (four cores, no hyperthreading). I looked at simultaneous GPU tasks from zero through three, and CPU tasks from zero through four. While the general rule was that two or three GPU tasks was better than anything else by a lot, and did not vary so much within that group, the higher CPU task options suffered more than I expected, as they degraded GPU output, required more CPU time per task completion, and had considerably inferior incremental power efficiency (in fact often negative!). Holding the GPU tasks running constant at three, the peak output was with two CPU tasks running, while the four CPU task option actually gave less credit/day than the zero CPU task option--with considerably inferior power consumption, and probably much worse user interactive responsiveness.

    Weighing all considerations, I concluded that I wanted to run with 3 GPU tasks and one CPU task, with the GPU running BRP and the CPU running GW work.

    Within hours I got a rude awakening--this configuration can't come close to keeping itself fed with newly assigned work, as the limit of 40 new tasks per day per active CPU core throttles delivery. As this configuration has an indicated daily throughput of 62.2 GPU tasks and 6.8 CPU tasks, the indicated daily deficit of task assignment is thirty tasks.

    Even stepping up to three GPU and two CPU tasks, which was actually the highest output configuration I tested, gives a very modest daily net task queue gain of 7.1 tasks--so a miserly 10%/ day recovery rate in the case of outages at the project, lost communication at any point between, and certain types of trouble on my own host.

    In casting about for solutions, I noticed that if one attempted to approximate the 3 GPU/1 CPU configuration by setting the BOINC preferences to allow 50% of CPUs (half of four is two in my case) and to allow at most 50% of CPU time (labelled as a CPU heat reducing provision), one would get similar total throughput, but be allowed 80 tasks/day.

    Sadly this alternate is in fact a bit less productive--probably because the CPU throttling scheme employed slightly increases average response time of the CPU application which supports each GPU tasks. It is sad to give up 2% on both output and power efficiency just to obtain a marginally adequate supply of work.

    I confess I don't know the history of this limit, nor why it gives full credit to each CPU core allowed to work, even if heavily throttled by a power reduction duty cycle limit, while giving zero credit to GPUs, which when present often absolutely dominate the real productivity.

    While I don't for a moment think that my personal situation itself warrants a change in this project limit, I've described it in some detail just in case the window is open to reconsider the limit, the value imposed, or the method of calculation. It would be a pity to discourage GPU participants.

    [edited for clumsy wording]
    ____________

    archae86
    Send message
    Joined: Dec 6 05
    Posts: 1065
    Credit: 112,226,125
    RAC: 98,928
    Message 116665 - Posted 2 Apr 2012 19:41:39 UTC - in response to Message 116661.

      Just a day or two ago I completed a pretty strong effort to assess the productivity and power consumption of a host with one GTX 460 GPU card and an i5-2500K CPU (four cores, no hyperthreading).

      I've posted a subset of my result spreadsheet on performance of this host while varying the number of active GPU and CPU jobs in the thread I started on bringing up a GTX460 host.
      ____________

      Tom*
      Send message
      Joined: Oct 9 11
      Posts: 37
      Credit: 19,384,189
      RAC: 33,961
      Message 116666 - Posted 2 Apr 2012 20:06:56 UTC

        EfMer's Priority program a split from TThrottle fixed both my Einstein and
        Seti issues running Two CPU jobs and one or two GPU jobs keeping the GPU's
        satisfied (to my satisfaction:-))on a two cpu PROCESSOR.

        I guess LASSO does something similar Since different users have different requirements ie Total Crunch vs Slip in a few cycles every so often
        a separate utility is in MHO the easiest way to implement this.

        Just MHO.

        Bill

        Horacio
        Send message
        Joined: Oct 3 11
        Posts: 205
        Credit: 79,509,837
        RAC: 25,577
        Message 116669 - Posted 2 Apr 2012 22:14:12 UTC

          There is a workaround to dodge the limits while still using your desired setting of 3GPU+1CPU (or any other just adjusting the numbers):

          Use an app_info and set the cpu usage of the GPU apps for Einstein, <avg_ncpus> and <max_ncpus> tags, to 1.0 (I think you'll need to set the "CUDA count" also in the app_info as it wont be using the server side settings), and set the CPU usage prefference to 100%.

          When there are 3 GPU tasks running the boinc manager will reserve 3 cores for them and you will be crunching only one CPU task. When requesting work it will tell to the scheduller that you have 4 active cores so your effective limit will be 160... In addition, if the project runs out of GPU tasks your host will start to use all the other cores during the outage.

          Just a warning, if you do this, you should take care about not setting the cache size too long, as the host will request CPU tasks thinking that they will be crunched 4 times faster which can lead to missing deadlines.

          Of course, using app_info will require you to keep an eye on the forums to know if some app changed and/or if there are new ones...

          Post to thread

          Message boards : Cruncher's Corner : Why quota of 40 per CPU core?


          Home · Your account · Message boards

          This material is based upon work supported by the National Science Foundation (NSF) under Grants PHY-1104902, PHY-1104617 and PHY-1105572 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

          Copyright © 2014 Bruce Allen