Using dual GTX 590's


Advanced search

Message boards : Cruncher's Corner : Using dual GTX 590's

AuthorMessage
lupo
Send message
Joined: Sep 8 10
Posts: 1
Credit: 1,760,229
RAC: 0
Message 114132 - Posted 21 Sep 2011 6:16:08 UTC

    Has anyone had any luck running dual GTX 590's? This would effectively give me the equivalent of 4 cards using only two PCI-E slots.

    Any thoughts, experiences, or recommendations?

    Thanks!

    Profile Bikeman (Heinz-Bernd Eggenstein)
    Forum moderator
    Project administrator
    Project developer
    Avatar
    Send message
    Joined: Aug 28 06
    Posts: 3225
    Credit: 72,696,832
    RAC: 32,541
    Message 114139 - Posted 22 Sep 2011 0:46:46 UTC - in response to Message 114132.

      Last modified: 22 Sep 2011 1:03:23 UTC

      Hi!

      If you follow the Statistics -> Top Hosts Link on the project home page, you'll see some comparable hosts, I even found one with two such cards:


      http://einstein.phys.uwm.edu/show_host_detail.php?hostid=4207502.

      That PC is not "hidden" so you can try to send a PM to the owner.

      To get the most performance, you will want to experiment with app_inf.xml files that allow you to run more than one GPU task per physical GPU. In that case you get slower execution times, but overall higher throughput. More infor is available in this forum. This PC is an example in the TOP 20 that seems to use this kind of app_info file on a single GFX 590:

      http://einstein.phys.uwm.edu/show_host_detail.php?hostid=4095562

      This should give you an idea of the performance to expect from this combination.

      There are also some threads here discussing cooling of PCs :-). If I'm not mistaken, your system will consume (altogether) > 1000 W (!!) and dissipate heat like a hair dryer.

      Please let us know if you have specific questions.

      HBE
      ____________

      Jeroen
      Send message
      Joined: Nov 25 05
      Posts: 302
      Credit: 328,882,503
      RAC: 344,095
      Message 114140 - Posted 22 Sep 2011 1:12:59 UTC - in response to Message 114132.

        I have not tried this but have thought of doing a setup like this in the past. My main concern was bandwidth on the PCI-E bus. When running multiple work units per GPU, ideally each GPU would have a 16x slot set at 16x. With the 590, two GPUs would have to share the same slot.

        A much lower cost alternative would be the GTX 295 dual GPU card. It actually makes for a decent cruncher with this project. Last time I tried, I was seeing around 2000 second processing time in Linux with one card installed and running a single work unit per GPU. With two cards in 16x slots, in theory you could process 172 work units per day provided two cards are able to perform as well as a single card by itself. A decent CPU overclock also helps performance. In this case, I had my i7 set to 4.3 GHz.

        Profile Stranger7777
        Avatar
        Send message
        Joined: Mar 17 05
        Posts: 321
        Credit: 93,601,561
        RAC: 77,297
        Message 114212 - Posted 27 Sep 2011 18:13:31 UTC - in response to Message 114140.

          A decent CPU overclock also helps performance. In this case, I had my i7 set to 4.3 GHz.

          And what is the mean daily RAC for the host?

          Fred J. Verster
          Avatar
          Send message
          Joined: Apr 27 08
          Posts: 114
          Credit: 20,727,834
          RAC: 0
          Message 114213 - Posted 27 Sep 2011 18:52:20 UTC - in response to Message 114212.

            Last modified: 27 Sep 2011 18:57:40 UTC

            Most motherboards having (atleast) 2 PCIe 2.0 16x slots, will run them in 8x mode if 2 (or 3)
            GPUs, are used. But there are a lot of differences involved.

            My (2) ASUS P5E mobos, run 2 ATI cards in PCIe 2.0 16x mode, but not 2 NVIDIA GPUs.
            My INTEL DP67BG mobo, runs 2 ATI 5870 GPUs, in PCIe 2.0 8x mode, which is enough
            even when 2 (or more) WUs being crunched at the same time. (Well with 3, you'll notice
            some extra time!)

            I run 2 rigs, 1 Q6600+GTX470 and 1 X9650+GTX480, but only one WU per GPU.
            Last time I made an app_info.xml file, I made a typo....................
            (Although, I used the same "names", as in the SETI app_info.xml?)

            Run SETI with 2 WUs per GPU, no problem and like todo this on Einstein, as well.
            Can someone help me with an example, on these 2 FERMIs and the low GPU use
            there should be no problem, cause on the 480, I did run 2 per GPU?
            ____________

            Knight who says Ni N! N!

            Jeroen
            Send message
            Joined: Nov 25 05
            Posts: 302
            Credit: 328,882,503
            RAC: 344,095
            Message 114215 - Posted 27 Sep 2011 23:00:11 UTC - in response to Message 114212.

              A decent CPU overclock also helps performance. In this case, I had my i7 set to 4.3 GHz.

              And what is the mean daily RAC for the host?


              This system has the potential for ~44,000 RAC with the single 295. Unfortunately I don't have enough available bandwidth to feed this system so I have it offline currently.

              Profile Stranger7777
              Avatar
              Send message
              Joined: Mar 17 05
              Posts: 321
              Credit: 93,601,561
              RAC: 77,297
              Message 114261 - Posted 1 Oct 2011 9:37:13 UTC - in response to Message 114215.

                A decent CPU overclock also helps performance. In this case, I had my i7 set to 4.3 GHz.

                And what is the mean daily RAC for the host?


                This system has the potential for ~44,000 RAC with the single 295. Unfortunately I don't have enough available bandwidth to feed this system so I have it offline currently.


                That's great! I cannot yet make my I5 (sandy bridge) host do more than 16000 daily with GTS 450 1 Gb. Cannot understand why. :(

                FrankHagen
                Send message
                Joined: Feb 13 08
                Posts: 102
                Credit: 60,245
                RAC: 14
                Message 114262 - Posted 1 Oct 2011 9:48:58 UTC - in response to Message 114261.

                  A decent CPU overclock also helps performance. In this case, I had my i7 set to 4.3 GHz.

                  And what is the mean daily RAC for the host?


                  This system has the potential for ~44,000 RAC with the single 295. Unfortunately I don't have enough available bandwidth to feed this system so I have it offline currently.


                  That's great! I cannot yet make my I5 (sandy bridge) host do more than 16000 daily with GTS 450 1 Gb. Cannot understand why. :(


                  do you have the original version with 192 cores or the OEM-one with only 144?

                  oh, and a GTX295 has 2*240 cores.. ;)

                  Profile Bikeman (Heinz-Bernd Eggenstein)
                  Forum moderator
                  Project administrator
                  Project developer
                  Avatar
                  Send message
                  Joined: Aug 28 06
                  Posts: 3225
                  Credit: 72,696,832
                  RAC: 32,541
                  Message 114276 - Posted 1 Oct 2011 18:21:59 UTC - in response to Message 114262.


                    do you have the original version with 192 cores or the OEM-one with only 144?

                    oh, and a GTX295 has 2*240 cores.. ;)


                    You can check the number of cores in the log output of BRP4 tasks, it's in one of the first lines printed. In this case:


                    [20:35:28][4100][INFO ] Using CUDA device #0 "GeForce GTS 450" (192 CUDA cores / 622.08 GFLOPS)


                    I think 16k per day isn't too bad. How many GPU tasks in parallel are you running (i'd guess two) and how many CPU cores are crunching CPU tasks at the same time ?

                    CU
                    HB
                    ____________

                    Sid
                    Send message
                    Joined: Oct 17 10
                    Posts: 89
                    Credit: 48,005,187
                    RAC: 48,940
                    Message 114278 - Posted 1 Oct 2011 18:51:10 UTC - in response to Message 114261.



                      That's great! I cannot yet make my I5 (sandy bridge) host do more than 16000 daily with GTS 450 1 Gb. Cannot understand why. :(


                      I'm using GTX-560 Ti with 2Gb memory so I'm running 6 task simultaneously. For BPR4 and 6 tasks best time on my computer is 2:30. So theoretically I can have 24 / 2:30 * 6 * 500 = 28800 credits for one day.
                      Nevertheless for some reason I haven't.
                      I've tried:
                      1. Switch to the Linux - no gain for my configuration.
                      2. Use only one PCIE x16 slot - - got only couple of percents.

                      My guess - Boinc spends a lot of time to finish one task and start next one.

                      Profile Bikeman (Heinz-Bernd Eggenstein)
                      Forum moderator
                      Project administrator
                      Project developer
                      Avatar
                      Send message
                      Joined: Aug 28 06
                      Posts: 3225
                      Credit: 72,696,832
                      RAC: 32,541
                      Message 114284 - Posted 1 Oct 2011 23:49:05 UTC - in response to Message 114278.

                        Last modified: 1 Oct 2011 23:51:34 UTC


                        I'm using GTX-560 Ti with 2Gb memory so I'm running 6 task simultaneously. For BPR4 and 6 tasks best time on my computer is 2:30. So theoretically I can have 24 / 2:30 * 6 * 500 = 28800 credits for one day.
                        Nevertheless for some reason I haven't.
                        I've tried:
                        1. Switch to the Linux - no gain for my configuration.
                        2. Use only one PCIE x16 slot - - got only couple of percents.

                        My guess - Boinc spends a lot of time to finish one task and start next one.



                        Highly unlikely. In fact, your host is doing very well the last few days, but the RAC displayed here is effectively a running average over several weeks and will take some time to adjust.

                        This is what BOINCstats reports for that host under Windows:



                        I think that's quite ok :-) A second effect is that your pending results are not in equilibrium if you switch platform (windows / Linux) so that alos takes some time to level out in the stats.

                        CU
                        HBE
                        ____________

                        Profile Stranger7777
                        Avatar
                        Send message
                        Joined: Mar 17 05
                        Posts: 321
                        Credit: 93,601,561
                        RAC: 77,297
                        Message 114298 - Posted 2 Oct 2011 10:03:10 UTC

                          I'm running now 3 CUDA tasks simultaneously. Last 2 weeks I was running only 2 tasks in parallel, but the RAC gets down to 12000-14000. So I added up one more task to run. I5 runs only 3 tasks at once with 80% load. I will have to look at it for a couple of weeks. If RAC will not go up I'll return it back to run 4 tasks or add up load to 100%. The cards has 192 cores - it is Gigabyte. The temperature of the chip is 68 C.

                          Jeroen
                          Send message
                          Joined: Nov 25 05
                          Posts: 302
                          Credit: 328,882,503
                          RAC: 344,095
                          Message 114308 - Posted 2 Oct 2011 18:38:14 UTC - in response to Message 114261.

                            Last modified: 2 Oct 2011 18:39:46 UTC

                            A decent CPU overclock also helps performance. In this case, I had my i7 set to 4.3 GHz.

                            And what is the mean daily RAC for the host?


                            This system has the potential for ~44,000 RAC with the single 295. Unfortunately I don't have enough available bandwidth to feed this system so I have it offline currently.


                            That's great! I cannot yet make my I5 (sandy bridge) host do more than 16000 daily with GTS 450 1 Gb. Cannot understand why. :(


                            If you can change to Linux OS, you should be able to get a bit of a boost in GPU performance. Is the video card in a 16x slot that is set to 16x? If you are running any CPU work units, you may try and temporarily disable CPU support to see if that helps performance any. One other thing you can try is to install Process Lasso and give the GPU work units higher priority.

                            Profile Bikeman (Heinz-Bernd Eggenstein)
                            Forum moderator
                            Project administrator
                            Project developer
                            Avatar
                            Send message
                            Joined: Aug 28 06
                            Posts: 3225
                            Credit: 72,696,832
                            RAC: 32,541
                            Message 114309 - Posted 2 Oct 2011 18:50:38 UTC

                              Hi all!

                              Just a little reminder that RAC as shown by the project is not a good metric for optimization. For example, now BRP4 tasks are again distributed to CPU-only hosts as well which will temporarily decrease the RAC for the GPU crunchers as the average time that it takes our wingman to validate your result will incrrease. After some days it will go back to equilibrium.

                              CU
                              HB
                              ____________

                              hotze33
                              Send message
                              Joined: Nov 10 04
                              Posts: 91
                              Credit: 75,096,034
                              RAC: 5,575
                              Message 114364 - Posted 5 Oct 2011 6:21:59 UTC

                                Hi,
                                just an additional information one has to consider. A 44k RAC machine uses roughly 80GB traffic per month. Just make sure your isp has no problems with it.
                                ____________

                                Post to thread

                                Message boards : Cruncher's Corner : Using dual GTX 590's


                                Home · Your account · Message boards

                                This material is based upon work supported by the National Science Foundation (NSF) under Grants PHY-1104902, PHY-1104617 and PHY-1105572 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

                                Copyright © 2014 Bruce Allen