Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 25 · Next

AuthorMessage
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 64545
Credit: 55,293,173
RAC: 49
United States
Message 1323289 - Posted: 1 Jan 2013, 16:52:25 UTC - in response to Message 1323277.  
Last modified: 1 Jan 2013, 16:53:14 UTC

Got my rigs setup on Einstein as well now. Can you run more than one WU at a time like with S@H?

And is there an optimized app for it?

1. Yes.
2. No.

On the 590 I process Einstein WU's 2 per gpu, each one takes about 36-37 minutes to crunch through, so You might not want to do more than 1 per gpu, it's faster to do 1 at a time, but if You have 1.5GB to 2GB You could try it and You have to set the parameters over at Einstein here from 1.0 to 0.5, this can be found at this line below.

GPU utilization factor of BRP apps
DANGEROUS! Only touch this if you are absolutely sure of what you are doing!

The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1323289 · Report as offensive
Team kizb

Send message
Joined: 8 Mar 01
Posts: 219
Credit: 3,709,162
RAC: 0
Germany
Message 1323293 - Posted: 1 Jan 2013, 16:59:48 UTC - in response to Message 1323289.  

Sounds like I should just let my 295s do 1 at a time then. Thanks for the information.
My Computers:
â–ˆ Blue Offline
â–ˆ Green Offline
â–ˆ Red Offline
ID: 1323293 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51407
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323296 - Posted: 1 Jan 2013, 17:01:34 UTC - in response to Message 1323284.  


Had to start the furnace this morning for the first time this year. (Was -6f here this morning.)

Posted: 1 Jan 2013 | 15:51:12 UTC

Well, duh!

Well, duh?
If my GPUs were not out of Seti work, I would not have had to.
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1323296 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 28538
Credit: 53,134,872
RAC: 32
United States
Message 1323300 - Posted: 1 Jan 2013, 17:05:25 UTC

You might want to check this discussion out next time there is an outage here and you are feeling lost because you can't find the post button.
http://boinc.berkeley.edu/dev/forum_thread.php?id=8105
Of course this is also on a computer at the SSL so it will be down during the Jan 4-6 power outage, but perhaps for less time than Seti is down.
ID: 1323300 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323304 - Posted: 1 Jan 2013, 17:29:10 UTC - in response to Message 1323296.  


Had to start the furnace this morning for the first time this year. (Was -6f here this morning.)

Posted: 1 Jan 2013 | 15:51:12 UTC

Well, duh!

Well, duh?
If my GPUs were not out of Seti work, I would not have had to.

But it was also the first chance you had to do it this year...
ID: 1323304 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51407
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323306 - Posted: 1 Jan 2013, 17:35:40 UTC - in response to Message 1323304.  


Had to start the furnace this morning for the first time this year. (Was -6f here this morning.)

Posted: 1 Jan 2013 | 15:51:12 UTC

Well, duh!

Well, duh?
If my GPUs were not out of Seti work, I would not have had to.

But it was also the first chance you had to do it this year...

LOL...got me there. I didn't realize I was being so punny.
I should have said it was the first time this WINTER I had to run the furnace.
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1323306 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51407
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323308 - Posted: 1 Jan 2013, 17:36:46 UTC - in response to Message 1323305.  
Last modified: 1 Jan 2013, 17:38:10 UTC

Well, duh? If my GPUs were not out of Seti work, I would not have had to.

Oh c'mon Mark, -6F is damn cold, even kitties need a bit of warmth :-)

No worries there.
They're normally snoozing on the nice warm waterbed. The temp controller is set to 82f, so they can stay as cozy as can be.
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1323308 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51407
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323349 - Posted: 1 Jan 2013, 19:03:03 UTC

Hmmmmmmmm.....
Seems to be a disturbance in the uploads on the Cricket graph.
Either something else fell over in the server closet, or..., could it be..., the Lone Ranger poking about on New Year's Day???
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1323349 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1323354 - Posted: 1 Jan 2013, 19:18:00 UTC - in response to Message 1323308.  

A water bed....
I wonder if that would be a large enough thermal mass to act as a radiator for a water cooling system?
ID: 1323354 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51407
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323355 - Posted: 1 Jan 2013, 19:21:42 UTC - in response to Message 1323354.  

A water bed....
I wonder if that would be a large enough thermal mass to act as a radiator for a water cooling system?

I'm sure it could be, although you would not want the water to get up to 82f for most efficient water cooling.
Excuse me if I am hard to understand at times.......I've had a difficult few lives.

ID: 1323355 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323381 - Posted: 1 Jan 2013, 21:59:30 UTC - in response to Message 1323349.  

Hmmmmmmmm.....
Seems to be a disturbance in the uploads on the Cricket graph.
Either something else fell over in the server closet, or..., could it be..., the Lone Ranger poking about on New Year's Day???

Three hours later -- it seems like someone's in the control room but comments here have dried up. May we live in interesting times...
ID: 1323381 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1323383 - Posted: 1 Jan 2013, 22:05:17 UTC

most of synergy`s proceses are not running,
gon all red,
and as for the cricket even the thin blue line has falen off the page.
`we have been normalized`
much the same as asymilated but less work gets done
ID: 1323383 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323390 - Posted: 1 Jan 2013, 22:24:29 UTC

I've got an experiment I'd really like to try, but I'm scared... :-)

My nVidia card has been out of work since early this morning, however, I've all these CPU 603s lying around. I checked, the files labeled 609 (cuda23) are identical to the ones labeled 603. So, I added this entry to the end of my app_info file;
<app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.100000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
	       <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>

After restarting BONIC I received this;
1/1/2013 2:42:04 PM | SETI@home | Found app_info.xml; using anonymous platform
1/1/2013 2:42:04 PM | SETI@home | [error] State file error: duplicate app version: setiathome_enhanced windows_intelx86 603

This would appear that it might work if I removed my CPU 603 entry from the top of my app_info file. Then again, I might lose all my 603 files including the ones waiting to be reported. I have quite a few 603s already uploaded and waiting reporting....

I also have a few of those nasty vlars that I've been suspending least the nVidia card try working on one. So, what do you think. If I suspend CPU work & vlars, stop BONIC, remove the CPU entry, then restart, will I have success or will I lose all my 603 files? I was hoping to at least report the completions before trying this, however, I'm growing inpatient.
ID: 1323390 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 28538
Credit: 53,134,872
RAC: 32
United States
Message 1323405 - Posted: 1 Jan 2013, 23:16:07 UTC - in response to Message 1323400.  

The feeder and the transitioners are not running...


The Cardinal is beating the Badger at the Rose Bowl. As Eric is a Badger, you had better root for them or it is likely to stay offline for a while. ;-)

ID: 1323405 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323413 - Posted: 1 Jan 2013, 23:56:49 UTC - in response to Message 1323390.  

I've got an experiment I'd really like to try, but I'm scared... :-)

My nVidia card has been out of work since early this morning, however, I've all these CPU 603s lying around. I checked, the files labeled 609 (cuda23) are identical to the ones labeled 603. So, I added this entry to the end of my app_info file;
<app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.100000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
	       <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>

After restarting BONIC I received this;
1/1/2013 2:42:04 PM | SETI@home | Found app_info.xml; using anonymous platform
1/1/2013 2:42:04 PM | SETI@home | [error] State file error: duplicate app version: setiathome_enhanced windows_intelx86 603

This would appear that it might work if I removed my CPU 603 entry from the top of my app_info file. Then again, I might lose all my 603 files including the ones waiting to be reported. I have quite a few 603s already uploaded and waiting reporting....

I also have a few of those nasty vlars that I've been suspending least the nVidia card try working on one. So, what do you think. If I suspend CPU work & vlars, stop BONIC, remove the CPU entry, then restart, will I have success or will I lose all my 603 files? I was hoping to at least report the completions before trying this, however, I'm growing inpatient.

Success!

I chose a safer route and merely moved the CPU entry to the bottom of my app_info file, below the nVidia entry. Then suspended all remaining 603s and restarted BOINC. Once again I received;
1/1/2013 6:12:00 PM | SETI@home | Found app_info.xml; using anonymous platform
1/1/2013 6:12:00 PM | SETI@home | [error] State file error: duplicate app version: setiathome_enhanced windows_intelx86 603

I then resumed one non-vlar 603 and the nVidia app started the task. I then resumed another non-vlar 603 to see if a CPU would start the task, it didn't. The first 603, with an estimate of 2 hours, finished in 27 minutes. It's on the 2nd 603 now. The only down side I see is this might have a negative effect on the CPU Estimated times for the 603s. The up side is, your nVidia card now has twice it's imposed limit of 100 units.

There has to be something I'm missing...
:-)
ID: 1323413 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323420 - Posted: 2 Jan 2013, 0:57:27 UTC - in response to Message 1323413.  
Last modified: 2 Jan 2013, 1:00:14 UTC

There has to be something I'm missing...
:-)

Perhaps this.
ID: 1323420 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1323422 - Posted: 2 Jan 2013, 1:01:08 UTC - in response to Message 1323413.  
Last modified: 2 Jan 2013, 1:06:34 UTC

Just a note ..

If you do something like this, you (not only for this WUs, maybe also for a few following) and your wingmen will get less Credits.

The tasks are still online marked as CPU WUs, but finished much faster (on the GPU) than the CPU WUs normally do, so you get less Credits.
Because SAH grant the lowest Credits of the paired results of one WU, the wingmen of all this WUs get also lower Credits granted.

Because of this it's not longer recommended to use e.g. BoincRescheduler since CreditNew.

E.g. let crunch secondary BOINC projects and let the client_state.XML file like it is - and don't edit the app_info.XML file.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1323422 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1323430 - Posted: 2 Jan 2013, 1:18:51 UTC
Last modified: 2 Jan 2013, 1:22:40 UTC

As many of us know, the Cricket graphs tanked at about 1850 PST Monday night. Something crashed hard.

At about 1100 PST today, I got bounced out of the Cafe forum with the message "Project is temporarily down for maintenance." The SSP showed everything except the 4 main databases as "Disabled". About 1500 PST many of the server functions were back online, and I was able to return to the Forums. As of now more functions are showing green, including the upload, download, and Scheduler, but the Feeder is still off-line and the Cricket graphs are still down in the noise.

I suspect one of the guys came in to do a restart, but things are not coming back up smoothly. Hope it all gets sorted soon, but if not, I've got enough work to last until Thursday.
Donald
Infernal Optimist / Submariner, retired
ID: 1323430 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323432 - Posted: 2 Jan 2013, 1:22:20 UTC - in response to Message 1323420.  

There has to be something I'm missing...
:-)

Perhaps this.

Hummm, that sounds more complicated than simply adding that version entry to the nVidia entry, then removing it when the servers are back...
ID: 1323432 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323435 - Posted: 2 Jan 2013, 1:34:30 UTC - in response to Message 1323422.  

Just a note ..

If you do something like this, you (not only for this WUs, maybe also for a few following) and your wingmen will get less Credits.

The tasks are still online marked as CPU WUs, but finished much faster (on the GPU) than the CPU WUs normally do, so you get less Credits.
Because SAH grant the lowest Credits of the paired results of one WU, the wingmen of all this WUs get also lower Credits granted.

Because of this it's not longer recommended to use e.g. BoincRescheduler since CreditNew.

E.g. let crunch secondary BOINC projects and let the client_state.XML file like it is - and don't edit the app_info.XML file.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *

Interesting. So, using your logic, we shouldn't use the optimized Apps at all. Seeing as how they drastically reduce the work time and result in reduced credits and all. If I wasn't using the optimized Lunatics CPU app, that CPU 603 task would take close to 5 hours instead of the 2 hours it does with the optimized CPU app. Do you recommend not using the optimized Apps? How about faster video cards/CPUs, they also drastically reduce work time... I'm really not interested in any other Projects.
ID: 1323435 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2021 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.