Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 24 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323308 - Posted: 1 Jan 2013, 17:36:46 UTC - in response to Message 1323305.  
Last modified: 1 Jan 2013, 17:38:10 UTC

Well, duh? If my GPUs were not out of Seti work, I would not have had to.

Oh c'mon Mark, -6F is damn cold, even kitties need a bit of warmth :-)

No worries there.
They're normally snoozing on the nice warm waterbed. The temp controller is set to 82f, so they can stay as cozy as can be.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1323308 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323349 - Posted: 1 Jan 2013, 19:03:03 UTC

Hmmmmmmmm.....
Seems to be a disturbance in the uploads on the Cricket graph.
Either something else fell over in the server closet, or..., could it be..., the Lone Ranger poking about on New Year's Day???
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1323349 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1323354 - Posted: 1 Jan 2013, 19:18:00 UTC - in response to Message 1323308.  

A water bed....
I wonder if that would be a large enough thermal mass to act as a radiator for a water cooling system?
ID: 1323354 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1323355 - Posted: 1 Jan 2013, 19:21:42 UTC - in response to Message 1323354.  

A water bed....
I wonder if that would be a large enough thermal mass to act as a radiator for a water cooling system?

I'm sure it could be, although you would not want the water to get up to 82f for most efficient water cooling.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1323355 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323381 - Posted: 1 Jan 2013, 21:59:30 UTC - in response to Message 1323349.  

Hmmmmmmmm.....
Seems to be a disturbance in the uploads on the Cricket graph.
Either something else fell over in the server closet, or..., could it be..., the Lone Ranger poking about on New Year's Day???

Three hours later -- it seems like someone's in the control room but comments here have dried up. May we live in interesting times...
ID: 1323381 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1323383 - Posted: 1 Jan 2013, 22:05:17 UTC

most of synergy`s proceses are not running,
gon all red,
and as for the cricket even the thin blue line has falen off the page.
`we have been normalized`
much the same as asymilated but less work gets done
ID: 1323383 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323390 - Posted: 1 Jan 2013, 22:24:29 UTC

I've got an experiment I'd really like to try, but I'm scared... :-)

My nVidia card has been out of work since early this morning, however, I've all these CPU 603s lying around. I checked, the files labeled 609 (cuda23) are identical to the ones labeled 603. So, I added this entry to the end of my app_info file;
<app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.100000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
	       <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>

After restarting BONIC I received this;
1/1/2013 2:42:04 PM | SETI@home | Found app_info.xml; using anonymous platform
1/1/2013 2:42:04 PM | SETI@home | [error] State file error: duplicate app version: setiathome_enhanced windows_intelx86 603

This would appear that it might work if I removed my CPU 603 entry from the top of my app_info file. Then again, I might lose all my 603 files including the ones waiting to be reported. I have quite a few 603s already uploaded and waiting reporting....

I also have a few of those nasty vlars that I've been suspending least the nVidia card try working on one. So, what do you think. If I suspend CPU work & vlars, stop BONIC, remove the CPU entry, then restart, will I have success or will I lose all my 603 files? I was hoping to at least report the completions before trying this, however, I'm growing inpatient.
ID: 1323390 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 31505
Credit: 53,134,872
RAC: 32
United States
Message 1323405 - Posted: 1 Jan 2013, 23:16:07 UTC - in response to Message 1323400.  

The feeder and the transitioners are not running...


The Cardinal is beating the Badger at the Rose Bowl. As Eric is a Badger, you had better root for them or it is likely to stay offline for a while. ;-)

ID: 1323405 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323413 - Posted: 1 Jan 2013, 23:56:49 UTC - in response to Message 1323390.  

I've got an experiment I'd really like to try, but I'm scared... :-)

My nVidia card has been out of work since early this morning, however, I've all these CPU 603s lying around. I checked, the files labeled 609 (cuda23) are identical to the ones labeled 603. So, I added this entry to the end of my app_info file;
<app_version>
        <app_name>setiathome_enhanced</app_name>
        <version_num>603</version_num>
        <avg_ncpus>0.040000</avg_ncpus>
        <max_ncpus>0.100000</max_ncpus>
        <coproc>
            <type>CUDA</type>
            <count>1</count>
        </coproc>
        <file_ref>
            <file_name>Lunatics_x41g_win32_cuda32.exe</file_name>
            <main_program/>
         </file_ref>
        <file_ref>
	       <file_name>cudart32_32_16.dll</file_name>
        </file_ref>
        <file_ref>
            <file_name>cufft32_32_16.dll</file_name>
        </file_ref>
    </app_version>

After restarting BONIC I received this;
1/1/2013 2:42:04 PM | SETI@home | Found app_info.xml; using anonymous platform
1/1/2013 2:42:04 PM | SETI@home | [error] State file error: duplicate app version: setiathome_enhanced windows_intelx86 603

This would appear that it might work if I removed my CPU 603 entry from the top of my app_info file. Then again, I might lose all my 603 files including the ones waiting to be reported. I have quite a few 603s already uploaded and waiting reporting....

I also have a few of those nasty vlars that I've been suspending least the nVidia card try working on one. So, what do you think. If I suspend CPU work & vlars, stop BONIC, remove the CPU entry, then restart, will I have success or will I lose all my 603 files? I was hoping to at least report the completions before trying this, however, I'm growing inpatient.

Success!

I chose a safer route and merely moved the CPU entry to the bottom of my app_info file, below the nVidia entry. Then suspended all remaining 603s and restarted BOINC. Once again I received;
1/1/2013 6:12:00 PM | SETI@home | Found app_info.xml; using anonymous platform
1/1/2013 6:12:00 PM | SETI@home | [error] State file error: duplicate app version: setiathome_enhanced windows_intelx86 603

I then resumed one non-vlar 603 and the nVidia app started the task. I then resumed another non-vlar 603 to see if a CPU would start the task, it didn't. The first 603, with an estimate of 2 hours, finished in 27 minutes. It's on the 2nd 603 now. The only down side I see is this might have a negative effect on the CPU Estimated times for the 603s. The up side is, your nVidia card now has twice it's imposed limit of 100 units.

There has to be something I'm missing...
:-)
ID: 1323413 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323420 - Posted: 2 Jan 2013, 0:57:27 UTC - in response to Message 1323413.  
Last modified: 2 Jan 2013, 1:00:14 UTC

There has to be something I'm missing...
:-)

Perhaps this.
ID: 1323420 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1323422 - Posted: 2 Jan 2013, 1:01:08 UTC - in response to Message 1323413.  
Last modified: 2 Jan 2013, 1:06:34 UTC

Just a note ..

If you do something like this, you (not only for this WUs, maybe also for a few following) and your wingmen will get less Credits.

The tasks are still online marked as CPU WUs, but finished much faster (on the GPU) than the CPU WUs normally do, so you get less Credits.
Because SAH grant the lowest Credits of the paired results of one WU, the wingmen of all this WUs get also lower Credits granted.

Because of this it's not longer recommended to use e.g. BoincRescheduler since CreditNew.

E.g. let crunch secondary BOINC projects and let the client_state.XML file like it is - and don't edit the app_info.XML file.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1323422 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1323430 - Posted: 2 Jan 2013, 1:18:51 UTC
Last modified: 2 Jan 2013, 1:22:40 UTC

As many of us know, the Cricket graphs tanked at about 1850 PST Monday night. Something crashed hard.

At about 1100 PST today, I got bounced out of the Cafe forum with the message "Project is temporarily down for maintenance." The SSP showed everything except the 4 main databases as "Disabled". About 1500 PST many of the server functions were back online, and I was able to return to the Forums. As of now more functions are showing green, including the upload, download, and Scheduler, but the Feeder is still off-line and the Cricket graphs are still down in the noise.

I suspect one of the guys came in to do a restart, but things are not coming back up smoothly. Hope it all gets sorted soon, but if not, I've got enough work to last until Thursday.
Donald
Infernal Optimist / Submariner, retired
ID: 1323430 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323432 - Posted: 2 Jan 2013, 1:22:20 UTC - in response to Message 1323420.  

There has to be something I'm missing...
:-)

Perhaps this.

Hummm, that sounds more complicated than simply adding that version entry to the nVidia entry, then removing it when the servers are back...
ID: 1323432 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323435 - Posted: 2 Jan 2013, 1:34:30 UTC - in response to Message 1323422.  

Just a note ..

If you do something like this, you (not only for this WUs, maybe also for a few following) and your wingmen will get less Credits.

The tasks are still online marked as CPU WUs, but finished much faster (on the GPU) than the CPU WUs normally do, so you get less Credits.
Because SAH grant the lowest Credits of the paired results of one WU, the wingmen of all this WUs get also lower Credits granted.

Because of this it's not longer recommended to use e.g. BoincRescheduler since CreditNew.

E.g. let crunch secondary BOINC projects and let the client_state.XML file like it is - and don't edit the app_info.XML file.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *

Interesting. So, using your logic, we shouldn't use the optimized Apps at all. Seeing as how they drastically reduce the work time and result in reduced credits and all. If I wasn't using the optimized Lunatics CPU app, that CPU 603 task would take close to 5 hours instead of the 2 hours it does with the optimized CPU app. Do you recommend not using the optimized Apps? How about faster video cards/CPUs, they also drastically reduce work time... I'm really not interested in any other Projects.
ID: 1323435 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1323436 - Posted: 2 Jan 2013, 1:36:43 UTC - in response to Message 1323430.  

At which point the entire lab goes offline for a quarterly outage to rebuild fragile electrical infrastructure again.

January looks to be a rather depressing month for SETI addicts.




I suspect one of the guys came in to do a restart, but things are not coming back up smoothly. Hope it all gets sorted soon, but if not, I've got enough work to last until Thursday.

ID: 1323436 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1323437 - Posted: 2 Jan 2013, 1:44:26 UTC - in response to Message 1323436.  

At which point the entire lab goes offline for a quarterly outage to rebuild fragile electrical infrastructure again.

January looks to be a rather depressing month for SETI addicts.

I suspect one of the guys came in to do a restart, but things are not coming back up smoothly. Hope it all gets sorted soon, but if not, I've got enough work to last until Thursday.

Oh, yeah. Well, then I hope we can get work Wednesday. If not, I'll have to open the vent damper into my study, and let the furnace heat the room instead of my two Wintel crunchers. My ancient Mac G4s in the dining room have enough work to last until Monday.
Donald
Infernal Optimist / Submariner, retired
ID: 1323437 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1323443 - Posted: 2 Jan 2013, 2:19:21 UTC - in response to Message 1323435.  
Last modified: 2 Jan 2013, 2:31:53 UTC

TBar wrote:
Interesting. So, using your logic, we shouldn't use the optimized Apps at all. Seeing as how they drastically reduce the work time and result in reduced credits and all. If I wasn't using the optimized Lunatics CPU app, that CPU 603 task would take close to 5 hours instead of the 2 hours it does with the optimized CPU app. Do you recommend not using the optimized Apps? How about faster video cards/CPUs, they also drastically reduce work time... I'm really not interested in any other Projects.


No.

If your PC calculate with the stock project apps they have their own:
Host/Application details/'Average processing rate'
.. after 10 granted results, the Credits 'should' be correct.
E.g. 'SETI@home Enhanced 6.03 windows_intelx86'.

After installation of the opt. project apps they get their own:
Host/Application details/'Average processing rate'
.. after 10 granted results, the Credits 'should' be correct.
E.g. 'SETI@home Enhanced (anonymous platform, CPU)'.

If you rename the WUs in BOINC (CPU -> GPU, GPU -> CPU), the SAH server don't get info about this.
Online the WUs are still CPU WUs, the calculation time is much shorter than the estimated calculation time, with the result of less Credits.

No, if you rename GPU WUs to CPU WUs, you don't get more Credits.

After introduction of the new Credit system 'CreditNew' at SAH, Fred added the info:
Warning, rescheduling may result in less credits, for you or your wing man.
Try NOT to use it, unless there is no other way.



* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1323443 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1323452 - Posted: 2 Jan 2013, 3:23:36 UTC - in response to Message 1323443.  

TBar wrote:
Interesting. So, using your logic, we shouldn't use the optimized Apps at all. Seeing as how they drastically reduce the work time and result in reduced credits and all. If I wasn't using the optimized Lunatics CPU app, that CPU 603 task would take close to 5 hours instead of the 2 hours it does with the optimized CPU app. Do you recommend not using the optimized Apps? How about faster video cards/CPUs, they also drastically reduce work time... I'm really not interested in any other Projects.


No.

If your PC calculate with the stock project apps they have their own:
Host/Application details/'Average processing rate'
.. after 10 granted results, the Credits 'should' be correct.
E.g. 'SETI@home Enhanced 6.03 windows_intelx86'.

After installation of the opt. project apps they get their own:
Host/Application details/'Average processing rate'
.. after 10 granted results, the Credits 'should' be correct.
E.g. 'SETI@home Enhanced (anonymous platform, CPU)'.

If you rename the WUs in BOINC (CPU -> GPU, GPU -> CPU), the SAH server don't get info about this.
Online the WUs are still CPU WUs, the calculation time is much shorter than the estimated calculation time, with the result of less Credits.

No, if you rename GPU WUs to CPU WUs, you don't get more Credits.

After introduction of the new Credit system 'CreditNew' at SAH, Fred added the info:
Warning, rescheduling may result in less credits, for you or your wing man.
Try NOT to use it, unless there is no other way.



* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *

This part, Try NOT to use it, unless there is no other way.
I would consider being completely out of nVidia tasks for over half a day "no other way". This will be listed under "Anonymous Platform", which as far as I know, is wide open. If you can construct an App that preforms 10x better than the stock App, go for it. Just imagine I've spent scads of money on the latest Intel i9000 processor, I'm using a proto App, and it's cooking with gas. Seems to be a lot of concern over credits....

Just remember, Fred's comments were before the unrealistic Limits were imposed. Allotting just 100 units to the much faster GPUs is not realistic. I wouldn't be attempting this if my GPU hadn't have run out of tasks in less than half a day while the CPUs have days worth of tasks left. Maybe a more realistic allotment is in order? I would like to see at least 100 per GPU/Platform. This machine also has an AMD card which was fortunate enough to be working APs when the lights went out. It is just now approaching 50% cache. 100 for the nVidia card, 100 for the AMD, or 100 for each GPU. Then people such as myself wouldn't be involved in such endeavors. :-)
ID: 1323452 · Report as offensive
Profile Paul D Harris
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 1122
Credit: 33,600,005
RAC: 0
United States
Message 1323569 - Posted: 2 Jan 2013, 13:00:10 UTC

Well all of my 200 wu are ready to report and waiting for more work.
ID: 1323569 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1323600 - Posted: 2 Jan 2013, 14:22:12 UTC

Just wonder if the staff will kick start the project a bit today after the outage (if they do it) so we can get a little chunk of WU's before all closes down over the weekend...
ID: 1323600 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.