Panic Mode On (114) Server Problems?

Message boards : Number crunching : Panic Mode On (114) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 45 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1967416 - Posted: 27 Nov 2018, 22:29:12 UTC - in response to Message 1967410.  

. . Getting nothing but http errors so I think I need to reduce it a bit.
Setting <max_tasks_reported> only helps with 'internal server error'. Check WHICH http error you're getting, and act accordingly. I'm getting 'Couldn't connect to server', which means "wait till the rush has died down a bit".
ID: 1967416 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967420 - Posted: 27 Nov 2018, 22:39:42 UTC

All my hosts have had max_tasks set to 100 for a long while. All have been able to report except for this one which has been forced into longer and longer backoffs. Setting to 64 just now unplugged the dam.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967420 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1967424 - Posted: 27 Nov 2018, 23:01:28 UTC - in response to Message 1967413.  

From the manual:

<max_tasks_reported>N</max_tasks_reported>
Report at most N tasks per scheduler RPC. Try N=1000 if your computer has lots of tasks to report and is having trouble completing a scheduler RPC.
Take no notice of the 1000 suggestion - you'd be better off starting with say 64.

If you've been setting and unsetting log flags, you'll have a fully-populated cc_config.xml file already - modify the existing line, don't add a duplicate.


. . OK thanks Richard

Stephen

. .
ID: 1967424 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1967427 - Posted: 27 Nov 2018, 23:08:15 UTC - in response to Message 1967416.  

. . Getting nothing but http errors so I think I need to reduce it a bit.
Setting <max_tasks_reported> only helps with 'internal server error'. Check WHICH http error you're getting, and act accordingly. I'm getting 'Couldn't connect to server', which means "wait till the rush has died down a bit".

. . My laziness is chronic ... it was in fact the "HTTP: Internal server error" thanks again.

Stephen

:)
ID: 1967427 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1967469 - Posted: 28 Nov 2018, 2:18:23 UTC - in response to Message 1967412.  

Nop, 14no18aa is still stuck, even after the outage.


. . Soon it will have squatter's rights and we'll never get rid of it ...

Stephen

:)



It is gone !!
ID: 1967469 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967470 - Posted: 28 Nov 2018, 2:20:49 UTC

Ha ha. The staff had to rip it out by its roots, it was so well dug in!
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967470 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1968036 - Posted: 30 Nov 2018, 16:42:47 UTC

Not a panic yet, just a concern. I don't think we have enough datafiles to get through the weekend. I know the seti people are amazing at putting on files over the weekend and holidays, but truthfully I'd like them to just not have to worry about it on the weekend.

It has been so nice not to have to panic about the system for a while, and it worked so well over the Thanksgiving holiday. I thought I'd just pop this thread back up in case we need to panic late Sunday or early Monday. :-)
ID: 1968036 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1639
Credit: 12,921,799
RAC: 89
New Zealand
Message 1968096 - Posted: 1 Dec 2018, 2:35:47 UTC - in response to Message 1968036.  

Not a panic yet, just a concern. I don't think we have enough datafiles to get through the weekend. I know the seti people are amazing at putting on files over the weekend and holidays, but truthfully I'd like them to just not have to worry about it on the weekend.

It has been so nice not to have to panic about the system for a while, and it worked so well over the Thanksgiving holiday. I thought I'd just pop this thread back up in case we need to panic late Sunday or early Monday. :-)

No need to panic
ID: 1968096 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1968119 - Posted: 1 Dec 2018, 7:09:00 UTC - in response to Message 1968096.  

It has been so nice not to have to panic about the system for a while, and it worked so well over the Thanksgiving holiday. I thought I'd just pop this thread back up in case we need to panic late Sunday or early Monday. :-)

No need to panic


. . As in there is a whole swag of Blc12 tapes mounted now ...

Stephen

:)
ID: 1968119 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1639
Credit: 12,921,799
RAC: 89
New Zealand
Message 1968211 - Posted: 1 Dec 2018, 22:05:37 UTC - in response to Message 1968119.  
Last modified: 1 Dec 2018, 22:07:43 UTC

It has been so nice not to have to panic about the system for a while, and it worked so well over the Thanksgiving holiday. I thought I'd just pop this thread back up in case we need to panic late Sunday or early Monday. :-)

No need to panic


. . As in there is a whole swag of Blc12 tapes mounted now ...

Stephen

:)

I am not saying it wouldn't happen but I would be impressed if we got through all of them in 2 days/before staff are back at work
ID: 1968211 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1968221 - Posted: 1 Dec 2018, 22:37:25 UTC - in response to Message 1968211.  


I am not saying it wouldn't happen but I would be impressed if we got through all of them in 2 days/before staff are back at work


We are going through 1500 to 2000 channels a day (closer to 2000), so we should be good until Friday, as a rough guess. We could always hit some noisy bits that we would go through faster, or get some Aricebo data that would mean we wouldn't run out of data then. So I'll post again on Friday in hopes they load more data then.
ID: 1968221 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1968501 - Posted: 3 Dec 2018, 10:01:45 UTC
Last modified: 3 Dec 2018, 10:18:26 UTC

I've been out of the loop for a while. Scored a good deal on RAM for Black Friday and doubled-up on my main rig, so I'm running 4x8gb CL14 ddr4-3200 on my 2700x now.

Since building this rig in May, I've only managed to grab 20 APs (I know they're basically extinct these days), so I was considering wandering into the MB milieu little bit, maybe.

What's the good CPU app for that these days? I found r3714 AVX.. is there something else I should be using? Testing things out using MBbench for the time being, using stock 8.05 as the ref app.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1968501 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968526 - Posted: 3 Dec 2018, 17:13:05 UTC

Think that is the latest optimized cpu app. You will find out whether that is the fastest in the MBbench tool. Be sure it post your results. I am curious if that one is faster than the r3711 SSE41 app.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968526 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1968535 - Posted: 3 Dec 2018, 18:14:58 UTC - in response to Message 1968526.  

Think that is the latest optimized cpu app. You will find out whether that is the fastest in the MBbench tool. Be sure it post your results. I am curious if that one is faster than the r3711 SSE41 app.


It is.
Approximately by 10% on Ryzen.


With each crime and every kindness we birth our future.
ID: 1968535 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968537 - Posted: 3 Dec 2018, 18:29:47 UTC - in response to Message 1968535.  

Now that Richard has pointed me to the svn, I found the r3714 changeset
Does anyone want to take a crack at compiling the app for Linux?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968537 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1968544 - Posted: 3 Dec 2018, 19:22:27 UTC
Last modified: 3 Dec 2018, 19:25:43 UTC

Alright, well i haven't found anything other than r3714_avx, as well as the old r3330's.

====================================== 
 4 testWU(s) found  
     (PG0009_v8.wu) 
     (PG0395_v8.wu) 
     (PG0444_v8.wu) 
     (PG1327_v8.wu) 
 1 reference science app(s) found 
     (setiathome_8.05_windows_x86_64.exe -verb -nog) 
 4 science app(s) found 
     (MB8_win_x64_AVX_VS2010_r3330.exe) 
     (MB8_win_x64_AVX_VS2017_r3714_Vlock.exe) 
     (MB8_win_x64_SSE2_VS2008_r3330.exe) 
     (MB8_win_x64_SSE3_VS2008_r3330.exe) 
====================================== 
------------ 
Quick timetable 
 
WU : PG0009_v8.wu 
setiathome_8.05_windows_x86_64.exe -verb -nog :
  Elapsed 341.547 secs
      CPU 337.727 secs
MB8_win_x64_AVX_VS2010_r3330.exe :
  Elapsed 163.544 secs, speedup: 52.12%  ratio: 2.09x
      CPU 161.367 secs, speedup: 52.22%  ratio: 2.09x
MB8_win_x64_AVX_VS2017_r3714_Vlock.exe :
  Elapsed 159.488 secs, speedup: 53.30%  ratio: 2.14x
      CPU 152.429 secs, speedup: 54.87%  ratio: 2.22x
MB8_win_x64_SSE2_VS2008_r3330.exe :
  Elapsed 170.298 secs, speedup: 50.14%  ratio: 2.01x
      CPU 168.153 secs, speedup: 50.21%  ratio: 2.01x
MB8_win_x64_SSE3_VS2008_r3330.exe :
  Elapsed 164.204 secs, speedup: 51.92%  ratio: 2.08x
      CPU 162.007 secs, speedup: 52.03%  ratio: 2.08x
 
WU : PG0395_v8.wu 
setiathome_8.05_windows_x86_64.exe -verb -nog :
  Elapsed 456.880 secs
      CPU 454.244 secs
MB8_win_x64_AVX_VS2010_r3330.exe :
  Elapsed 176.129 secs, speedup: 61.45%  ratio: 2.59x
      CPU 173.910 secs, speedup: 61.71%  ratio: 2.61x
MB8_win_x64_AVX_VS2017_r3714_Vlock.exe :
  Elapsed 283.321 secs, speedup: 37.99%  ratio: 1.61x
      CPU 279.460 secs, speedup: 38.48%  ratio: 1.63x
MB8_win_x64_SSE2_VS2008_r3330.exe :
  Elapsed 182.259 secs, speedup: 60.11%  ratio: 2.51x
      CPU 180.072 secs, speedup: 60.36%  ratio: 2.52x
MB8_win_x64_SSE3_VS2008_r3330.exe :
  Elapsed 181.214 secs, speedup: 60.34%  ratio: 2.52x
      CPU 179.089 secs, speedup: 60.57%  ratio: 2.54x
 
WU : PG0444_v8.wu 
setiathome_8.05_windows_x86_64.exe -verb -nog :
  Elapsed 412.908 secs
      CPU 410.189 secs
MB8_win_x64_AVX_VS2010_r3330.exe :
  Elapsed 171.027 secs, speedup: 58.58%  ratio: 2.41x
      CPU 168.809 secs, speedup: 58.85%  ratio: 2.43x
MB8_win_x64_AVX_VS2017_r3714_Vlock.exe :
  Elapsed 154.687 secs, speedup: 62.54%  ratio: 2.67x
      CPU 151.087 secs, speedup: 63.17%  ratio: 2.71x
MB8_win_x64_SSE2_VS2008_r3330.exe :
  Elapsed 171.145 secs, speedup: 58.55%  ratio: 2.41x
      CPU 169.121 secs, speedup: 58.77%  ratio: 2.43x
MB8_win_x64_SSE3_VS2008_r3330.exe :
  Elapsed 166.476 secs, speedup: 59.68%  ratio: 2.48x
      CPU 164.425 secs, speedup: 59.91%  ratio: 2.49x
 
WU : PG1327_v8.wu 
setiathome_8.05_windows_x86_64.exe -verb -nog :
  Elapsed 321.100 secs
      CPU 318.336 secs
MB8_win_x64_AVX_VS2010_r3330.exe :
  Elapsed 181.801 secs, speedup: 43.38%  ratio: 1.77x
      CPU 179.651 secs, speedup: 43.57%  ratio: 1.77x
MB8_win_x64_AVX_VS2017_r3714_Vlock.exe :
  Elapsed 157.291 secs, speedup: 51.01%  ratio: 2.04x
      CPU 153.755 secs, speedup: 51.70%  ratio: 2.07x
MB8_win_x64_SSE2_VS2008_r3330.exe :
  Elapsed 182.247 secs, speedup: 43.24%  ratio: 1.76x
      CPU 180.041 secs, speedup: 43.44%  ratio: 1.77x
MB8_win_x64_SSE3_VS2008_r3330.exe :
  Elapsed 176.829 secs, speedup: 44.93%  ratio: 1.82x
      CPU 174.721 secs, speedup: 45.11%  ratio: 1.82x
------------ 


So it looks like r3714 is quickest by a good bit on three of the test WUs, but far in last place for PG0395.

And while it may not be the absolute quickest, it almost looks like r3330 SSE3 is going to end up being overall the most efficient, since it handles tasks like 0395 really well, and does all the others quite well.

Is there anything else I should test?
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1968544 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968550 - Posted: 3 Dec 2018, 20:09:45 UTC
Last modified: 3 Dec 2018, 20:13:59 UTC

You really need to run the cpu apps against the current work that the project is running. Those PG test case files are not representative of what we are actually running now. They are no signals, fast completing test wu's to prove the bench works. Move some of the Arecibo and BLC tasks out of the Safe folder into the TestWu directory and run the bench again.

[Edit] Just realized you are using the old Benchmark tool and not Rick's new tool. Copy some of your current work to the TestWu directory for more realistic running.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968550 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1968566 - Posted: 3 Dec 2018, 21:51:12 UTC - in response to Message 1968537.  

Now that Richard has pointed me to the svn, I found the r3714 changeset
Does anyone want to take a crack at compiling the app for Linux?


If i`m not mistaken Raistmers last changes are not commited t to svn yet.


With each crime and every kindness we birth our future.
ID: 1968566 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1968570 - Posted: 3 Dec 2018, 21:56:58 UTC - in response to Message 1968550.  
Last modified: 3 Dec 2018, 22:01:53 UTC

You really need to run the cpu apps against the current work that the project is running. Those PG test case files are not representative of what we are actually running now. They are no signals, fast completing test wu's to prove the bench works. Move some of the Arecibo and BLC tasks out of the Safe folder into the TestWu directory and run the bench again.

[Edit] Just realized you are using the old Benchmark tool and not Rick's new tool. Copy some of your current work to the TestWu directory for more realistic running.

Right. I can do that. I can just pull some of the BLC tasks from Altair's cache and drop them into the TestWUs folder. Of course, they'll take longer to run, but it should give a better comparison.

Wow, i just looked.. the entire cache on that machine is all VLARs right now. So that WON'T be representative. I'll still grab two at random though. And then run this again in a few days when there is something that ISN'T a VLAR.

edit: Running test against:

blc11_2bit_guppi_58406_26261_HIP20357_0101.25007.818.21.44.146.vlar.wu
blc12_2bit_guppi_58405_85309_GJ687_0026.13619.409.22.45.253.vlar.wu

(I added .wu so that mbbench would read them, as per the instructions.)
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1968570 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968579 - Posted: 3 Dec 2018, 22:34:09 UTC - in response to Message 1968570.  

If you want, you can download Rick's new benchMT benchmark tool and grab the two standard AR Arecibo non-VLAR tasks he's included in the testWU directory. They are in the /Safe directory in the testWU directory. There are about a dozen reference tasks there including Arecibo VLAR's, standard AR's and BLC tasks.

https://setiathome.berkeley.edu/forum_thread.php?id=83566&postid=1968118
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968579 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 45 · Next

Message boards : Number crunching : Panic Mode On (114) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.