The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 94 · Next

AuthorMessage
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2024979 - Posted: 25 Dec 2019, 20:48:39 UTC - in response to Message 2024978.  

I just noticed the T3500 is only asking for GPU tasks. The preferences for its location should have both GPU and CPU tasks. Any other place where a variable could override that?
Missing or broken app_info.xml entry for the CPU app?
ID: 2024979 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 38094
Credit: 261,360,520
RAC: 489
Australia
Message 2024981 - Posted: 25 Dec 2019, 21:05:58 UTC

I just noticed the T3500 is only asking for GPU tasks. The preferences for its location should have both GPU and CPU tasks. Any other place where a variable could override that?
I have no idea what your "T3500" is, but have you checked your local SETI properties as sometimes you could be on an up to 4 day back off on requesting CPU work.

If you are then once you close the window do a manual update and that will reset the back off.

Cheers.
ID: 2024981 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2024982 - Posted: 25 Dec 2019, 21:21:57 UTC - in response to Message 2024959.  

I have two of my rigs back on the special sauce app, a cuda90 version. One of them has both GPU and CPU tasks (T5810-Ubuntu) and the other only has GPU after quite a while (T3500-Ubuntu). All I did was restore the backed up setiathome.berkely.com folder. Before the server software issues, both were getting both GPU and CPU tasks. Any idea what I should check? Is is just not long enough for the server to make some CPU tasks?

Thanks and happy holidays!

Roger


. . When the splitters create 'results' they are neither CPU nor GPU tasks, they are just tasks. It is the scheduler that 'decides' to make them one or the other when they are allocated to a host (which is deceptive when you get a message saying there are ATI and Intel tasks available but none for your Nvidia cards???) So check your event log as Richards suggests and make sure you are requesting CPU work.

Stephen

. .
ID: 2024982 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2024985 - Posted: 25 Dec 2019, 21:41:54 UTC - in response to Message 2024979.  

I would recommend always using the sched_op_debug logging flag for the Event Log. That way at each scheduler connection you will get a printout of how many seconds of cpu work and gpu work you are requesting. If you don't ask for any seconds of cpu work, then you need to figure out why. Probably a configuration problem where you turned off the cpu for the host or the location venue.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2024985 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2024986 - Posted: 25 Dec 2019, 21:51:42 UTC

The Splitters are struggling to meet demand, but so far they are. Now if they could just sort out the WU validation/deletion/assimilation issues. And the new Scheduler code issues.
A few things there for the to do list in the new year.
Grant
Darwin NT
ID: 2024986 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2024990 - Posted: 25 Dec 2019, 21:59:22 UTC - in response to Message 2024930.  

BTW, I finally found out how to keep the Main Server from sending all those OpenCL tasks when trying to Spoof the CUDA Special App as Stock. Just add <no_opencl>1</no_opencl> to cc_config.xml, then restart BOINC, and then it will only send tasks for CUDA.
So with luck using <no_cuda>1</no_cuda> should stop any CUDA42 or CUDA50 applications being used if trying to run stock under Windows, leaving just SoG which is OpenCL (and AP for those that process them).
Grant
Darwin NT
ID: 2024990 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2024997 - Posted: 25 Dec 2019, 22:52:43 UTC - in response to Message 2024911.  

Hmm, managed to pick up some work (new work that is, not resends) in the last 30min or so, Ready-to-send showing 1200, but splitter output has been reported as 0 for about an hour now.
This was annoying me while I was trying to sleep last night.
The only thing that comes to mind is that the splitters were spluttering over that hour or so- spitting out a good amount of work now & then, but never when they were queried for the Server Status page numbers. Hence work was being produced, even though the splitter output was being reported as 0.
Grant
Darwin NT
ID: 2024997 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2025012 - Posted: 26 Dec 2019, 0:42:02 UTC - in response to Message 2024985.  

I would recommend always using the sched_op_debug logging flag for the Event Log. That way at each scheduler connection you will get a printout of how many seconds of cpu work and gpu work you are requesting. If you don't ask for any seconds of cpu work, then you need to figure out why. Probably a configuration problem where you turned off the cpu for the host or the location venue.

Thanks for the suggestions everyone. I do have that debugging flag set. it shows no CPU work request:
Wed 25 Dec 2019 07:24:59 PM EST | SETI@home | [sched_op] Starting scheduler request
Wed 25 Dec 2019 07:24:59 PM EST | SETI@home | Sending scheduler request: To fetch work.
Wed 25 Dec 2019 07:24:59 PM EST | SETI@home | Reporting 15 completed tasks
Wed 25 Dec 2019 07:24:59 PM EST | SETI@home | Requesting new tasks for NVIDIA GPU
Wed 25 Dec 2019 07:24:59 PM EST | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Wed 25 Dec 2019 07:24:59 PM EST | SETI@home | [sched_op] NVIDIA GPU work request: 4629217.31 seconds; 0.00 devices
Wed 25 Dec 2019 07:25:02 PM EST | SETI@home | Scheduler request completed: got 15 new tasks
Wed 25 Dec 2019 07:25:02 PM EST | SETI@home | [sched_op] Server version 709
Wed 25 Dec 2019 07:25:02 PM EST | SETI@home | Project requested delay of 303 seconds
Wed 25 Dec 2019 07:25:03 PM EST | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
Wed 25 Dec 2019 07:25:03 PM EST | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 1482 seconds

Not sure why this is happening on this one host. It still has no CPU jobs and it did before the server problems. I just copied back the folder, so none of that changed. I only left in the dont_check_file_sizes flag set to 1. Could that do something like this? But, it is also set on the PC with CPU tasks. The other host at "home" location is getting CPU tasks and CPU and GPU are set for that location. I checked the cc_config and app_confi and app_info files. They look the same; I haven't done a diff compare. however.

Wiggo, T3500 is the PC name in case someone wanted to look it up. I have restarted the BOINC manager to no effect. I didn't quite understand what you were suggesting in your reply. Can you explain in a bit more detail? [/code]
ID: 2025012 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2025013 - Posted: 26 Dec 2019, 0:50:16 UTC - in response to Message 2025012.  

Just found it. In BOINC manager, the computing preferences were set a bit different (since the machines are different specs). I think it was an inadvertently low "use at most % of the CPUs" limit. Should have compared that earlier!
ID: 2025013 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2025015 - Posted: 26 Dec 2019, 1:12:41 UTC

And on to next year's wish list it would also be nice if they could sort the odd upload that takes a couple of attempts to go through, along with the occasional sticking downloads.
Grant
Darwin NT
ID: 2025015 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2025021 - Posted: 26 Dec 2019, 1:51:08 UTC

Did the new hardware upload server ever make it over to Main after its dry-run at Beta?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2025021 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2025023 - Posted: 26 Dec 2019, 1:54:27 UTC - in response to Message 2025021.  
Last modified: 26 Dec 2019, 1:56:15 UTC

Did the new hardware upload server ever make it over to Main after its dry-run at Beta?
I ask about it every so often, and... Nope. Apparently it's still there, presently out of commission.

From Eric's "Sever issues" news post
The file system containing the beta project uploads directory is having problems, so beta is down until further notice.



Would be nice to have such upgraded hardware helping out here.
*shrug*
Grant
Darwin NT
ID: 2025023 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 2025034 - Posted: 26 Dec 2019, 4:44:12 UTC
Last modified: 26 Dec 2019, 4:57:33 UTC

Well, it was fun while it lasted. Looks like either work or issues in progress.
??
ID: 2025034 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2025036 - Posted: 26 Dec 2019, 5:32:55 UTC - in response to Message 2025034.  
Last modified: 26 Dec 2019, 5:37:25 UTC

Well, it was fun while it lasted. Looks like either work or issues in progress.
??
Yeah, I got a "Project is temporarily shut down for maintenance" response, next contact I reported 75 WUs, but only got 1 back, the one after that filled in the deficit.

Edit-
Eric's post in the RX 5700 XT thread might explain the recent brief project outage.
Grant
Darwin NT
ID: 2025036 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13944
Credit: 208,696,464
RAC: 304
Australia
Message 2025043 - Posted: 26 Dec 2019, 6:33:12 UTC

Results-received-in-last-hour 171,859. Well, that's certainly going to give the servers a pounding.
Grant
Darwin NT
ID: 2025043 · Report as offensive
Profile Chris904395093209d Project Donor
Volunteer tester

Send message
Joined: 1 Jan 01
Posts: 112
Credit: 29,923,129
RAC: 6
United States
Message 2025072 - Posted: 26 Dec 2019, 14:53:11 UTC

I'm getting the 'Project has not tasks available' message this morning on my 2 windows machines. At least I was able to get restocked last night.
~Chris

ID: 2025072 · Report as offensive
Profile B. Ahmet KIRAN

Send message
Joined: 19 Oct 14
Posts: 77
Credit: 36,140,903
RAC: 140
Turkey
Message 2025073 - Posted: 26 Dec 2019, 14:54:57 UTC

Since last 45 minutes, I have been updating 5 computers, all the time the system responding "no tasks available" while the "work done" and "average work done" seems to be frozen... Where do these results go??? Am I the only one who is being targeted??? I see no posts in this issue (now nearly one hour passed and still no updates on work done and average work done)
SOMEONE PLEASE RESPOND...
ID: 2025073 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 2025075 - Posted: 26 Dec 2019, 15:14:40 UTC - in response to Message 2025073.  

Since last 45 minutes, I have been updating 5 computers, all the time the system responding "no tasks available" while the "work done" and "average work done" seems to be frozen... Where do these results go??? Am I the only one who is being targeted??? I see no posts in this issue (now nearly one hour passed and still no updates on work done and average work done)
SOMEONE PLEASE RESPOND...
I'm guessing the servers are busy with the daily statistics dump, and that the system will be back in 3-2-1...
ID: 2025075 · Report as offensive
Profile B. Ahmet KIRAN

Send message
Joined: 19 Oct 14
Posts: 77
Credit: 36,140,903
RAC: 140
Turkey
Message 2025077 - Posted: 26 Dec 2019, 15:41:49 UTC - in response to Message 2025075.  

Thanks for the update...
Is there a time schedule for this "daily statistics dump"? If so I will try to avoid those times for my uploads, and prevent my stomach acid burst...
Thanks again...
ID: 2025077 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 2025078 - Posted: 26 Dec 2019, 15:51:46 UTC - in response to Message 2025077.  

Thanks for the update...
Is there a time schedule for this "daily statistics dump"? If so I will try to avoid those times for my uploads, and prevent my stomach acid burst...
Thanks again...
I don't think the schedule is very reliable, but the files from the dump go here, and the process will have started at the first file timestamp and ended at the last one.
The time of day will normally be the same from day to day, but then it will occasionally jump to some other time.
And there are two runs a day, with approximately 12 hours between.
ID: 2025078 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.