Panic Mode On (111) Server Problems?

Message boards : Number crunching : Panic Mode On (111) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 31 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1928272 - Posted: 6 Apr 2018, 8:13:50 UTC

Scheduler is back to random allocation of work, In-progress numbers are falling again (along with my cache).
Grant
Darwin NT
ID: 1928272 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928274 - Posted: 6 Apr 2018, 8:45:02 UTC
Last modified: 6 Apr 2018, 8:45:40 UTC

OK, I've just allowed the 'breakfast fetch' for my three NVidia GPU-only crunchers, and I'm getting the 'no work available for the devices you've selected: work is available for other devices' message. Those machines are allowed 600 GPU tasks between them, and they currently have 360 tasks onboard, so plenty of headroom.

I did a special fetch on a slow CPU-only cruncher, and got an allocation exclusively consisting of Arecibo VLARs. So that's consistent: the schedular/feeder is working, but the available work is inhibited from NVidia cards. All according to the known policy of the project.

I'll leave the GPUs requesting work as they wish, and see what happens when this block of VLARs has been plucked from the feeder.
ID: 1928274 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928277 - Posted: 6 Apr 2018, 9:18:00 UTC

Ah, here they come:

06/04/2018 10:04:31 | SETI@home | Scheduler request completed: got 36 new tasks
06/04/2018 10:04:31 | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 27543 seconds
And another 27 five minutes later.

Mostly Guppies, estimated at 12:58 each - and a few shorties, at 04:59. This machine usually only crunches SETI on the 750Ti, while the 970 runs GPUGrid. So the average duration of 27543/36 =765 seconds is what I expect - the server has adjusted to the speed it sees.

The other two machines are fetching too - up to 526 tasks across the three of them. So, no fault found so far.
ID: 1928277 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928282 - Posted: 6 Apr 2018, 9:40:48 UTC

The first of the three has now reached 200 GPU tasks onboard, so the message has changed to

06/04/2018 10:32:59 | SETI@home | Scheduler request completed: got 0 new tasks
06/04/2018 10:32:59 | SETI@home | No tasks are available for SETI@home v8
06/04/2018 10:32:59 | SETI@home | Tasks for CPU are available, but your preferences are set to not accept them
06/04/2018 10:32:59 | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
06/04/2018 10:32:59 | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them
06/04/2018 10:32:59 | SETI@home | This computer has reached a limit on tasks in progress
Multiple reasons, but the last one is the relevant one.

So that's another tick in the 'working as designed' report card.
ID: 1928282 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1928284 - Posted: 6 Apr 2018, 10:09:06 UTC
Last modified: 6 Apr 2018, 10:12:26 UTC

6/04/2018 18:29:59 | SETI@home | Message from task: 0
6/04/2018 18:29:59 | SETI@home | Computation for task 23no17aa.18652.684313.7.34.160_0 finished
6/04/2018 18:29:59 | SETI@home | Starting task blc02_2bit_guppi_58185_76028_Dw1_off_0033.9189.409.21.44.27.vlar_1
6/04/2018 18:30:01 | SETI@home | Started upload of 23no17aa.18652.684313.7.34.160_0_r267150381_0
6/04/2018 18:30:05 | SETI@home | Finished upload of 23no17aa.18652.684313.7.34.160_0_r267150381_0
6/04/2018 18:31:46 | SETI@home | Message from task: 0
6/04/2018 18:31:46 | SETI@home | Computation for task ap_23no17aa_B4_P1_00338_20180405_24601.wu_2 finished
6/04/2018 18:31:48 | SETI@home | Started upload of ap_23no17aa_B4_P1_00338_20180405_24601.wu_2_r443664073_0
6/04/2018 18:31:52 | SETI@home | Finished upload of ap_23no17aa_B4_P1_00338_20180405_24601.wu_2_r443664073_0
6/04/2018 18:33:43 | SETI@home | Sending scheduler request: To fetch work.
6/04/2018 18:33:43 | SETI@home | Reporting 2 completed tasks
6/04/2018 18:33:43 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 18:33:45 | SETI@home | Scheduler request completed: got 2 new tasks
6/04/2018 18:33:47 | SETI@home | Started download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.818.21.44.130.vlar
6/04/2018 18:33:47 | SETI@home | Started download of blc03_2bit_guppi_58185_64326_And_XI_0018.30008.409.22.45.125.vlar
6/04/2018 18:33:50 | SETI@home | Finished download of blc03_2bit_guppi_58185_64326_And_XI_0018.30008.409.22.45.125.vlar
6/04/2018 18:33:51 | SETI@home | Finished download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.818.21.44.130.vlar
6/04/2018 18:35:02 | SETI@home | Message from task: 0
6/04/2018 18:35:02 | SETI@home | Computation for task blc02_2bit_guppi_58185_76028_Dw1_off_0033.9189.409.21.44.27.vlar_1 finished
6/04/2018 18:35:02 | SETI@home | Starting task 23no17aa.18652.684313.7.34.161_1
6/04/2018 18:35:04 | SETI@home | Started upload of blc02_2bit_guppi_58185_76028_Dw1_off_0033.9189.409.21.44.27.vlar_1_r1496257461_0
6/04/2018 18:35:07 | SETI@home | Finished upload of blc02_2bit_guppi_58185_76028_Dw1_off_0033.9189.409.21.44.27.vlar_1_r1496257461_0
6/04/2018 18:38:30 | SETI@home | Message from task: 0
6/04/2018 18:38:30 | SETI@home | Computation for task ap_02dc17aa_B5_P1_00019_20180405_01573.wu_0 finished
6/04/2018 18:38:30 | SETI@home | Starting task 23no17aa.18652.684313.7.34.165_0
6/04/2018 18:38:32 | SETI@home | Started upload of ap_02dc17aa_B5_P1_00019_20180405_01573.wu_0_r993485843_0
6/04/2018 18:38:36 | SETI@home | Finished upload of ap_02dc17aa_B5_P1_00019_20180405_01573.wu_0_r993485843_0
6/04/2018 18:38:51 | SETI@home | Sending scheduler request: To fetch work.
6/04/2018 18:38:51 | SETI@home | Reporting 2 completed tasks
6/04/2018 18:38:51 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 18:38:54 | SETI@home | Scheduler request completed: got 20 new tasks
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_63036_And_XI_0016.12629.2045.21.44.207.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.2045.21.44.163.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_58186_Bol520_0010.14284.818.22.45.178.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_58186_Bol520_0010.13193.1227.21.44.129.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc02_2bit_guppi_58185_75400_Dw1_0032.30037.1227.21.44.237.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc01_2bit_guppi_58185_68267_And_X_0024.30024.1227.22.45.175.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_64326_And_XI_0018.30008.1636.22.45.71.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_58186_Bol520_0010.14757.409.21.44.103.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.226.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.2045.21.44.170.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.227.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_63036_And_XI_0016.12629.2045.21.44.211.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_58186_Bol520_0010.14284.818.22.45.180.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_63036_And_XI_0016.12629.2045.21.44.210.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_58186_Bol520_0010.14284.818.22.45.182.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_58186_Bol520_0010.13193.1227.21.44.125.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.223.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_65682_And_X_0020.30586.818.21.44.156.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.224.vlar
6/04/2018 18:38:56 | SETI@home | Started download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.2045.21.44.162.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_63036_And_XI_0016.12629.2045.21.44.207.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_58186_Bol520_0010.14284.818.22.45.178.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc02_2bit_guppi_58185_75400_Dw1_0032.30037.1227.21.44.237.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc01_2bit_guppi_58185_68267_And_X_0024.30024.1227.22.45.175.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_64326_And_XI_0018.30008.1636.22.45.71.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_58186_Bol520_0010.14757.409.21.44.103.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.226.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.2045.21.44.170.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_58186_Bol520_0010.14284.818.22.45.180.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_63036_And_XI_0016.12629.2045.21.44.210.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_58186_Bol520_0010.14284.818.22.45.182.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_58186_Bol520_0010.13193.1227.21.44.125.vlar
6/04/2018 18:39:00 | SETI@home | Finished download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.224.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.2045.21.44.163.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_58186_Bol520_0010.13193.1227.21.44.129.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.227.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_63036_And_XI_0016.12629.2045.21.44.211.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_61754_And_XI_0014.12687.818.22.45.223.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_65682_And_X_0020.30586.818.21.44.156.vlar
6/04/2018 18:39:18 | SETI@home | Finished download of blc03_2bit_guppi_58185_59455_Bol520_0012.29562.2045.21.44.162.vlar
6/04/2018 18:40:19 | SETI@home | Message from task: 0
6/04/2018 18:40:19 | SETI@home | Computation for task 23no17aa.18652.684313.7.34.161_1 finished
6/04/2018 18:40:19 | SETI@home | Starting task 23no17aa.31849.405300.9.36.42_1
6/04/2018 18:40:21 | SETI@home | Started upload of 23no17aa.18652.684313.7.34.161_1_r1616963712_0
6/04/2018 18:40:25 | SETI@home | Finished upload of 23no17aa.18652.684313.7.34.161_1_r1616963712_0
6/04/2018 18:42:52 | SETI@home | Message from task: 0
6/04/2018 18:42:52 | SETI@home | Computation for task 23no17aa.31849.405300.9.36.42_1 finished
6/04/2018 18:42:52 | SETI@home | Starting task blc03_2bit_guppi_58185_56920_Bol520_0008.9675.0.21.44.194.vlar_1
6/04/2018 18:42:54 | SETI@home | Started upload of 23no17aa.31849.405300.9.36.42_1_r528860525_0
6/04/2018 18:42:58 | SETI@home | Finished upload of 23no17aa.31849.405300.9.36.42_1_r528860525_0
6/04/2018 18:43:44 | SETI@home | Message from task: 0
6/04/2018 18:43:44 | SETI@home | Computation for task 23no17aa.18652.684313.7.34.165_0 finished
6/04/2018 18:43:44 | SETI@home | Starting task blc03_2bit_guppi_58185_56920_Bol520_0008.9675.0.21.44.197.vlar_0
6/04/2018 18:43:46 | SETI@home | Started upload of 23no17aa.18652.684313.7.34.165_0_r1671766231_0
6/04/2018 18:43:50 | SETI@home | Finished upload of 23no17aa.18652.684313.7.34.165_0_r1671766231_0
6/04/2018 18:43:58 | SETI@home | Sending scheduler request: To fetch work.
6/04/2018 18:43:58 | SETI@home | Reporting 3 completed tasks
6/04/2018 18:43:58 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 18:44:00 | SETI@home | Scheduler request completed: got 0 new tasks
6/04/2018 18:44:00 | SETI@home | No tasks sent
6/04/2018 18:44:00 | SETI@home | No tasks are available for AstroPulse v7
6/04/2018 18:44:00 | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
6/04/2018 18:44:00 | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them
6/04/2018 18:44:00 | SETI@home | This computer has reached a limit on tasks in progress

...

6/04/2018 19:02:06 | SETI@home | Reporting 2 completed tasks
6/04/2018 19:02:06 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 19:02:08 | SETI@home | Scheduler request completed: got 0 new tasks
6/04/2018 19:02:08 | SETI@home | No tasks sent
6/04/2018 19:02:08 | SETI@home | No tasks are available for AstroPulse v7
6/04/2018 19:02:08 | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
6/04/2018 19:02:08 | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them
6/04/2018 19:02:08 | SETI@home | This computer has reached a limit on tasks in progress
6/04/2018 19:03:23 | SETI@home | Message from task: 0
6/04/2018 19:03:23 | SETI@home | Computation for task blc03_2bit_guppi_58185_61754_And_XI_0014.11266.0.21.44.176.vlar_1 finished
6/04/2018 19:03:23 | SETI@home | Starting task blc02_2bit_guppi_58185_75400_Dw1_0032.19800.818.22.45.32.vlar_0
6/04/2018 19:03:25 | SETI@home | Started upload of blc03_2bit_guppi_58185_61754_And_XI_0014.11266.0.21.44.176.vlar_1_r651684791_0
6/04/2018 19:03:29 | SETI@home | Finished upload of blc03_2bit_guppi_58185_61754_And_XI_0014.11266.0.21.44.176.vlar_1_r651684791_0
6/04/2018 19:03:52 | SETI@home | Message from task: 0
6/04/2018 19:03:52 | SETI@home | Computation for task blc03_2bit_guppi_58185_58186_Bol520_0010.11719.409.21.44.80.vlar_1 finished
6/04/2018 19:03:52 | SETI@home | Starting task 23no17aa.18652.688812.7.34.225_0
6/04/2018 19:03:54 | SETI@home | Started upload of blc03_2bit_guppi_58185_58186_Bol520_0010.11719.409.21.44.80.vlar_1_r745415459_0
6/04/2018 19:03:58 | SETI@home | Finished upload of blc03_2bit_guppi_58185_58186_Bol520_0010.11719.409.21.44.80.vlar_1_r745415459_0
6/04/2018 19:07:14 | SETI@home | Sending scheduler request: To fetch work.
6/04/2018 19:07:14 | SETI@home | Reporting 2 completed tasks
6/04/2018 19:07:14 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 19:07:16 | SETI@home | Scheduler request completed: got 0 new tasks
6/04/2018 19:07:16 | SETI@home | No tasks sent
6/04/2018 19:07:16 | SETI@home | No tasks are available for AstroPulse v7
6/04/2018 19:07:16 | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
6/04/2018 19:07:16 | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them
6/04/2018 19:07:16 | SETI@home | This computer has reached a limit on tasks in progress
6/04/2018 19:08:21 | SETI@home | Message from task: 0
6/04/2018 19:08:21 | SETI@home | Computation for task blc02_2bit_guppi_58185_75400_Dw1_0032.19800.818.22.45.32.vlar_0 finished
6/04/2018 19:08:21 | SETI@home | Starting task 23no17aa.18652.688812.7.34.119_0
6/04/2018 19:08:23 | SETI@home | Started upload of blc02_2bit_guppi_58185_75400_Dw1_0032.19800.818.22.45.32.vlar_0_r1204705471_0
6/04/2018 19:08:27 | SETI@home | Finished upload of blc02_2bit_guppi_58185_75400_Dw1_0032.19800.818.22.45.32.vlar_0_r1204705471_0
6/04/2018 19:08:40 | SETI@home | Message from task: 0
6/04/2018 19:08:40 | SETI@home | Computation for task 23no17aa.18652.688812.7.34.225_0 finished
6/04/2018 19:08:40 | SETI@home | Starting task 23no17aa.18652.688812.7.34.111_0
6/04/2018 19:08:42 | SETI@home | Started upload of 23no17aa.18652.688812.7.34.225_0_r1232591193_0
6/04/2018 19:08:46 | SETI@home | Finished upload of 23no17aa.18652.688812.7.34.225_0_r1232591193_0
6/04/2018 19:12:10 | SETI@home | Message from task: 0
6/04/2018 19:12:10 | SETI@home | Computation for task 20dc17ab.483.2541.6.33.79.vlar_1 finished
6/04/2018 19:12:10 | SETI@home | Starting task 20dc17ab.10231.2132.7.34.53.vlar_0
6/04/2018 19:12:12 | SETI@home | Started upload of 20dc17ab.483.2541.6.33.79.vlar_1_r357298170_0
6/04/2018 19:12:14 | SETI@home | Finished upload of 20dc17ab.483.2541.6.33.79.vlar_1_r357298170_0
6/04/2018 19:12:22 | SETI@home | Sending scheduler request: To fetch work.
6/04/2018 19:12:22 | SETI@home | Reporting 3 completed tasks
6/04/2018 19:12:22 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 19:12:24 | SETI@home | Scheduler request completed: got 1 new tasks
6/04/2018 19:12:26 | SETI@home | Started download of 01dc17aa.14314.7441.14.41.130.vlar
6/04/2018 19:12:30 | SETI@home | Finished download of 01dc17aa.14314.7441.14.41.130.vlar
6/04/2018 19:13:05 | SETI@home | Message from task: 0
6/04/2018 19:13:05 | SETI@home | Computation for task 23no17aa.18652.688812.7.34.119_0 finished
6/04/2018 19:13:05 | SETI@home | Starting task blc03_2bit_guppi_58185_56920_Bol520_0008.11248.1636.22.45.85.vlar_0
6/04/2018 19:13:07 | SETI@home | Started upload of 23no17aa.18652.688812.7.34.119_0_r947724587_0
6/04/2018 19:13:11 | SETI@home | Finished upload of 23no17aa.18652.688812.7.34.119_0_r947724587_0
6/04/2018 19:13:22 | SETI@home | Message from task: 0
6/04/2018 19:13:22 | SETI@home | Computation for task 23no17aa.18652.688812.7.34.111_0 finished
6/04/2018 19:13:22 | SETI@home | Starting task blc03_2bit_guppi_58185_56920_Bol520_0008.11293.1636.22.45.88.vlar_0
6/04/2018 19:13:24 | SETI@home | Started upload of 23no17aa.18652.688812.7.34.111_0_r673884030_0
6/04/2018 19:13:28 | SETI@home | Finished upload of 23no17aa.18652.688812.7.34.111_0_r673884030_0
6/04/2018 19:17:29 | SETI@home | Sending scheduler request: To fetch work.
6/04/2018 19:17:29 | SETI@home | Reporting 2 completed tasks
6/04/2018 19:17:29 | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
6/04/2018 19:17:32 | SETI@home | Scheduler request completed: got 0 new tasks
6/04/2018 19:17:32 | SETI@home | No tasks sent
6/04/2018 19:17:32 | SETI@home | No tasks are available for AstroPulse v7
6/04/2018 19:17:32 | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
6/04/2018 19:17:32 | SETI@home | Tasks for Intel GPU are available, but your preferences are set to not accept them
6/04/2018 19:17:32 | SETI@home | This computer has reached a limit on tasks in progress


Even when there is no Arecibo work at all, let alone Arecibo VLARs, when the issue occurs " This computer has reached a limit on tasks in progress" is the reason given for not allocating new work, even though the system hasn't reached it's cache limit or the server side limits. This applies to CPU as well as GPU work.
Grant
Darwin NT
ID: 1928284 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928294 - Posted: 6 Apr 2018, 11:13:48 UTC - in response to Message 1928284.  

"This computer has reached a limit on tasks in progress" is the reason given for not allocating new work, even though the system hasn't reached it's cache limit or the server side limits. This applies to CPU as well as GPU work.
That message is designed to appear ONLY in connection with server-side limits - it is not designed to have anything to do with task durations, cache settings, or quota (daily quota is a technical term reserved for cases where you have been reporting failed or invalid tasks).

So, we need to ask again one of the questions I asked yesterday: are you equipped to easily count exactly how many of each type of task are present on your machine? This has to be a local count - information from this website is no use here. BoincTasks is likely to be a much better tool for this purpose than BOINC Manager - I'm using a predecessor called BoincView, which is less specific than BoincTasks, but to my eye has a cleaner interface, and does the job. I can see I currently have 561 tasks across the three machines I'm monitoring, as I type this - so not many successful fetches while I've been away doing other things.
ID: 1928294 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1928296 - Posted: 6 Apr 2018, 11:28:25 UTC - in response to Message 1928294.  
Last modified: 6 Apr 2018, 11:32:15 UTC

My one machine is fairly easy to calculate. It is running just CPU and CUDA tasks and currently has 41 CPU tasks onboard. Looking at the website it currently has 241 tasks onboard, which means it's down by about 100 CUDA tasks, https://setiathome.berkeley.edu/results.php?hostid=6796479&offset=300
All it's being told is 'No tasks sent' when it reports around 5 completed tasks and asks for NVIDIA GPU tasks every 5 minutes.

Just as I post, the server awakes and sends 104 New tasks. So, why does the server wait until the machine is down 80 to 100 tasks and then send new work? This has been going on for Months, it's nothing new.
ID: 1928296 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928297 - Posted: 6 Apr 2018, 11:55:47 UTC - in response to Message 1928296.  

My final machine has just finished breakfast too, and is back up to 200 tasks for two GPU cards.

The final allocation was

06/04/2018 12:26:56 | SETI@home | Scheduler request completed: got 26 new tasks
06/04/2018 12:26:56 | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 11106 seconds
- an average of 427 seconds per task. Sure enough, most of them were VHAR 'shorties', estimated at 04:45 each (again, 750Ti timings)

While there are shorties in the feeder cache, that cache drains more quickly - because work requests are for 'duration in seconds', not 'number of tasks'. There are currently no Arecibo tapes available to split, so assuming no new ones arrive, we will be working on 100% Breakthrough Listen tasks within about 5 hours (when the RTS queue has completely drained and been refilled). That will lessen the load on the feeder, and we'll get a clearer picture.

In the meantime, 'no tasks sent' means 'no tasks available' (in feeder cache, or possibly 'only VLAR available'), and has nothing to do with 'reached a limit in tasks in progress'.
ID: 1928297 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1928305 - Posted: 6 Apr 2018, 12:40:26 UTC - in response to Message 1928297.  

In the meantime, 'no tasks sent' means 'no tasks available' (in feeder cache, or possibly 'only VLAR available'), and has nothing to do with 'reached a limit in tasks in progress'.

I think we've been through this before. The chances of the same machine hitting an empty feeder on around 15 consecutive tries is about Zero. Especially since it's been happening for about a Year even when there aren't any Arecibo VLARs to be had. Obviously something else is going on.
ID: 1928305 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928307 - Posted: 6 Apr 2018, 12:51:19 UTC - in response to Message 1928305.  

In the meantime, 'no tasks sent' means 'no tasks available' (in feeder cache, or possibly 'only VLAR available'), and has nothing to do with 'reached a limit in tasks in progress'.
I think we've been through this before. The chances of the same machine hitting an empty feeder on around 15 consecutive tries is about Zero. Especially since it's been happening for about a Year even when there aren't any Arecibo VLARs to be had. Obviously something else is going on.
Well, gather the evidence, read the code, and isolate the logic failure that matches both. Let me have the analysis and the faulty line number, and I'll feed it upstream.

When I get a chance, I'm going to try tracing that g_wreq->max_jobs_exceeded() we were talking about yesterday, to see if my suspicion that it bails out at the first limited resource is correct. But that, as I keep saying, is a different question.
ID: 1928307 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1928308 - Posted: 6 Apr 2018, 12:59:09 UTC - in response to Message 1928266.  

Since the routine is working "correctly" on two of my four crunchers, and "incorrectly" on the other two I would suggest there is something amiss in the communication between the cruncher and the calculation. It is worth noting that the two that are "incorrect" are my top two....


. . Much the same here, my strongest cruncher has not had new work for hours and is now empty, the machine beside it has a full cache.

Stephen

? ?
ID: 1928308 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1928309 - Posted: 6 Apr 2018, 13:07:32 UTC - in response to Message 1928294.  

"This computer has reached a limit on tasks in progress" is the reason given for not allocating new work, even though the system hasn't reached it's cache limit or the server side limits. This applies to CPU as well as GPU work.
That message is designed to appear ONLY in connection with server-side limits - it is not designed to have anything to do with task durations, cache settings, or quota (daily quota is a technical term reserved for cases where you have been reporting failed or invalid tasks).

So, we need to ask again one of the questions I asked yesterday: are you equipped to easily count exactly how many of each type of task are present on your machine? This has to be a local count - information from this website is no use here. BoincTasks is likely to be a much better tool for this purpose than BOINC Manager - I'm using a predecessor called BoincView, which is less specific than BoincTasks, but to my eye has a cleaner interface, and does the job. I can see I currently have 561 tasks across the three machines I'm monitoring, as I type this - so not many successful fetches while I've been away doing other things.


. . In my case it is very easy ... ZERO. No new work for the past 3 to 4 hours and all tasks have finshed and been reported, none left but still just getting the message "no tasks sent". My CPU Q's are also filled with Arecibo VLARs.

. . So I have turned the machine off ...

Stephen

:(
ID: 1928309 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928312 - Posted: 6 Apr 2018, 13:25:43 UTC - in response to Message 1928309.  

. . In my case it is very easy ... ZERO. No new work for the past 3 to 4 hours and all tasks have finshed and been reported, none left but still just getting the message "no tasks sent". My CPU Q's are also filled with Arecibo VLARs.

. . So I have turned the machine off ...
Either I'm mis-understanding, or you're contradicting yourself. How can you say "all tasks have finished" and "my CPU Q's are filled" in the same breath?

If you're talking about two different machines, please make that clear.
ID: 1928312 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928314 - Posted: 6 Apr 2018, 13:36:43 UTC - in response to Message 1928311.  
Last modified: 6 Apr 2018, 13:47:45 UTC

Maybe they implemented some changes to the scheduler too?
Possible, but in my judgement unlikely: the scheduler is fiendishly complicated, and Eric is very cautious about changing it - I think he suspects that even David doesn't understand it fully.

There's been no official change in the repository code for either BOINC or SETI recently, but that doesn't confirm things either way. When testing changes, they'll sometimes tinker with local copies of code, and only commit the final version when it's working the way they want.

But I'll search my logs, just in case.

While I've been typing, I've just seen a whole block of tasks from 21no17aa overflow in seconds - that would increase the drawdown rate from the feeder, too.

Edit - this was during my breakfast fetch yesterday:

05-Apr-2018 09:11:10 [SETI@home] Sending scheduler request: To fetch work.
05-Apr-2018 09:11:10 [SETI@home] Requesting new tasks for NVIDIA GPU
05-Apr-2018 09:11:14 [SETI@home] Scheduler request completed: got 0 new tasks
05-Apr-2018 09:11:14 [SETI@home] No tasks sent
05-Apr-2018 09:11:14 [SETI@home] No tasks are available for SETI@home v8
05-Apr-2018 09:11:14 [SETI@home] This computer has reached a limit on tasks in progress
05-Apr-2018 09:15:45 [SETI@home] Computation for task blc03_2bit_blc03_guppi_58152_85190_DIAG_PSR_J1012+5307_0008.12693.1636.22.45.99.vlar_0 finished
05-Apr-2018 09:16:19 [SETI@home] Sending scheduler request: To fetch work.
05-Apr-2018 09:16:19 [SETI@home] Reporting 1 completed tasks
05-Apr-2018 09:16:19 [SETI@home] Requesting new tasks for NVIDIA GPU
05-Apr-2018 09:16:23 [SETI@home] Scheduler request completed: got 1 new tasks
I've removed extra lines, but the sequence is accurate - two requests, minimum separation, single completed task replaced immediately. Might still have changed since then, though.
ID: 1928314 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928317 - Posted: 6 Apr 2018, 14:02:17 UTC

Well, I'll keep my eye open for that when I do my next top-up. I'm still flushing shorties through the systems, so my next fetch fills by time as well as count (may as well have the cache as long-running as possible when your predicted weekend crash happens :P). In the meantime, I'm taking a break - second sunny day in a row, I need a walk.
ID: 1928317 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1928329 - Posted: 6 Apr 2018, 15:08:57 UTC

If there is a VLAR storm it must be selective. I just upped the cache to 1.4 days and not a single 14 new CPU task was an Arecibo VLAR. I do have about 15 Arecibo VLARs from late yesterday, and early this morning, but I now have a total of 55 CPU tasks most of which are not Arecibo VLARs.
ID: 1928329 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1928342 - Posted: 6 Apr 2018, 16:16:39 UTC - in response to Message 1928329.  

I did the same just before setting out for my walk, at 15:25 local (14:25 UTC), and got two Arecibo VLARs. Everything else that downloaded while I was out (including the first at 15:30 local) have been guppies, so we must have been just on the end of it.

Log at All tasks for computer 7118033
ID: 1928342 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1928343 - Posted: 6 Apr 2018, 16:18:21 UTC - in response to Message 1928329.  
Last modified: 6 Apr 2018, 16:25:33 UTC

If there is a VLAR storm it must be selective. I just upped the cache to 1.4 days and not a single 14 new CPU task was an Arecibo VLAR. I do have about 15 Arecibo VLARs from late yesterday, and early this morning, but I now have a total of 55 CPU tasks most of which are not Arecibo VLARs.

I have >200 in my CPU chache if you want some. LOL
After the VLAR storm, all is working as expected, the GPU caches are filling normaly, at least from here.
ID: 1928343 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1928346 - Posted: 6 Apr 2018, 16:37:42 UTC - in response to Message 1928266.  

Since the routine is working "correctly" on two of my four crunchers, and "incorrectly" on the other two I would suggest there is something amiss in the communication between the cruncher and the calculation. It is worth noting that the two that are "incorrect" are my top two....

I see the greatest effect on my three fastest and most capable machines. The oldest and slowest crunchers in my farm are less effected and stay more at their cache allotments the longest times.

Since they have been attached to the project the longest, maybe the servers have stabilized on the "correct" identification of system parameters and performance capabilities.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1928346 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1928350 - Posted: 6 Apr 2018, 16:46:46 UTC - in response to Message 1928307.  

When I get a chance, I'm going to try tracing that g_wreq->max_jobs_exceeded() we were talking about yesterday, to see if my suspicion that it bails out at the first limited resource is correct. But that, as I keep saying, is a different question.

Look earlier in the thread where I tried to follow the code down the rabbit hole. I came up with the mechanism that assigns the number of gpu tasks based on the number of gpu cards in the host.

And that is as far as I got. I couldn't find where cpu tasks were calculated as 100 tasks per cpu.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1928350 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 31 · Next

Message boards : Number crunching : Panic Mode On (111) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.