Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 57 · 58 · 59 · 60 · 61 · 62 · 63 . . . 94 · Next
Author | Message |
---|---|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
Stuck Uploads....ALL my machines have Many Uploads waiting Retry. Something changed a few minutes ago. The machine I had placed in 'Suspend GPU' for over an hour was finally able to Upload the Pages of files the Server had been refusing to take. So, I resumed crunching and so far there are only a few Uploads waiting for Retry instead of Pages (Hundreds). The other machines look better as well. We'll see how long this lasts... ...20 Minutes later the Server is back to refusing Uploads. I suspended the GPUs again after the Retries exceeded a page in length. Now it seems to be nearly normal again. |
BetelgeuseFive ![]() Send message Joined: 6 Jul 99 Posts: 158 Credit: 17,117,787 RAC: 19 ![]() ![]() |
Are the validators missing stuff ? https://setiathome.berkeley.edu/workunit.php?wuid=3847744137 Why is this still waiting for validation ? Tom |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22815 Credit: 416,307,556 RAC: 380 ![]() ![]() |
This sort of thing happens from time to time. It appears to be more common when the servers have been having problems. It is normally cleared when the tasks reach their deadlines which triggers the validators to wake up and grant credit as appropriate. However in the case you highlight the job will have to be sent out to another computer as it has not been validated - two of the three computers "disagree", the third has submitted a result that looks as if it should validate, but that needs to be confirmed by someone else. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
BetelgeuseFive ![]() Send message Joined: 6 Jul 99 Posts: 158 Credit: 17,117,787 RAC: 19 ![]() ![]() |
This sort of thing happens from time to time. It appears to be more common when the servers have been having problems. It is normally cleared when the tasks reach their deadlines which triggers the validators to wake up and grant credit as appropriate. I'm not sure you are correct. I think the two first results didn't match, but the third one has not yet been compared to the first two. If either the first or the second task is a match with the third the unit will be validated (and removed from the results table ...) and no new task needs to be sent. Tom |
![]() ![]() Send message Joined: 26 Jan 15 Posts: 88 Credit: 280,183 RAC: 1 ![]() |
Almost a week now with no work units/tasks at all... i see others say they get them. yet all i see is this: -Project communication failed: attempting access to reference site -Internet access OK - project servers may be temporarily down. should i be worried ?
|
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
You're running a very old version of BOINC - v6.6.38 (over 10 years old!) You need to run a newer version because of updated communications security protocols. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
Are the validators missing stuff ?This is pretty common, and has been happening forever. It happens on the Maintenance days when the Servers are shut down. To see more of them just go back to the Tuesdays of each week and look for them. If you were active during that period it's likely you have a few, some have more than others. It might be interesting to estimate just how many there are, since everyone gets a few every Tuesday. I just checked one of My hosts for the Tuesday Jan 7th Maintenance. Looks like quite a few sitting there, wasting space, https://setiathome.berkeley.edu/workunit.php?wuid=3826315572, if that link is dead just go to that date in the list as there's plenty more where that one came from, https://setiathome.berkeley.edu/results.php?hostid=6813106&offset=23800 |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
People complaining about stuck uploads, what boinc version are you using? I haven't been suffering from this problem so can it be version dependent? One curious thing I have observed is that when my upload fails, the client announces several minute long backoff time on the event log, but it will actually retry in about half a minute. And almost always succesfully. |
JohnDK ![]() ![]() ![]() Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 ![]() ![]() |
I have had stuck upload problems with both 7.14.2 & 7.16.3 for some weeks I guess. |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I have had stuck upload problems with both 7.14.2 & 7.16.3 for some weeks I guess.My version is between your versions: 7.15.0, but not a release version. I downloaded the development source some time in last autumn to add spoofing to it. The most visible chance compared to the version I ran before it that came from Ubuntu repositories was that boincmgr doesn't scroll the list to the end whenever the size of the list changes, which was extremely annoying (but it still resets the sorting columns, which is only slightly less annoying). A bigger annoyance is that boincmgr forces its windows on top whenever they gain focus. But I'm not sure if this is a boinc bug or qt bug. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
You'll have to pull from the 7.16 branch to get rid of the jumping tasks in the tasklist. Was fixed in #3064 Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I wonder are the results corresponding to 'Workunits waiting for assimilation' on the SSP still counted in 'Results returned and awaiting validation'? That would explain the mismatch between 'Results waiting for db purging' that is now really low and the number of workunits of my hosts the website lists in 'Valid' state that is way higher than normal. That would also explain part of 'Results returned and awaiting validation' being very high. |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
You'll have to pull from the 7.16 branch to get rid of the jumping tasks in the tasklist. Was fixed in #3064It is fixed in my version. It's not the official 7.15 release but some random point in the development between 7.15 and 7.16. The remaining problem affects the sorting order only. The viewport won't jump anywhere when new tasks arrive, but If I had sorted my list by some column, this sorting order will reset to default. Has 7.16 fixed this? Looks like the jump fix was committed in April. I downloaded my source some time in September. If some version fixes the super annoying 'window brings itself on top of other windows when it receives focus' bug, then please tell me immediately! You can't observe that bug on Windows, Mac, or a Linux whose window manager tries to emulate Windows or Mac because on those systems the system itself does the same thing. The bug is a bug even on those environments because it's doing a redundant thing. But when you have a different environment, it becomes a really serious bug. Windows/Mac window management was originally designed for using only one application at a time in the eighties. Properly multiprocessing environments like Amiga and classic X11 kept the windows stacked like the user wanted so that using more than one application is possible without constant alt-tabbing. That's how my desktop works. Unfortunately most linux distributions these days use window managers emulating windows by default because thay want users migrating from windows to feel the environment familiar. With all its limitations. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
You'll have to pull from the 7.16 branch to get rid of the jumping tasks in the tasklist. Was fixed in #3064That one was me. I didn't have a Linux machine at the time, so I was working blind. Nobody mentioned a sorting problem, so I didn't fix one. It may have been solved elsewhere - I've got a Linux machine now, so I'll try it later. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I'm trying to get the sorting order to change when new downloaded tasks arrive. Sorting order never changes for me. Whatever column I have changed the sorting order in, stays put. Maybe I don't understand your problem correctly. Running the 7.16.3 branch. Just a FYI and some persuasion . . . you can get rid of the "finish file not found" errors if you update to the 7.16 branch also. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
Actually the sorting problem doesn't affect the primary sorting column. That stays selected. The problem is in sorting by multiple columns. When I click one column, then another, I get a sorting that sorts by the second clicked column but rows, that don't differ by that coluimn, are sorted by the first clicked column. I can sort by any number of columns this way. I think the manager doesn't support multi-column sorting explicitly, but it is just a side effect from using a stable sort algorithm. However this stability is lost when the list updates. The rows stay sorted by the primary column but rows with identical primary column go into some arbitrary order that doesn't match any column. This could be fixed by maintaining a list of column indices and whenever some column is clicked for sorting, move that column's index to the front of the list. Then use all columns in the list order as sorting key. The result would be an explicitly supported multi-column sorting but using the same user interface as the current serendipitous multi-column sorting. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Ahhh, gotcha. Yes you are correct. It seems the Manager does not support multi-order sorting. Just single. I just tried with time remaining, status and application. The AP task get jumping around. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
I don't want to update my boinc client version without a good reason because updating involves quite a bit of effort. First I would have to port my spoofing hack to the new version and then I would have to run my queue empty because in my experience updating the boinc version involves the high risk of the new version deleting every task downloaded by the old version. Even the completed ones! And whenever I intentionally run my queue empty, there's a guaranteed unplanned server outage exactly when the queue becomes empty! |
![]() ![]() ![]() Send message Joined: 1 Apr 13 Posts: 1859 Credit: 268,616,081 RAC: 1,349 ![]() ![]() |
I have had stuck upload problems with both 7.14.2 & 7.16.3 for some weeks I guess. Ditto ![]() ![]() |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
no real problems with stuck uploads here. i see a handful of the uploads occasionally hitting a 3-5min backoff but they clear on the next retry so they don't stack up, and that's on the fastest systems on the project. doesn't seem to be a huge issue really. none of my systems had more than like 5 in this state. also using 7.16.3. doubt its at all related to the BOINC version. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.