Panic Mode On (24) Server problems

Message boards : Number crunching : Panic Mode On (24) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · Next

AuthorMessage
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933481 - Posted: 15 Sep 2009, 10:57:41 UTC - in response to Message 933134.  
Last modified: 15 Sep 2009, 11:00:14 UTC


We can't blame AP?

So BTW how you understand Joe's post?


Astropulse splitters producing, download bandwidth heading for saturation....
                                                              Joe



..and, everytime if the AP splitters are on.. we have bandwidth probs, or not? Have I seen something other as you all? ;-)


Since 06:05 UTC no, nothing - really no chance for UL (last report of my GPU cruncher)..

ID: 933481 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 933487 - Posted: 15 Sep 2009, 11:12:01 UTC - in response to Message 933481.  

Since 06:05 UTC no, nothing - really no chance for UL (last report of my GPU cruncher)..

This was at 9:40 UTC (at UTC+2):
15/09/2009 11:40:37|SETI@home|Started upload of 29au09ad.11188.15200.3.10.15_1_0
15/09/2009 11:40:37||[file_xfer_debug] URL: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
15/09/2009 11:40:39||[file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
15/09/2009 11:40:42||[file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
15/09/2009 11:40:42||[file_xfer_debug] file transfer status 0
15/09/2009 11:40:42|SETI@home|Finished upload of 29au09ad.11188.15200.3.10.15_1_0

Gruß,
Gundolf
ID: 933487 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 933489 - Posted: 15 Sep 2009, 11:18:01 UTC - in response to Message 933481.  

..and, everytime if the AP splitters are on.. we have bandwidth probs, or not? Have I seen something other as you all? ;-)

I see the cricket graphs, and although the throughput was rising when Joe posted, it never reached saturation - in fact, we haven't been maxxed out since the recovery from last week's outage.
ID: 933489 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933491 - Posted: 15 Sep 2009, 11:22:43 UTC


Hmm.. then maybe I need to uninstall BOINC DEV-V6.6.38, maybe the 'next attempt counter' (if 3 ULs don't go through) of the ULs is not well?


BTW.
How long it will last (after 3 failed in a row) that BOINC start a new attempt of an UL?

ID: 933491 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933492 - Posted: 15 Sep 2009, 11:35:31 UTC
Last modified: 15 Sep 2009, 11:36:35 UTC


I looked..

The last report of my GPU cruncher was a VLAR killed WU.
So no UL needed.

He have now ~ 400 results ready for UL, but nothing go through.

The QX6700 have now ~ 30 ULs, same story.

I started manually one UL (at both PCs) and 3 try/PC and again stopped ULs.


Hmm.. maybe it's again time for to reboot the PCs for to have fresh TCP/UDP/URL - or what ever this was?

ID: 933492 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 933500 - Posted: 15 Sep 2009, 12:30:51 UTC

Same problem here - cannot upload. It got progressively worse over the last few (2? 3?) days, now NO UL going through, lots of HTTP errors in msgs tab. I shut down network activity, will keep it off until at least the PM is over today - I have plenty enough WUs for that - and hopefully the issue will be fixed by then!!!
ID: 933500 · Report as offensive
Profile twister@austria-national-team.at
Volunteer tester

Send message
Joined: 26 Jan 00
Posts: 30
Credit: 60,419,551
RAC: 0
Austria
Message 933503 - Posted: 15 Sep 2009, 12:57:40 UTC - in response to Message 933491.  


Hmm.. then maybe I need to uninstall BOINC DEV-V6.6.38, maybe the 'next attempt counter' (if 3 ULs don't go through) of the ULs is not well?



German:

Das finde ich ganz und gar nicht richtig an diesem instabilen Projekt:

Die ganzen Teilnehmer haben zur Zeit Probleme mit dem UL. und denken sogar schon darüber nach Boinc zu deinstallieren. (Man sieht den RAC weltweit in allen Teams sinken)

Aber in den Technical News wird gar nichts erwähnt das etwas nicht funktioniert!!!
Nein, im Gegenteil sogar, die Server zeigen alle Grünes Licht für Running.

Google translate:

I think that's totally not true in this unstable project:

All the participants have the time problems with the UL. and even think about it even after uninstalling Boinc too. (We see the RAC worldwide decline in all teams)

But in the Technical News is nothing mentioned that something is not working!
Nope, it's the contrary, the servers all show green light for running.
ID: 933503 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933510 - Posted: 15 Sep 2009, 13:33:41 UTC
Last modified: 15 Sep 2009, 13:34:48 UTC


No no.. I will not leave SETI@home. ;-)
I think about to use instead BOINC DEV-V6.6.38 the BOINC DEV-V6.6.37 or go back to V6.4.7 .



Nein nein.. möchte ja nicht BOINC komplett runtermachen. ;-)
Ich "teste" ja aktuell den BOINC DEV-V6.6.38. Das ist ja ein DEV (development/test) client. Und der schaltet nach 3 aufeinander folgenden nicht erfolgreichen ULs den UL ab. Und dann wartet er einige Minuten/Stunden und dann versucht er wieder 3 ULs. Und wenn die wieder nicht funktionieren, wartet er wieder Stunden bis zum nächsten Versuch.

Der BOINC DEV-V6.6.37 hat das nicht. Aber der hat auch nicht mehr den GPU/EDF-BUG.
So sollte ich vielleicht den 6.6.37 draufmachen.

Der V6.6.36 (letzte offizielle Version) hat ja leider einen BUG, den EDF (earliest deadline first - zuerst die Arbeit rechnen mit der kürzesten/nächsten Deadline) GPU-BUG. BOINC startet und stoppt viele viele und noch mehr WUs und nach kurzer Zeit.. ..bleiben alle CUDA WUs im System RAM und ein weiteres Berechnen ist nicht mehr möglich, da der komplette System RAM überlastet ist. Ist mir zwei mal passiert.

Also keine Sorge.. ;-) ..ich bleibe SETI@home erhalten.. :-)

ID: 933510 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933512 - Posted: 15 Sep 2009, 13:42:16 UTC
Last modified: 15 Sep 2009, 13:45:56 UTC


But BTW maybe there is something wrong with the SETI@home servers?
Or the wingmen?


For the last two fresh starts (idle status) [because of no WUs from the SETI@home servers] of my GPU cruncher I had a stable RAC of ~ 52,000 with my GPU cruncher. And maybe ~ 180,000 pending.


Now a RAC of ~ 41,000 and 276,627.21 pendings.


Hey wingmen - when you will send your results??

ID: 933512 · Report as offensive
Profile twister@austria-national-team.at
Volunteer tester

Send message
Joined: 26 Jan 00
Posts: 30
Credit: 60,419,551
RAC: 0
Austria
Message 933518 - Posted: 15 Sep 2009, 13:48:53 UTC - in response to Message 933510.  
Last modified: 15 Sep 2009, 13:50:05 UTC


No no.. I will not leave SETI@home. ;-)
I think about to use instead BOINC DEV-V6.6.38 the BOINC DEV-V6.6.37 or go back to V6.4.7 .



Nein nein.. möchte ja nicht BOINC komplett runtermachen. ;-)
Ich "teste" ja aktuell den BOINC DEV-V6.6.38. Das ist ja ein DEV (development/test) client. Und der schaltet nach 3 aufeinander folgenden nicht erfolgreichen ULs den UL ab. Und dann wartet er einige Minuten/Stunden und dann versucht er wieder 3 ULs. Und wenn die wieder nicht funktionieren, wartet er wieder Stunden bis zum nächsten Versuch.

Der BOINC DEV-V6.6.37 hat das nicht. Aber der hat auch nicht mehr den GPU/EDF-BUG.
So sollte ich vielleicht den 6.6.37 draufmachen.

Der V6.6.36 (letzte offizielle Version) hat ja leider einen BUG, den EDF (earliest deadline first - zuerst die Arbeit rechnen mit der kürzesten/nächsten Deadline) GPU-BUG. BOINC startet und stoppt viele viele und noch mehr WUs und nach kurzer Zeit.. ..bleiben alle CUDA WUs im System RAM und ein weiteres Berechnen ist nicht mehr möglich, da der komplette System RAM überlastet ist. Ist mir zwei mal passiert.

Also keine Sorge.. ;-) ..ich bleibe SETI@home erhalten.. :-)


Habe ich etwas von Seti verlassen geschrieben?
Nein! Eh nur vom herumprobieren und ev. den Boincmanger zu deinstallieren und so. Aber es liegt nicht an unseren Versionen der Boincmanager, denn da habe ich verschiedene im Einsatz, von 5.4.9 und noch ältere, aus der 4er Version.
Und alle haben Probleme beim UL. Und das finde ich traurig, das hier niemand von den Zuständigen etwas meldet das Ihr Server nicht richtig arbeitet.
Von Eric K. habe ich auch noch nichts gehört ;-) denn mein Pending Credit ist immer noch weg...
ID: 933518 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933522 - Posted: 15 Sep 2009, 13:57:48 UTC


Ach so.. ;-)


Na ja.. ich habe ja nur gemeint mit dem V6.4.7 oder dem DEV-V6.6.37 habe ich vielleicht größere Chancen ULs "durchzu-würgen/pressen".. ;-)


ID: 933522 · Report as offensive
Profile twister@austria-national-team.at
Volunteer tester

Send message
Joined: 26 Jan 00
Posts: 30
Credit: 60,419,551
RAC: 0
Austria
Message 933527 - Posted: 15 Sep 2009, 14:05:32 UTC - in response to Message 933510.  




Der V6.6.36 (letzte offizielle Version) hat ja leider einen BUG, den EDF (earliest deadline first - zuerst die Arbeit rechnen mit der kürzesten/nächsten Deadline) GPU-BUG. BOINC startet und stoppt viele viele und noch mehr WUs und nach kurzer Zeit.. ..bleiben alle CUDA WUs im System RAM und ein weiteres Berechnen ist nicht mehr möglich, da der komplette System RAM überlastet ist. Ist mir zwei mal passiert.

Also keine Sorge.. ;-) ..ich bleibe SETI@home erhalten.. :-)


Beim V. 6.6.36 den ich nun drauf habe helfe ich mir mit dem Befehl:
SC \\1.1.1.123. stop boinc
und 1 Minute später
SC \\1.1.1.123 start boinc
Befehl alle 3 Stunden.
Dann leert er immer den Speicher und gibt wieder Vollgas ;-)

Also mit Deinem Boincmanger den Du mir empfohlen hast, hat er eben nie mehr Pakete angefordert - 2 Wochen lang hab ichs versucht - bis der Chache auf 3 Stunden leer war, erst dann hat er wieder Arbeit verlangt.
(Ich hatte die 64 Bit Version)
ID: 933527 · Report as offensive
cncr04s

Send message
Joined: 25 Oct 00
Posts: 6
Credit: 296,024
RAC: 0
United States
Message 933528 - Posted: 15 Sep 2009, 14:06:13 UTC - in response to Message 931615.  

I recieve too many gpu units and not enough cpu units, any way to fix this?
ID: 933528 · Report as offensive
Profile twister@austria-national-team.at
Volunteer tester

Send message
Joined: 26 Jan 00
Posts: 30
Credit: 60,419,551
RAC: 0
Austria
Message 933530 - Posted: 15 Sep 2009, 14:13:04 UTC - in response to Message 933528.  

I recieve too many gpu units and not enough cpu units, any way to fix this?


I think so:

Your finished CPU WU hang in the UpLoad,
Therefore, you also get no new...

ID: 933530 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933543 - Posted: 15 Sep 2009, 15:11:57 UTC - in response to Message 933527.  
Last modified: 15 Sep 2009, 15:25:43 UTC

Beim V. 6.6.36 den ich nun drauf habe helfe ich mir mit dem Befehl:
SC \\1.1.1.123. stop boinc
und 1 Minute später
SC \\1.1.1.123 start boinc
Befehl alle 3 Stunden.
Dann leert er immer den Speicher und gibt wieder Vollgas ;-)

Also mit Deinem Boincmanger den Du mir empfohlen hast, hat er eben nie mehr Pakete angefordert - 2 Wochen lang hab ichs versucht - bis der Chache auf 3 Stunden leer war, erst dann hat er wieder Arbeit verlangt.
(Ich hatte die 64 Bit Version)


?

Hmm.. keine Ahnung wo Du das reinschreibst.. ;-)


Der V6.4.7 32bit funktionierte sehr gut bei mir.
Vielleicht lag s auch daran, daß Du mehrere Projekte auf dem PC hattest?

Ich hatte nur SETI@home und da war alles gut.
Dann hatte ich zusätzlich SETI-Beta drauf gemacht und dann fing der V6.4.7 auch an zu spinnen.
Also SETI-Beta wieder runtergemacht.


--------------------------------------


O.K., now I use BOINC DEV-V6.6.37 and the ULs go through.. of couse not all but maybe every 4th..
So I have better chances to UL my results.


Maybe.. I don't know how the devs wanted to do it..
But better/well it would be: If an UL don't reach the UL server, stop all ULs for ~ 60 sec. and then again start one UL, again not possible to reach the UL server again ~ 60 sec. break of all ULs. And then again start one UL. If one UL go through an new UL start to do this procedure.
This wouldn't be a well idea?

Because how was is possible that my BOINC DEV-V6.6.38 collects ~ 400 results in the UL overview?

ID: 933543 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 933567 - Posted: 15 Sep 2009, 16:04:01 UTC - in response to Message 933491.  


Hmm.. then maybe I need to uninstall BOINC DEV-V6.6.38, maybe the 'next attempt counter' (if 3 ULs don't go through) of the ULs is not well?

Why is this such a hard concept?

If the basic problem is "too much" then pushing more is not the solution.

"too much" is whenever the bandwidth is pegged, or when the server can't (or isn't) handling the load.

Pushing harder is like throwing gasoline on a fire.

But hey, I've worked with lots of people who see the solution to every problem as just more force.
ID: 933567 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 933588 - Posted: 15 Sep 2009, 21:51:26 UTC

I've come to the conclusion that there is only one problem with the uploads and it has nothing to do we Seti or the Boinc version being used.

The problem is the volunteers. They have too much time on their hands and try to track each and ever WU. They just can't walk away from the computer and let the software do what it was designed to do.

People take a chill pill.

Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470
ID: 933588 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933592 - Posted: 15 Sep 2009, 22:01:08 UTC
Last modified: 15 Sep 2009, 22:05:36 UTC


This you could say if your PC make a RAC of ~ 20.

But if your PC make a RAC of ~ 52,000 he must UL ~ 600 to 3,000 results/day!
And if you use DEV-V6.6.38 which don't work well.. you should take a BOINC Version which isn't so buggy.
So I use now DEV-V6.6.37 which don't collects result ULs, this version make the ULs.

No, I can't use V6.6.36 with the 'crazy EDF GPU BUG'.

The UL function in DEV-V6.6.38 don't work well.
I hope in DEV-V6.10.x this UL function work better (like I posted in my upper post) or is disabled.

ID: 933592 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 933596 - Posted: 15 Sep 2009, 22:09:46 UTC - in response to Message 933592.  


This you could say if your PC make a RAC of ~ 20.

But if your PC make a RAC of ~ 52,000 he must UL ~ 600 to 3,000 results/day!
And if you use DEV-V6.6.38 which don't work well.. you should take a BOINC Version which isn't so buggy.
So I use now DEV-V6.6.37 which don't collect result ULs, this version make the ULs.

No, I can't use V6.6.36 with the 'crazy EDF GPU BUG'.

The UL function in DEV-V6.6.38 don't work well.
I hope in DEV-V6.10.x this UL function work better (like I posted in my upper post) or is disabled.


Crazy EDF GPU bug you say? Is that when BOINC process a few % of a GPU task, then moves to another, process a few % of a GPU task, then moves to another,process a few % of a GPU task, then moves to another...

I had started seeing that over the past few weeks. I thought my GF8500 was just starting to fail or something so stopped doing GPU tasks. As it was slow anyway.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 933596 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 933598 - Posted: 15 Sep 2009, 22:14:24 UTC - in response to Message 933596.  

Crazy EDF GPU bug you say? Is that when BOINC process a few % of a GPU task, then moves to another, process a few % of a GPU task, then moves to another,process a few % of a GPU task, then moves to another...

I had started seeing that over the past few weeks. I thought my GF8500 was just starting to fail or something so stopped doing GPU tasks. As it was slow anyway.


Correct..
And every suspended CUDA WU take system RAM.
And if you have maybe ~ 20 or 30 suspended CUDA tasks, your PC can't crunch because your system RAM is overloaded.

So you should use also DEV-V6.6.37 .

ID: 933598 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · Next

Message boards : Number crunching : Panic Mode On (24) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.