Questions and Answers :
Windows :
Connection Error
Message board moderation
Author | Message |
---|---|
CElliott Send message Joined: 19 Jul 99 Posts: 178 Credit: 79,285,961 RAC: 0 |
Two three times a week, the Boinc tray icon displays a red dot next to the Boinc logo and a balloon with text something like "Connection error, incorrect password." When I click on the Boinc icon to bring up BoincMgr, and click on select computer, all I have to do is enter the computer name and the correct password appears also. Then I click OK and Boinc proceeds normally. But many hours of processing may be lost. This happens on all my computers, presently three. How can I stop this conection error from happening? |
Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0 |
But many hours of processing may be lost. No, with the symptoms you describe, nothing is lost. Only the BOINC manager (the GUI) is disconnected from the client, which controls the project applications. So, you only can't see that it's still working (besides with windows task manager). This happens on all my computers, presently three. The account you posted from has no active computers attached. The last connection any of your computers did on this account was on 5 Oct 2011 | 16:48:21 UTC. Did you accidentally create a new account when adding (attaching) the newest hosts? Gruß, Gundolf Computer sind nicht alles im Leben. (Kleiner Scherz) SETI@home classic workunits 3,758 SETI@home classic CPU time 66,520 hours |
CElliott Send message Joined: 19 Jul 99 Posts: 178 Credit: 79,285,961 RAC: 0 |
But many hours of processing may be lost. |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
This is true for a while, but then boinc.exe exits (for lack of a heartbeat?), and the system stops processing WUs. For example, that is what happened here: If boinc.exe exits you have to see some lines that say Exit/Shutdown before "Starting BOINC client": 08-Dec-2011 04:24:53 [SETI@home] Computation for task 22oc11ac.9371.53591.11.10.90_1 finished 08-Dec-2011 04:24:53 [SETI@home] Starting 24oc11ad.27923.20522.15.10.215_1 08-Dec-2011 04:24:53 [SETI@home] Starting task 24oc11ad.27923.20522.15.10.215_1 using setiathome_enhanced version 528 08-Dec-2011 04:24:56 [SETI@home] Started upload of 22oc11ac.9371.53591.11.10.90_1_0 08-Dec-2011 04:25:09 [SETI@home] Finished upload of 22oc11ac.9371.53591.11.10.90_1_0 09-Dec-2011 04:25:13 [SETI@home] Sending scheduler request: To report completed tasks. 09-Dec-2011 04:25:13 [SETI@home] Reporting 1 completed tasks, not requesting new tasks 09-Dec-2011 04:25:30 [SETI@home] Scheduler request completed [12/09/11 13:19:30] TRACE [-1323943]: ***** Win9x Monitor System Shutdown/Logoff Event Detected ***** 09-Dec-2011 13:19:30 [---] Exit requested by user 09-Dec-2011 14:19:17 [---] Starting BOINC client version 6.6.38 for windows_intelx86 09-Dec-2011 14:19:17 [---] log flags: task, file_xfer, sched_ops, benchmark_debug Is it possible that your entire remote system was hung (the OS unresponsive, nothing to do with BOINC directly)? Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â |
CElliott Send message Joined: 19 Jul 99 Posts: 178 Credit: 79,285,961 RAC: 0 |
There is no "Exit." Boinc.exe exits, BoincMgr cannot connect to it, so the latter posts a tray ballon saying connection error, incorrect password, or similar. But the password is already there and correct, which can be demonstrated by going into BoincMgr.exe and clicking on Advanced/Select Computer and entering the computer name; the password appears as if by magic. I think the problem might be a timing issue caused by the following: The Beta site was not accepting requests or downloading WUs over the entire weekend of the New Year and had been like that since the Tuesday outage. When it suddenly came on about noon EST on 01/02/12, I returned about 3,000 WUs in a few minutes. That implies that the client_state.xml files were huge because of all those voluminous <stderr_txt> sections. Boinc.exe had a problem reading or writing that long file and quit. In fact, I might have caused the problem: A program of mine looks for a change in size of the job_log_XYZ.txt file; when it sees that it reads the client_state.xml file over the network to see what happened, and puts started and finished results in a database. Normally, that is not a problem, but with a very long (> 50 MB) client_state.xml file, my program might just be locking it out long enough for Boinc.exe to have trouble writing to it, so Boinc.exe quits. I write this, not because confession is good for the soul, but because I only see this problem occasionally, but then it happens several times a week. Perhaps it only happens when I have a large number of WUs in inventory and there is an outage. It would help if Boinc.exe and BoincMgr.exe wrote messages to the System Event Log when they encounter a problem, or a significant event. This is what NTPD.exe does, and it too runs on both Windows and Linux. NTP.ORG has developed their own API for this (complete with levels of importance so one can specify the logging detail), and then channel the message to the correct logging api depending on what O/S they are running on. |
Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0 |
There is no "Exit." Boinc.exe exits That's a contradiction! ;-) When it suddenly came on about noon EST on 01/02/12, I returned about 3,000 WUs in a few minutes. That implies that the client_state.xml files were huge because of all those voluminous <stderr_txt> sections. In those cases, the <max_tasks_reported> option in cc_config.xml might help. The number of tasks might also be the cause for the manager's disconnecting from the client. Do you have "Show all tasks" selected in the Tasks tab? It would help if Boinc.exe and BoincMgr.exe wrote messages to the System Event Log when they encounter a problem, or a significant event. They don't write to the system event log but to the BOINC event log (formerly known as Messages tab), which is kept in the file stdoutdae.txt in your BOINC data directory. Gruß, Gundolf |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
It would help if Boinc.exe and BoincMgr.exe wrote messages to the System Event Log when they encounter a problem, or a significant event. Boinc.exe when it crashes will attempt to write its debug material to stderrdae.txt Boincmgr.exe when it has problems, will write reports of that to stderrgui.txt These files can be found in your BOINC Data directory, where client_state.xml lives. On most Windows versions this directory is hidden. You can check in your BOINC start-up messages where your data directory lives. But normally, the default places for it can be found in this FAQ. Only when the whole shebang really crashes, will something be written into Windows Event Viewer->Applications Boinc.exe exiting but leaving Boincmgr.exe should be found in the stderrdae.txt file. |
CElliott Send message Joined: 19 Jul 99 Posts: 178 Credit: 79,285,961 RAC: 0 |
I found stderrdae.txt; it is fairly large. In every case, it was trying to write the state file when it failed. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Post the appropriate trace lines. But probably best not here, as that is too much info for this thread. Use an external site, such as http://pastebin.com/ and leave the link to your crash material. |
CElliott Send message Joined: 19 Jul 99 Posts: 178 Credit: 79,285,961 RAC: 0 |
Thank you for replying. It crashed again last night. The Boinc tray icon was displaying connecion error when I came down to work this morning. Here is when my Java program read the client_state.xml file on computer "Nat:" 1/6/2012 3:09:42 AM Start processing Nat 1/6/2012 3:09:44 AM Start processing Phillip The dump begins at "Dump Timestamp : 01/06/12 03:10:15." That does not prove that my program that reads the state file is interferring with Boinc.exe, but they are too close together not to be gravely suspicious. I put the entire crash dump on http://www.pastebin.com/. My user name is CELLiott and the Paste Bin is named BoincCrashDump. I can access it; please tell me if you cannot. BTW, the client_state.xml file is only 3,666,837 bytes long now. The file length may have nothing to do with the error. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
I just noticed, you're on 6.10.58; any reason why you aren't using a 6.12? I will forward the log to the developers, but I doubt that they will do anything with it, unless you manage to have the same problem on 6.12.34 or later. |
CElliott Send message Joined: 19 Jul 99 Posts: 178 Credit: 79,285,961 RAC: 0 |
I don't use 6.12 because I hate the extra steps one has to go thru to see the messages. I don't see any reason for the change, unless it is impossible to have more than 6 tabs across the top. On the one machine I have that runs 6.12, there is nothing in stderrdae.txt since 2007, so perhaps the problem is fixed in 6.12. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.