Panic Mode On (94) Server Problems?

Message boards : Number crunching : Panic Mode On (94) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 22 · Next

AuthorMessage
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1632946 - Posted: 26 Jan 2015, 8:23:35 UTC - in response to Message 1632929.  

alarmist Bernie this is only happening when I use Bionic . I already know I can merge them that's not the problem it the fact that only when the client is beieng used I've had a problem so there must be some vonrebility someone has found is posssiblly try to use I HAVE CHANGED NO HARDWARE only 1 program and now I have a answer from the very user here that made it I suspect something but stuff you's then i'll keep my mouth shut and if it happens to you bad luck and i'll keep what I suspect to myself .
ID: 1632946 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1632948 - Posted: 26 Jan 2015, 8:34:43 UTC
Last modified: 26 Jan 2015, 8:35:17 UTC

Glenn it can happen due to lots of reasons, sometimes the user has to do nothing for it to happen.

I remember a thread a while ago about this. I never said YOU did anything I just said until the facts are known typing Seti has been hacked in capitals is just a bit alarmist.

If you wait I am sure someone here can explain the technicalities as to why and how this can and does happen.

I get from your post that you suspect I have hacked into your machine is this correct?
ID: 1632948 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1632953 - Posted: 26 Jan 2015, 9:05:30 UTC - in response to Message 1632948.  

Glenn it can happen due to lots of reasons, sometimes the user has to do nothing for it to happen.

I remember a thread a while ago about this.

I do remember seeing a thread about this, too. There was some time a few years ago where some machines were getting a new ID for... no reason that could be determined. I believe the best theory had something to do with A/V interrupting the process of updating client_state.xml every 60 seconds, and if BOINC can't write it.. it just kind of starts anew.



However, if you say both "machines" seem to be updating on their own even though there is really only one machine, the next thing you'll need to do is click "show IP address" on both of the apparently-identical machines to confirm whether or not it is the same machine or not.

So compare those addresses and report back (for security reasons, DON'T actually post your IP here, just whether they're the same or different.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1632953 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1632967 - Posted: 26 Jan 2015, 10:12:04 UTC - in response to Message 1632948.  

Glenn it can happen due to lots of reasons, sometimes the user has to do nothing for it to happen.

I remember a thread a while ago about this. I never said YOU did anything I just said until the facts are known typing Seti has been hacked in capitals is just a bit alarmist.

If you wait I am sure someone here can explain the technicalities as to why and how this can and does happen.

I get from your post that you suspect I have hacked into your machine is this correct?

Not anything sinister.....
I have had this happen once or twice after a hardware malfunction.
Boinc gets screwed up and makes the servers think it's a new installation.
No malware involved.
If the Boinc installation gets whacked, the servers do what they do thinking it's just a new host coming online.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1632967 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1632976 - Posted: 26 Jan 2015, 11:37:12 UTC

I've had it happen twice and I think the culprit was Steam, possibly with some help from A/V s/ware. I cannot prove it but Steam has been banished from the start-up group, it hasn't happened since.
ID: 1632976 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1633003 - Posted: 26 Jan 2015, 13:51:00 UTC - in response to Message 1632953.  

Glenn it can happen due to lots of reasons, sometimes the user has to do nothing for it to happen.

I remember a thread a while ago about this.

I do remember seeing a thread about this, too. There was some time a few years ago where some machines were getting a new ID for... no reason that could be determined. I believe the best theory had something to do with A/V interrupting the process of updating client_state.xml every 60 seconds, and if BOINC can't write it.. it just kind of starts anew.

However, if you say both "machines" seem to be updating on their own even though there is really only one machine, the next thing you'll need to do is click "show IP address" on both of the apparently-identical machines to confirm whether or not it is the same machine or not.

So compare those addresses and report back (for security reasons, DON'T actually post your IP here, just whether they're the same or different.

I had one machine a few years ago that was generating a new ID every few days for no reason at all. A work machine with a clean ghosted OS image loaded & only BOINC running.
Also for me BOINC only records the internal network IPs from my machines on the LAN side of my router rather than my internet IP.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1633003 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1633005 - Posted: 26 Jan 2015, 13:56:17 UTC - in response to Message 1633002.  

NO Bernie I don't . And seeing as I know what is going on stuff bionic there is a problem but ..yours . I will keep what I have found out to myself roll back the offending program to a older version and stuff you all if you get problems and find your bank account empty bad luck . I tried to warn you but now F you's


Well you seem upset when several other posters have said that this sort of thing can and does happen.

If you believe there is a problem with Boinc, you should perhaps report it to the Boinc developers.

If you believe there is a problem with SETI@Home that might involve people on this forum, then you should report it to Fred who is in charge of the forums,
doing nothing seems to be a bit mean spirited, however it is your choice.
ID: 1633005 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30651
Credit: 53,134,872
RAC: 32
United States
Message 1633011 - Posted: 26 Jan 2015, 14:33:57 UTC - in response to Message 1633003.  

Glenn it can happen due to lots of reasons, sometimes the user has to do nothing for it to happen.

I remember a thread a while ago about this.

I do remember seeing a thread about this, too. There was some time a few years ago where some machines were getting a new ID for... no reason that could be determined. I believe the best theory had something to do with A/V interrupting the process of updating client_state.xml every 60 seconds, and if BOINC can't write it.. it just kind of starts anew.

However, if you say both "machines" seem to be updating on their own even though there is really only one machine, the next thing you'll need to do is click "show IP address" on both of the apparently-identical machines to confirm whether or not it is the same machine or not.

So compare those addresses and report back (for security reasons, DON'T actually post your IP here, just whether they're the same or different.

I had one machine a few years ago that was generating a new ID every few days for no reason at all. A work machine with a clean ghosted OS image loaded & only BOINC running.
Also for me BOINC only records the internal network IPs from my machines on the LAN side of my router rather than my internet IP.

Yes, server error where it fails to recognize that the computer is the same as another it already has in its table. Climate Prediction was generating a new ID for several of my machines nearly every time they asked for work. Most of the time the BONIC server eventually figures out the machines are one in the same, sometimes it is clueless. One of the reasons for the merge hosts function of the website.
ID: 1633011 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1633137 - Posted: 26 Jan 2015, 20:00:39 UTC - in response to Message 1633068.  

Keith was seen sitting in front of Synergy with a bottle of fine Scotch, saying "Good work mate, good work."
ID: 1633137 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1633168 - Posted: 26 Jan 2015, 20:55:50 UTC - in response to Message 1633003.  

the next thing you'll need to do is click "show IP address" on both of the apparently-identical machines to confirm whether or not it is the same machine or not.

Also for me BOINC only records the internal network IPs from my machines on the LAN side of my router rather than my internet IP.

Interesting. My machines have always shown me local and external. Even when I had two machines running in my office at my old job at college.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1633168 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1633169 - Posted: 26 Jan 2015, 20:57:29 UTC - in response to Message 1633168.  

outside IPs can be blocked by various methods.
ID: 1633169 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1633170 - Posted: 26 Jan 2015, 21:01:46 UTC - in response to Message 1633169.  

outside IPs can be blocked by various methods.

Or at least obfuscated in some way by use of a proxy or VPN, but I thought the external IP that shows on your view of your host pages was what the server sees you connecting from, regardless of whether its a direct connection or via a proxy or VPN.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1633170 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1633177 - Posted: 26 Jan 2015, 21:10:15 UTC

Also, I wanted to point out that there was something I (among several others) were curious about.

Those WUs that get stuck in validation limbo.. I've been wondering "at what point exactly in the process do they get stuck?" I theorized that once a canonical result is chosen and assimilated, it moves on to the next stage of deleting the results/WU files, but then gets stuck at purging from the DB. That was my theory for a while.

However, this WU disproves that theory. Assimilation was broken and not running at that time. _1 missed the deadline, it got sent out to _2. _2 returned it before _1 did. _0 (me) and _2 got validated and awarded credit, _1 returned it late and is pending.

That happened BEFORE assimilation. So that means this problem lies somewhere in the logic for the validator, or if there is some kind of governing process that moves a WU along the pipeline, the logic fault could be in there, somewhere before assimilation actually gets called.

Now we just need to dig through that section of code and figure out what the problem could be. I mean.. it shouldn't be too difficult to tell validation to wait until all the results are in.. or "if already validated and a late task gets reported, mark it as 'too late to validate' and discard it." There are times where _1 is late, _2 gets assigned, _1 reports, and _2 is NOT late. In that situation, _2 should still get validated and awarded credit.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1633177 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1633185 - Posted: 26 Jan 2015, 21:18:56 UTC - in response to Message 1633177.  

... if there is some kind of governing process that moves a WU along the pipeline, the logic fault could be in there, somewhere before assimilation actually gets called.

Yes, there is, and it's called the transitioner. But I don't know what triggers it to pay attention to any particular task or WU, or what circumstances (exact coincidence of timing?) might cause the trigger to fail.
ID: 1633185 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1633227 - Posted: 26 Jan 2015, 22:09:28 UTC

All I will say is there is a exploit and there is a reason the servers gave my machine a ID but seeing as I'm just a high school iddiot whom knows nothing i'll keep what I have learnt to my self . Seeing as the program was written by a user at seti I can see others getting problems . The user involved has not maliciously done this his intention was just to produce a program we could use but there is a problem with the newer version so , I'm not up set with him .
ID: 1633227 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1633239 - Posted: 26 Jan 2015, 22:25:56 UTC - in response to Message 1633185.  

... if there is some kind of governing process that moves a WU along the pipeline, the logic fault could be in there, somewhere before assimilation actually gets called.

Yes, there is, and it's called the transitioner. But I don't know what triggers it to pay attention to any particular task or WU, or what circumstances (exact coincidence of timing?) might cause the trigger to fail.

It seems that there is logic missing that doesn't check to see if there are any non-expired tasks for that particular WU and goes ahead and validates as soon as there are two results reported. Once a WU gets marked as validated and a canonical result is chosen, the process seems to break at that point and doesn't know what to do nor how to continue.


I know a long time ago, this used to not be a problem, so it worked like.. 6 years ago, but I believe it showed up some time during the reign of _v505.

I know I keep implying that it is "just a simple logic fault," but I know there's probably nothing simple about finding it, and then fixing it.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1633239 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1633240 - Posted: 26 Jan 2015, 22:30:11 UTC - in response to Message 1633227.  

All I will say is there is a exploit and there is a reason the servers gave my machine a ID but seeing as I'm just a high school iddiot whom knows nothing i'll keep what I have learnt to my self . Seeing as the program was written by a user at seti I can see others getting problems . The user involved has not maliciously done this his intention was just to produce a program we could use but there is a problem with the newer version so , I'm not up set with him .

It would seem you are asserting the server code is perfect & could in no way hiccup causing your machine to generate a new ID. Despite the evidence of several people informing you that it does randomly happen now & then.
I'm genuinely astonished at your faith in Dr Anderson & the BOINC devs in the matter.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1633240 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1633242 - Posted: 26 Jan 2015, 22:39:59 UTC - in response to Message 1633240.  

All I will say is there is a exploit and there is a reason the servers gave my machine a ID but seeing as I'm just a high school iddiot whom knows nothing i'll keep what I have learnt to my self . Seeing as the program was written by a user at seti I can see others getting problems . The user involved has not maliciously done this his intention was just to produce a program we could use but there is a problem with the newer version so , I'm not up set with him .

It would seem you are asserting the server code is perfect & could in no way hiccup causing your machine to generate a new ID. Despite the evidence of several people informing you that it does randomly happen now & then.
I'm genuinely astonished at your faith in Dr Anderson & the BOINC devs in the matter.

The code for generating new host IDs is explicit and available in the BOINC source code - this is an open source project, after all.

It's in sched/handle_request.cpp, and I see four cases at lines 230, 324, 335 and 349 (search term "new host record"). The mystery is why the cases cited -

If no host ID is supplied, or if RPC seqno mismatch, ...
If the request's host ID isn't consistent with the authenticator, ...
If the seqno from the host is less than what we expect, ...
Here no hostid was given, or the ID was bad...

- should occur in normal running, when the user is confident that none of the bad things happened.
ID: 1633242 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1633296 - Posted: 27 Jan 2015, 0:18:07 UTC - in response to Message 1633242.  

All I will say is there is a exploit and there is a reason the servers gave my machine a ID but seeing as I'm just a high school iddiot whom knows nothing i'll keep what I have learnt to my self . Seeing as the program was written by a user at seti I can see others getting problems . The user involved has not maliciously done this his intention was just to produce a program we could use but there is a problem with the newer version so , I'm not up set with him .

It would seem you are asserting the server code is perfect & could in no way hiccup causing your machine to generate a new ID. Despite the evidence of several people informing you that it does randomly happen now & then.
I'm genuinely astonished at your faith in Dr Anderson & the BOINC devs in the matter.

The code for generating new host IDs is explicit and available in the BOINC source code - this is an open source project, after all.

It's in sched/handle_request.cpp, and I see four cases at lines 230, 324, 335 and 349 (search term "new host record"). The mystery is why the cases cited -

If no host ID is supplied, or if RPC seqno mismatch, ...
If the request's host ID isn't consistent with the authenticator, ...
If the seqno from the host is less than what we expect, ...
Here no hostid was given, or the ID was bad...

- should occur in normal running, when the user is confident that none of the bad things happened.

Some kind of error in the packet or an issue on the server, possibly a memory leak or hardware related, with the received host information would be my first thoughts oh how it might happen.
I think it may be related to the same reason I will see messages on my MB only machines every so often "Your app_info.xml file doesn't have a usable version of AstroPulse v7". When they are in a venue that only asks for MB work. There are some 1's & 0's out of place somewhere when that happens.

The decision was made to give a host a new ID if something goes wrong on a request. Which is probably better than the server keeping track and waiting for the the next request to decide if it should do something later or ignoring the request.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1633296 · Report as offensive
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 1633466 - Posted: 27 Jan 2015, 7:50:26 UTC

From the looks of SSP, the AP database is belly up again
Dave

ID: 1633466 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (94) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.