Message boards :
Number crunching :
Manually uploading client_state.xml to s@h
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
I did post it just now to him.. Was away during the evening here and it retried until just recent. Files that size simply can't be uploaded properly to s@h and if mr Ageless here hadn't been so kind to report to dave about it i would have needed to push the detach button and waste aprox of halv a million in credit. Hats of to all and especially Ageless.. Thank you! _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I did post it just now to him.. Yikes....hope DA can sort this one.... I had a lotta trouble getting my backed up results reported this time around too, was almost in the same boat with you. Don't hit that detach button just yet. Meowa meowa zoom. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Hats of to all and especially Ageless.. You're welcome. Let's hope it's fixed for you before you have all those 27,000+ tasks ready to report. ;-) |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
[trac]changeset:22389[/trac] says scheduler: fix crashing bug when client reports a large # (1000+) of results (256KB not enough for query in this case) Now you still need available bandwidth. |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Well it still hasn't been able to send it through to the servers. I sent the file to Dave and used 7z to zip it and it went from staggering 31MB to 349 KB in size! How come boinc doesn't zip the files before sending them to the servers in the core? That would save extremely amount of resends on their congested connection to the internet! For now the file is up to 8906 wus to report and i have since this night set to NNT just to be sure and not congest my side of the bandwidth :) Hope this is sorted out in the long run. Kind regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
How come boinc doesn't zip the files before sending them to the servers in the core? Yes, but it would add enormous overhead to the server needing to decompress all those files before being able to read them, plus probably recompress the answer back to you all (sched_reply*.xml). It would also require a rewrite of the client. It is now capable of (de)compressing project files only (as zip and tar.gz). This could easily be done, were it not that the extra overhead on an already pounced upon server isn't something you want to add. |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
How come boinc doesn't zip the files before sending them to the servers in the core? That's true but at this rate servers are progressing huge parallelism the interconnection doesn't! So there is a battle where the bottleneck are going to be situated. * No compression = the interconnection and all 300000 clients trying to resend all data * Compression = cpu's in the servers are somewhat congested with decompressing data with added CRC to be sure that information wasn't compromised on the way. Ofcourse there will be resends there too but not to this extent. So the question is only where does we want the bottleneck to be? Dropped packets / resends or slower data progressing within the upload/download server? Kind Regards Vyper P.S 9064 Wu's for the time beeing & Sched_request is 34 Mb in size atm. D.S _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
I'm starting to wonder. Is there any way i can chop it up in pieces for myself reporting 999 at a time , connect again so it let the boinc client clear those 999 and then shut it down again. Edit it again , next 999 and so forth? Is there even a possibility to do this without clearing out all completed wu's? Regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I'm starting to wonder. You could test on a lower throughput host, with just a few results to report, to see what happens by tampering with the scheduler request before it goes up. That way the consequences would be less severe if it all went horribly wrong (which is likely ;) ). How you prevent the scheduler request from going up before test edits can be made would probably be a manual excercise, and a bit fiddly on the test machine. AFAICT, one thing that seems to take up a large proportion of the scheduler request are the in progress results. Since we know from the ghosts situation the server doesn't process those, then removing those from the sheduler request *may not* break it. My theory is that if you remove all but a small number of completed report results from the scheduler request, but maintain properly formed request (hard to do because it doesn't seem to have dos style line endings!) only those in the request will be acknowledged as reported, then the next (normally) generated request should be that much shorter. I'm glad I don't have to mess around myself to see if such a process might actually work... If it doesn't, and it looks like the server side fix will be a long time then I'll examine the option of fiddling with the report sequence in the Boinc client (though I'd rather not!). Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0 |
My theory is that if you remove all but a small number of completed report results from the scheduler request, but maintain properly formed request (hard to do because it doesn't seem to have dos style line endings!)... Aren't there some tools like unix2dos and dos2unix available for windows? Gruß, Gundolf |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
My theory is that if you remove all but a small number of completed report results from the scheduler request, but maintain properly formed request (hard to do because it doesn't seem to have dos style line endings!)... Sure. Since I don't generally do that kind of thing myself, I can't recommend a specific tool for the job, though I have certainly used editors that will read both kinds, and preserve the style used in that file. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I have to wonder......... Is this something that internet protocols just cannot handle......I mean, they can handle dropped connections and such.... Or just something that the Seti servers are unable to cope with? I mean......dropped comms are something that has existed since day 1 on the internet. S*** happens on the road. Why does this result in such bad things here? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Well internet have but remember there are things called DDOS attacks which could more or less render whole websites inoperable.. This is in a certain way what happens when the servers get online again, all hosts are trying to get through.. It's like fitting a river in a straw.. Some water needs to stay put while the rest pours through.. But if the river consisted in compression there would be much less water getting through to the servers so reduce the lake to about 1,1% it's original size and reduce the straw to about 50% it's original size.. You can quite fast see that the lake is not that big anylonger and the time needed to let the river flow will be much less timeconsuming and bandwidth consuming letting more work out from the servers instead of alot of bad transmissions and resends.. This is only my view as a consultant and technician because we have an urge to do something good even better. I don't know how it would be programmed wisely, but it would be better for the small straw :) Kind regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
The core client builds the sched_request just before sending it, I can't think of any way to pause between those two actions to allow editing it. A proxy could potentially trap the request and allow editing before sending it on, but nobody has written that kind of proxy AFAIK. What can be successfully edited is the client_state.xml. It wouldn't be easy, there's a <file_info> and <workunit> for each WU plus the files to which those refer, and a <result> for each finished (done and uploaded) task. But it would be possible to make multiple copies of the boinc data directory hierarchy, each of which has a subset of the information. As preparation you'd want to set NNT, suspend all unstarted tasks and let those running finish and upload, then disable network activity. After that shut BOINC down and do the editing and file moving. To use the subsets after doing all that editing and moving of files, you'd restart BOINC with one of the subsets in place, allow network activity and do an Update to report that subset. Then disable network activity and shut down again. Check what the <rpc_seqno> has been updated to and put that in the next subset when you move it in. Consider that a sketch, I could easily have missed some details. There may be possible simplfications too, for instance I'm not totally sure the WU files need to be kept though BOINC doesn't delete them until their results have been reported. Joe |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
I presume that since the limited "resend lost tasks" feature has been seen, the Scheduler code has been rebuilt and probably includes the changeset 22389 fix to allow reporting large numbers of results. I have my fingers crossed for those of you who need it! Joe |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
It hasn't been a problem reporting atleast 3000 Wu's from what i have experienced. That hasn't been a issue so far it seems, is there perhaps a limit at 4095? Only number that would add up in my mind after that it's Hex 1000.. Regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
It hasn't been a problem reporting atleast 3000 Wu's from what i have experienced. I don't think it's an exact specific number. The reports are rewritten as a batch update to the database, the old method had a limit of 256KB for that query. The new method builds the query in a string, I don't know if there's any specific limit to the length. In either case, each result name is included and they vary in length. The question is whether the one with over 7000 can be successfully transferred and handled since the changes yesterday. I can't imagine Dr. Anderson not including that bug fix when rebuilding the Scheduler code. Joe |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Hi it's working now and apart of that my assumptions of the 4095 limit was fairly correct because more than that made the request to get as big as 1GB making the scheduler go ouf of memory.. So in a way i was correct after all :D Man i'm so excited my hunch was fairly spot on. Well as for now it's chugging along reporting in 1000 wu chunks. Thanks everyone for digging out this frustrating issue. And a big warm thank you to Dave who adressed this issue so quickly and painlessly. Kind regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
So, Boinc is now splitting the file up somehow and reporting smaller batches, if I understand you correctly? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Yup, it doesn't split the file in that way, it only grabs the first 1000 and tick them as reported and reporting that back to the client now. Next time your client report back it has reduced the amount with aprox 1000 too. Regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.