Posts by Leopoldo

1) Message boards : Number crunching : Panic Mode On (34) Server Problems (Message 1007782)
Posted 24 Jun 2010 by Profile Leopoldo
Post:
Wonder how many of those 4.6 million results out in the field will hit the upload server when it comes back online?

This number slowly decreases also:

As of 24 Jun 2010 14:50:09 UTC
Results ready to send				158,484		21,179
Results out in the field			4,618,405	72,475
Results returned and awaiting validation	6,473,085	75,090
Workunits waiting for validation		0		0
Workunits waiting for assimilation		947,649		5,141
Workunit files waiting for deletion		46		0
Result files waiting for deletion		140		0
Workunits waiting for db purging		937,110		21
Results waiting for db purging			1,981,623	914

As of 24 Jun 2010 15:45:08 UTC
Results ready to send				161,415		22,761
Results out in the field			4,615,486	72,475
Results returned and awaiting validation	6,385,568	75,090
Workunits waiting for validation		0		0
Workunits waiting for assimilation		903,271		5,141
Workunit files waiting for deletion		11		0
Result files waiting for deletion		98		0
Workunits waiting for db purging		981,631		21
Results waiting for db purging			2,072,158	914

2) Message boards : Number crunching : Rescheduler weirdness and strange errors (Message 1005157)
Posted 17 Jun 2010 by Profile Leopoldo
Post:
In addition I am now seeing weird stuff in the error logs that makes no sense??

17/06/2010 08:30:00 SETI@home Beta Test You used the wrong URL for this project
17/06/2010 08:30:00 SETI@home Beta Test The correct URL is http://setiweb.ssl.berkeley.edu/beta/


It's not Rescheduler problem - error exists in description file inside of SETI@home Beta (the same name with main project). Do detach from beta project and continue crunching for main SETI@home. You can attach to beta again later.
3) Message boards : Number crunching : Running SETI@home on an nVidia Fermi GPU (Message 1004577)
Posted 16 Jun 2010 by Profile Leopoldo
Post:
MadMaC,

<app_name>setiathome_enhanced</app_name> Im guessing that this is where the error is
No, this string of <workunit> section is correct. It's multibeam workunit.

<version_num>610</version_num>
<plan_class>cuda</plan_class>
Here for fermi-oriented app 6.10 must be

<plan_class>cuda_fermi</plan_class>

Rescheduler 1.9 doesn't know about new version 6.10 and this new plan_class - at compilation time such new things didn't exists...
4) Message boards : Number crunching : seti, collatz and cuda (Message 997300)
Posted 20 May 2010 by Profile Leopoldo
Post:
As addition to Claggy's message: Dennis, saw you scheduling values for this projects?
The collatz scheduling priority is roughly double that of seti, can that be changed?

Dennis, you can read article http://www.boinc-wiki.info/Work_Scheduler to better understanding BOINC principle of chosing the projects to crunch right now.
* Work Scheduler does "round-robin" scheduling among Results, attempting to honor Resource Shares. *
If I will wish to crunch 2 projects simultaneously, I will set equal Resource Share to both of them.

Also, in your particular case of Collatz joining later than SETI, "Long Term Debt" for SETI is much lower than LTD for Collatz.

Back to your question. To equalize BOINC expectations about projects with equal resource shares - there exist the way to change project debts, with "boinccmd" tool. But it does require advanced skills and caution. It's not recommended for majority of users - to use service tools for manual BOINC manipulating.

I did it w/o problem. Backup BOINC data directory first (just for any case). At the BOINC directory exists service executable called "boinccmd.exe". To equalize debts for 2 projects, SETI and Collatz, you can run this executable with special parameters. Command line is:
boinccmd --set_debts setiathome.berkeley.edu 0 0 boinc.thesonntags.com/collatz 0 0


After this you can check properties for these projects to see how numbers changed.


Edit: but firstly look at coproc_debug output please.
5) Message boards : Number crunching : HELP!!! My son has reforrmatted my Ext-HD (Message 997162)
Posted 19 May 2010 by Profile Leopoldo
Post:
ok. will look at these again. I have allowed the BF to try a scan with a different program, I can't remember what it is, I will check this too.

If nothing of above will help, try to look at http://en.wikipedia.org/wiki/TestDisk. There you can choose filesystem to search for. It can work with external media like USB-flashdisk...
6) Message boards : Number crunching : seti, collatz and cuda (Message 997132)
Posted 19 May 2010 by Profile Leopoldo
Post:
As addition to Claggy's message: Dennis, saw you scheduling values for this projects?
(with project's "Properties" button in BOINC's "Advanced mode", search for "... scheduling priority" lines)

Project with larger value should be crunched by BOINC sooner than its counterpart...
(Except the case of projected crunching time is longer than task's deadline)
___________
WBW, Leonid
7) Message boards : Number crunching : Panic Mode On (32) Server problems (Message 996116)
Posted 14 May 2010 by Profile Leopoldo
Post:
It looks like upload webserver and corresponding tasks (including results verification and total credit updating) stopped.


Upload computer is alive
PING setiboincdata.ssl.berkeley.edu (208.68.240.16) 56(84) bytes of data.
64 bytes from setiboincdata.ssl.berkeley.edu (208.68.240.16): icmp_seq=1 ttl=53 time=343 ms

But upload webserver doesn't responding
telnet: connect to address 208.68.240.16: Connection refused
8) Message boards : Number crunching : libz.so.1?? (Message 994255)
Posted 5 May 2010 by Profile Leopoldo
Post:
but I'm trying to install a new version of boinc (boinc_6.10.44_i686-pc-linux-gnu.sh) on a Fedora Core 12 machine.


Just a suggestion: if I'm correct, your FC12 machine have 2.6.31.5-127.fc12.x86_64 installed (i.e. 64-bit Linux). Probably, 64-bit BOINC (boinc_6.10.**_x86_64-pc-linux-gnu.sh) would be better suited for such OS?
9) Message boards : Number crunching : Something wrong somwhere? (Message 994254)
Posted 5 May 2010 by Profile Leopoldo
Post:
The 210 tasks dahls has are "Ready to report" so they have already been uploaded successfully. Communications with the upload handler are done, it's communicating with the Scheduler that's failing.

Thx, Josef. I missed this.

@dahls: Jørn, you can visit scheduler URL http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi from machine under question?

Normal answer looks like:

<scheduler_reply>
<scheduler_version>611</scheduler_version>
<master_url>http://setiathome.berkeley.edu/</master_url>
<request_delay>11.000000</request_delay>
<message priority="low">Error in request message: no start tag </message>
<project_name>SETI@home</project_name>
</scheduler_reply>

In my opinion it seems like there is something wrong at the server side.

To determine this, look into server answer after the uploading attempt (file "sched_reply_setiathome.berkeley.edu.xml")
Normal acknowledgement for completed task looks like:

...
<result_ack>
    <name>11dc06ag.26523.16023.4.10.112_0</name>
</result_ack>
...

Excuse me please, I never saw rejecting server answer, can't tell which it looks like.

Should I delete everything on the triple core machine and then reinstall BOINC again?

IMHO, in case of such big uploading troubles, which makes crunching at this machine useless, this action seems normal from my point of view...
10) Message boards : Number crunching : Something wrong somwhere? (Message 994188)
Posted 5 May 2010 by Profile Leopoldo
Post:
The triple core machine has probably not been able to get new work sets for at least a week. And it has 210 work sets to report, but it's not able to report them.

Firstly you should report completed tasks. You can check availability of the upload server (208.68.240.16) by visiting the URL http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler with your favorite browser from this triple core machine.

Normal answer looks like:

<data_server_reply>
    <status>1</status>
    <message>no command</message>
</data_server_reply>

If it does, check your uploading (completed tasks reported through the file "sched_request_setiathome.berkeley.edu.xml", so maybe this file can't be uploaded due to size)

If it doesn't, change URL with direct numbers instead of symbolic name and repeat visit.
11) Message boards : Number crunching : Something wrong somwhere? (Message 992765)
Posted 29 Apr 2010 by Profile Leopoldo
Post:
An explanation of why this is happened would be nice :)


To see the one source of this problem please visit address
http://208.68.240.18/sah/download_fanout/testfile
You should see "this is a test" - which means server is working

And address
http://208.68.240.13/sah/download_fanout/testfile
If you will see "The service is not available. Please try again later." - 2nd SETI server isn't working and only the trick with "hosts" file will help your computers with correct downloads (don't forget to comment lines after the problem with server exhausted)

_____________
WBW, Leopoldo
12) Message boards : Number crunching : CUDA MB V12b rebuild supposed to work with Fermi GPUs (Message 992656)
Posted 29 Apr 2010 by Profile Leopoldo
Post:
This is curious. If I click on my own link (which worked yesterday), I get "The service is unavailable".

But if I click on the copy of the exact same link which I posted in the Beta forum a few minutes later, the app downloads OK.


Just as a reminder, at present situation with .13 server we can post modified URLs like instead of
http://boinc2.ssl.berkeley.edu/beta/download/setiathome_6.09_windows_intelx86__cuda_fermi.exe

working 'hand-made' link is
http://208.68.240.18/beta/download/setiathome_6.09_windows_intelx86__cuda_fermi.exe

and it will work without 'The service is not available' ;)
13) Questions and Answers : Unix/Linux : Stalled tasks. (Message 981011)
Posted 19 Mar 2010 by Profile Leopoldo
Post:
IMHO, such stuck looks as a result of using binary, compiled from initially-working source which assumes any compilator defaults - with the other version of compilator (and other defaults).

For example, initial variant of 64-bit openSUSE kernel in the 11.3 milestone 3 works if compiled by GCC 4.4.3 and doesn't with 4.4.4 - this GCC versions used different default aligning of "struct"s, 8 and 32, coordingly.

Such tasks (endlessly looping after writing baseline smoothing into the stderr.txt), are common for the stock 6.03 application of the SETI@home Beta project with 64-bit versions of Ubuntu 9.10 and openSUSE 11.2

Tangerineboy, with such binary I see no other way as abort them.
14) Message boards : Number crunching : Panic Mode On (30) Server problems (Message 979149)
Posted 15 Mar 2010 by Profile Leopoldo
Post:
Did you saw how is broken LAN-card works? On-board LEDs indicates true activity, you can connect to POP and SMTP through telnet and receive answers, you can send/receive small pieces of data (your telnet commands and POP or SMTP answers; broadcasts at LAN; showing of shared folders)... BUT you never can said will data larger than around 200KB transmitted or not - you just receives no answer after sending of a command... Only telnet to external POP3-server and "list" command helped me to find a broken card. Replacement of this card had been done, which allowed to bring server into working condition.

Edit: furthermore, I see half-working 16-port switch now. It have full-working 8-ports and broken 8-ports (ping, SMB-messages, telnet, short-file transfers works there but long-file transfers doesn't). After disassembling and looking at electronic circuitry I saw 2 control chips and 2 correspondig buffers - so I sticked broken ports with scotch tape and reconnected 8 remaining UTPs to other switch. My conclusion was - one broken buffer. Other half-part of this switch still working...

It's very hard to find such things...
15) Message boards : Number crunching : Problems... (Message 977882)
Posted 12 Mar 2010 by Profile Leopoldo
Post:
(btw, is any reason to double mention of all files in heading section? I mean why each file_ref should have corresponding file_info? )

my interpretation is:

  • files specified at <file_info> sections (right after the <app>) are declarations of all files which can be needed by any versions of specified app (i.e. for file existance check);
  • and files, mentioned at <file_ref> sections of one <app_version> - as files which must be copied to corresponding slot along with workunit file and chosen version of app for future processing.

16) Message boards : Number crunching : AndyK pending tool........... (Message 946930)
Posted 13 Nov 2009 by Profile Leopoldo
Post:
I have updated to the same version that Leopaldo is using
...

Hello!

New version (as an attempt to add some stability to script working at fast webhosting - delays in line-by-line reading were added) has been uploaded to my mirror. I tested it with Geek@Play, John Galt 007 and msattler accounts - at my mirror I found no lockups.

Arkayn, try it too, please. Only one thing: delays for your fast webhosting must be longer, than my. Time-related settings concentrated around line 655.
17) Message boards : Number crunching : BOINC 6.10.17 released for all users (Message 943897)
Posted 30 Oct 2009 by Profile Leopoldo
Post:
But not if you intend to run AQUA's multi-thread CPU applications. Unfinished business.

Agreed. AQUA MT apps slows down SETI CUDA-crunching seriously.
18) Message boards : Number crunching : AndyK pending tool........... (Message 943578)
Posted 29 Oct 2009 by Profile Leopoldo
Post:
Hello!
Querry by HostID failed...........
Querry by UserID failed...........

Geek@Play, thank You! Finally I saw the lockup of the script too.

But the problem is much more larger than I think before. Looking deeper into the AndyK script functions, I hadn't found reading stream timeout reaction, only timeout increasing of the whole parsing script...

I don't understand why AndyK didn't programm the reaction to reading timeout. So, before any next recommendations, I will experiment at my mirror firstly.
19) Message boards : Number crunching : AndyK pending tool........... (Message 943391)
Posted 28 Oct 2009 by Profile Leopoldo
Post:
When the querry is by User ID the waiting period is 0.50 and fails.

Right now I tried arkayn's mirror with Geek@Play account ID and it works.
Hmm, I will ask arkayn to increase timeout even more...
20) Message boards : Number crunching : AndyK pending tool........... (Message 943307)
Posted 28 Oct 2009 by Profile Leopoldo
Post:
Hi, fellow crunchers!

Just as another variant I increased delay to 0.1 sec and request timeout to 3x value at my mirror.

John Galt 007 and Geek@Play: I entered your account numbers into the new version of the AndyK tool at my mirror - and no lockups I saw. Please check yourself again if you wish. I hope, together we can find the working combination...

___________
WBW, Leonid


Next 20


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.