Message boards :
Number crunching :
AMD Phenom
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 9 · Next
Author | Message |
---|---|
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
They're both on identical boards, Gigabyte Am2+ AMD 770 chipsets, but only one shows this error, and that's the one with the older ATX 2.01 standard supply. I've just gotten back from buying a new ATX 2.2 standard one with a full 24 pin main connector, so I'll see later today. |
ChrisD Send message Joined: 25 Sep 99 Posts: 158 Credit: 2,496,342 RAC: 0 ![]() |
Best of luck :) ChrisD |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
It's back and happy again so far. Though it lasted several days before the issue occurred so only time will tell. |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
Or then again maybe not happy. Just a few hours after installing the new PSU it locked up again on the third core, same message "BUG soft lock for 11 seconds". I've switched it to the vanilla app and it's now running again for the last few hours. If it happens again the next step is to swap CPU's between the two machines and see if the issue follows the CPU. I'm beginning to wonder if there's something about the SSE2 enabled app that the Phenom just doesn't like. |
ChrisD Send message Joined: 25 Sep 99 Posts: 158 Credit: 2,496,342 RAC: 0 ![]() |
I'm beginning to wonder if there's something about the SSE2 enabled app that the Phenom just doesn't like. I do not think You need to worry about this one. All 3 crunchers here runs the SSE2 build and has no problems. ChrisD |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
I'm beginning to wonder if there's something about the SSE2 enabled app that the Phenom just doesn't like. Then that suggests maybe a dud CPU. |
ChrisD Send message Joined: 25 Sep 99 Posts: 158 Credit: 2,496,342 RAC: 0 ![]() |
I'm beginning to wonder if there's something about the SSE2 enabled app that the Phenom just doesn't like. Just remembered, You are running Linux, I am running Windows. Maybe You should ask Crunch3r if there is significant diffs that might influence on processing. ChrisD |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 ![]() |
Or then again maybe not happy. Just a few hours after installing the new PSU it locked up again on the third core, same message "BUG soft lock for 11 seconds". I've switched it to the vanilla app and it's now running again for the last few hours. If it happens again the next step is to swap CPU's between the two machines and see if the issue follows the CPU. I'm beginning to wonder if there's something about the SSE2 enabled app that the Phenom just doesn't like. I'd held off on posting graphs, hoping for a bit broader distribution of angle range on the machine running the SSE2 enabled ap. But as you are on vanilla now, the picture won't be building up. This graph shows individual result CPU times for Spear's two Phenom hosts: ![]() While the high current number of prematurely terminating noisy Work Units confuses the picture slight, the essentials are obvious. For the (large) group of results very near .39 AR, the enhanced ap requires only about .75 times as much CPU time as stock. The advantage is smaller near .01 AR, where it requires about .88 times as much CPU time. For the VHAR results from 2 to 6 AR, there appears to be an intermediate advantage, with the enhanced requiring about .84 times as much. Of these, the .39 region should be given the strongest weighting in guessing overall performance. So, if the problem is partly that the enhanced ap is somewhat more speed demanding from the point of view of your particular 9600 sample, you would still be ahead in performance (and in power productivity) if you had to slow down the clock rate while using the enhanced ap by more than 10%. Here is the graph with the same data from these two Phenoms along with the Q6600 and X9650 references I mentioned earlier. ![]() Some comments to this graph: takashi_m's X9650 has far, far lower result-to-result variation in CPU time than does my Q6600 in the VHAR range where memory contention seems most likely to affect things. Initial results from the stock Phenom suggest higher variability than the X9650 in this region, suggesting that some memory or cache contention higher than the X9650 is a factor. A better comparison on that point will come with more points from the enhanced ap on Phenom, should those resume. As more results flow in, I'll update the graphs at the same location, so this message will display updated data. I'll put new comments in a new post, but not extra copies of the same graphs. As the graphs are currently just 8 and 11 kilobytes, I hope that displaying them inline is warranted. I've not included Mark Sattler's demon machine X9650 host here, despite his kind invitation. His aggressive cooling provisions and extremely brave voltage choice are not, I think representative of the prudent range of overclocking practice. By contrast, I think my 3.006 GHz Q6600 and Takashi_m's 3.67 GHz X9650 are representative values easily reached or exceeded by prudent participants on those models. |
![]() ![]() Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 ![]() |
I'm beginning to wonder if there's something about the SSE2 enabled app that the Phenom just doesn't like. I'd say use the 32 bit optimized app for Linux on that phenom... times will probably look way better with that one ;) ![]() Join BOINC United now! |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
Regarding the enhanced app machine, I don't know what to think at this point. I've been running with vanilla for the last 6 hours then swapped to the enhanced app in an attempt to provoke it, yet not a peep from it. I'll give it some memtest and other stress tests as well as check the cooling later tonight in an attempt to narrow down the issue. The vanilla app machine is unflappable so far at least. I'll see how that responds to the enhanced app after the other machine begins to behave. |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
I hope I've finally gotten an answer now. A run of memtest86+ showed errors but not with all tests. They occurred within the first stick of RAM, so I took it out and tried the second one alone. That too showed errors but at different addresses. Thinking two dud sticks was unlikely I examined the RAM slots and found a largish speck of dust had gotten jammed in when the sticks were installed. Removed dust and it now passes memtest86+. |
ChrisD Send message Joined: 25 Sep 99 Posts: 158 Credit: 2,496,342 RAC: 0 ![]() |
I hope I've finally gotten an answer now. A run of memtest86+ showed errors but not with all tests. They occurred within the first stick of RAM, so I took it out and tried the second one alone. That too showed errors but at different addresses. Thinking two dud sticks was unlikely I examined the RAM slots and found a largish speck of dust had gotten jammed in when the sticks were installed. Removed dust and it now passes memtest86+. This is good news :) Hope this was the culprit that caused You the headache. Best of luck. ChrisD |
![]() ![]() Send message Joined: 21 Aug 02 Posts: 224 Credit: 1,809,275 RAC: 0 ![]() |
Very interesting! ![]() |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
I hope I've finally gotten an answer now. A run of memtest86+ showed errors but not with all tests. They occurred within the first stick of RAM, so I took it out and tried the second one alone. That too showed errors but at different addresses. Thinking two dud sticks was unlikely I examined the RAM slots and found a largish speck of dust had gotten jammed in when the sticks were installed. Removed dust and it now passes memtest86+. Sigh, spoke too soon. After 24 hours of the enhanced app the same soft lockup errors reoccurred. cpuburn runs happily without issue. A CPU swap is now planned for later tonight. |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
More possibilities have arisen. The boards aren't quite identical despite being bought simultaneously from the same supplier and being of the same revision. The problem machine has an older revision of the ITE 8718 monitoring chip. Plus the problem machine also has a Tekram 315 SCSI sitting in it. A google search for tekram +"soft lockup" show a lot of hits regarding Tekram devices. On a different note, even with cheap old cases, maybe one or two fans each and the stock cooler and thermal paste, they don't go above 50 degrees. |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 ![]() |
As more results flow in, I'll update the graphs at the same location, so this message will display updated data. I'll put new comments in a new post, but not extra copies of the same graphs. As the graphs are currently just 8 and 11 kilobytes, I hope that displaying them inline is warranted. I've updated the images displayed in message 724650 again. The main change is that luck of work fetch has finally given Spear's Phenom running the enhanced ap a decent sampling of angle ranges other than 0.39. |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51524 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
I hope I've finally gotten an answer now. A run of memtest86+ showed errors but not with all tests. They occurred within the first stick of RAM, so I took it out and tried the second one alone. That too showed errors but at different addresses. Thinking two dud sticks was unlikely I examined the RAM slots and found a largish speck of dust had gotten jammed in when the sticks were installed. Removed dust and it now passes memtest86+. Think ya got one of them bum Phenoms??? "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
I've swapped the CPU's and they've now run for nearly 48 hours without hiccup. The vanilla app machine did go a bit funny for a bit. Turns out compiling a Linux kernel that uses the PIIX4 modules for use by lm-sensors screws up BOINC's ability to correctly report times. There's about 12 results that have nonsensical times that are about a thousandth of what they should be. The results were processed correctly otherwise. I'm not sure why the issue hasn't shown up after the swap. They're the same CPU's even down to the week of production. Maybe it wasn't getting full contact for the pins or something. |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51524 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
Well....there have been some reports that the problem is mobo/chipset related........ "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
Spear Send message Joined: 15 Nov 01 Posts: 49 Credit: 6,365,604 RAC: 0 ![]() |
They're the same boards, though the troublesome machine seems to have a slightly older revision. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.