Intel security flaw

Message boards : Number crunching : Intel security flaw
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1913725 - Posted: 18 Jan 2018, 14:12:54 UTC - in response to Message 1913722.  

And to add to the woes:

Intel fix causes reboots and slowdowns

The company said it had reproduced the problem and was "making progress toward identifying the root cause".
Reading further down, Intel now acknowledges:

The most significant reduction in performance involved computer servers that store and retrieve large volumes of data. For those, the slowdown could be as severe as 25%.
That's more honest - theory and reality begin to match at least.

Looks like Moore's Law took a little bit of a hit there, eh?
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1913725 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1913732 - Posted: 18 Jan 2018, 14:47:09 UTC

Ugh. So, my next question is, I presume they are incorporating the 'fix' into the silicone, at least for the upcoming generations of procs that are going to be released, but my understanding of the problem is that it was introduced when they implemented the pre-fetching years and years ago, which of course boosted performance, if the data was sitting in the cache instead of it having to be read from memory.

So, is the fix to just disable it from now on, and this performance 'hit' is going to be the new normal? Sort of a back to the future? Or is there a way to securely do the pre-fetching to preserve the performance gains, but shield it from the security exposures? I haven't yet read anything about that yet, though I would think that that would be a huge concern, especially for their server side business?

ID: 1913732 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1913738 - Posted: 18 Jan 2018, 15:44:49 UTC - in response to Message 1913732.  

I've just been out for a walk to fetch my evening read (newspaper printed on dead trees - no power supply needed), and I was musing on much the same question as I walked.

The worst hit of all is going to be rebooting Centurion from a cold start. So, by the time I got home, I'd got as far as:

Could we keep a tiny, tiny bit of independent disk storage (100 MB of SSD, say - even a flash card) to hold configuration and status information - the most critical being the list of active tape files/channels being split (I gather each channel is interleaved along the entire length of the disk file, so the whole 50 GB has to be read (again!) for each new channel.

Then, the cold boot loader would start a user session with the sole purpose of reading that config, and then pipelining the contents of the required disk files into memory. Could such a boot load session sleep so deep that it would effectively eliminate the kernel mode switching?

Then, once the data was loaded, the boot loader could trigger the full BOINC server working environment, start the daemons, and set to work - suffering the kernel hit, but to a much lesser extent than it would during pre-load. Can it be done? Ideas?
ID: 1913738 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1913744 - Posted: 18 Jan 2018, 16:32:53 UTC

ARGGGGGHHHHH - how inefficient is that :-(
Spinning the whole tape each time a new channel is to be split is just plain *p
The reason for the data being "along" to the whole "tape" is to keep its synchronicity, but we aren't interested in that. So using a bit of Richard's logic in his earlier post to keep things together, and adding a bit from a few years back when I was dragging data from multi-channel data loggers that recorded data in much the same way.
Assuming we know how many "channels" there are in the "tape", when a new tape is loaded set up the required channel files.
Now, block read the tape, big blocks, n-channels wide by x segments long, into memory, parse the blocks and split dump into the files; repeat as necessary. "Wide-long" block reads are hairy to set up, but are low on i/o count, so avoid that overhead, they can also be very efficient if scaled properly. (the "hairy to set up" bit).
Now the splitters only have to work on the prepared files not the whole tape, so they don't have as much i/o to do.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1913744 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1913749 - Posted: 18 Jan 2018, 16:42:51 UTC - in response to Message 1913720.  

I am waiting for the class action lawsuits to start. People claiming that they no longer are getting the performance levels they paid for. I am sure there are lawyers just chomping at the bit.

Meow.

They've already started, 3 in fact in California the same week as the announcement.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1913749 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1913751 - Posted: 18 Jan 2018, 16:48:55 UTC - in response to Message 1913744.  

That sounds good. We certainly do know the format and structure of the files on 'tape' (remember, really disk images) - I think we designed it, and even built the bespoke data recorder which interfaces with the telescope feed. That may even be what Matt went off to do.

Assuming that to be the case (and be a little cautious - I wasn't taking technical briefing notes), it would be worth suggesting that to Eric - now, rather than later. He's beginning to think about processing the Parkes data: we touched on that in our chat too, and he mentioned some of the differences. IIRC - and I'm less certain about this bit - the philosophy at Green Bank is to search "every channel for selected sources", but for Parkes will be "selected channels for every source" - makes sense to get an overview of an area of sky we haven't seen before. So he's got to make changes anyway - what a good time to test out a new idea as well.
ID: 1913751 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1913753 - Posted: 18 Jan 2018, 17:03:45 UTC

Maybe this would be a good addition to the server closet.

https://www.theregister.co.uk/2017/11/21/hpe_brings_amds_epyc_processor_to_mainstream_2p2u_server_box/


With each crime and every kindness we birth our future.
ID: 1913753 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1913755 - Posted: 18 Jan 2018, 17:15:18 UTC

I think that at least if the servers were AMD based instead of Intel based, they wouldn't be suffering the 25% I/O penalty since they wouldn't need the Meltdown patch. Don't know if I have seen any performance degradation tests done on AMD hardware yet with a Spectre patch so that parameter is unknown.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1913755 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1913770 - Posted: 18 Jan 2018, 18:07:52 UTC

Any ruminations on what might be coming down the pike from the mfg's on how to mitigate this down the road, or if it is possible without re-engineering how the basics of the CPU has functioned for well over a decade?

ID: 1913770 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1913774 - Posted: 18 Jan 2018, 18:26:31 UTC - in response to Message 1913753.  

Maybe this would be a good addition to the server closet.

https://www.theregister.co.uk/2017/11/21/hpe_brings_amds_epyc_processor_to_mainstream_2p2u_server_box/


Would be nice. No problems be yet? Huh.. December. I hope they were talking about last month.
ID: 1913774 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1913781 - Posted: 18 Jan 2018, 18:53:55 UTC - in response to Message 1913751.  

Thanks Richard - I'll drop Eric a note to open a dialogue.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1913781 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1913804 - Posted: 18 Jan 2018, 20:52:26 UTC - in response to Message 1913706.  

I've just sent this email round to a small discussion group.....
And it's just produced a very interesting response. Turns out the Meltdown / Sceptre patches simply tipped the database over the edge of a problem which had been growing (un-noticed) anyway. Fortunately, WCG has access to hot and cold running database engineers, and after several interations of

'We investigated multiple paths in order to determine the issue.'

'After doing research, the team concluded that...'

... their database had multiple damaged SQL indices. A quick index drop and recreate later,

After those rebuilds were done, the database server dropped to a load of between 2-5 and a cpu utilization between 150-300%. Much lower than original 20 load and 2800% cpu utilization the server had been experiencing.
I've suggested that the report should be used to start a BOINC server administrator's Knowledge Base, and that the final diagnostic test they used should be scripted and made available to less well endowed BOINC projects as well. We'll see. (Eric is a member of the discussion group, so he'll get the full report and my suggestion directly).
ID: 1913804 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1913808 - Posted: 18 Jan 2018, 21:00:10 UTC - in response to Message 1913770.  

Any ruminations on what might be coming down the pike from the mfg's on how to mitigate this down the road, or if it is possible without re-engineering how the basics of the CPU has functioned for well over a decade?

All chip manufacturers will have to engineer new silicon. That means end products won't be available till 5 years from now.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1913808 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1913810 - Posted: 18 Jan 2018, 21:04:44 UTC

The Administrator's Knowledge Base is a great idea Richard. And it should be used as suggested to disseminate all the server database engineering discoveries and fixes.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1913810 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 1913853 - Posted: 19 Jan 2018, 0:46:36 UTC - in response to Message 1913732.  

Ugh. So, my next question is, I presume they are incorporating the 'fix' into the silicone, at least for the upcoming generations of procs that are going to be released, but my understanding of the problem is that it was introduced when they implemented the pre-fetching years and years ago, which of course boosted performance, if the data was sitting in the cache instead of it having to be read from memory.


Not prefetching. Speculative execution (Spectre) and shared page table mapping (Meltdown). For a good read: https://arstechnica.com/gadgets/2018/01/meltdown-and-spectre-every-modern-processor-has-unfixable-security-flaws/

For an in-depth explanation: https://arstechnica.com/gadgets/2018/01/whats-behind-the-intel-design-flaw-forcing-numerous-patches/
ID: 1913853 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1913861 - Posted: 19 Jan 2018, 1:13:33 UTC

Thanks Ozz and Keith. 5 years, huh? Not good.. I'll check out those links for some reading with the nightcap this evening.

ID: 1913861 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24875
Credit: 3,081,182
RAC: 7
Ireland
Message 1913896 - Posted: 19 Jan 2018, 2:45:57 UTC - in response to Message 1913770.  

Any ruminations on what might be coming down the pike from the mfg's on how to mitigate this down the road, or if it is possible without re-engineering how the basics of the CPU has functioned for well over a decade?
On June 8th this year, x86 architecture will be 40 years old. Maybe time to move beyond that.
ID: 1913896 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1913901 - Posted: 19 Jan 2018, 2:54:48 UTC

Now _That_ would be a tectonic paradigm shift I'd have to think. Flush the x86? Maybe a new path while still doing upgrades on the old one, as to not obsolete millions of pieces of hardware and software. But then everyone having to write and support 2 versions of all new programs for who knows how long, it boggles the mind.

ID: 1913901 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24875
Credit: 3,081,182
RAC: 7
Ireland
Message 1913905 - Posted: 19 Jan 2018, 3:16:34 UTC - in response to Message 1913901.  

True, but they didn't let that stop innovation in the 60's/70's/80's.
ID: 1913905 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11358
Credit: 29,581,041
RAC: 66
United States
Message 1913907 - Posted: 19 Jan 2018, 3:22:09 UTC - in response to Message 1913905.  

True, but they didn't let that stop innovation in the 60's/70's/80's.

But there was much less to deal with.
ID: 1913907 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : Intel security flaw


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.