To checkpoint or not -- the wear and tear of SSD drives

Message boards : Number crunching : To checkpoint or not -- the wear and tear of SSD drives
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20289
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1877965 - Posted: 12 Jul 2017, 14:40:16 UTC - in response to Message 1877895.  
Last modified: 12 Jul 2017, 14:58:43 UTC

MLC and TLC refer to the NAND cell voltage level. SLC is one bit per cell or single voltage representing a binary bit, MLC is 2 bits per cell or 2 voltage levels and TLC it 3 bits per cell and is three distinct voltage levels. The scale is SLC, MLC and then TLC with regard to data integrity robustness and the amount of PE writes a cell can endure before being taken out of service by the flash controller. SLC is also faster than MLC and TLC and is usually put in data center product lines. The rule of thumb got upset a bit when the 3D cell structures came into play because they have larger cell features and can handle more erases than even MLC at smaller feature size. They are too new in the marketplace to see where they fall out on the lifespan curve I believe. ...

Regarding the number of voltage levels:

"SLC" one bit per cell encodes the single bit using two discrete (analog) voltage levels stored in the memory cell (high or low volts for a "1" or a "0");

"MLC" encodes 2 bits per storage cell by using 4 discrete (analog) voltage levels for the "00", "01", "10", "11" bit values;

"TLC" encodes 3 bits by using 8 discrete (analog) voltage levels...

Encoding more voltage levels in each cell enables more data bits to be stored per cell but with the compromise of slower read-write circuitry and lower noise margins... Those compromises can be offset by using a form of 'internal raid' for parallel access and error correction codes, all internal to the storage system.

A good metal case and keeping away from PSUs and power lines is always a very good idea!


The 64-layer stacked devices soon to be available are looking impressive for the storage density and low power. There is just the game to wait for the prices to hit whatever sweet-spot you have for your needs.


Happy cool crunchin',
Martin

ps: Problems of SSD wear should now be a long gone thing of the past. All recent software and good OSes have suitable data alignment to avoid the killer "write amplification" effect. Further, newer systems should be using the "discard" function to improve the wear leveling performance. Further still, in the Linux world, there are filesystems in use that are specially designed to minimize the wear on flash-based storage (memory sticks, SSDs). For example, for HDDs and SSDs, I'm using btrfs; for memory sticks and SSDs, I use F2FS. - Happy fast efficient crunchin'!
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1877965 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1878099 - Posted: 13 Jul 2017, 17:25:34 UTC

To add to what Martin said, basically, what happens with MLCs and TLCs is that sometimes, an electron gets stuck on the wall of a cell. When this happens, the SSD has no control over removing that electron.

On a SLC where there is one electron per cell, the SSD controller expects a charge level of between 0 and 7. If there is no electron, then 0 is a binary 0, and 1-7 is a binary 1.

If an electron gets stuck in the cell, then 0 can never be achieved ever again because there will always be an electron stuck in the cell. So 1 is now the new binary 0. 2-7 is a binary 1.

Basically, the cell was only intended to have one electron in it, but it can theoretically have upwards of 8 trapped in there at any given time because of 8 voltage levels. The SSD knows how to detect when an electron gets stuck and refuses to be released, and it can do that for every single cell.

So a SLC can end up having 7 electrons get stuck on the wall of every cell and as long as the 8th electron plays nice, that cell is still operational and the voltage can be distinguished between 6 and 7. But once the 8th electron gets stuck, that cell is no longer usable and has to be deactivated and removed from the pool.


MLC is when it is designed to have two electrons per cell. As you can imagine, this complicates the logic a bit more in the event that an electron gets stuck on a wall. It makes that 0-7 voltage level a lot more sensitive to a stuck electron.

And then TLC holds three electrons per cell, and still only operates on the 0-7 scale.

That is why TLCs are more fragile than MLCs and especially SLCs. But saying "fragile" is a bit misleading all by itself, because TLCs can do several hundred TB before the SMART data starts to show retired clusters. That's pretty good considering you get a 128gb-512gb TLC and can go at least 300TB before it starts having any signs of wear.

But that also leads to another problem. SLCs are very durable, but they have a low density, and they're not exactly fast at read and write because of that low density. Think of the read/write speed of a CD compared to a bluray. Both are the same footprint, but the latter has 34x the data density, so at the same RPM, it's going to read 34x faster. Same concept with SLC vs TLC. TLC has more data density, and often has better read/write performance, as well. But that density comes at the cost of a shorter lifespan in the case of NAND storage.




But hey, I'm not an engineer or anything, so.. interpret how you will. This is just what I've self-educated over the recent years. I could have interpreted it wrong from the start.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1878099 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20289
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1878110 - Posted: 13 Jul 2017, 19:41:51 UTC - in response to Message 1878099.  
Last modified: 13 Jul 2017, 20:10:38 UTC

... one electron per cell ...

Sorry, don't follow your description there. Sounds fuzzily[*] like you're describing storage using quantum dots.

Eg, see: ARSTechnica - "The perfect computer memory: quantum dots?"


In the greatly larger dimensions of insulated-gate based 'flash' storage, varying amounts of charge (very many electrons) are tunneled through an insulator to a "floating" (transistor) gate. The amount of charge forced onto the floating gate determines the static charge seen and hence the detected voltage.

In all cases, there is only one floating charged gate per storage cell.

The clever part is to be able to control ever smaller differences in charge to be able to encode more bits of data for the detected voltage levels. One bit requires two discrete voltage levels (amounts of charge on the floating gate). Through to sixteen discrete voltage levels to go to four bits per cell.

Degradation in the operation occurs when charge gets 'trapped' in the insulators around the gates, or if the insulators become leaky. When degraded, the charge that should be there doesn't give the expected voltage for that transistor...

There are various 'tricks' to recover or mitigate but that's for a bit more of an article!


All a good fun subject to explore!

Keep searchin',
Martin

[edit]

*: That's a bit of a fun pun on fuzzy logic and Heisenberg uncertainty for singular electrons :-)

[/edit]
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1878110 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20289
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1885025 - Posted: 20 Aug 2017, 13:35:20 UTC - in response to Message 1878110.  
Last modified: 20 Aug 2017, 13:39:06 UTC

[ Multiple bits per storage cell flash memory (such as SSDs): ]


In the greatly larger dimensions of insulated-gate based 'flash' storage, varying amounts of charge (very many electrons) are tunneled through an insulator to a "floating" (transistor) gate. The amount of charge forced onto the floating gate determines the static charge seen and hence the detected voltage.

In all cases, there is only one floating charged gate per storage cell.

The clever part is to be able to control ever smaller differences in charge to be able to encode more bits of data for the detected voltage levels. One bit requires two discrete voltage levels (amounts of charge on the floating gate). Through to sixteen discrete voltage levels to go to four bits per cell.

Degradation in the operation occurs when charge gets 'trapped' in the insulators around the gates, or if the insulators become leaky. When degraded, the charge that should be there doesn't give the expected voltage for that transistor...

There are various 'tricks' to recover or mitigate but that's for a bit more of an article!...


And thus 4-bits-per-cell "QLC" becomes available sooner rather than later!


This article nicely explains the latest "QLC" announcement and gives an idea of how the 'QLC' works:

Toshiba Showcases QLC NAND Development Platform At Flash Memory Summit

Last month Toshiba announced QLC (4-bit per cell) storage using BiCS 3 technology. We've heard about QLC for around a year now but never expected to see it this soon. At Flash Memory Summit we not only saw QLC, but also saw it running in a development platform...


Note how the "L" in "QLC" is something of a misnomer. The "L" could describe a "level-4" ('quad') development. However, the "L" is very definitely NOT what would be commonly assumed to be for describing merely 4 levels of voltage when in reality you must have at least 16 levels of charge to encode 4 bits of data...



Keep searchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1885025 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20289
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1895924 - Posted: 18 Oct 2017, 12:42:18 UTC - in response to Message 1885025.  
Last modified: 18 Oct 2017, 12:46:33 UTC

... This article nicely explains the latest "QLC" announcement and gives an idea of how the 'QLC' works:

Toshiba Showcases QLC NAND Development Platform At Flash Memory Summit

Last month Toshiba announced QLC (4-bit per cell) storage using BiCS 3 technology. We've heard about QLC for around a year now but never expected to see it this soon. At Flash Memory Summit we not only saw QLC, but also saw it running in a development platform...


Note how the "L" in "QLC" is something of a misnomer. The "L" could describe a "level-4" ('quad') development. However, the "L" is very definitely NOT what would be commonly assumed to be for describing merely 4 levels of voltage when in reality you must have at least 16 levels of charge to encode 4 bits of data...

Here's a good follow-up with pretty diagrams that nicely explain the structure, compromises and endurance for examples of QLC flash:

Facebook Asks For QLC NAND, Toshiba Answers with 100TB QLC SSDs With TSV

... 100TB QLC SSDs. Toshiba plans to pioneer the use of QLC (Quad-Level Cell) technology with a new variant of its 64-layer BiCS3 NAND, but the company will also employ its TSV (Through-Silicon Via) technology to boost its die stacking capability, which effectively doubles capacity per package. QLC NAND makes several tradeoffs, particularly [with] endurance and performance ... but it will provide more density...


With only a dozen or so electrons to kick around at a time (for SLC/MLC planar NAND), spectacular and scary stuff for your data! Phew!!

Keep searchin',
Martin

SLC: 'Single'-Level Cell, two levels of charge (voltage) storing 1 bit of data;
MLC: 'Multiple'-Level Cell, four levels of charge (voltage) storing 2 bits of data.
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1895924 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : To checkpoint or not -- the wear and tear of SSD drives


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.