Issues with SSD : rising CRC errors , freezing, sometimes read-only

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP








up vote
6
down vote

favorite












My laptop SSD is acting up and the number of errors soared since the last time I posted.



Is this drive dead / dying?

It's on now and I'm writing this on it - I have all my data backed up and all, but I am still unsure if it's usable or not?



Contacting the manufacturer didn't help much: they asked me to install Windows and run the disk check utility from there or connect it as an external drive to a Windows host and test it there.

I did both and no errors were encountered.



I also checked it with the utility they provide (see screenshot below). I then used the image I made with clonezilla to return to Ubuntu, and I found that the SATA PHY error count is nearing 300 errors!



I've also checked the connectors, but since the SSD is in a laptop I cannot change the cable (easily).



These are the test results generated by the manufacturer's utility



Test results



And the smartctl output on Ubuntu, later:



smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.14.0-041400-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SPCC Solid State Disk
Serial Number: XXXXXXXXXX
Firmware Version: S9FM02.8
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 (minor revision not indicated)
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sun Feb 18 02:22:56 2018 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 30) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6352
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 2717
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0
170 Unknown_Attribute 0x0013 100 100 010 Pre-fail Always - 25
173 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 105447539
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 77
194 Temperature_Celsius 0x0023 070 070 000 Pre-fail Always - 30
196 Reallocated_Event_Count 0x0000 100 100 000 Old_age Offline - 0
218 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 15431
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 6281157

SMART Error Log Version: 1
ATA Error Count: 298 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 298 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.077 [VENDOR SPECIFIC]
ca 00 80 b0 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 30 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 b0 8e 12 e1 00 00:11:08.075 WRITE DMA
ca 00 80 30 8e 12 e1 00 00:11:08.074 WRITE DMA

Error 297 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.039 [VENDOR SPECIFIC]
ca 00 80 b0 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 30 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 b0 7b 12 e1 00 00:11:08.037 WRITE DMA
ca 00 80 30 7b 12 e1 00 00:11:08.037 WRITE DMA

Error 296 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.974 [VENDOR SPECIFIC]
ca 00 80 b0 48 12 e1 00 00:11:07.973 WRITE DMA
ca 00 80 30 48 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 b0 47 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 30 47 12 e1 00 00:11:07.972 WRITE DMA

Error 295 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.927 [VENDOR SPECIFIC]
ca 00 80 b0 2a 12 e1 00 00:11:07.926 WRITE DMA
ca 00 80 30 2a 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 b0 29 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 30 29 12 e1 00 00:11:07.924 WRITE DMA

Error 294 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.899 [VENDOR SPECIFIC]
ca 00 80 b0 22 12 e1 00 00:11:07.898 WRITE DMA
ca 00 80 30 22 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 b0 21 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 30 21 12 e1 00 00:11:07.896 WRITE DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6288 -
# 2 Conveyance offline Completed without error 00% 6285 -
# 3 Short offline Completed without error 00% 6285 -
# 4 Extended offline Completed without error 00% 6283 -
# 5 Extended offline Completed without error 00% 6283 -
# 6 Short offline Completed without error 00% 6283 -
# 7 Extended offline Completed without error 00% 6262 -
# 8 Conveyance offline Completed without error 00% 6262 -
# 9 Conveyance offline Completed without error 00% 6262 -
#10 Extended offline Completed without error 00% 6262 -
#11 Short offline Completed without error 00% 6262 -
#12 Conveyance offline Completed without error 00% 6211 -
#13 Extended offline Completed without error 00% 6211 -
#14 Short offline Completed without error 00% 6211 -
#15 Short offline Completed without error 00% 6075 -
#16 Conveyance offline Completed without error 00% 5564 -
#17 Extended offline Completed without error 00% 5564 -
#18 Short offline Completed without error 00% 5564 -
#19 Conveyance offline Completed without error 00% 5319 -
#20 Short offline Completed without error 00% 5319 -
#21 Conveyance offline Completed without error 00% 4403 -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.









share|improve this question



















  • 1




    so you are saying that with the S.M.A.R.T. tools of ubuntu you see error counts, but under windows with the manufacturer tools, you see none ?
    – Robert Riedl
    Feb 9 at 18:16






  • 1




    @RobertRiedl that's what I'm saying, I don't know why. And I only installed Windows because the support e-mail said to do so.
    – Muaad ElSharif
    Feb 9 at 18:31






  • 1




    Any other errors or behaviour under Ubuntu ?
    – Robert Riedl
    Feb 9 at 18:33






  • 2




    I also think it is connection/controller of the laptop. Try to attach it to another PC and test again
    – jet
    Feb 10 at 2:48






  • 2




    @MuaadElSharif Your smartctl listing is difficult to read. Can you regenerate it and post it into your question. Rather than clicking " to format it as a quote, click to format it as a code block. Thanks.
    – WinEunuuchs2Unix
    Feb 17 at 21:05














up vote
6
down vote

favorite












My laptop SSD is acting up and the number of errors soared since the last time I posted.



Is this drive dead / dying?

It's on now and I'm writing this on it - I have all my data backed up and all, but I am still unsure if it's usable or not?



Contacting the manufacturer didn't help much: they asked me to install Windows and run the disk check utility from there or connect it as an external drive to a Windows host and test it there.

I did both and no errors were encountered.



I also checked it with the utility they provide (see screenshot below). I then used the image I made with clonezilla to return to Ubuntu, and I found that the SATA PHY error count is nearing 300 errors!



I've also checked the connectors, but since the SSD is in a laptop I cannot change the cable (easily).



These are the test results generated by the manufacturer's utility



Test results



And the smartctl output on Ubuntu, later:



smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.14.0-041400-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SPCC Solid State Disk
Serial Number: XXXXXXXXXX
Firmware Version: S9FM02.8
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 (minor revision not indicated)
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sun Feb 18 02:22:56 2018 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 30) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6352
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 2717
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0
170 Unknown_Attribute 0x0013 100 100 010 Pre-fail Always - 25
173 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 105447539
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 77
194 Temperature_Celsius 0x0023 070 070 000 Pre-fail Always - 30
196 Reallocated_Event_Count 0x0000 100 100 000 Old_age Offline - 0
218 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 15431
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 6281157

SMART Error Log Version: 1
ATA Error Count: 298 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 298 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.077 [VENDOR SPECIFIC]
ca 00 80 b0 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 30 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 b0 8e 12 e1 00 00:11:08.075 WRITE DMA
ca 00 80 30 8e 12 e1 00 00:11:08.074 WRITE DMA

Error 297 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.039 [VENDOR SPECIFIC]
ca 00 80 b0 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 30 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 b0 7b 12 e1 00 00:11:08.037 WRITE DMA
ca 00 80 30 7b 12 e1 00 00:11:08.037 WRITE DMA

Error 296 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.974 [VENDOR SPECIFIC]
ca 00 80 b0 48 12 e1 00 00:11:07.973 WRITE DMA
ca 00 80 30 48 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 b0 47 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 30 47 12 e1 00 00:11:07.972 WRITE DMA

Error 295 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.927 [VENDOR SPECIFIC]
ca 00 80 b0 2a 12 e1 00 00:11:07.926 WRITE DMA
ca 00 80 30 2a 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 b0 29 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 30 29 12 e1 00 00:11:07.924 WRITE DMA

Error 294 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.899 [VENDOR SPECIFIC]
ca 00 80 b0 22 12 e1 00 00:11:07.898 WRITE DMA
ca 00 80 30 22 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 b0 21 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 30 21 12 e1 00 00:11:07.896 WRITE DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6288 -
# 2 Conveyance offline Completed without error 00% 6285 -
# 3 Short offline Completed without error 00% 6285 -
# 4 Extended offline Completed without error 00% 6283 -
# 5 Extended offline Completed without error 00% 6283 -
# 6 Short offline Completed without error 00% 6283 -
# 7 Extended offline Completed without error 00% 6262 -
# 8 Conveyance offline Completed without error 00% 6262 -
# 9 Conveyance offline Completed without error 00% 6262 -
#10 Extended offline Completed without error 00% 6262 -
#11 Short offline Completed without error 00% 6262 -
#12 Conveyance offline Completed without error 00% 6211 -
#13 Extended offline Completed without error 00% 6211 -
#14 Short offline Completed without error 00% 6211 -
#15 Short offline Completed without error 00% 6075 -
#16 Conveyance offline Completed without error 00% 5564 -
#17 Extended offline Completed without error 00% 5564 -
#18 Short offline Completed without error 00% 5564 -
#19 Conveyance offline Completed without error 00% 5319 -
#20 Short offline Completed without error 00% 5319 -
#21 Conveyance offline Completed without error 00% 4403 -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.









share|improve this question



















  • 1




    so you are saying that with the S.M.A.R.T. tools of ubuntu you see error counts, but under windows with the manufacturer tools, you see none ?
    – Robert Riedl
    Feb 9 at 18:16






  • 1




    @RobertRiedl that's what I'm saying, I don't know why. And I only installed Windows because the support e-mail said to do so.
    – Muaad ElSharif
    Feb 9 at 18:31






  • 1




    Any other errors or behaviour under Ubuntu ?
    – Robert Riedl
    Feb 9 at 18:33






  • 2




    I also think it is connection/controller of the laptop. Try to attach it to another PC and test again
    – jet
    Feb 10 at 2:48






  • 2




    @MuaadElSharif Your smartctl listing is difficult to read. Can you regenerate it and post it into your question. Rather than clicking " to format it as a quote, click to format it as a code block. Thanks.
    – WinEunuuchs2Unix
    Feb 17 at 21:05












up vote
6
down vote

favorite









up vote
6
down vote

favorite











My laptop SSD is acting up and the number of errors soared since the last time I posted.



Is this drive dead / dying?

It's on now and I'm writing this on it - I have all my data backed up and all, but I am still unsure if it's usable or not?



Contacting the manufacturer didn't help much: they asked me to install Windows and run the disk check utility from there or connect it as an external drive to a Windows host and test it there.

I did both and no errors were encountered.



I also checked it with the utility they provide (see screenshot below). I then used the image I made with clonezilla to return to Ubuntu, and I found that the SATA PHY error count is nearing 300 errors!



I've also checked the connectors, but since the SSD is in a laptop I cannot change the cable (easily).



These are the test results generated by the manufacturer's utility



Test results



And the smartctl output on Ubuntu, later:



smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.14.0-041400-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SPCC Solid State Disk
Serial Number: XXXXXXXXXX
Firmware Version: S9FM02.8
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 (minor revision not indicated)
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sun Feb 18 02:22:56 2018 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 30) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6352
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 2717
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0
170 Unknown_Attribute 0x0013 100 100 010 Pre-fail Always - 25
173 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 105447539
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 77
194 Temperature_Celsius 0x0023 070 070 000 Pre-fail Always - 30
196 Reallocated_Event_Count 0x0000 100 100 000 Old_age Offline - 0
218 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 15431
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 6281157

SMART Error Log Version: 1
ATA Error Count: 298 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 298 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.077 [VENDOR SPECIFIC]
ca 00 80 b0 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 30 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 b0 8e 12 e1 00 00:11:08.075 WRITE DMA
ca 00 80 30 8e 12 e1 00 00:11:08.074 WRITE DMA

Error 297 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.039 [VENDOR SPECIFIC]
ca 00 80 b0 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 30 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 b0 7b 12 e1 00 00:11:08.037 WRITE DMA
ca 00 80 30 7b 12 e1 00 00:11:08.037 WRITE DMA

Error 296 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.974 [VENDOR SPECIFIC]
ca 00 80 b0 48 12 e1 00 00:11:07.973 WRITE DMA
ca 00 80 30 48 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 b0 47 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 30 47 12 e1 00 00:11:07.972 WRITE DMA

Error 295 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.927 [VENDOR SPECIFIC]
ca 00 80 b0 2a 12 e1 00 00:11:07.926 WRITE DMA
ca 00 80 30 2a 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 b0 29 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 30 29 12 e1 00 00:11:07.924 WRITE DMA

Error 294 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.899 [VENDOR SPECIFIC]
ca 00 80 b0 22 12 e1 00 00:11:07.898 WRITE DMA
ca 00 80 30 22 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 b0 21 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 30 21 12 e1 00 00:11:07.896 WRITE DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6288 -
# 2 Conveyance offline Completed without error 00% 6285 -
# 3 Short offline Completed without error 00% 6285 -
# 4 Extended offline Completed without error 00% 6283 -
# 5 Extended offline Completed without error 00% 6283 -
# 6 Short offline Completed without error 00% 6283 -
# 7 Extended offline Completed without error 00% 6262 -
# 8 Conveyance offline Completed without error 00% 6262 -
# 9 Conveyance offline Completed without error 00% 6262 -
#10 Extended offline Completed without error 00% 6262 -
#11 Short offline Completed without error 00% 6262 -
#12 Conveyance offline Completed without error 00% 6211 -
#13 Extended offline Completed without error 00% 6211 -
#14 Short offline Completed without error 00% 6211 -
#15 Short offline Completed without error 00% 6075 -
#16 Conveyance offline Completed without error 00% 5564 -
#17 Extended offline Completed without error 00% 5564 -
#18 Short offline Completed without error 00% 5564 -
#19 Conveyance offline Completed without error 00% 5319 -
#20 Short offline Completed without error 00% 5319 -
#21 Conveyance offline Completed without error 00% 4403 -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.









share|improve this question















My laptop SSD is acting up and the number of errors soared since the last time I posted.



Is this drive dead / dying?

It's on now and I'm writing this on it - I have all my data backed up and all, but I am still unsure if it's usable or not?



Contacting the manufacturer didn't help much: they asked me to install Windows and run the disk check utility from there or connect it as an external drive to a Windows host and test it there.

I did both and no errors were encountered.



I also checked it with the utility they provide (see screenshot below). I then used the image I made with clonezilla to return to Ubuntu, and I found that the SATA PHY error count is nearing 300 errors!



I've also checked the connectors, but since the SSD is in a laptop I cannot change the cable (easily).



These are the test results generated by the manufacturer's utility



Test results



And the smartctl output on Ubuntu, later:



smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.14.0-041400-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SPCC Solid State Disk
Serial Number: XXXXXXXXXX
Firmware Version: S9FM02.8
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 (minor revision not indicated)
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Sun Feb 18 02:22:56 2018 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 30) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6352
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 2717
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0
170 Unknown_Attribute 0x0013 100 100 010 Pre-fail Always - 25
173 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 105447539
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 77
194 Temperature_Celsius 0x0023 070 070 000 Pre-fail Always - 30
196 Reallocated_Event_Count 0x0000 100 100 000 Old_age Offline - 0
218 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 15431
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 6281157

SMART Error Log Version: 1
ATA Error Count: 298 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 298 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.077 [VENDOR SPECIFIC]
ca 00 80 b0 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 30 8f 12 e1 00 00:11:08.076 WRITE DMA
ca 00 80 b0 8e 12 e1 00 00:11:08.075 WRITE DMA
ca 00 80 30 8e 12 e1 00 00:11:08.074 WRITE DMA

Error 297 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:08.039 [VENDOR SPECIFIC]
ca 00 80 b0 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 30 7c 12 e1 00 00:11:08.038 WRITE DMA
ca 00 80 b0 7b 12 e1 00 00:11:08.037 WRITE DMA
ca 00 80 30 7b 12 e1 00 00:11:08.037 WRITE DMA

Error 296 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.974 [VENDOR SPECIFIC]
ca 00 80 b0 48 12 e1 00 00:11:07.973 WRITE DMA
ca 00 80 30 48 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 b0 47 12 e1 00 00:11:07.972 WRITE DMA
ca 00 80 30 47 12 e1 00 00:11:07.972 WRITE DMA

Error 295 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.927 [VENDOR SPECIFIC]
ca 00 80 b0 2a 12 e1 00 00:11:07.926 WRITE DMA
ca 00 80 30 2a 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 b0 29 12 e1 00 00:11:07.925 WRITE DMA
ca 00 80 30 29 12 e1 00 00:11:07.924 WRITE DMA

Error 294 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 01 01 00 00 00

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ff d5 01 01 00 00 00 ff 00:11:07.899 [VENDOR SPECIFIC]
ca 00 80 b0 22 12 e1 00 00:11:07.898 WRITE DMA
ca 00 80 30 22 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 b0 21 12 e1 00 00:11:07.897 WRITE DMA
ca 00 80 30 21 12 e1 00 00:11:07.896 WRITE DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6288 -
# 2 Conveyance offline Completed without error 00% 6285 -
# 3 Short offline Completed without error 00% 6285 -
# 4 Extended offline Completed without error 00% 6283 -
# 5 Extended offline Completed without error 00% 6283 -
# 6 Short offline Completed without error 00% 6283 -
# 7 Extended offline Completed without error 00% 6262 -
# 8 Conveyance offline Completed without error 00% 6262 -
# 9 Conveyance offline Completed without error 00% 6262 -
#10 Extended offline Completed without error 00% 6262 -
#11 Short offline Completed without error 00% 6262 -
#12 Conveyance offline Completed without error 00% 6211 -
#13 Extended offline Completed without error 00% 6211 -
#14 Short offline Completed without error 00% 6211 -
#15 Short offline Completed without error 00% 6075 -
#16 Conveyance offline Completed without error 00% 5564 -
#17 Extended offline Completed without error 00% 5564 -
#18 Short offline Completed without error 00% 5564 -
#19 Conveyance offline Completed without error 00% 5319 -
#20 Short offline Completed without error 00% 5319 -
#21 Conveyance offline Completed without error 00% 4403 -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.






ssd hardware disk ubuntu-gnome smart






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 23 at 20:36









Robert Riedl

2,740623




2,740623










asked Feb 9 at 18:09









Muaad ElSharif

98620




98620







  • 1




    so you are saying that with the S.M.A.R.T. tools of ubuntu you see error counts, but under windows with the manufacturer tools, you see none ?
    – Robert Riedl
    Feb 9 at 18:16






  • 1




    @RobertRiedl that's what I'm saying, I don't know why. And I only installed Windows because the support e-mail said to do so.
    – Muaad ElSharif
    Feb 9 at 18:31






  • 1




    Any other errors or behaviour under Ubuntu ?
    – Robert Riedl
    Feb 9 at 18:33






  • 2




    I also think it is connection/controller of the laptop. Try to attach it to another PC and test again
    – jet
    Feb 10 at 2:48






  • 2




    @MuaadElSharif Your smartctl listing is difficult to read. Can you regenerate it and post it into your question. Rather than clicking " to format it as a quote, click to format it as a code block. Thanks.
    – WinEunuuchs2Unix
    Feb 17 at 21:05












  • 1




    so you are saying that with the S.M.A.R.T. tools of ubuntu you see error counts, but under windows with the manufacturer tools, you see none ?
    – Robert Riedl
    Feb 9 at 18:16






  • 1




    @RobertRiedl that's what I'm saying, I don't know why. And I only installed Windows because the support e-mail said to do so.
    – Muaad ElSharif
    Feb 9 at 18:31






  • 1




    Any other errors or behaviour under Ubuntu ?
    – Robert Riedl
    Feb 9 at 18:33






  • 2




    I also think it is connection/controller of the laptop. Try to attach it to another PC and test again
    – jet
    Feb 10 at 2:48






  • 2




    @MuaadElSharif Your smartctl listing is difficult to read. Can you regenerate it and post it into your question. Rather than clicking " to format it as a quote, click to format it as a code block. Thanks.
    – WinEunuuchs2Unix
    Feb 17 at 21:05







1




1




so you are saying that with the S.M.A.R.T. tools of ubuntu you see error counts, but under windows with the manufacturer tools, you see none ?
– Robert Riedl
Feb 9 at 18:16




so you are saying that with the S.M.A.R.T. tools of ubuntu you see error counts, but under windows with the manufacturer tools, you see none ?
– Robert Riedl
Feb 9 at 18:16




1




1




@RobertRiedl that's what I'm saying, I don't know why. And I only installed Windows because the support e-mail said to do so.
– Muaad ElSharif
Feb 9 at 18:31




@RobertRiedl that's what I'm saying, I don't know why. And I only installed Windows because the support e-mail said to do so.
– Muaad ElSharif
Feb 9 at 18:31




1




1




Any other errors or behaviour under Ubuntu ?
– Robert Riedl
Feb 9 at 18:33




Any other errors or behaviour under Ubuntu ?
– Robert Riedl
Feb 9 at 18:33




2




2




I also think it is connection/controller of the laptop. Try to attach it to another PC and test again
– jet
Feb 10 at 2:48




I also think it is connection/controller of the laptop. Try to attach it to another PC and test again
– jet
Feb 10 at 2:48




2




2




@MuaadElSharif Your smartctl listing is difficult to read. Can you regenerate it and post it into your question. Rather than clicking " to format it as a quote, click to format it as a code block. Thanks.
– WinEunuuchs2Unix
Feb 17 at 21:05




@MuaadElSharif Your smartctl listing is difficult to read. Can you regenerate it and post it into your question. Rather than clicking " to format it as a quote, click to format it as a code block. Thanks.
– WinEunuuchs2Unix
Feb 17 at 21:05










4 Answers
4






active

oldest

votes

















up vote
6
down vote













Replace your SSD



People have tried a lot of things in the comments, but this SSD seems to have some issues.



Judging by the S.M.A.R.T readouts, your drive has not seen a lot of action (~250 power on days, ~6 TB written) and you say it is about 2 years old. This should well be inside the warranty!



My advice is



  • backup all you data immediately (though you say you have that covered already)

  • remove / replace the SSD (depending on your budget, of course)

  • send the disk to the manufacturer for replacement

Your " Slim S70 " disk should be covered under the 5 year warranty of Silicon Power
warranty



Just send them a RMA request here.






share|improve this answer




















  • It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
    – Muaad ElSharif
    Feb 19 at 18:05






  • 2




    @MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
    – Robert Riedl
    Feb 20 at 12:00

















up vote
3
down vote



+500










Some time before May 11, 2017 you updated your SSD Firmware. However a new version was released in September 2017 and you should apply it using Windows.




Run fstrim to discard unused blocks in the file system:



$ sudo fstrim --verbose --all
/mnt/c: 16 EiB (18446744073709551615 bytes) trimmed
/mnt/e: 16 EiB (18446744073709551615 bytes) trimmed
/: 23.4 GiB (25132920832 bytes) trimmed


In my case the results for Windows 10 partitions /mnt/c and /mnt/e were out of this world. So I checked the files and no harm was done to the data.




Run fsck -f on your SSD after booting with a Live-USB when the partition is not mounted. Another option is running fsck -f from grub - How to fsck hard drive while hard drive is unmounted, using bootable USB stick?.




As mentioned in comments a bad SATA cable can cause errors. But as this answer points out, a loose connection can also cause errors. To rule out a bad/loose connection, remove the plugs from your SSD, blow compressed air over them and the male pins on the drive and firmly reseat the cables.




How much is your time worth?



The last question is how much is your time worth. Assuming you've spent 10 hours on this problem it works out to $5 / hour because many brand new 120GB SATA III SSDs can be purchased from ebay.com




Feb 23/2018 update



I read all the other answers tonight. One answer says to return it. But if you do and they find nothing wrong they'll simply send it back and you'll be without a drive for 2 weeks to 2 months.



Another answer says smartctl reports there is nothing wrong with the drive.



In this answer I suggested running fsck -f and you responded that no errors were reported.



Run fsck every boot



As a compromise between the negative answer (return it) and the positive answer (nothing is wrong), my inclination would be to run fsck on every boot. If an error is discovered the boot is paused and you can read the error message. To summarize the link use:



sudo tune2fs -c 1 /dev/sdX


Note: replace X with your drive letter, ie a, b, etc..



If after a month of no errors, change the value from 1 to 30 which is typical for most systems I believe. On a typical SSD the fsck will run quickly.



Clean and re-seat SATA cables



Others mentioned replacing the SATA cable which is problematic for a laptop. As a compromise consider unplugging all cables on the drive side, using compressed air on male and female ends and then plugging the cables back in firmly.






share|improve this answer






















  • I checked it a few days ago, no update was issued. I updated mine in April of 2017.
    – Muaad ElSharif
    Feb 18 at 0:21










  • @MuaadElSharif Is the link in my answer different than the page you checked?
    – WinEunuuchs2Unix
    Feb 18 at 0:22











  • No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
    – Muaad ElSharif
    Feb 18 at 0:28






  • 1




    @MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
    – WinEunuuchs2Unix
    Feb 24 at 0:33






  • 1




    @MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
    – WinEunuuchs2Unix
    Feb 24 at 22:30

















up vote
2
down vote













There is nothing wrong with your drive. All tests pass. You are simply misinterpreting the SMART data.



Firstly, the first screenshot contains raw data and you cannot draw any conclusions about it. I have no idea what use its creator thinks that data would be to anybody, but it doesn't really mean anything. Unless the meaningful columns can be reached by scrolling right in the window or something.



Let me explain the columns in the SMART report (the latter report you posted).



  • Attribute name: name of the metric

  • Value: current value, higher is better. Values are often out of 100 where 100 = best, but can use any scale as long as higher is better. Even if the metric is something like "error rate", it's normalised so higher values mean lower error rates.

  • Worse: worst observed value, higher is better.

  • Thresh: if value drops below this, it's a fail condition. At or above = pass.

  • Type: what a fail condition would mean for this metric.

    • Old_age: this metric is indicative of age/usage of the drive, not a specific problem.

    • Pre-fail: this metric is indicative of a potential problem with the drive, increasing chance of drive failure.


  • When_failed: When this entered failure mode, if ever

  • Raw_value: internal measurement of the drive that contributed to the value - this is not useful for end user and lower or higher values do not necessarily indicate better or worse.

To address some specific areas of the report:




SMART overall-health self-assessment test result: PASSED




This reflects everything passed. None of the metrics measured has ever entered a failure state.



The log of "errors" is relatively typical for a drive. These do not necessarily indicate unrecoverable errors or even problems with the drive itself; their reports are vague, so you can't tell what actually happened from this except that it was during DMA transfer at the controller, but if anything was important it would be reflected in the overall health report. In particular, these ones could be something fairly innocent like writes that were cancelled at the controller end, or the OS requesting some feature during load that the drive doesn't support, which may be entirely normal when probing device capabilities.



Finally, a note about CRC errors or error rates: all drives have an error rate. Drives store data at such high densities that a certain number of bit errors is expected and designed for, by using error correction code. The error correction code ensures that a certain number of bit errors per chunk of bits may occur and be 100% corrected. The drive is constantly applying the error correction code all the time, and the error correction code is designed so that the chance of an unrecoverable error occurring randomly is very low (as in, significantly less likely than winning the lottery) in a well functioning drive. If you see an error rate in any stats and it's treated like no big deal, it's because it isn't, it'll just be corrected errors.






share|improve this answer






















  • The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
    – Robert Riedl
    Feb 22 at 14:45










  • I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
    – thomasrutter
    Feb 22 at 14:50











  • But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
    – Robert Riedl
    Feb 22 at 14:58

















up vote
0
down vote













Since you have only WRITE DMA errors and short and long tests show no errors.



And since DMA, is about the Direct Memory Access, try to find out if the BIOS has a separate hardware diagnostics test, and try the memory related tests.



If not a BIOS embedded test is available, look at the manufacturers support site if an offline hardware diagnostics is available (eg: bootable ISO file to burn on CD or USB-stick)



(BTW: An ubuntu cd has also memory diagnostics)



Because DMA write is IO, I would try to replace the SATA cable and look if no new error numbers are added after that (last one is here 298 but more can be added bynow)






share|improve this answer


















  • 1




    OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
    – Robert Riedl
    Feb 22 at 15:41










  • An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
    – jringoot
    Sep 21 at 11:03










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "89"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1004635%2fissues-with-ssd-rising-crc-errors-freezing-sometimes-read-only%23new-answer', 'question_page');

);

Post as a guest






























4 Answers
4






active

oldest

votes








4 Answers
4






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
6
down vote













Replace your SSD



People have tried a lot of things in the comments, but this SSD seems to have some issues.



Judging by the S.M.A.R.T readouts, your drive has not seen a lot of action (~250 power on days, ~6 TB written) and you say it is about 2 years old. This should well be inside the warranty!



My advice is



  • backup all you data immediately (though you say you have that covered already)

  • remove / replace the SSD (depending on your budget, of course)

  • send the disk to the manufacturer for replacement

Your " Slim S70 " disk should be covered under the 5 year warranty of Silicon Power
warranty



Just send them a RMA request here.






share|improve this answer




















  • It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
    – Muaad ElSharif
    Feb 19 at 18:05






  • 2




    @MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
    – Robert Riedl
    Feb 20 at 12:00














up vote
6
down vote













Replace your SSD



People have tried a lot of things in the comments, but this SSD seems to have some issues.



Judging by the S.M.A.R.T readouts, your drive has not seen a lot of action (~250 power on days, ~6 TB written) and you say it is about 2 years old. This should well be inside the warranty!



My advice is



  • backup all you data immediately (though you say you have that covered already)

  • remove / replace the SSD (depending on your budget, of course)

  • send the disk to the manufacturer for replacement

Your " Slim S70 " disk should be covered under the 5 year warranty of Silicon Power
warranty



Just send them a RMA request here.






share|improve this answer




















  • It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
    – Muaad ElSharif
    Feb 19 at 18:05






  • 2




    @MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
    – Robert Riedl
    Feb 20 at 12:00












up vote
6
down vote










up vote
6
down vote









Replace your SSD



People have tried a lot of things in the comments, but this SSD seems to have some issues.



Judging by the S.M.A.R.T readouts, your drive has not seen a lot of action (~250 power on days, ~6 TB written) and you say it is about 2 years old. This should well be inside the warranty!



My advice is



  • backup all you data immediately (though you say you have that covered already)

  • remove / replace the SSD (depending on your budget, of course)

  • send the disk to the manufacturer for replacement

Your " Slim S70 " disk should be covered under the 5 year warranty of Silicon Power
warranty



Just send them a RMA request here.






share|improve this answer












Replace your SSD



People have tried a lot of things in the comments, but this SSD seems to have some issues.



Judging by the S.M.A.R.T readouts, your drive has not seen a lot of action (~250 power on days, ~6 TB written) and you say it is about 2 years old. This should well be inside the warranty!



My advice is



  • backup all you data immediately (though you say you have that covered already)

  • remove / replace the SSD (depending on your budget, of course)

  • send the disk to the manufacturer for replacement

Your " Slim S70 " disk should be covered under the 5 year warranty of Silicon Power
warranty



Just send them a RMA request here.







share|improve this answer












share|improve this answer



share|improve this answer










answered Feb 19 at 14:19









Robert Riedl

2,740623




2,740623











  • It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
    – Muaad ElSharif
    Feb 19 at 18:05






  • 2




    @MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
    – Robert Riedl
    Feb 20 at 12:00
















  • It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
    – Muaad ElSharif
    Feb 19 at 18:05






  • 2




    @MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
    – Robert Riedl
    Feb 20 at 12:00















It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
– Muaad ElSharif
Feb 19 at 18:05




It's still working, haven't made an error since, I'm backing up daily and making weekly images of the system. I need more information to decide.
– Muaad ElSharif
Feb 19 at 18:05




2




2




@MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
– Robert Riedl
Feb 20 at 12:00




@MuaadElSharif, cut your losses and replace it. You've spent a lot of time on it already !
– Robert Riedl
Feb 20 at 12:00












up vote
3
down vote



+500










Some time before May 11, 2017 you updated your SSD Firmware. However a new version was released in September 2017 and you should apply it using Windows.




Run fstrim to discard unused blocks in the file system:



$ sudo fstrim --verbose --all
/mnt/c: 16 EiB (18446744073709551615 bytes) trimmed
/mnt/e: 16 EiB (18446744073709551615 bytes) trimmed
/: 23.4 GiB (25132920832 bytes) trimmed


In my case the results for Windows 10 partitions /mnt/c and /mnt/e were out of this world. So I checked the files and no harm was done to the data.




Run fsck -f on your SSD after booting with a Live-USB when the partition is not mounted. Another option is running fsck -f from grub - How to fsck hard drive while hard drive is unmounted, using bootable USB stick?.




As mentioned in comments a bad SATA cable can cause errors. But as this answer points out, a loose connection can also cause errors. To rule out a bad/loose connection, remove the plugs from your SSD, blow compressed air over them and the male pins on the drive and firmly reseat the cables.




How much is your time worth?



The last question is how much is your time worth. Assuming you've spent 10 hours on this problem it works out to $5 / hour because many brand new 120GB SATA III SSDs can be purchased from ebay.com




Feb 23/2018 update



I read all the other answers tonight. One answer says to return it. But if you do and they find nothing wrong they'll simply send it back and you'll be without a drive for 2 weeks to 2 months.



Another answer says smartctl reports there is nothing wrong with the drive.



In this answer I suggested running fsck -f and you responded that no errors were reported.



Run fsck every boot



As a compromise between the negative answer (return it) and the positive answer (nothing is wrong), my inclination would be to run fsck on every boot. If an error is discovered the boot is paused and you can read the error message. To summarize the link use:



sudo tune2fs -c 1 /dev/sdX


Note: replace X with your drive letter, ie a, b, etc..



If after a month of no errors, change the value from 1 to 30 which is typical for most systems I believe. On a typical SSD the fsck will run quickly.



Clean and re-seat SATA cables



Others mentioned replacing the SATA cable which is problematic for a laptop. As a compromise consider unplugging all cables on the drive side, using compressed air on male and female ends and then plugging the cables back in firmly.






share|improve this answer






















  • I checked it a few days ago, no update was issued. I updated mine in April of 2017.
    – Muaad ElSharif
    Feb 18 at 0:21










  • @MuaadElSharif Is the link in my answer different than the page you checked?
    – WinEunuuchs2Unix
    Feb 18 at 0:22











  • No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
    – Muaad ElSharif
    Feb 18 at 0:28






  • 1




    @MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
    – WinEunuuchs2Unix
    Feb 24 at 0:33






  • 1




    @MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
    – WinEunuuchs2Unix
    Feb 24 at 22:30














up vote
3
down vote



+500










Some time before May 11, 2017 you updated your SSD Firmware. However a new version was released in September 2017 and you should apply it using Windows.




Run fstrim to discard unused blocks in the file system:



$ sudo fstrim --verbose --all
/mnt/c: 16 EiB (18446744073709551615 bytes) trimmed
/mnt/e: 16 EiB (18446744073709551615 bytes) trimmed
/: 23.4 GiB (25132920832 bytes) trimmed


In my case the results for Windows 10 partitions /mnt/c and /mnt/e were out of this world. So I checked the files and no harm was done to the data.




Run fsck -f on your SSD after booting with a Live-USB when the partition is not mounted. Another option is running fsck -f from grub - How to fsck hard drive while hard drive is unmounted, using bootable USB stick?.




As mentioned in comments a bad SATA cable can cause errors. But as this answer points out, a loose connection can also cause errors. To rule out a bad/loose connection, remove the plugs from your SSD, blow compressed air over them and the male pins on the drive and firmly reseat the cables.




How much is your time worth?



The last question is how much is your time worth. Assuming you've spent 10 hours on this problem it works out to $5 / hour because many brand new 120GB SATA III SSDs can be purchased from ebay.com




Feb 23/2018 update



I read all the other answers tonight. One answer says to return it. But if you do and they find nothing wrong they'll simply send it back and you'll be without a drive for 2 weeks to 2 months.



Another answer says smartctl reports there is nothing wrong with the drive.



In this answer I suggested running fsck -f and you responded that no errors were reported.



Run fsck every boot



As a compromise between the negative answer (return it) and the positive answer (nothing is wrong), my inclination would be to run fsck on every boot. If an error is discovered the boot is paused and you can read the error message. To summarize the link use:



sudo tune2fs -c 1 /dev/sdX


Note: replace X with your drive letter, ie a, b, etc..



If after a month of no errors, change the value from 1 to 30 which is typical for most systems I believe. On a typical SSD the fsck will run quickly.



Clean and re-seat SATA cables



Others mentioned replacing the SATA cable which is problematic for a laptop. As a compromise consider unplugging all cables on the drive side, using compressed air on male and female ends and then plugging the cables back in firmly.






share|improve this answer






















  • I checked it a few days ago, no update was issued. I updated mine in April of 2017.
    – Muaad ElSharif
    Feb 18 at 0:21










  • @MuaadElSharif Is the link in my answer different than the page you checked?
    – WinEunuuchs2Unix
    Feb 18 at 0:22











  • No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
    – Muaad ElSharif
    Feb 18 at 0:28






  • 1




    @MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
    – WinEunuuchs2Unix
    Feb 24 at 0:33






  • 1




    @MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
    – WinEunuuchs2Unix
    Feb 24 at 22:30












up vote
3
down vote



+500







up vote
3
down vote



+500




+500




Some time before May 11, 2017 you updated your SSD Firmware. However a new version was released in September 2017 and you should apply it using Windows.




Run fstrim to discard unused blocks in the file system:



$ sudo fstrim --verbose --all
/mnt/c: 16 EiB (18446744073709551615 bytes) trimmed
/mnt/e: 16 EiB (18446744073709551615 bytes) trimmed
/: 23.4 GiB (25132920832 bytes) trimmed


In my case the results for Windows 10 partitions /mnt/c and /mnt/e were out of this world. So I checked the files and no harm was done to the data.




Run fsck -f on your SSD after booting with a Live-USB when the partition is not mounted. Another option is running fsck -f from grub - How to fsck hard drive while hard drive is unmounted, using bootable USB stick?.




As mentioned in comments a bad SATA cable can cause errors. But as this answer points out, a loose connection can also cause errors. To rule out a bad/loose connection, remove the plugs from your SSD, blow compressed air over them and the male pins on the drive and firmly reseat the cables.




How much is your time worth?



The last question is how much is your time worth. Assuming you've spent 10 hours on this problem it works out to $5 / hour because many brand new 120GB SATA III SSDs can be purchased from ebay.com




Feb 23/2018 update



I read all the other answers tonight. One answer says to return it. But if you do and they find nothing wrong they'll simply send it back and you'll be without a drive for 2 weeks to 2 months.



Another answer says smartctl reports there is nothing wrong with the drive.



In this answer I suggested running fsck -f and you responded that no errors were reported.



Run fsck every boot



As a compromise between the negative answer (return it) and the positive answer (nothing is wrong), my inclination would be to run fsck on every boot. If an error is discovered the boot is paused and you can read the error message. To summarize the link use:



sudo tune2fs -c 1 /dev/sdX


Note: replace X with your drive letter, ie a, b, etc..



If after a month of no errors, change the value from 1 to 30 which is typical for most systems I believe. On a typical SSD the fsck will run quickly.



Clean and re-seat SATA cables



Others mentioned replacing the SATA cable which is problematic for a laptop. As a compromise consider unplugging all cables on the drive side, using compressed air on male and female ends and then plugging the cables back in firmly.






share|improve this answer














Some time before May 11, 2017 you updated your SSD Firmware. However a new version was released in September 2017 and you should apply it using Windows.




Run fstrim to discard unused blocks in the file system:



$ sudo fstrim --verbose --all
/mnt/c: 16 EiB (18446744073709551615 bytes) trimmed
/mnt/e: 16 EiB (18446744073709551615 bytes) trimmed
/: 23.4 GiB (25132920832 bytes) trimmed


In my case the results for Windows 10 partitions /mnt/c and /mnt/e were out of this world. So I checked the files and no harm was done to the data.




Run fsck -f on your SSD after booting with a Live-USB when the partition is not mounted. Another option is running fsck -f from grub - How to fsck hard drive while hard drive is unmounted, using bootable USB stick?.




As mentioned in comments a bad SATA cable can cause errors. But as this answer points out, a loose connection can also cause errors. To rule out a bad/loose connection, remove the plugs from your SSD, blow compressed air over them and the male pins on the drive and firmly reseat the cables.




How much is your time worth?



The last question is how much is your time worth. Assuming you've spent 10 hours on this problem it works out to $5 / hour because many brand new 120GB SATA III SSDs can be purchased from ebay.com




Feb 23/2018 update



I read all the other answers tonight. One answer says to return it. But if you do and they find nothing wrong they'll simply send it back and you'll be without a drive for 2 weeks to 2 months.



Another answer says smartctl reports there is nothing wrong with the drive.



In this answer I suggested running fsck -f and you responded that no errors were reported.



Run fsck every boot



As a compromise between the negative answer (return it) and the positive answer (nothing is wrong), my inclination would be to run fsck on every boot. If an error is discovered the boot is paused and you can read the error message. To summarize the link use:



sudo tune2fs -c 1 /dev/sdX


Note: replace X with your drive letter, ie a, b, etc..



If after a month of no errors, change the value from 1 to 30 which is typical for most systems I believe. On a typical SSD the fsck will run quickly.



Clean and re-seat SATA cables



Others mentioned replacing the SATA cable which is problematic for a laptop. As a compromise consider unplugging all cables on the drive side, using compressed air on male and female ends and then plugging the cables back in firmly.







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 24 at 0:46

























answered Feb 17 at 22:02









WinEunuuchs2Unix

36.6k760138




36.6k760138











  • I checked it a few days ago, no update was issued. I updated mine in April of 2017.
    – Muaad ElSharif
    Feb 18 at 0:21










  • @MuaadElSharif Is the link in my answer different than the page you checked?
    – WinEunuuchs2Unix
    Feb 18 at 0:22











  • No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
    – Muaad ElSharif
    Feb 18 at 0:28






  • 1




    @MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
    – WinEunuuchs2Unix
    Feb 24 at 0:33






  • 1




    @MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
    – WinEunuuchs2Unix
    Feb 24 at 22:30
















  • I checked it a few days ago, no update was issued. I updated mine in April of 2017.
    – Muaad ElSharif
    Feb 18 at 0:21










  • @MuaadElSharif Is the link in my answer different than the page you checked?
    – WinEunuuchs2Unix
    Feb 18 at 0:22











  • No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
    – Muaad ElSharif
    Feb 18 at 0:28






  • 1




    @MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
    – WinEunuuchs2Unix
    Feb 24 at 0:33






  • 1




    @MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
    – WinEunuuchs2Unix
    Feb 24 at 22:30















I checked it a few days ago, no update was issued. I updated mine in April of 2017.
– Muaad ElSharif
Feb 18 at 0:21




I checked it a few days ago, no update was issued. I updated mine in April of 2017.
– Muaad ElSharif
Feb 18 at 0:21












@MuaadElSharif Is the link in my answer different than the page you checked?
– WinEunuuchs2Unix
Feb 18 at 0:22





@MuaadElSharif Is the link in my answer different than the page you checked?
– WinEunuuchs2Unix
Feb 18 at 0:22













No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
– Muaad ElSharif
Feb 18 at 0:28




No, I checked using the SATA tool on Windows, and it said that the firmware is at the latest version, no update was available
– Muaad ElSharif
Feb 18 at 0:28




1




1




@MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
– WinEunuuchs2Unix
Feb 24 at 0:33




@MuaadElSharif After reading other answers; one saying it's broken and return it, the other saying there is nothing wrong, I added a compromise section to my answer above.
– WinEunuuchs2Unix
Feb 24 at 0:33




1




1




@MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
– WinEunuuchs2Unix
Feb 24 at 22:30




@MuaadElSharif Sorry for delay Saturday overtime at work. Glad you got fsck scheduled for every boot. Keep us posted.
– WinEunuuchs2Unix
Feb 24 at 22:30










up vote
2
down vote













There is nothing wrong with your drive. All tests pass. You are simply misinterpreting the SMART data.



Firstly, the first screenshot contains raw data and you cannot draw any conclusions about it. I have no idea what use its creator thinks that data would be to anybody, but it doesn't really mean anything. Unless the meaningful columns can be reached by scrolling right in the window or something.



Let me explain the columns in the SMART report (the latter report you posted).



  • Attribute name: name of the metric

  • Value: current value, higher is better. Values are often out of 100 where 100 = best, but can use any scale as long as higher is better. Even if the metric is something like "error rate", it's normalised so higher values mean lower error rates.

  • Worse: worst observed value, higher is better.

  • Thresh: if value drops below this, it's a fail condition. At or above = pass.

  • Type: what a fail condition would mean for this metric.

    • Old_age: this metric is indicative of age/usage of the drive, not a specific problem.

    • Pre-fail: this metric is indicative of a potential problem with the drive, increasing chance of drive failure.


  • When_failed: When this entered failure mode, if ever

  • Raw_value: internal measurement of the drive that contributed to the value - this is not useful for end user and lower or higher values do not necessarily indicate better or worse.

To address some specific areas of the report:




SMART overall-health self-assessment test result: PASSED




This reflects everything passed. None of the metrics measured has ever entered a failure state.



The log of "errors" is relatively typical for a drive. These do not necessarily indicate unrecoverable errors or even problems with the drive itself; their reports are vague, so you can't tell what actually happened from this except that it was during DMA transfer at the controller, but if anything was important it would be reflected in the overall health report. In particular, these ones could be something fairly innocent like writes that were cancelled at the controller end, or the OS requesting some feature during load that the drive doesn't support, which may be entirely normal when probing device capabilities.



Finally, a note about CRC errors or error rates: all drives have an error rate. Drives store data at such high densities that a certain number of bit errors is expected and designed for, by using error correction code. The error correction code ensures that a certain number of bit errors per chunk of bits may occur and be 100% corrected. The drive is constantly applying the error correction code all the time, and the error correction code is designed so that the chance of an unrecoverable error occurring randomly is very low (as in, significantly less likely than winning the lottery) in a well functioning drive. If you see an error rate in any stats and it's treated like no big deal, it's because it isn't, it'll just be corrected errors.






share|improve this answer






















  • The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
    – Robert Riedl
    Feb 22 at 14:45










  • I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
    – thomasrutter
    Feb 22 at 14:50











  • But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
    – Robert Riedl
    Feb 22 at 14:58














up vote
2
down vote













There is nothing wrong with your drive. All tests pass. You are simply misinterpreting the SMART data.



Firstly, the first screenshot contains raw data and you cannot draw any conclusions about it. I have no idea what use its creator thinks that data would be to anybody, but it doesn't really mean anything. Unless the meaningful columns can be reached by scrolling right in the window or something.



Let me explain the columns in the SMART report (the latter report you posted).



  • Attribute name: name of the metric

  • Value: current value, higher is better. Values are often out of 100 where 100 = best, but can use any scale as long as higher is better. Even if the metric is something like "error rate", it's normalised so higher values mean lower error rates.

  • Worse: worst observed value, higher is better.

  • Thresh: if value drops below this, it's a fail condition. At or above = pass.

  • Type: what a fail condition would mean for this metric.

    • Old_age: this metric is indicative of age/usage of the drive, not a specific problem.

    • Pre-fail: this metric is indicative of a potential problem with the drive, increasing chance of drive failure.


  • When_failed: When this entered failure mode, if ever

  • Raw_value: internal measurement of the drive that contributed to the value - this is not useful for end user and lower or higher values do not necessarily indicate better or worse.

To address some specific areas of the report:




SMART overall-health self-assessment test result: PASSED




This reflects everything passed. None of the metrics measured has ever entered a failure state.



The log of "errors" is relatively typical for a drive. These do not necessarily indicate unrecoverable errors or even problems with the drive itself; their reports are vague, so you can't tell what actually happened from this except that it was during DMA transfer at the controller, but if anything was important it would be reflected in the overall health report. In particular, these ones could be something fairly innocent like writes that were cancelled at the controller end, or the OS requesting some feature during load that the drive doesn't support, which may be entirely normal when probing device capabilities.



Finally, a note about CRC errors or error rates: all drives have an error rate. Drives store data at such high densities that a certain number of bit errors is expected and designed for, by using error correction code. The error correction code ensures that a certain number of bit errors per chunk of bits may occur and be 100% corrected. The drive is constantly applying the error correction code all the time, and the error correction code is designed so that the chance of an unrecoverable error occurring randomly is very low (as in, significantly less likely than winning the lottery) in a well functioning drive. If you see an error rate in any stats and it's treated like no big deal, it's because it isn't, it'll just be corrected errors.






share|improve this answer






















  • The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
    – Robert Riedl
    Feb 22 at 14:45










  • I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
    – thomasrutter
    Feb 22 at 14:50











  • But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
    – Robert Riedl
    Feb 22 at 14:58












up vote
2
down vote










up vote
2
down vote









There is nothing wrong with your drive. All tests pass. You are simply misinterpreting the SMART data.



Firstly, the first screenshot contains raw data and you cannot draw any conclusions about it. I have no idea what use its creator thinks that data would be to anybody, but it doesn't really mean anything. Unless the meaningful columns can be reached by scrolling right in the window or something.



Let me explain the columns in the SMART report (the latter report you posted).



  • Attribute name: name of the metric

  • Value: current value, higher is better. Values are often out of 100 where 100 = best, but can use any scale as long as higher is better. Even if the metric is something like "error rate", it's normalised so higher values mean lower error rates.

  • Worse: worst observed value, higher is better.

  • Thresh: if value drops below this, it's a fail condition. At or above = pass.

  • Type: what a fail condition would mean for this metric.

    • Old_age: this metric is indicative of age/usage of the drive, not a specific problem.

    • Pre-fail: this metric is indicative of a potential problem with the drive, increasing chance of drive failure.


  • When_failed: When this entered failure mode, if ever

  • Raw_value: internal measurement of the drive that contributed to the value - this is not useful for end user and lower or higher values do not necessarily indicate better or worse.

To address some specific areas of the report:




SMART overall-health self-assessment test result: PASSED




This reflects everything passed. None of the metrics measured has ever entered a failure state.



The log of "errors" is relatively typical for a drive. These do not necessarily indicate unrecoverable errors or even problems with the drive itself; their reports are vague, so you can't tell what actually happened from this except that it was during DMA transfer at the controller, but if anything was important it would be reflected in the overall health report. In particular, these ones could be something fairly innocent like writes that were cancelled at the controller end, or the OS requesting some feature during load that the drive doesn't support, which may be entirely normal when probing device capabilities.



Finally, a note about CRC errors or error rates: all drives have an error rate. Drives store data at such high densities that a certain number of bit errors is expected and designed for, by using error correction code. The error correction code ensures that a certain number of bit errors per chunk of bits may occur and be 100% corrected. The drive is constantly applying the error correction code all the time, and the error correction code is designed so that the chance of an unrecoverable error occurring randomly is very low (as in, significantly less likely than winning the lottery) in a well functioning drive. If you see an error rate in any stats and it's treated like no big deal, it's because it isn't, it'll just be corrected errors.






share|improve this answer














There is nothing wrong with your drive. All tests pass. You are simply misinterpreting the SMART data.



Firstly, the first screenshot contains raw data and you cannot draw any conclusions about it. I have no idea what use its creator thinks that data would be to anybody, but it doesn't really mean anything. Unless the meaningful columns can be reached by scrolling right in the window or something.



Let me explain the columns in the SMART report (the latter report you posted).



  • Attribute name: name of the metric

  • Value: current value, higher is better. Values are often out of 100 where 100 = best, but can use any scale as long as higher is better. Even if the metric is something like "error rate", it's normalised so higher values mean lower error rates.

  • Worse: worst observed value, higher is better.

  • Thresh: if value drops below this, it's a fail condition. At or above = pass.

  • Type: what a fail condition would mean for this metric.

    • Old_age: this metric is indicative of age/usage of the drive, not a specific problem.

    • Pre-fail: this metric is indicative of a potential problem with the drive, increasing chance of drive failure.


  • When_failed: When this entered failure mode, if ever

  • Raw_value: internal measurement of the drive that contributed to the value - this is not useful for end user and lower or higher values do not necessarily indicate better or worse.

To address some specific areas of the report:




SMART overall-health self-assessment test result: PASSED




This reflects everything passed. None of the metrics measured has ever entered a failure state.



The log of "errors" is relatively typical for a drive. These do not necessarily indicate unrecoverable errors or even problems with the drive itself; their reports are vague, so you can't tell what actually happened from this except that it was during DMA transfer at the controller, but if anything was important it would be reflected in the overall health report. In particular, these ones could be something fairly innocent like writes that were cancelled at the controller end, or the OS requesting some feature during load that the drive doesn't support, which may be entirely normal when probing device capabilities.



Finally, a note about CRC errors or error rates: all drives have an error rate. Drives store data at such high densities that a certain number of bit errors is expected and designed for, by using error correction code. The error correction code ensures that a certain number of bit errors per chunk of bits may occur and be 100% corrected. The drive is constantly applying the error correction code all the time, and the error correction code is designed so that the chance of an unrecoverable error occurring randomly is very low (as in, significantly less likely than winning the lottery) in a well functioning drive. If you see an error rate in any stats and it's treated like no big deal, it's because it isn't, it'll just be corrected errors.







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 22 at 14:09

























answered Feb 22 at 13:58









thomasrutter

25.4k46086




25.4k46086











  • The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
    – Robert Riedl
    Feb 22 at 14:45










  • I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
    – thomasrutter
    Feb 22 at 14:50











  • But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
    – Robert Riedl
    Feb 22 at 14:58
















  • The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
    – Robert Riedl
    Feb 22 at 14:45










  • I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
    – thomasrutter
    Feb 22 at 14:50











  • But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
    – Robert Riedl
    Feb 22 at 14:58















The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
– Robert Riedl
Feb 22 at 14:45




The user reported frequent issues with his system in the comments "Sometimes freezes, today it went read only on me, and a black screen with busybox on it."
– Robert Riedl
Feb 22 at 14:45












I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
– thomasrutter
Feb 22 at 14:50





I didn't see info about freezes or boot issues as it was hidden in comments. Nonetheless it looks like the physical health of the SSD does not look like it's the cause of those problems.
– thomasrutter
Feb 22 at 14:50













But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
– Robert Riedl
Feb 22 at 14:58




But the CRC count is rising (compare windows screenshot vs linux output). You are right, normally this wouldn't be cause for concern, but the number is relatively high, keeps rising, there are issues, etc..
– Robert Riedl
Feb 22 at 14:58










up vote
0
down vote













Since you have only WRITE DMA errors and short and long tests show no errors.



And since DMA, is about the Direct Memory Access, try to find out if the BIOS has a separate hardware diagnostics test, and try the memory related tests.



If not a BIOS embedded test is available, look at the manufacturers support site if an offline hardware diagnostics is available (eg: bootable ISO file to burn on CD or USB-stick)



(BTW: An ubuntu cd has also memory diagnostics)



Because DMA write is IO, I would try to replace the SATA cable and look if no new error numbers are added after that (last one is here 298 but more can be added bynow)






share|improve this answer


















  • 1




    OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
    – Robert Riedl
    Feb 22 at 15:41










  • An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
    – jringoot
    Sep 21 at 11:03














up vote
0
down vote













Since you have only WRITE DMA errors and short and long tests show no errors.



And since DMA, is about the Direct Memory Access, try to find out if the BIOS has a separate hardware diagnostics test, and try the memory related tests.



If not a BIOS embedded test is available, look at the manufacturers support site if an offline hardware diagnostics is available (eg: bootable ISO file to burn on CD or USB-stick)



(BTW: An ubuntu cd has also memory diagnostics)



Because DMA write is IO, I would try to replace the SATA cable and look if no new error numbers are added after that (last one is here 298 but more can be added bynow)






share|improve this answer


















  • 1




    OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
    – Robert Riedl
    Feb 22 at 15:41










  • An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
    – jringoot
    Sep 21 at 11:03












up vote
0
down vote










up vote
0
down vote









Since you have only WRITE DMA errors and short and long tests show no errors.



And since DMA, is about the Direct Memory Access, try to find out if the BIOS has a separate hardware diagnostics test, and try the memory related tests.



If not a BIOS embedded test is available, look at the manufacturers support site if an offline hardware diagnostics is available (eg: bootable ISO file to burn on CD or USB-stick)



(BTW: An ubuntu cd has also memory diagnostics)



Because DMA write is IO, I would try to replace the SATA cable and look if no new error numbers are added after that (last one is here 298 but more can be added bynow)






share|improve this answer














Since you have only WRITE DMA errors and short and long tests show no errors.



And since DMA, is about the Direct Memory Access, try to find out if the BIOS has a separate hardware diagnostics test, and try the memory related tests.



If not a BIOS embedded test is available, look at the manufacturers support site if an offline hardware diagnostics is available (eg: bootable ISO file to burn on CD or USB-stick)



(BTW: An ubuntu cd has also memory diagnostics)



Because DMA write is IO, I would try to replace the SATA cable and look if no new error numbers are added after that (last one is here 298 but more can be added bynow)







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 23 at 13:40

























answered Feb 22 at 13:38









jringoot

539318




539318







  • 1




    OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
    – Robert Riedl
    Feb 22 at 15:41










  • An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
    – jringoot
    Sep 21 at 11:03












  • 1




    OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
    – Robert Riedl
    Feb 22 at 15:41










  • An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
    – jringoot
    Sep 21 at 11:03







1




1




OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
– Robert Riedl
Feb 22 at 15:41




OP said in the comments "no way to change the connector, and I'm not in the market shopping for new laptops"
– Robert Riedl
Feb 22 at 15:41












An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
– jringoot
Sep 21 at 11:03




An alternative to replacing the cable is to lift the disk from the laptop and connect it to a SATA cable and connector of a desktop or tower model and do the SMART test again, see if you still have this high amount of DMA errors or buy a USB2SATA media converter so that you can connect the disk to a USB-port of another computer/laptop and do the SMART test again.
– jringoot
Sep 21 at 11:03

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1004635%2fissues-with-ssd-rising-crc-errors-freezing-sometimes-read-only%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

pylint3 and pip3 broken

Missing snmpget and snmpwalk

How to enroll fingerprints to Ubuntu 17.10 with VFS491