PCDVD數位科技討論區
PCDVD數位科技討論區   註冊 常見問題 標記討論區為已讀

回到   PCDVD數位科技討論區 > 電腦硬體討論群組 > 儲存媒體討論區
帳戶
密碼
 

  回應
 
主題工具
litfal
Regular Member
 

加入日期: Aug 2005
文章: 62
沒開ERC會使得Raid5的硬碟成員容易錯誤嗎?

用"錯誤"這個詞怪怪的,不過我暫時想不到其他詞。

狀況是這樣的:
我在一年前(不到)組了一台電腦,用
Dell Perc 6i + Hitachi 5K3000 x4
組RAID5。
上禮拜一,忽然當機卡死(系統在獨立的ssd上),重開機MSM就跳警告RAID5降級,原來是其中一顆硬碟掉了。
拔下來後插了一顆Seagate LP 2TB的上去重建,原本還擔心有問題,重建完也沒事...
至於有問題的那顆,插到別台電腦上單獨檢查,SMART 05 = 1274,C4=2609,C5=70,但重新對整顆做一次讀寫後,C5就歸零了。HD Tune掃描和速度測試看起來都算正常。我甚至把整顆硬碟寫滿後checksum,也都正常。
反正,還沒一年,我就請代理商做免費送修了。

原本以為是一次買四顆,中獎機率比較高,但沒想到在昨天又在使用中掉了一顆,這次倒是沒有當機。是其中一顆Hitachi、不同的Port。
至於原因打算等明天複製好資料後再檢查。
這次決定先把資料複製到獨立硬碟裡面,有二就有三。

一個可能是運氣不好,兩個就可能是哪裡有問題。
如果是PSU的問題,應該要壞得更徹底一點。
或是散熱不好導致讀寫錯誤?嗯,這倒是有些可能。
抑或是硬體之間不協調?例如5K3000的ERC沒開之類的。
或是線材有問題?

之所以沒開是因為5K3000開了ERC,重開機後又會關掉,沒辦法開。6i也不能穿透,不能開機後再用指令開。
     
      

此文章於 2012-06-26 03:14 AM 被 litfal 編輯.
舊 2012-06-26, 03:13 AM #1
回應時引用此文章
litfal離線中  
vxr
Elite Member
 
vxr的大頭照
 

加入日期: May 2002
您的住址: 地球的上面..
文章: 5,854
Exclamation

你需要上傳firmware log...
或著使用lsigetwin產生script上傳..
 
舊 2012-06-26, 08:33 AM #2
回應時引用此文章
vxr離線中  
litfal
Regular Member
 

加入日期: Aug 2005
文章: 62
這是發生時的 firmware log, 完整版的我晚點找空間上傳
引用:
06/25/12 11:00:29: EVT#50597-06/25/12 11:00:29: 44=Time established as 06/25/12 11:00:29; (942466 seconds since power on)
06/25/12 11:03:19: DM: Chip 0 Paused
06/25/12 11:03:19: Chip <0> Slots: Cur=[247]
06/25/12 11:03:19: [ b]= 1 [f9]= 2 [fa]= 1
06/25/12 11:03:20: DM: Chip 0 Paused
06/25/12 11:03:20: Chip <0> Slots: Cur=[248]
06/25/12 11:03:20: [ b]= 1 [f9]= 2 [fa]= 1
06/25/12 11:03:21: DM: Timing wheel expired - Chip 0 Slot f9
06/25/12 11:03:21: Chip <0> Slots: Cur=[249]
06/25/12 11:03:21: [ b]= 1 [f9]= 2 [fa]= 1
06/25/12 11:03:21: Timedout RDM: 8086fe00, Cmd 1 DevId[2], State 20 Flags f1400005
06/25/12 11:03:21: Timedout RDM: 808bf600, Cmd 1 DevId[2], State 20 Flags f1480005
06/25/12 11:03:21: DM_TMWheelScanAllMsg (slot f9):Scanned 4, Timeout 2 Msg mask 21000
06/25/12 11:03:21: Task Mgmt Start Addr 815cf300 DevId[2] Index 0 chip 0 type 3 numCmdIssued 5
06/25/12 11:03:21: EVT#50598-06/25/12 11:03:21: 267=Command timeout on PD 02(e0xff/s2) Path 1221000002000000, CDB: 28 00 39 1e 67 00 00 01 00 00
06/25/12 11:03:21: Raw Sense for PD 2:`d
06/25/12 11:03:21: EVT#50599-06/25/12 11:03:21: 267=Command timeout on PD 02(e0xff/s2) Path 1221000002000000, CDB: 28 00 39 1e 66 00 00 01 00 00
06/25/12 11:03:21: Raw Sense for PD 2:`d
06/25/12 11:03:21: EVT#50600-06/25/12 11:03:21: 268=PD 02(e0xff/s2) Path 1221000002000000 reset (Type 03)
06/25/12 11:03:21: MPI_EVENT_SAS_DISCOVERY: PortBitmap ff - Discovery is in progress
06/25/12 11:03:21: MPI_EVENT_SAS_PHY_LINK_STATUS - PhyNum 2 DevHandle 3 Link 08
06/25/12 11:03:21: MPI_EVENT_SAS_DISCOVERY: PortBitmap 0 - Discovery is complete
06/25/12 11:03:21: Disc-prog= 0....resetProg=0 aenCount=0 transit=0
06/25/12 11:03:22: DM: Timing wheel expired - Chip 0 Slot fa
06/25/12 11:03:22: Chip <0> Slots: Cur=[250]
06/25/12 11:03:22: [ b]= 1 [fa]= 1
06/25/12 11:03:22: Timedout RDM: 80524a00, Cmd 1 DevId[2], State 20 Flags f1480005
06/25/12 11:03:22: DM_TMWheelScanAllMsg (slot fa):Scanned 4, Timeout 1 Msg mask 21000
06/25/12 11:03:22: EVT#50601-06/25/12 11:03:22: 267=Command timeout on PD 02(e0xff/s2) Path 1221000002000000, CDB: 28 00 39 1e 68 00 00 01 00 00
06/25/12 11:03:22: Raw Sense for PD 2:h?€跿?
06/25/12 11:03:25: MPI_EVENT_SAS_DEVICE_STATUS_CHANGE
06/25/12 11:03:25: MPT_EventDeviceStatusChange: Device Removed DevId 2 Tgt 2 Sas 12210000:02000000
06/25/12 11:03:25: curQdepth 4 WaitQCount 0 path 0 flags:f1580005
06/25/12 11:03:25: DEV_REMOVAL: Issue Abort Target 2
06/25/12 11:03:25: MPT_REC:TaskTerminated DevId[2] Tgt 2 pRdm 808cfa00 RdmStatus=0 RdmFlags=2, DevState=20 DevFlags=f1580005 TimeoutCount=0 RetryCount=0
06/25/12 11:03:25: MPT_REC: Cmd 3 Ioc 800bf710, iocSts 8048 iocLogInfo 31140000
06/25/12 11:03:25: MPT_REC:TaskTerminated DevId[2] Tgt 2 pRdm 80524a00 RdmStatus=f3 RdmFlags=0, DevState=2 DevFlags=f1580201 TimeoutCount=1 RetryCount=0
06/25/12 11:03:25: MPT_REC: Cmd 1 Ioc 800bf710, iocSts 8048 iocLogInfo 31130000
06/25/12 11:03:25: MPT_REC:TaskTerminated DevId[2] Tgt 2 pRdm 8086fe00 RdmStatus=f3 RdmFlags=0, DevState=2 DevFlags=f1580201 TimeoutCount=1 RetryCount=0
06/25/12 11:03:25: MPT_REC: Cmd 1 Ioc 800bf710, iocSts 8048 iocLogInfo 31130000
06/25/12 11:03:25: MPT_REC:TaskTerminated DevId[2] Tgt 2 pRdm 808bf600 RdmStatus=f3 RdmFlags=0, DevState=2 DevFlags=f1580201 TimeoutCount=1 RetryCount=0
06/25/12 11:03:25: MPT_REC: Cmd 1 Ioc 800bf710, iocSts 8048 iocLogInfo 31130000
06/25/12 11:03:25: Task Reply Complete 0 DevID[2] IOCStatus 0 MsgAddr 815cf300 type 3
06/25/12 11:03:25: chip 0 tgt 0 numCmdIssued 1 Qdepth 1 TermCount 4 DevWQC 3 ChipWQC 1
06/25/12 11:03:25: ********FOUND CMD IN CHIP WAIT Q **** DevId 2 Count 1 Qdepth 1
06/25/12 11:03:25: MPT_TaskMgmtPostRoutine DevId 2 Msg 0 Addr 815cf300 CurDevQDepth 1, chipQcount 1 taskType=3 retry 0
06/25/12 11:03:25: DevId [2] Reduce Queue Depth to 1 from 20
06/25/12 11:03:25: DM_RECOVERY_STATE_RESET_ALT_PATH : Rdm 808cfa00 Pd 2 Status f4
06/25/12 11:03:25: RESET_ALT_PATH: devID 2 Rdm 808cfa00 path 0
06/25/12 11:03:25: DM_DevicePathRemoved devId 2 Tid 2 Path 0
06/25/12 11:03:25: PD 2 Removed eviceCount=3
06/25/12 11:03:25: Clearing only path flags...!!f0000000 pd:2
06/25/12 11:03:25: DEV_FLAGS_PULL flag set but no removal pending for pd: 2
06/25/12 11:03:25: DM_PdScsiTypeSet: Pd 2 type 1f isSata 1
06/25/12 11:03:25: EVT#50602-06/25/12 11:03:25: 112=Removed: PD 02(e0xff/s2)
06/25/12 11:03:25: EVT#50603-06/25/12 11:03:25: 248=Removed: PD 02(e0xff/s2) Info: enclPd=ffff, scsiType=0, portMap=02, sasAddr=1221000002000000,0000000000000000
06/25/12 11:03:25: EVT#50604-06/25/12 11:03:25: 114=State change on PD 02(e0xff/s2) from ONLINE(18) to FAILED(11)
06/25/12 11:03:25: EVT#50605-06/25/12 11:03:25: 81=State change on VD 00/0 from OPTIMAL(3) to DEGRADED(2)
06/25/12 11:03:25: EVT#50606-06/25/12 11:03:25: 251=VD 00/0 is now DEGRADED
06/25/12 11:03:25: EVT#50607-06/25/12 11:03:25: 114=State change on PD 02(e0xff/s2) from FAILED(11) to UNCONFIGURED_BAD(1)
06/25/12 11:03:25: LoadBalance 0
06/25/12 11:03:26: LoadBalanceCmdBlock 0
06/25/12 11:03:26: Load Balance Statistics Path0PDs 3 Path1PDs 0
06/25/12 11:04:01: MPI_EVENT_SAS_DISCOVERY: PortBitmap ff - Discovery is in progress
06/25/12 11:04:01: MPI_EVENT_SAS_DISCOVERY: PortBitmap 0 - Discovery is complete
06/25/12 11:04:01: Disc-prog= 0....resetProg=0 aenCount=0 transit=0
06/25/12 11:18:25: DM_ReInitTimer : chipId=0, numSATATargetsPerQuad=3, gCurSATAMaxQDepth=20
06/25/12 11:30:20: EVT#50608-06/25/12 11:30:20: 113=Unexpected sense: PD 00(e0xff/s0) Path 1221000000000000, CDB: 4d 00 4d 00 00 00 00 00 20 00, Sense: 5/24/00
06/25/12 11:30:20: Raw Sense for PD 0: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00
06/25/12 11:30:21: EVT#50609-06/25/12 11:30:21: 113=Unexpected sense: PD 01(e0xff/s1) Path 1221000001000000, CDB: 4d 00 4d 00 00 00 00 00 20 00, Sense: 5/24/00
06/25/12 11:30:21: Raw Sense for PD 1: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00
06/25/12 11:30:21: EVT#50610-06/25/12 11:30:21: 113=Unexpected sense: PD 03(e0xff/s3) Path 1221000003000000, CDB: 4d 00 4d 00 00 00 00 00 20 00, Sense: 5/24/00
06/25/12 11:30:21: Raw Sense for PD 3: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00
06/25/12 12:00:29: EVT#50611-06/25/12 12:00:29: 44=Time established as 06/25/12 12:00:29; (946066 seconds since power on)
舊 2012-06-26, 02:42 PM #3
回應時引用此文章
litfal離線中  
litfal
Regular Member
 

加入日期: Aug 2005
文章: 62
至於前一次的, 好像詳細紀錄已經清掉了, 剩下很難讀的History
舊 2012-06-26, 03:22 PM #4
回應時引用此文章
litfal離線中  
vxr
Elite Member
 
vxr的大頭照
 

加入日期: May 2002
您的住址: 地球的上面..
文章: 5,854
Exclamation

引用:
作者litfal
至於前一次的, 好像詳細紀錄已經清掉了, 剩下很難讀的History

你可以使用lsigetwin產生壓縮腳本檔上傳...
http://kb.lsi.com/KnowledgebaseArticle12278.aspx
舊 2012-06-26, 03:24 PM #5
回應時引用此文章
vxr離線中  
litfal
Regular Member
 

加入日期: Aug 2005
文章: 62
其實剛剛寫了幾行程式碼,對frimware log解析兼照時間排序了一下,發現了一個驚人的狀況

引用:
47895: 2012-4-22,17:58:58 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b3 c4 00 00 01 00 00
47896: 2012-4-22,17:58:58 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b7 05 90 00 00 70 00
47897: 2012-4-22,17:58:58 Warning: PD 00(e0xff/s0) Path 1221000000000000 reset (Type 03)
47898: 2012-4-22,17:58:59 Information: Unexpected sense: PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b3 c4 00 00 01 00 00, Sense: 6/29/00
47899: 2012-4-22,17:59:41 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 31 00 00 01 00 00
47900: 2012-4-22,17:59:41 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 32 00 00 01 00 00
47901: 2012-4-22,17:59:41 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 34 00 00 01 00 00
47902: 2012-4-22,17:59:41 Warning: PD 00(e0xff/s0) Path 1221000000000000 reset (Type 03)
47903: 2012-4-22,17:59:42 Information: Unexpected sense: PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 34 00 00 01 00 00, Sense: 6/29/00
47904: 2012-4-22,18:0:3 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 38 00 00 01 00 00
47905: 2012-4-22,18:0:3 Warning: PD 00(e0xff/s0) Path 1221000000000000 reset (Type 03)
47906: 2012-4-22,18:0:4 Information: Unexpected sense: PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b3 d0 00 00 01 00 00, Sense: 6/29/00
47907: 2012-4-22,18:0:20 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 38 00 00 01 00 00
47908: 2012-4-22,18:0:20 Warning: PD 00(e0xff/s0) Path 1221000000000000 reset (Type 03)
47909: 2012-4-22,18:0:21 Information: Unexpected sense: PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 38 00 00 01 00 00, Sense: 6/29/00
47910: 2012-4-22,18:0:34 Warning: Command timeout on PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 38 00 00 01 00 00
47911: 2012-4-22,18:0:34 Warning: PD 00(e0xff/s0) Path 1221000000000000 reset (Type 03)
47912: 2012-4-22,18:0:34 Warning: Error on PD 00(e0xff/s0) (Error f0)
47913: 2012-4-22,18:0:34 Information: State change on PD 00(e0xff/s0) from ONLINE(18) to FAILED(11)
47914: 2012-4-22,18:0:34 Information: State change on VD 00/0 from OPTIMAL(3) to DEGRADED(2)
47915: 2012-4-22,18:0:34 Critical: VD 00/0 is now DEGRADED
47916: 2012-4-22,18:0:35 Information: Unexpected sense: PD 00(e0xff/s0) Path 1221000000000000, CDB: 28 00 04 b4 39 00 00 01 00 00, Sense: 6/29/00


看起來四月多就降級了,但我到六月十三號才收到MSM的通知並著手重建..
我沒有設定Mail Alert,但那台電腦是有人在用的(我自己也會用),若有跳警告應該很容易發現
舊 2012-06-26, 04:20 PM #6
回應時引用此文章
litfal離線中  
vxr
Elite Member
 
vxr的大頭照
 

加入日期: May 2002
您的住址: 地球的上面..
文章: 5,854
Exclamation

引用:
作者litfal
其實剛剛寫了幾行程式碼,對frimware log解析兼照時間排序了一下,發現了一個驚人的狀況



看起來四月多就降級了,但我到六月十三號才收到MSM的通知並著手重建..
我沒有設定Mail Alert,但那台電腦是有人在用的(我自己也會用),若有跳警告應該很容易發現

你給的這些資訊...
對我來說...
只是一個timeout的發生..
但是我想知道 包括整個BIOS initializing的資訊..
還包括了RAID controller整個相關資訊.
因此我需要知道firmware log或著lsigetwin擷取script的全部資訊...
舊 2012-06-26, 04:24 PM #7
回應時引用此文章
vxr離線中  
litfal
Regular Member
 

加入日期: Aug 2005
文章: 62
從MSM用Save Frimware Log輸出的Txt文件
http://www.mediafire.com/?q0ps1f15jedp2rt
用lsigetwin輸出的
http://www.mediafire.com/?61987mkaas240oo

密碼 abc

加密碼只是不想讓別人看到,麻煩了
裡面其實還包含了好多上一手(或是上N手)的紀錄XD
舊 2012-06-26, 05:20 PM #8
回應時引用此文章
litfal離線中  
vxr
Elite Member
 
vxr的大頭照
 

加入日期: May 2002
您的住址: 地球的上面..
文章: 5,854
Exclamation

引用:
作者litfal
從MSM用Save Frimware Log輸出的Txt文件
http://www.mediafire.com/?q0ps1f15jedp2rt
用lsigetwin輸出的
http://www.mediafire.com/?61987mkaas240oo

密碼 abc

加密碼只是不想讓別人看到,麻煩了
裡面其實還包含了好多上一手(或是上N手)的紀錄XD

File Blocked for Violation.

The file you requested has been blocked for a violation of our Terms of Service.

If you believe you have reached this page in error, please contact support.
Click here to view our help resources
舊 2012-06-26, 05:28 PM #9
回應時引用此文章
vxr離線中  
litfal
Regular Member
 

加入日期: Aug 2005
文章: 62
MF好像會偵測來源網址
麻煩直接複製連結後,貼到瀏覽器網址列開啟
舊 2012-06-26, 05:55 PM #10
回應時引用此文章
litfal離線中  


    回應


POPIN
主題工具

發表文章規則
不可以發起新主題
不可以回應主題
不可以上傳附加檔案
不可以編輯您的文章

vB 代碼打開
[IMG]代碼打開
HTML代碼關閉



所有的時間均為GMT +8。 現在的時間是03:49 PM.


vBulletin Version 3.0.1
powered_by_vbulletin 2025。