Inteservice

MegaSAS RAID卡 BBU Learn Cycle周期的影响

背景

最近遇到有些带MegaSAS RAID卡的服务器,在业务高峰时突然IO负载飚升得很高,IO性能急剧下降,查了日志及各种设置最后才发现是RAID卡的Cache写策略由WriteBack变成WriteThrough了。更深入的原因是BBU进入了Learn Cycle周期,自动把Cache策略改为WriteThrough.

WriteBack和WriteThrough

在开始之前,我需要提到两个词: WriteBack, WriteThrough

  1. WriteBack:进行写操作时,将数据写入RAID卡缓存,并直接返回,RAID卡控制器将在系统负载低或者Cache满了的情况下把数据写入硬盘。该设置会大大提升RAID卡写性能,绝大多数的情况下会降低系统IO负载。 数据的可靠性由RAID卡的BBU(Battery Backup Unit)进行保证。
  2. WriteThrough: 数据写操作不使用缓存,数据直接写入磁盘。RAID卡写性能下降,在大多数情况下该设置会造成系统IO负载上升。

MegaSAS RAID卡的Cache策略

对于LSI的MegaSAS RAID卡, 默认的Cache策略是: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU

如何查看RAID卡Cache策略

root@hostname:~# ./MegaCli -LDInfo -Lall -aALLAdapter0--VirtualDriveInformation:VirtualDrive:0(TargetId:0)Name:
RAID Level:Primary-1,Secondary-0, RAID LevelQualifier-0Size:557.861 GB
MirrorData:557.861 GB
State:OptimalStripSize:128 KB
NumberOfDrives:2SpanDepth:1DefaultCachePolicy:WriteBack,ReadAheadNone,Direct,NoWriteCacheifBad BBU
CurrentCachePolicy:WriteBack,ReadAheadNone,Direct,NoWriteCacheifBad BBU
DefaultAccessPolicy:Read/WriteCurrentAccessPolicy:Read/WriteDiskCachePolicy:DisabledEncryptionType:NoneIs VD Cached:NoExitCode:0x00
  • Default Cache Policy: 默认的缓存策略,针对每个RAID可以有不同的设置.
  • Current Cache Policy: 当前生效的缓存策略.

策略说明

  1. 第一段: WriteBack, WriteThrough
  2. 第二段: ReadAheadNone, ReadAdaptive, ReadAhead.
    • ReadAheadNone: 不开启预读。这是默认的设置
    • ReadAhead: 在读操作时,预先把后面顺序的数据加载入Cache,在顺序读取时,能提高性能,相反会降低随机读的性能。
    • ReadAdaptive: 自适应预读,当Cache memory和IO空闲时,采取顺序预读,平衡了连续读性能及随机读的性能,需要消耗一定的计算能力。
  3. 第三段: Direct, Cached.
    • Direct: Direct IO模式,读操作不缓存到cache memory中,数据将同时传输到cache中和应用,如果接下来要读取相同的数据块,则直接从Cache memory中获取. 这是默认的设置
    • Cached: Cached IO模式,所有读操作都会缓存到cache memory中。
  4. 第四段: Write Cache OK if Bad BBU, No Write Cache if Bad BBU
    • Write Cache OK if Bad BBU: 在BBU有问题时(如电池失效), 依旧使用Write Cache, 有一定的数据丢失风险.
    • No Write Cache if Bad BBU: 在BBU有问题时, 不使用Write Cache

策略自动切换的问题

由于MegaSAS RAID卡默认采用No Write Cache if Bad BBU的设置,将可能发生Write Cache策略变更的情况(由WriteBack变成WriteThrough),导致写性能下降,如果该自动变更发生在业务高峰且系统Io负载高的时候,可能会引发不可预测的问题,如卡机。以下原因将造成Write Cache策略的变更.

  1. RAID卡进入BBU Learn Cycle: 详细介绍见下面
  2. 检测到某些电池故障,如电池容量过低等,一般是电池老化带来的影响,IBM建议一年更换一次RAID卡电池
  3. 没有安装电池, 部分服务器购买时不带电池,导致被自动设置为WriteThrough

在BBU出问题时,如何临时强制启用Write Cache?

./MegaCli-LDSetPropCachedBadBBU-Lall-aALL
./MegaCli-LDSetProp WB -Lall-aALL
#以下命令可以把设置修改回去./MegaCli-LDSetPropNOCachedBadBBU-Lall-aALL

BBU Learn Cycle

BBU由锂离子电池和电子控制电路组成。 锂离子电池的寿命取决于其老化程度,从出厂之后,无论它是否被充电及它的充放电次数多与少,锂离子电池的容量将慢慢的减少。这意味着一个老电池无法像新电池那么持久。 也就决定了BBU的相对充电状态(Relative State of Charge)不会等于绝对充电状态(Absolute State of Charge)。
为了记录电池的放电曲线,以便控制器了解电池的状态,例如最大和最小电压等,同时为了延长电池的寿命,默认会启用自动校准模式(AutoLearn Mode). 在learn cycle期间, raid卡控制器不会启用BBU直到它完成校准。整个过程可能需要高达12小时。这个过程中,会禁用WriteBack模式,以保证数据完整性,同时会造成性能的降低. 整个Learn Cycle分为三个步骤:

  1. 控制器把BBU电池充满电(该步骤可能是放电后充电或直接充电,如果电池刚好满电,则直接进入第二阶段)
  2. 开始校准, 对BBU电池执行放电
  3. 放电完成后,完成校准,并重新开始充电, 直接达到最大电量, 整个Learn Cycle才算完成 注意: 如果第二或第三阶段被中断,重新校准的任务会停止,而不会重新执行

IBM的服务器默认设置是30天执行一次Learn Cycle, 而DELL是90天。不推荐关闭Auto Learn模式,通过这个校准,能延长电池寿命,不作电池校准的Raid卡,电池寿命将从正常的2年降为8个月

查看当前的BBU Learn设置

root@hostname:~# ./MegaCli -AdpBbuCmd -GetBbuProperties -aALL
BBU PropertiesforAdapter:0AutoLearnPeriod:2592000SecNextLearntime:394618008SecLearnDelayInterval:0HoursAuto-LearnMode:Enabled
  • Auto Learn Period: 自动校准间隔, 单位秒,IBM的服务器默认设置是30天执行一次Learn Cycle, 而DELL是90天。 该设置无法修改。
  • Next Learn time: 下一次自动校准的时间,从2000年1月1日算起的秒数,这个设置无法修改,根据上一次自动校准的完成时间加上自动校准间隔计算得来。该时间转化为实际时间时,需要加上RAID卡时间的误差,部分RAID卡时间转成GMT时间后,依然是错误的。

实际时间计算方法,伪代码如下

RealTime=NextLearntime+(系统时间的Unixtime- RAID卡时间的Unixtime)
date -d 'UTC 2000-01-01 + $RealTime secs'
  • Learn Delay Interval: 自动校准启动后的延迟时间,单位小时,最大设置为7天。该设置只针对下次Learn Cycle,下次Learn Cycle完成后,该值将自动归零。
  • Auto-Learn Mode: 是否打开自动校准模式

查看当前BBU的状态

root@hostname:~# MegaCli -AdpBbuCmd -GetBbuStatus -aALL
BBU status forAdapter:0BatteryType: iBBU
Voltage:3837 mV
Current:-152 mA
Temperature:23 C
BatteryState:Operational
BBU FirmwareStatus:ChargingStatus:DischargingVoltage: OK
  Temperature: OK
  LearnCycleRequested:YesLearnCycleActive:YesLearnCycleStatus: OK
  LearnCycleTimeout:No
  I2c ErrorsDetected:NoBatteryPackMissing:NoBatteryReplacement required            :NoRemainingCapacityLow:NoPeriodicLearnRequired:NoTransparentLearn:NoNo space to cache offload               :NoPackis about to fail & should be replaced :NoCacheOffload premium feature required  :NoModule microcode update required        :No...下略...
  1. Charging Status: 当前电池处于什么状态,有Charging, Discharging, None等值,分别代表电池充电,放电,及没有充放电操作的状态
  2. Learn Cycle Requested: Learn Cycle请求,当为Yes时,并且下面的Learn Cycle Active为No, 说明已经开始了Learn Cycle的第一阶段, 此时策略开始变为WriteThrough, 电池将经历一个放电后充电或者充电的过程
  3. Learn Cycle Active: 是否处于Learn Cycle的校准阶段,如果为Yes, 则进入了Learn Cycle的第二阶段,控制器开始校准电池.
  4. Battery Replacement required: 电池是否需要维修,如果为Yes, 请尽快更换电池5. Remaining Capacity Low: 剩余电容量低, 如果为Yes, 需要更换电池

如何强制启动Learn Cycle操作

强制执行自动校准的命令, 执行该命令后,会延迟几秒才会生效,策略会自动变为WriteThrough

root@hostname:~# MegaCli -AdpBbuCmd -BbuLearn -aALL

通过该命令可以粗略的调整自动校准的下次执行时间,但无法100%准确:

  • 本次Learn Cycle的完成时间无法精确计算,这取决于电池的放电及充电速度.* 下次Battery的relearn任务可能会因为某些原因而推迟执行,例如当时电池正在充电,整个Relearn操作将推迟到充电完后之后。

如何查看当前的Cache策略是否发生变动

对比Default Cache Policy和Current Cache Policy是否不同,不同则是策略发生变动

root@hostname:~# MegaCli -LDInfo -Lall -aALL

如何把Learn模式改为手动?

echo'autoLearnMode=1'>/tmp/megaraid.conf
MegaCli-AdpBbuCmd-SetBbuProperties-f /tmp/megaraid.conf -aAll
#1为Disable, 0为Enable, 从Disable切换到Enable时,Relearn操作会立刻执行#确认是否生效MegaCli-AdpBbuCmd-GetBbuProperties-aALL

建议

推荐的Cache策略: 使用No Write Cache if Bad BBU,在BBU出问题的情况下,牺牲性能来确保数据的安全性。
WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU

以下有几种可选的方法

  • 在非业务高峰对BBU强制启动Learn Cycle,但下次自动的Learn Cycle会向后延迟5-6小时(视整个Learn Cycle所需时间而定)。每一次Learn Cycle执行完,下次Learn Cycle的执行时间会发生向后偏移的情况,推移时间由上一次整个Learn Cycle的耗时决定,一般下一次执行时间都会向后推移大约5小时(一次Learn Cycle的时间)。建议可以根据实际推迟效果定期在非业务高峰做一次手动Learn Cycle(一般是02:00~05:00)
  • 切换为手动模式,由crontab或者其他手动定期触发Learn Cycle,采用该方式需要根据不同硬件来决定Learn Cycle的间隔,采取错误的间隔将损耗电池的寿命。IBM的30天, DELL的机器为90天。
  • 检测下次Learn Cycle的时间,在即将进入Learn Cycle前,设置为Write Cached OK if Bad BBU, 使得Write Cache策略在Learn Cycle期间不发生变动,Learn Cycle过后,切换会原配置,这种方式在Learn Cycle期间(大约5小时左右)数据将不保险,如果遇到断电的情况,将发生数据丢失。* 检测下次Learn Cycle的时间,提前1~2天,在非业务高峰期提前触发learn cycle. 这种方法效果最好,也最方便,需要专门的脚本进行下次Learn Cycle时间的计算

推荐做法: 在保留Auto Learn模式的同时,定期通过Crontab对Raid卡执行强制Relearn的操作,检测下次Learn Cycle的时间,提前1~2天,在非业务高峰期提前触发learn cycle(一般是02:00~05:00)。

参考资料

  1. 对RAID及BBU的自问自答
  2. ServeRAID-M Series Battery Backup Unit (BBU) charge cycle behavior and cache modes – IBM System x

 

原版文章: http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5078572

ServeRAID-M Series Battery Backup Unit (BBU) charge cycle behavior and cache modes – IBM System x



Source

RETAIN tip: H174520

Symptom

Below are the most frequently asked questions about the behavior of a Battery Backup Unit (BBU) configured with IBM ServeRAID M Series Serial Attached SCSI (SAS) or Serial Advanced Technology Attachment (SATA) controllers on supported IBM System x servers:

  • What is Autolearn mode?
  • How do I manually enable Write-Back mode on a ServeRAID adapter?
  • What causes the battery “Relative State of Charge” to not reach 100 percent charge completion?
  • Why is the battery “Remaining Capacity” not the same as “Full Capacity?”
  • How long will it take to finish a battery relearn cycle after a ServeRAID M Series controller has initiated this action?
  • Why does the IBM ServeRAID M Series controller’s Write Cache remain disabled while a battery learn cycle is in process?
  • What is the proper way to store ServeRAID M batteries?

Affected configurations

The system is configured with at least one of the following:

  • MegaRAID Storage Manager, any version

The system is configured with one or more of the following IBM Options:

  • IBM MegaRAID 8480 SAS PCI-Express RAID adapter, Option part number 39R8850, replacement part numbers 39R8852 – Adapter, 39R8853 – Battery
  • ServeRAID M5014 SAS or SATA Controller, Option part number 46M0916, replacement part number 46M0918
  • ServeRAID M5015 SAS or SATA Controller, Option part number 46M0829, replacement part number 46M0851 – Adapter
  • ServeRAID M5025 SAS or SATA Controller, Option part number 46M0830, replacement part number 46M0854
  • ServeRAID-MR10M SAS or SATA Controller, Option part number 43W4339, replacement part number 43W4341 – Adapter
  • ServeRAID-MR10i SAS/SATA Controller, Option part number 43W4296, any replacement part number (CRU)
  • ServeRAID-MR10ie (CIOv) Controller for IBM BladeCenter, Option part number 46C7167, replacement part number 46C7171
  • ServeRAID-MR10is Vault SAS or SATA Controller, Option part number 44E8695, replacement part number 44E8696 – Adapter
  • ServeRAID-MR10k SAS or SATA Controller, Option part number 43W4280, replacement part number 43W4282 – Adapter

This tip is not system specific.

Note: This does not imply that the network operating system will work under all combinations of hardware and software.

Please see the compatibility page for more information: http://www.ibm.com/systems/info/x86servers/serverproven/compat/us/

Additional information

The BBU is composed of Lithium-Ion (Li-Ion) and an electronic control circuitry. A unique nature of the Li-Ion battery is that its life span is dependent upon aging (shelf life). From time of manufacturing, regardless of whether it was charged or the number of charge or discharge cycles, the battery will decline slowly and predictably in capacity.

This means that an older battery will not last as long as a new battery solely due to its age. This is a main reason why “Relative State of Charge” of the BBU is not going to be equal to “Absolute State of Charge,” as by design batteries are consumable goods and degrade over time.

Note: IBM recommends replacing the battery after one (1) year of service.

Before a BBU can be used, it has to be calibrated. The controller will not use the BBU until the calibration is done. This can take up to 12 hours to complete. Until then, it will disable Write-Back cache on any logical drive for data integrity reasons, resulting in temporarily reduced performance. The controller identifies this fact on Power On Self Test (POST) with an error message. The calibration (Autolearn mode) is a process whereby the controller records the battery discharge curve in order to know the battery autonomy, in addition to maximum and minimum voltages. It is split into three (3) steps:

Step 1: Begin calibration. The controller charges the BBU to maximum capacity.
Step 2: The controller discharges the BBU.
Step 3: The controller recharges the BBU. When maximum capacity is reached, the process is finished.

Note: If either Step 2 or Step 3 is interrupted, the learning process stops and will not restart.

With the ServeRAID firmware, this calibration will start automatically after 30 days of battery operation (either when installed in the factory or after a BBU upgrade). For best performance, run Autolearn mode manually as soon as the BBU is put into service by using the following MegaRAID Command Line Interface (MegaCLI) command:

MegaCli -AdpBbuCmd -BbuLearn -aALL

The Write-Back cache policy can be checked or changed with the MegaRAID Storage Manager (MSM) application, WebBIOS, or the MegaCLI tool. Use the following CLI commands:

MegaCli -LDSetProp -WB -Lall -aALL
MegaCli -LDSetProp -NoCachedBadBBU -Lall -aALL

Write-Back can also be enabled regardless of the BBU status, or even when no BBU is populated. When doing this, however, keep in mind that an Uninterruptible Power Supply (UPS) is required to secure data. If no UPS is used, there is absolutely no warranty of data protection. Use the following CLI commands:

MegaCli -LDSetProp -WB -Lall -aALL
MegaCli -LDSetProp -CachedBadBBU -Lall -aALL

Once the BBU is completely charged, the controller will automatically change the cache policy back to Write-Back.

It is not recommended to disable the “Learn Cycle” of the BBU as it will shorten the service life of the battery to one-third (approximately eight (8) months) of its original useful life of two (2) years.

It is not an uncommon event for the “Relative state of charge” of the BBU not to reach 100%. Relative state of charge is an indication of full charge capacity percentage in relation to the design capacity. Multiple battery learn cycles may move this state upward or downward.

For more information on ServeRAID batteries, refer to RETAIN tip H001648 at the following URL:

 

 

ServeRAID batteries have limited useful life – IBM ServeRAID Controller



Source

RETAIN tip: H001648

Symptom

This is an official IBM statement release for IBM ServeRAID Controllers with backup battery options.

This RETAIN Tip is designed to assist users when ServeRAID Controller batteries have reached end of life.

Affected configurations

The system is configured with one or more of the following IBM Options:

  • IBM MegaRAID 8480 SAS PCI-Express RAID adapter, Option part number 39R8850, replacement part number (CRU) 39R8852 – Adapter
  • MegaRAID 8480 SAS PCI-Express RAID Controller – Battery, any replacement part number (CRU)
  • ServeRAID M5000 Series Battery Kit, Option part number 46M0917, any replacement part number (CRU)
  • ServeRAID M5015 SAS/SATA Controller, Option part number 46M0829, replacement part number (CRU) 46M0851
  • ServeRAID-3H Ultra2 SCSI Controller – Battery Backup 32MB Cache, Option part number 28L1003, any replacement part number (CRU)
  • ServeRAID-3HB Ultra2 SCSI Controller – Battery Backup 32MB Cache, any replacement part number (CRU)
  • ServeRAID-4H Ultra160 SCSI Controller – Cache Battery, any replacement part number (CRU)
  • ServeRAID-4H Ultra160 SCSI Controller, Option part number 37L6889, replacement part number (CRU) 37L6892 – Adapter
  • ServeRAID-4M Ultra160 SCSI Controller (Japan), Option part number 19K0565, replacement part number (CRU) 00N9543 – Adapter
  • ServeRAID-4M Ultra160 SCSI Controller, Option part number 37L6080, replacement part number (CRU) 37L7258 – Adapter
  • ServeRAID-4Mx Ultra160 SCSI Controller, Option part number 06P5736, replacement part number (CRU) 06P5737 – Adapter
  • ServeRAID-5i Controller, Option part number 25P3492, replacement part number (CRU) 02R0970 – Adapter
  • ServeRAID-6M Controller (128 MB Cache), Option part number 32P0033, replacement part number (CRU) 39R8821 – Adapter
  • ServeRAID-6M Controller (256 MB Cache), Option part number 02R0988, replacement part number (CRU) 39R8822 – Adapter
  • ServeRAID-6i Controller, Option part number 39R8793, replacement part number (CRU) 71P8627 – Adapter
  • ServeRAID-6i+ Controller – Cache Battery for 6M and 6i, replacement part number (FRU) 71P8628, any replacement part number (CRU)
  • ServeRAID-6i+ Controller, Option part number 13N2190, replacement part number (CRU) 13N2195 – Adapter
  • ServeRAID-7k Controller – Battery pack, any replacement part number (CRU)
  • ServeRAID-7k Controller, Option part number 39R8800 replaces 71P8642, replacement part number (CRU) 71P8644 – Adapter
  • ServeRAID-8i Controller, Option part number 13N2227, replacement part number (CRU) 39R8731 – Adapter
  • ServeRAID-8k SAS Controller, Option part number 25R8064, replacement part number (CRU) 25R8076 – Adapter
  • ServeRAID-8s SAS PCIe Controller, Option part number 39R8765, replacement part number (CRU) 39R8785 – Adapter
  • ServeRAID-MR10M SAS/SATA Controller, Option part number 43W4339, replacement part number (CRU) 43W4341 – Adapter
  • ServeRAID-MR10i SAS/SATA Controller, Option part number 43W4296, replacement part number (CRU) 43W4297 – Adapter
  • ServeRAID-MR10ie (CIOv) Controller – Battery, Option part number 46M0800, any replacement part number (CRU)
  • ServeRAID-MR10is Vault SAS/SATA Controller, Option part number 44E8695, replacement part number (CRU) 44E8696 – Adapter
  • ServeRAID-MR10k SAS/SATA Controller, Option part number 43W4280, replacement part number (CRU) 43W4282 – Adapter

This tip is not system specific.

This tip is not software specific.

The system has the symptom described above.

Solution

None, this is a permanent restriction, there will be no solution.

Additional information

IBM strongly recommends checking the battery’s health as part of a scheduled maintenance plan. IBM ServeRAID batteries can be consumable items, depending on the machine type, model of the host server, and the replacement part number.

Prior to sending out the replacement part, IBM employees will check each affected host server to determine whether or not the battery is a consumable part and advise the customer accordingly.

Users should install and monitor battery conditions using the IBM ServeRAID Manager application for the following IBM ServeRAID controller models: 4M, 4H, 4Mx, 5i, 6i, 6i+, 6M, 7k, 8i, 8k, and 8s. All have cache backup batteries of which the battery status can be checked on the controller properties Status tab.

Users should install and monitor battery conditions using the IBM MegaRAID Storage Manager (MSM) application for the following controller models: MegaRAID 8480, IBM ServeRAID MR10i, MR10is, MR10ie, MR10M, MR10k, and M5015. If installed, the cache backup battery status can be checked on the battery backup unit (BBU) properties tab.

Once batteries have reached their maximum lifespan, users can purchase a new replacement battery at their convenience.

Based upon availability, batteries will be shipped on the next business day.

Users can order the IBM ServeRAID batteries as a Customer Replaceable Unit (CRU) from of the following:

By telephone, in the US:

1 (800)388-7080, press option 2, press option 1 and provide the IBM FRU part number.

By web, in the US:

http://www.ibm.com/shop/americas/content/home/store_IBMPublicUSA/en_US/parts/parts_main.html

By telephone, in the UK:

01475 897202, press option 2, press option 1 and provide the IBM FRU part number.

By web, in the UK:

http://www-304.ibm.com/shop/europe/webapp/wcs/stores/servlet/default/CategoryDisplay?catalogId=-826&storeId=826&langId=826&dualCurrId=20&categoryId=3063814

As a convenience to our users, listed below are battery CRU/Field Replaceable Unit (FRU) replacement part numbers.

  • ServeRAID 4x Controller Battery (FRU 37L6903)
  • ServeRAID 5i Controller Battery (FRU 25P3482)
  • ServeRAID 6i Controller Battery (FRU 39R8799)
  • ServeRAID 7k Controller Battery (FRU 39R8804)
  • ServeRAID 8k Controller Battery (FRU 25R8088)
  • ServeRAID 8i and 8s Controller Battery (FRU 25R8118)
  • ServeRAID MR10is and MR10M Controller Battery (FRU 46C9040)
  • ServeRAID MR10i Li-Ion Controller Battery (FRU 46C9040)
  • ServeRAID MR10i NiMH Controller Battery (FRU 43W4301)
  • ServeRAID MR10ie Controller Battery (FRU 90Y9406)
  • ServeRAID MR10k Controller Battery (FRU 43W4283)
  • ServeRAID M5015 Controller Battery (FRU 46C9040)
  • MegaRAID 8480 Controller Battery (FRU 39R8853)

Refer to RETAIN tip H19142 (MIGR-5077853) for details on the transition from NiMH to Li-Ion batteries.