Configuring Platform Event Filters

This chapter includes the following sections:

Platform Event Filters

A platform event filter (PEF) can trigger an action and generate an alert when a critical hardware-related event occurs. For each PEF, you can choose the action to be taken (or take no action) when a platform event occurs. You can also choose to generate and send an alert when a platform event occurs. Alerts are sent as an SNMP trap, so you must configure an SNMP trap destination before the alerts can be sent.

You can globally enable or disable the generation of platform event alerts. When disabled, alerts are not sent even if PEFs are configured to send them.

Enabling Platform Event Alerts

SUMMARY STEPS

  1. Server# scope fault
  2. Server /fault # set platform-event-enabled yes
  3. Server /fault # commit
  4. Server /fault # show [detail]

DETAILED STEPS

  Command or Action Purpose

Step 1

Server# scope fault

Enters the fault command mode.

Step 2

Server /fault # set platform-event-enabled yes

Enables platform event alerts.

Step 3

Server /fault # commit

Commits the transaction to the system configuration.

Step 4

Server /fault # show [detail]

(Optional) Displays the platform event alert configuration.

Example

This example enables platform event alerts:

Server# scope fault
Server /fault # set platform-event-enabled yes
Server /fault *# commit
Server /fault # show
Platform Event Enabled
-----------------------
yes                   

Server /fault #         

Disabling Platform Event Alerts

SUMMARY STEPS

  1. Server# scope fault
  2. Server /fault # set platform-event-enabled no
  3. Server /fault # commit
  4. Server /fault # show [detail]

DETAILED STEPS

  Command or Action Purpose

Step 1

Server# scope fault

Enters the fault command mode.

Step 2

Server /fault # set platform-event-enabled no

Disables platform event alerts.

Step 3

Server /fault # commit

Commits the transaction to the system configuration.

Step 4

Server /fault # show [detail]

(Optional) Displays the platform event alert configuration.

Example

This example disables platform event alerts:

Server# scope fault
Server /fault # set platform-event-enabled no
Server /fault *# commit
Server /fault # show
Platform Event Enabled
-----------------------
no

Server /fault #         

Configuring Platform Event Filters

You can configure actions and alerts for the following platform event filters:

ID Platform Event Filter

1

Temperature Critical Assert Filter

2

Temperature Warning Assert Filter

3

Voltage Critical Assert Filter

4

Processor Assert Filter

5

Memory Critical Assert Filter

6

Drive Slot Assert Filter

7

LSI Critical Assert Filter

8

LSI Warning Assert Filter

SUMMARY STEPS

  1. Server# scope fault
  2. Server /fault # scope pef id
  3. Server /fault/pef # set action {none | reboot | power-cycle | power-off}
  4. Server /fault/pef # commit

DETAILED STEPS

  Command or Action Purpose

Step 1

Server# scope fault

Enters the fault command mode.

Step 2

Server /fault # scope pef id

Enters the platform event filter command mode for the specified event.

See the Platform Event Filter table for event ID numbers.

Step 3

Server /fault/pef # set action {none | reboot | power-cycle | power-off}

Selects the desired system action when this event occurs. The action can be one of the following:

  • none —No system action is taken.

  • reboot —The server is rebooted.

  • power-cycle —The server is power cycled.

  • power-off —The server is powered off.

Step 4

Server /fault/pef # commit

Commits the transaction to the system configuration.

Example

This example configures the platform event alert for an event:

Server# scope fault
Server /fault # scope pef 1
Server /fault/pef # set action reboot
Server /fault/pef *# commit
Server /fault/pef # show
Platform Event Filter Event                               Action      Send Alert         
--------------------- ----------------------------------- ----------- ----------- 
1                     Temperature Critical Assert Filter  reboot      yes              

Server /fault/pef #      

What to do next

If you configure any PEFs to send an alert, complete the following tasks:

  • Enable platform event alerts

  • Configure SNMP trap settings

Interpreting Platform Event Traps

A CIMC platform event alert sent as an SNMP trap contains an enterprise object identifier (OID) in the form 1.3.6.1.4.1.3183.1.1.0.event. The first ten fields of the OID represent the following information: iso(1).org(3).dod(6).internet(1).private(4).enterprises(1).wired_for_management(3183).PET(1).version(1).version(0), indicating an IPMI platform event trap (PET) version 1.0 message. The last field is an event number, indicating the specific condition or alert being notified.

Platform Event Trap Descriptions

The following table provides a description of the event being notified in a platform event trap message, based on the event number in the trap OID.

Event Number [Note 1] Platform Event Description
0 0h Test Trap
65799 010107h Temperature Warning
65801 010109h Temperature Critical
131330 020102h Under Voltage, Critical
131337 020109h Voltage Critical
196871 030107h Current Warning
262402 040102h Fan Critical
459776 070400h Processor related (IOH-Thermalert/Caterr sensor) – predictive failure deasserted
459777 070401h Processor related (IOH-Thermalert/Caterr sensor) – predictive failure asserted
460032 070500h Processor Power Warning – limit not exceeded
460033 070501h Processor Power Warning – limit exceeded
524533 0800F5h Power Supply Critical
524551 080107h Power Supply Warning
525313 080401h Discrete Power Supply Warning
527105 080B01h Power Supply Redundancy Lost
527106 080B02h Power Supply Redundancy Restored
552704 086F00h Power Supply Inserted
552705 086F01h Power Supply Failure
552707 086F03h Power Supply AC Lost
786433 0C0001h Correctable ECC Memory Errors, Release 1.3(1) and later releases, filter set to accept all reading types [Note 4]
786439 0C0007h DDR3_INFO sensor LED - RED bit asserted (Probable ECC error on a DIMM), Generic Sensor [Notes 2,3]
786689 0C0101h Correctable ECC Memory Errors, Release 1.3(1) and later releases
818945 0C7F01h Correctable ECC Memory Errors, Release 1.2(x) and earlier releases
818951 0C7F07h DDR3_INFO sensor LED - RED bit asserted (Probable ECC error on a DIMM), 1.2(x) and earlier releases [Note 3]
851968 0D0000h HDD sensor indicates no fault, Generic Sensor [Note 2]
851972 0D0004h HDD sensor indicates a fault, Generic Sensor [Note 2]
854016 0D0800h HDD Absent, Generic Sensor [Note 2]
854017 0D0801h HDD Present, Generic Sensor [Note 2]
880384 0D6F00h HDD Present, no fault indicated
880385 0D6F01h HDD Fault
880512 0D6F80h HDD Not Present
880513 0D6F81h HDD is deasserted but not in a fault state
884480 0D7F00h Drive Slot LED Off
884481 0D7F01h Drive Slot LED On
884482 0D7F02h Drive Slot LED fast blink
884483 0D7F03h Drive Slot LED slow blink
884484 0D7F04h Drive Slot LED green
884485 0D7F05h Drive Slot LED amber
884486 0D7F01h Drive Slot LED blue
884487 0D7F01h Drive Slot LED read
884488 0D7F08h Drive Slot Online
884489 0D7F09h Drive Slot Degraded

Note

 

When the event filter is set to accept all reading types, bits 15:8 of the hex event number are masked to 0. For example, event number 786689 (0C0101h) becomes 786433 (0C0001h).