• Sunfire v4100M2 GRASP board CR1 blinking

    From leam hall@21:1/5 to All on Thu Jan 4 10:44:50 2018
    And the server won't boot. I have two and tried swapping out the PSUs, the GRASP board, and the BR 2032 battery. While doing this the top cover is off so I can see what is blinking.

    So far the GRASP board CR1 comes up green then after a bit goes to slow blinking.No error lights on the front or back but pushing the "start her up" button does nothing.

    Thoughts?

    Leam

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to All on Sun Jan 14 05:28:16 2018
    I've gotten a Minicom session connected to the Mgmt port. Trying:

    cd /SYS
    start

    Gives me:

    start: Failed to start /SYS

    Pushing the button on front gives nothing. No amber light. both power supplies present.

    How do I interpret the logs for issues?

    show list

    /SP/logs/event/list
    Targets:

    Properties:

    Commands:
    cd
    show

    ID Date/Time Class Type Severity
    ----- ------------------------ -------- -------- --------
    5415 Thu Jan 4 04:02:13 2018 Audit Log minor
    root : Set : object = /SYS/power_state : value = on : error
    5414 Thu Jan 4 04:02:09 2018 Audit Log minor
    KCS Command : Set ACPI Power State : system power state = 0x0 : device po
    wer state = no change : success
    5413 Thu Jan 4 03:59:00 2018 Audit Log minor
    root : Set : object = /SYS/power_state : value = on : error
    5412 Thu Jan 4 03:58:55 2018 Audit Log minor
    KCS Command : Set ACPI Power State : system power state = 0x0 : device po
    wer state = no change : success
    5411 Thu Jan 4 03:56:21 2018 Audit Log minor
    root : Set : object = /SYS/power_state : value = on : error
    5410 Thu Jan 4 03:56:16 2018 Audit Log minor
    KCS Command : Set ACPI Power State : system power state = 0x0 : device po
    wer state = no change : success
    5409 Thu Jan 4 03:50:39 2018 Audit Log minor
    root : Open Session : object = /session/type : value = shell : success 5408 Thu Jan 4 03:50:20 2018 IPMI Log critical
    ID = e80 : 01/04/2018 : 03:50:20 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5407 Thu Jan 4 03:40:18 2018 IPMI Log critical
    ID = e7d : 01/04/2018 : 03:40:18 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5406 Thu Jan 4 03:41:49 2018 IPMI Log critical
    ID = e7a : 01/04/2018 : 03:41:49 : Power Supply : ps1.pwrok : State Deas
    serted
    5405 Thu Jan 4 03:30:21 2018 IPMI Log critical
    ID = e79 : 01/04/2018 : 03:30:21 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5404 Thu Jan 4 03:36:37 2018 IPMI Log critical
    ID = e76 : 01/04/2018 : 03:36:37 : Power Supply : ps1.vinok : State Deas
    serted
    5403 Thu Jan 4 03:20:32 2018 IPMI Log critical
    ID = e75 : 01/04/2018 : 03:20:22 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5402 Thu Jan 4 03:20:22 2018 IPMI Log critical
    ID = e74 : 01/04/2018 : 03:20:22 : Voltage : mb.v_bat : Lower Non-critic
    al going low : reading 2.59 < threshold 2.62 Volts
    5401 Thu Jan 4 03:22:56 2018 IPMI Log critical
    ID = e71 : 01/04/2018 : 03:22:56 : Power Supply : ps1.vinok : State Deas
    serted
    5400 Thu Jan 4 03:22:55 2018 IPMI Log critical
    ID = e70 : 01/04/2018 : 03:22:55 : Power Supply : ps1.pwrok : State Deas
    serted
    5399 Thu Jan 4 03:21:02 2018 IPMI Log critical
    ID = e6f : 01/04/2018 : 03:21:02 : Power Supply : ps0.pwrok : State Asse
    rted
    5398 Thu Jan 4 03:20:58 2018 IPMI Log critical
    ID = e6e : 01/04/2018 : 03:20:58 : Power Supply : ps0.vinok : State Asse
    rted
    5397 Thu Jan 4 03:20:32 2018 IPMI Log critical
    ID = e6d : 01/04/2018 : 03:20:22 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5396 Thu Jan 4 03:20:32 2018 IPMI Log critical
    ID = e6c : 01/04/2018 : 03:20:22 : Power Supply : ps0.vinok : State Deas
    serted

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to All on Sun Jan 14 09:20:34 2018
    Reset the server and got the following in the event logs.


    5420 Thu Jan 4 07:43:28 2018 Audit Log minor
    root : Open Session : object = /session/type : value = shell : success 5419 Thu Jan 4 07:40:19 2018 IPMI Log critical
    ID = e84 : 01/04/2018 : 07:40:18 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5418 Thu Jan 4 07:41:06 2018 IPMI Log critical
    ID = e81 : 01/04/2018 : 07:41:06 : Power Supply : ps1.vinok : State Deas
    serted

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Packard@21:1/5 to All on Mon Jan 15 10:55:52 2018
    On another vendor’s x64 server if it can’t see RAM then it won’t start. It’s been awhile since I’ve used Sun x64-based hardware. In the old days a blade chassis needed power for about 1 hour before I could power on a blade; I’m wondering
    how many minutes power must be applied to an x4100 before attempting power on. I’d use a DMM to measure voltage into the power supplies, to make sure it was the minimum spec for the power supply, then try reseating RAM.

    I’d put the cover back on; docs mention a chassis intrusion switch.

    You’ve Googled for sun fire x4100m2 service manual, I assume.

    Verifying cause of NO chassis power:

    Visually inspect each power supply for the status of the AC Present, Power OK, and Fault LEDs. If the Fault LED is illuminated on any of the PSUs then further troubleshooting will be required.

    If AC Present is NOT illuminated, ensure the AC power cords are securely plugged into the server and connected to working AC power outlet(s).

    If Power OK is NOT illuminated, but AC Present IS, then further troubleshooting will be required. Refer to the system Servers Service Manual and Servers Diagnostics Guide for additional troubleshooting steps.


    Display System Event Logs, and sensor & fault indicator information:

    ILOM:
    show /SP/logs/event/list
    show -d properties -level all /SYS
    show -o table -level all /SP/faultmgmt (Not available in all ILOM versions).

    Regards, Scott

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DoN. Nichols@21:1/5 to Scott Packard on Mon Jan 15 23:45:10 2018
    On 2018-01-15, Scott Packard <spackard@gmail.com> wrote:

    On another vendor?s x64 server if it can?t see RAM then it won?t
    start. It?s been awhile since I?ve used Sun x64-based hardware. In the
    old days a blade chassis needed power for about 1 hour before I could
    power on a blade; I?m wondering how many minutes power must be applied
    to an x4100 before attempting power on.

    O.K. I haven't observed that with mine. It may be an artifact
    of a tired configuration battery. (See man page indicated below about that.)

    I?d use a DMM to measure voltage into the power supplies, to make sure
    it was the minimum spec for the power supply, then try reseating RAM.

    I?d put the cover back on; docs mention a chassis intrusion switch.

    Yes -- IIRC, it is a magnet in the cover which actuates a reed
    switch along one of the sides. The top *must* be on to allow the system
    to power on. IIRC, there is partial power to some diagnoistic circuits
    to allow LEDs to indicate bad RAM DIMMs and such.

    Aside from that, one of the quoted log entries said that the
    battery voltage was below threshold, so you should replace that. I find
    this information (for the X4100M2) in the 819-1157-23 service manual, on
    PDF page 105.

    Without that, configuration settings will be lost when power
    goes away.

    You?ve Googled for sun fire x4100m2 service manual, I assume.

    A search for "819-1157-23.pdf" should lead you to it.

    Verifying cause of NO chassis power:

    Visually inspect each power supply for the status of the AC Present,
    Power OK, and Fault LEDs. If the Fault LED is illuminated on any of the
    PSUs then further troubleshooting will be required.

    If AC Present is NOT illuminated, ensure the AC power cords are
    securely plugged into the server and connected to working AC power
    outlet(s).

    If Power OK is NOT illuminated, but AC Present IS, then further troubleshooting will be required. Refer to the system Servers Service
    Manual and Servers Diagnostics Guide for additional troubleshooting
    steps.

    Good Luck,
    DoN.

    --
    Remove oil spill source from e-mail
    Email: <BPdnicholsBP@d-and-d.com> | (KV4PH) Voice (all times): (703) 938-4564
    (too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
    --- Black Holes are where God is dividing by zero ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to DoN. Nichols on Wed Jan 24 08:02:22 2018
    On Monday, January 15, 2018 at 6:45:57 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-15, Scott Packard <> wrote:


    Hadn't set the group to send mails, sorry for the delay in replying.

    The servers were in the garage during the cold front so a weak battery makes sense. I swapped out the one I could with a brand new, but no luck. There was another thing that looked like a multi-battery pack in shrinkwrap but it didn't want to come off
    the board. Need to study it more when warmer.

    What I was looking at is Item 12 on page 21:

    https://docs.oracle.com/cd/E19121-01/sf.x4200m2/819-1157-23/819-1157-23.pdf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DoN. Nichols@21:1/5 to leam hall on Thu Jan 25 03:11:47 2018
    On 2018-01-24, leam hall <leamhall@gmail.com> wrote:
    On Monday, January 15, 2018 at 6:45:57 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-15, Scott Packard <> wrote:


    Hadn't set the group to send mails, sorry for the delay in replying.

    The servers were in the garage during the cold front so a weak battery
    makes sense.

    Also, the environment values on PDF page 184 of the document
    below may be your problem if you were trying to run it in the cold.

    ====================================================================== Temperature 41 - 95 Deg F
    (operating) 5 - 35 Deg C

    Temperature -40 - 158 Deg F
    (storage) -40 - 70 Deg C
    ======================================================================


    I swapped out the one I could with a brand new, but no
    luck. There was another thing that looked like a multi-battery pack in shrinkwrap but it didn't want to come off the board. Need to study it
    more when warmer.

    What I was looking at is Item 12 on page 21:

    https://docs.oracle.com/cd/E19121-01/sf.x4200m2/819-1157-23/819-1157-23.pdf

    PDF page 106 has a photo of the cell in its holder and
    instructions on replacing it.

    You might want to look up the section on resetting the CMOS
    memory (pages 81 and 87) as it may have been corrupted by the low
    voltage in the previous cell.

    Good Luck,
    DoN.

    --
    Remove oil spill source from e-mail
    Email: <BPdnicholsBP@d-and-d.com> | (KV4PH) Voice (all times): (703) 938-4564
    (too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
    --- Black Holes are where God is dividing by zero ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to DoN. Nichols on Tue Jan 30 11:36:54 2018
    On Wednesday, January 24, 2018 at 9:51:39 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-24, leam hall <> wrote:
    On Monday, January 15, 2018 at 6:45:57 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-15, Scott Packard <> wrote:


    Hadn't set the group to send mails, sorry for the delay in replying.

    The servers were in the garage during the cold front so a weak battery makes sense.

    Also, the environment values on PDF page 184 of the document
    below may be your problem if you were trying to run it in the cold.

    ====================================================================== Temperature 41 - 95 Deg F
    (operating) 5 - 35 Deg C

    Temperature -40 - 158 Deg F
    (storage) -40 - 70 Deg C
    ======================================================================


    I swapped out the one I could with a brand new, but no
    luck. There was another thing that looked like a multi-battery pack in shrinkwrap but it didn't want to come off the board. Need to study it
    more when warmer.

    What I was looking at is Item 12 on page 21:

    https://docs.oracle.com/cd/E19121-01/sf.x4200m2/819-1157-23/819-1157-23.pdf

    PDF page 106 has a photo of the cell in its holder and
    instructions on replacing it.

    You might want to look up the section on resetting the CMOS
    memory (pages 81 and 87) as it may have been corrupted by the low
    voltage in the previous cell.

    Good Luck,
    DoN.

    Thanks! I re-replaced the battery and made sure it was turned the right way this time. Don't have a jumper so I used a small screwdriver blade to short between the two poles of the jumper. Plugged her back in and still no go.

    My bet is that the temperature was the issue. I *assume* the screwdriver make a good enough contact to serve as a jumper. Hmm...I wonder if I have any old hard drives laying around with jumpers on them...

    <some time later>
    Found a jumper. Went through four batteries, including the one from a once-working server that now doesn't want to work. :(

    Couple batteries were very close to the tolerance, 2.61 measured with 2.62 minimal. The server said it wasn't a critical error but still didn't come up.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DoN. Nichols@21:1/5 to leam hall on Wed Jan 31 01:00:08 2018
    On 2018-01-30, leam hall <leamhall@gmail.com> wrote:
    On Wednesday, January 24, 2018 at 9:51:39 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-24, leam hall <> wrote:
    On Monday, January 15, 2018 at 6:45:57 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-15, Scott Packard <> wrote:


    Hadn't set the group to send mails, sorry for the delay in replying.

    The servers were in the garage during the cold front so a weak battery
    makes sense.

    Also, the environment values on PDF page 184 of the document
    below may be your problem if you were trying to run it in the cold.

    ======================================================================
    Temperature 41 - 95 Deg F
    (operating) 5 - 35 Deg C

    Temperature -40 - 158 Deg F
    (storage) -40 - 70 Deg C
    ======================================================================


    I swapped out the one I could with a brand new, but no
    luck. There was another thing that looked like a multi-battery pack in
    shrinkwrap but it didn't want to come off the board. Need to study it
    more when warmer.

    What I was looking at is Item 12 on page 21:

    https://docs.oracle.com/cd/E19121-01/sf.x4200m2/819-1157-23/819-1157-23.pdf

    PDF page 106 has a photo of the cell in its holder and
    instructions on replacing it.

    You might want to look up the section on resetting the CMOS
    memory (pages 81 and 87) as it may have been corrupted by the low
    voltage in the previous cell.

    Good Luck,
    DoN.

    Thanks! I re-replaced the battery and made sure it was turned the
    right way this time. Don't have a jumper so I used a small screwdriver
    blade to short between the two poles of the jumper. Plugged her back in
    and still no go.

    My bet is that the temperature was the issue. I *assume* the
    screwdriver make a good enough contact to serve as a jumper. Hmm...I
    wonder if I have any old hard drives laying around with jumpers on
    them...

    Most modern drives use smaller jumpers, though old enough ones
    might supply what you need.

    <some time later>

    Found a jumper. Went through four batteries, including the one from a once-working server that now doesn't want to work. :(

    Couple batteries were very close to the tolerance, 2.61 measured with
    2.62 minimal. The server said it wasn't a critical error but still
    didn't come up.

    I seem to remember that you have to have the cover in place
    during the power up to reset the data. You don't say whether it is
    closed or not, but there is a sensor (Magnet & Reed switch) to tell
    whether the cover is in place or not.

    Good Luck,
    DoN.

    --
    Remove oil spill source from e-mail
    Email: <BPdnicholsBP@d-and-d.com> | (KV4PH) Voice (all times): (703) 938-4564
    (too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
    --- Black Holes are where God is dividing by zero ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to DoN. Nichols on Wed Jan 31 13:38:28 2018
    On Tuesday, January 30, 2018 at 8:00:46 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-30, leam hall <l> wrote:
    On Wednesday, January 24, 2018 at 9:51:39 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-24, leam hall <> wrote:
    On Monday, January 15, 2018 at 6:45:57 PM UTC-5, DoN. Nichols wrote:
    On 2018-01-15, Scott Packard <> wrote:


    Hadn't set the group to send mails, sorry for the delay in replying.

    The servers were in the garage during the cold front so a weak battery >> > makes sense.

    Also, the environment values on PDF page 184 of the document
    below may be your problem if you were trying to run it in the cold.

    ======================================================================
    Temperature 41 - 95 Deg F
    (operating) 5 - 35 Deg C

    Temperature -40 - 158 Deg F
    (storage) -40 - 70 Deg C
    ======================================================================


    I swapped out the one I could with a brand new, but no
    luck. There was another thing that looked like a multi-battery pack in >> > shrinkwrap but it didn't want to come off the board. Need to study it
    more when warmer.

    What I was looking at is Item 12 on page 21:

    https://docs.oracle.com/cd/E19121-01/sf.x4200m2/819-1157-23/819-1157-23.pdf

    PDF page 106 has a photo of the cell in its holder and
    instructions on replacing it.

    You might want to look up the section on resetting the CMOS
    memory (pages 81 and 87) as it may have been corrupted by the low
    voltage in the previous cell.

    Good Luck,
    DoN.

    Thanks! I re-replaced the battery and made sure it was turned the
    right way this time. Don't have a jumper so I used a small screwdriver blade to short between the two poles of the jumper. Plugged her back in
    and still no go.

    My bet is that the temperature was the issue. I *assume* the
    screwdriver make a good enough contact to serve as a jumper. Hmm...I
    wonder if I have any old hard drives laying around with jumpers on
    them...

    Most modern drives use smaller jumpers, though old enough ones
    might supply what you need.

    <some time later>

    Found a jumper. Went through four batteries, including the one from a once-working server that now doesn't want to work. :(

    Couple batteries were very close to the tolerance, 2.61 measured with
    2.62 minimal. The server said it wasn't a critical error but still
    didn't come up.

    I seem to remember that you have to have the cover in place
    during the power up to reset the data. You don't say whether it is
    closed or not, but there is a sensor (Magnet & Reed switch) to tell
    whether the cover is in place or not.

    Good Luck,
    DoN.



    Most of my stuff is old. Drives too. ;)

    Bought a new pack of batteries and have gone through a couple on the one server.

    5515 Thu Jan 4 11:23:26 2018 Audit Log minor
    root : Open Session : object = /session/type : value = shell : success 5514 Thu Jan 4 11:20:33 2018 IPMI Log critical
    ID = ef3 : 01/04/2018 : 11:20:32 : Voltage : mb.v_bat : Lower Non-recove
    rable going low : reading 0.62 < threshold 2.34 Volts
    5513 Thu Jan 4 11:20:33 2018 IPMI Log critical
    ID = ef2 : 01/04/2018 : 11:20:27 : Voltage : mb.v_bat : Lower Critical g
    oing low : reading 0.62 < threshold 2.53 Volts
    5512 Thu Jan 4 11:20:33 2018 IPMI Log critical
    ID = ef1 : 01/04/2018 : 11:20:22 : Entity Presence : io.id1.prsnt : Devi
    ce Present


    Did the "jumper on", close lid, power up thing multiple times. The other server came back around, this one didn't. Is the "mb.v_bat" the CMOS battery or the one next to it?

    http://reuel.net/images/v4100_batteries.jpg

    Thanks!

    Leam

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DoN. Nichols@21:1/5 to leam hall on Thu Feb 1 03:14:45 2018
    On 2018-01-31, leam hall <leamhall@gmail.com> wrote:

    Most of my stuff is old. Drives too. ;)

    Bought a new pack of batteries and have gone through a couple on the one server.

    5515 Thu Jan 4 11:23:26 2018 Audit Log minor
    root : Open Session : object = /session/type : value = shell : success 5514 Thu Jan 4 11:20:33 2018 IPMI Log critical
    ID = ef3 : 01/04/2018 : 11:20:32 : Voltage : mb.v_bat : Lower Non-recove
    rable going low : reading 0.62 < threshold 2.34 Volts
    5513 Thu Jan 4 11:20:33 2018 IPMI Log critical
    ID = ef2 : 01/04/2018 : 11:20:27 : Voltage : mb.v_bat : Lower Critical g
    oing low : reading 0.62 < threshold 2.53 Volts
    5512 Thu Jan 4 11:20:33 2018 IPMI Log critical
    ID = ef1 : 01/04/2018 : 11:20:22 : Entity Presence : io.id1.prsnt : Devi
    ce Present


    Did the "jumper on", close lid, power up thing multiple times. The other server came back around, this one didn't. Is the "mb.v_bat" the CMOS battery or the one next to it?

    http://reuel.net/images/v4100_batteries.jpg

    Thanks!

    Leam


    Looking at that photo, and comparing it with PDF page 106 in 819-1157-23.pdf (printed page # 3-16), I think that you have the battery
    in the holder backwards.

    ======================================================================
    Note ­ Install the new battery in the holder with the same
    orientation (polarity) as the battery that you removed. The positive
    polarity, marked with a "+" symbol, should be facing toward the chassis
    center.
    ======================================================================

    and the '+' side is the larger flat side. Your photo shows it facing
    the handle, which is near the outer edge, not the chassis center.

    The black heat-shrink enclosed part I believe to be a very high
    value, low voltage capacitor, to maintain info in the memory when you
    are replacing the battery (if it was not so low that it had already lost
    data.)

    I've got an X4200, and an X4100M2, and I think that you have the
    X4100. I'm not going to shut down the two systems (which are both doing
    things which I need to keep running, given power) just so I can examine
    them.

    Good Luck,
    DoN.

    --
    Remove oil spill source from e-mail
    Email: <BPdnicholsBP@d-and-d.com> | (KV4PH) Voice (all times): (703) 938-4564
    (too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
    --- Black Holes are where God is dividing by zero ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to leam hall on Thu Feb 1 05:59:50 2018
    On Thursday, February 1, 2018 at 8:54:44 AM UTC-5, leam hall wrote:
    Don, I really appreciate all the help!

    The server self-identifies:

    product_name = SUN FIRE X4100 M2
    product_part_number = 602-4492-01


    I did the full 'jumper, power on, close lid' reset bit with a third battery. This time I let it sit for a few minutes. The low battery seems to be during the reset time given the gap in /SP/logs/event/list.

    At present "start /SYS" fails. It shows a PSU_FAULT and there are amber lights. However, both PSUs have two green lights on the rear.

    Next step is to try resetting PSUs and see what happens. Will keep you updated.

    Thanks!

    Leam

    Hrmph. Maybe I was wrong about the battery issue just being during the reset. This is with the third brand new CR2032 battery with the positive side facing the center of the motherboard. I pulled the plugs, reseated one PSU and pulled the other. Here's
    the /SP/logs/event/list:


    5539 Thu Jan 4 14:33:17 2018 Audit Log minor
    root : Open Session : object = /session/type : value = shell : success 5538 Thu Jan 4 14:30:30 2018 IPMI Log critical
    ID = f0a : 01/04/2018 : 14:30:30 : Voltage : mb.v_bat : Lower Non-recove
    rable going low : reading 0.62 < threshold 2.34 Volts
    5537 Thu Jan 4 14:30:30 2018 IPMI Log critical
    ID = f09 : 01/04/2018 : 14:30:25 : Voltage : mb.v_bat : Lower Critical g
    oing low : reading 0.62 < threshold 2.53 Volts
    5536 Thu Jan 4 14:30:30 2018 IPMI Log critical
    ID = f08 : 01/04/2018 : 14:30:19 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5535 Thu Jan 4 14:30:19 2018 IPMI Log critical
    ID = f07 : 01/04/2018 : 14:30:19 : Voltage : mb.v_bat : Lower Non-critic
    al going low : reading 0.62 < threshold 2.62 Volts
    5534 Thu Jan 4 14:36:37 2018 IPMI Log critical
    ID = f05 : 01/04/2018 : 14:36:37 : Power Supply : ps1.pwrok : State Deas
    serted

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to All on Thu Feb 1 05:54:42 2018
    Don, I really appreciate all the help!

    The server self-identifies:

    product_name = SUN FIRE X4100 M2
    product_part_number = 602-4492-01


    I did the full 'jumper, power on, close lid' reset bit with a third battery. This time I let it sit for a few minutes. The low battery seems to be during the reset time given the gap in /SP/logs/event/list.

    At present "start /SYS" fails. It shows a PSU_FAULT and there are amber lights. However, both PSUs have two green lights on the rear.

    Next step is to try resetting PSUs and see what happens. Will keep you updated.

    Thanks!

    Leam

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DoN. Nichols@21:1/5 to leam hall on Fri Feb 2 04:21:31 2018
    On 2018-02-01, leam hall <leamhall@gmail.com> wrote:
    On Thursday, February 1, 2018 at 8:54:44 AM UTC-5, leam hall wrote:
    Don, I really appreciate all the help!

    The server self-identifies:

    product_name = SUN FIRE X4100 M2
    product_part_number = 602-4492-01


    I did the full 'jumper, power on, close lid' reset bit with a third
    battery. This time I let it sit for a few minutes. The low battery seems
    to be during the reset time given the gap in /SP/logs/event/list.

    At present "start /SYS" fails. It shows a PSU_FAULT and there are
    amber lights. However, both PSUs have two green lights on the rear.

    Next step is to try resetting PSUs and see what happens. Will keep
    you updated.

    Thanks!

    Leam

    Hrmph. Maybe I was wrong about the battery issue just being during the
    reset. This is with the third brand new CR2032 battery with the positive
    side facing the center of the motherboard. I pulled the plugs, reseated
    one PSU and pulled the other. Here's the /SP/logs/event/list:


    CR2032? I seem to remember that the X4100M2 and X4200 use a
    somewhat smaller cell. I went upstairs to see if I could find a saved
    low cell to refresh my memory on the size used. I had to go to
    Batteries Plus, and they had a few (not on display)

    If I am right, the larger diameter CR2032 can't make contact to
    both contacts in the holder at the same time.

    Do you have the original cells which you pulled from the system?
    Is it possible that someone previous to you manage to force in the
    CR2032 cell where it did not belong?

    But the photo in the manual does look like about CR2032 size.
    Maybe the weird size was in the T2000 instead.

    O.K. CR1225 in the T2000 -- that was what I was remembering.

    Sorry about the mis-remembered value.

    If that black heat-shrink wrapped cylinder is really a very high
    value low voltage capacitor -- the reverse cell mounting (if it makes
    proper contact with both sides of the cell, which it may not do) could
    have put a reverse charge on it. If so, that may take hours to recharge
    the capacitor.

    Good Luck,
    DoN.


    5539 Thu Jan 4 14:33:17 2018 Audit Log minor
    root : Open Session : object = /session/type : value = shell : success 5538 Thu Jan 4 14:30:30 2018 IPMI Log critical
    ID = f0a : 01/04/2018 : 14:30:30 : Voltage : mb.v_bat : Lower Non-recove
    rable going low : reading 0.62 < threshold 2.34 Volts
    5537 Thu Jan 4 14:30:30 2018 IPMI Log critical
    ID = f09 : 01/04/2018 : 14:30:25 : Voltage : mb.v_bat : Lower Critical g
    oing low : reading 0.62 < threshold 2.53 Volts
    5536 Thu Jan 4 14:30:30 2018 IPMI Log critical
    ID = f08 : 01/04/2018 : 14:30:19 : Entity Presence : io.id1.prsnt : Devi
    ce Present
    5535 Thu Jan 4 14:30:19 2018 IPMI Log critical
    ID = f07 : 01/04/2018 : 14:30:19 : Voltage : mb.v_bat : Lower Non-critic
    al going low : reading 0.62 < threshold 2.62 Volts
    5534 Thu Jan 4 14:36:37 2018 IPMI Log critical
    ID = f05 : 01/04/2018 : 14:36:37 : Power Supply : ps1.pwrok : State Deas
    serted


    --
    Remove oil spill source from e-mail
    Email: <BPdnicholsBP@d-and-d.com> | (KV4PH) Voice (all times): (703) 938-4564
    (too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
    --- Black Holes are where God is dividing by zero ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to All on Sat Feb 3 09:11:26 2018
    Hadn't thought about the cap discharge. Both servers are now flaky. Plugged in one and will let it sit for the rest of the day. Will let you know how it comes out.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to leam hall on Tue Feb 6 03:11:59 2018
    On Saturday, February 3, 2018 at 12:11:27 PM UTC-5, leam hall wrote:
    Hadn't thought about the cap discharge. Both servers are now flaky. Plugged in one and will let it sit for the rest of the day. Will let you know how it comes out.

    Didn't help. Server powers up fine but won't start the OS.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DoN. Nichols@21:1/5 to leam hall on Wed Feb 7 02:47:29 2018
    On 2018-02-06, leam hall <leamhall@gmail.com> wrote:
    On Saturday, February 3, 2018 at 12:11:27 PM UTC-5, leam hall wrote:
    Hadn't thought about the cap discharge. Both servers are now flaky. Plugged in one and will let it sit for the rest of the day. Will let you know how it comes out.

    Didn't help. Server powers up fine but won't start the OS.

    Hmm ... pretty much all that I can do. Is it possible that you
    have some bad RAM DIMMs in there? IIRC, there are LEDs to indicate
    every bad DIMM (and you need a lot of DIMMs to make a valid bank, IIRC.
    Also, there are LEDs to indicate bad CPUs.

    I only have the two systems (X4100M2 and X4200) and since they
    are in service I can't pull them down to check other things.

    All the fans are good, I hope?

    Good Luck,
    DoN.

    --
    Remove oil spill source from e-mail
    Email: <BPdnicholsBP@d-and-d.com> | (KV4PH) Voice (all times): (703) 938-4564
    (too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
    --- Black Holes are where God is dividing by zero ---

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From leam hall@21:1/5 to DoN. Nichols on Thu Feb 15 15:49:41 2018
    On Tuesday, February 6, 2018 at 9:48:27 PM UTC-5, DoN. Nichols wrote:
    On 2018-02-06, leam hall <l> wrote:
    On Saturday, February 3, 2018 at 12:11:27 PM UTC-5, leam hall wrote:
    Hadn't thought about the cap discharge. Both servers are now flaky. Plugged in one and will let it sit for the rest of the day. Will let you know how it comes out.

    Didn't help. Server powers up fine but won't start the OS.

    Hmm ... pretty much all that I can do. Is it possible that you
    have some bad RAM DIMMs in there? IIRC, there are LEDs to indicate
    every bad DIMM (and you need a lot of DIMMs to make a valid bank, IIRC.
    Also, there are LEDs to indicate bad CPUs.

    I only have the two systems (X4100M2 and X4200) and since they
    are in service I can't pull them down to check other things.

    All the fans are good, I hope?

    Good Luck,
    DoN.

    Hey Don, sorry it took me a bit to respond. Here's the funny. I used a totally new battery, got the same results. Pulled the battery out and fired up the system, same results. So it doesn't matter at the battery at all.

    *sigh*

    Wish the RAM fit into my Dell box, at least.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)