• Jammed (dead?) M4000 without SLA

    From Kay-Uwe Loebel@21:1/5 to All on Wed Sep 11 07:58:22 2019
    Hi,

    our professorship's Sparc Enterprise M4000 heavily used for student labs suddenly went down (power off) with lighting amber LED. :-(((

    Here the related outputs:


    XSCF> poweron -d 0
    DomainIDs to power on:00
    Continue? [y|n] :y
    00 :Not powering on :Poweron canceled due to missing component.

    XSCF> showstatus
    * MBU_A Status:Faulted;
    * CPUM#0-CHIP#0 Status:Deconfigured;
    * CPUM#0-CHIP#1 Status:Deconfigured;
    ...

    XSCF> showboards -va
    XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD
    ---- - -------- ----------- ---- ---- ---- ------- -------- ----
    00-0 * 00(00) Assigned n n n Unknown Faulted n

    XSCF> showlogs error
    Date: Aug 27 08:37:47 CEST 2019 Code: 60002108-7e010000-0508090810ff1017
    Status: Warning Occurred: Aug 27 08:37:43.968 CEST 2019
    FRU: /MBU_A
    Msg: MBC internal fatal error
    Date: Aug 27 08:38:07 CEST 2019 Code: 80006108-7b010000-0508071610ff1005
    Status: Alarm Occurred: Aug 27 08:38:06.113 CEST 2019
    FRU: /MBU_A
    Msg: MBC internal serious error

    XSCF> fmdump
    TIME UUID MSG-ID
    Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
    Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS

    XSCF> version -c xcp -v
    XSCF#0 (Active )
    XCP0 (Current): 1081
    OpenBoot PROM : 02.08.0000
    XSCF : 01.08.0004
    XCP1 (Reserve): 1081
    OpenBoot PROM : 02.08.0000
    XSCF : 01.08.0004
    OpenBoot PROM BACKUP
    #0: 02.08.0000
    #1: --.--.----


    It seems, that the motherboard is broken, but I've read in 3 or 4
    sources, that sometimes these MBC faults are spuriously, i.e.
    software-related, so I tried intensive to reset the error status.

    Re-testing the PSB fails unfortunately:

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.


    Clearing the error condition manually (clearstatus /MBU_A) fails due to
    the lack of a service password:

    XSCF> enableservice
    Service Password:
    ...
    XSCF> service
    Account is not enabled for service mode.


    Even a factory reset didn't solve the problem:

    XSCF> dumpconfig -v file:///media/usb_msd/config.txt
    ...
    XSCF> restoredefaults -c factory
    ...
    XSCF> restoreconfig -v file:///media/usb_msd/config.txt

    The fltlog is now empty (no output of fmdump), but the amber LED lights
    again, when on the concole

    start /scf/sbin/fmd (pid=719)

    is executed.


    You see the huge problem: What can I do to get the M4000 back in service?

    Many thanks in advance for every hint.

    Kind regards
    Kay-Uwe Loebel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Wed Sep 11 08:42:02 2019
    On 11/09/2019 06:58, Kay-Uwe Loebel wrote:
    Hi,

    our professorship's Sparc Enterprise M4000 heavily used for student labs suddenly went down (power off) with lighting amber LED. :-(((

    Here the related outputs:
    <snip>



    You see the huge problem: What can I do to get the M4000 back in service?


    In all honesty, the quickest (and time cheapest) method would be to get
    on Ebay and buy another one, then swap the disks etc. (They are quite
    cheap).

    Then thing about your DR policy :-)

    If you have another M series handy, get the disks in that and take flash archives of the OS and backup of the data.

    The, assuming other kit is available, re-install.


    Many thanks in advance for every hint.

    Kind regards
    Kay-Uwe Loebel



    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Wed Sep 11 12:01:49 2019
    Am 11.09.2019 um 09:42 schrieb YTC#1:

    In all honesty, the quickest (and time cheapest) method would be to get
    on Ebay and buy another one, then swap the disks etc. (They are quite
    cheap).

    A pragmatic solution of course, but I fear, that the university will not advocate the purchase of a second obsolete piece of hardware.
    Moreover the licence server for our CAD software is bound till the end
    of the year to the current hostid.

    Perhaps it's possible to install a more recent firmware (we have 1081),
    which allows the status reset by a simple power on/off with the key in
    the service position?
    Do you /someone know, why a further test of the PSB fails?

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    Then thing about your DR policy :-)

    ;-)
    I do, and some functions are already ported to a linux server.
    Nevertheless the M4000 has 2 PSUs and 2 system disks, the main causes of desasters ...

    If you have another M series handy, get the disks in that and take flash archives of the OS and backup of the data.

    We have the only one Mxxx in the university.

    Kind regards
    Kay-Uwe Loebel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Wed Sep 11 13:11:22 2019
    Am 11.09.2019 um 12:26 schrieb YTC#1:

    The hostid is probably an easy solution. Many ways to skin a cat.

    I know the LD_PRELOAD disguise. ;-)

    Perhaps it's possible to install a more recent firmware (we have 1081),
    which allows the status reset by a simple power on/off with the key in
    the service position?

    And that is an old FW at that.

    Okay, the 1050 should enable this feature?
    But where get from and is it possible to install firmware in the faulted
    state?

    Do you /someone know, why a further test of the PSB fails?

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    This thread may help, as you have been reseting the system <url:https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=12587jyhco_4&_afrLoop=366634817834842>

    Unfortunately I can't access the thread without Support Identifier ...

    I do, and some functions are already ported to a linux server.

    Booo, hisss !!!!

    Don't blame me, since at least 5 years I'm the last SPARC / Solaris user
    here ... ;-)

    We have the only one Mxxx in the university.

    What about any other Sparc boxes ? T series ? V series ?

    UltraSPARC 45 and SunBlade 2000 are still working excellently.

    Kind regards
    Kay-Uwe Loebel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Wed Sep 11 11:26:14 2019
    On 11/09/2019 11:01, Kay-Uwe Loebel wrote:
    Am 11.09.2019 um 09:42 schrieb YTC#1:

    In all honesty, the quickest (and time cheapest) method would be to get
    on Ebay and buy another one, then swap the disks etc. (They are quite
    cheap).

    A pragmatic solution of course, but I fear, that the university will not advocate the purchase of a second obsolete piece of hardware.
    Moreover the licence server for our CAD software is bound till the end
    of the year to the current hostid.

    The hostid is probably an easy solution. Many ways to skin a cat.

    Perhaps it's possible to install a more recent firmware (we have 1081),
    which allows the status reset by a simple power on/off with the key in
    the service position?

    And that is an old FW at that.

    Do you /someone know, why a further test of the PSB fails?

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    This thread may help, as you have been reseting the system <url:https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=12587jyhco_4&_afrLoop=366634817834842>



    Then thing about your DR policy :-)

    ;-)
    I do, and some functions are already ported to a linux server.

    Booo, hisss !!!!

    Nevertheless the M4000 has 2 PSUs and 2 system disks, the main causes of desasters ...

    If you have another M series handy, get the disks in that and take flash
    archives of the OS and backup of the data.

    We have the only one Mxxx in the university.


    What about any other Sparc boxes ? T series ? V series ?

    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Wed Sep 11 16:17:08 2019
    On 11/09/2019 12:11, Kay-Uwe Loebel wrote:
    Am 11.09.2019 um 12:26 schrieb YTC#1:

    The hostid is probably an easy solution. Many ways to skin a cat.

    I know the LD_PRELOAD disguise. ;-)

    Perhaps it's possible to install a more recent firmware (we have 1081),
    which allows the status reset by a simple power on/off with the key in
    the service position?

    And that is an old FW at that.

    Okay, the 1050 should enable this feature?
    But where get from and is it possible to install firmware in the faulted state?

    I take it you don't any Oracle support ?


    Do you /someone know, why a further test of the PSB fails?

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    This thread may help, as you have been reseting the system
    <url:https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=12587jyhco_4&_afrLoop=366634817834842>


    Unfortunately I can't access the thread without Support Identifier ...

    Oh, looks like I attached the wrong link anyway :-( <url:https://community.oracle.com/community/support/oracle_sun_technologies/sparc_m-series_servers>

    It is suggesting no CPU is attached to the domain


    I do, and some functions are already ported to a linux server.

    Booo, hisss !!!!

    Don't blame me, since at least 5 years I'm the last SPARC / Solaris user
    here ... ;-)

    Sounds like you really are the last now :-(


    We have the only one Mxxx in the university.

    What about any other Sparc boxes ? T series ? V series ?

    UltraSPARC 45 and SunBlade 2000 are still working excellently.


    I don't think that will roll.



    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Keith Thompson@21:1/5 to Kay-Uwe Loebel on Wed Sep 11 14:24:04 2019
    Kay-Uwe Loebel <loebel@etit.tu-chemnitz.de> writes:
    Am 11.09.2019 um 09:42 schrieb YTC#1:
    In all honesty, the quickest (and time cheapest) method would be to get
    on Ebay and buy another one, then swap the disks etc. (They are quite
    cheap).

    A pragmatic solution of course, but I fear, that the university will not advocate the purchase of a second obsolete piece of hardware.
    Moreover the licence server for our CAD software is bound till the end
    of the year to the current hostid.
    [...]

    Can you talk to your vendor about changing the hostid for the license
    server? I would hope they have provisions for dying hardware.

    --
    Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food.
    void Void(void) { Void(); } /* The recursive call of the void */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Keith Thompson on Thu Sep 12 08:20:30 2019
    On 11/09/2019 22:24, Keith Thompson wrote:
    Kay-Uwe Loebel <loebel@etit.tu-chemnitz.de> writes:
    Am 11.09.2019 um 09:42 schrieb YTC#1:
    In all honesty, the quickest (and time cheapest) method would be to get
    on Ebay and buy another one, then swap the disks etc. (They are quite
    cheap).

    A pragmatic solution of course, but I fear, that the university will not
    advocate the purchase of a second obsolete piece of hardware.
    Moreover the licence server for our CAD software is bound till the end
    of the year to the current hostid.
    [...]

    Can you talk to your vendor about changing the hostid for the license
    server? I would hope they have provisions for dying hardware.


    Bet they have not got support for that either :-)


    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Thu Sep 12 14:57:26 2019
    Am 11.09.2019 um 17:17 schrieb YTC#1:

    Okay, the 1050 should enable this feature?
    But where get from and is it possible to install firmware in the faulted
    state?

    I take it you don't any Oracle support ?

    Right, it wouldn't be an easy thing of course.

    Oh, looks like I attached the wrong link anyway :-( <url:https://community.oracle.com/community/support/oracle_sun_technologies/sparc_m-series_servers>

    It is suggesting no CPU is attached to the domain

    Thanks for the hint - I guess you meant the motherboard (XSB) is not
    assigned to the domain.
    It is, but I de- and re-assigned it just to be sure:

    XSCF> showboards -v -d 0
    XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD
    ---- - -------- ----------- ---- ---- ---- ------- -------- ----
    00-0 * 00(00) Assigned n n n Unknown Faulted n

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    XSCF> deleteboard -c unassign 00-0
    XSB#00-0 will be unassigned from domain immediately. Continue?[y|n] :y

    XSCF> showboards -v -d 0
    XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD
    ---- - -------- ----------- ---- ---- ---- ------- -------- ----
    00-0 SP Unavailable n n n Unknown Faulted n

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    XSCF> addboard -c assign -d 0 00-0
    XSB#00-0 will be assigned to DomainID 0. Continue?[y|n] :y

    It seems that one condition is still missing to re-test the board.
    Perhaps the XSB must be _connected_ the the domain, but there's no
    special option / command to perform this. :-(


    Alternatively I look for a way to perform a "clearstatus /MBU_A" ...
    or to prevent the XSCF from starting the /scf/sbin/fmd daemon ...

    There must be a way for _my_ purchased machine. ;-)

    Kind regards
    Kay-Uwe Loebel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Thu Sep 12 20:36:59 2019
    On 12/09/2019 13:57, Kay-Uwe Loebel wrote:
    Am 11.09.2019 um 17:17 schrieb YTC#1:

    Okay, the 1050 should enable this feature?
    But where get from and is it possible to install firmware in the faulted >>> state?

    I take it you don't any Oracle support ?

    Right, it wouldn't be an easy thing of course.

    Oh, looks like I attached the wrong link anyway :-(
    <url:https://community.oracle.com/community/support/oracle_sun_technologies/sparc_m-series_servers>


    It is suggesting no CPU is attached to the domain

    Thanks for the hint - I guess you meant the motherboard (XSB) is not
    assigned to the domain.
    It is, but I de- and re-assigned it just to be sure:

    XSCF> showboards -v -d 0
    XSB  R DID(LSB) Assignment  Pwr  Conn Conf Test    Fault    COD
    ---- - -------- ----------- ---- ---- ---- ------- -------- ----
    00-0 * 00(00)   Assigned    n    n    n    Unknown Faulted  n

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    XSCF> deleteboard -c unassign 00-0
    XSB#00-0 will be unassigned from domain immediately. Continue?[y|n] :y

    XSCF> showboards -v -d 0
    XSB  R DID(LSB) Assignment  Pwr  Conn Conf Test    Fault    COD
    ---- - -------- ----------- ---- ---- ---- ------- -------- ----
    00-0   SP       Unavailable n    n    n    Unknown Faulted  n

    XSCF> testsb 0
    Initial diagnosis is about to start, Continue?[y|n] :y
    The current configuration does not support this operation.

    XSCF> addboard -c assign -d 0 00-0
    XSB#00-0 will be assigned to DomainID 0. Continue?[y|n] :y

    It seems that one condition is still missing to re-test the board.
    Perhaps the XSB must be _connected_ the the domain, but there's no
    special option / command to perform this. :-(


    Alternatively I look for a way to perform a "clearstatus /MBU_A" ...
    or to prevent the XSCF from starting the /scf/sbin/fmd daemon ...

    It has been a while since I touched an Mx000, never mind trouble shot
    one :-(

    Essentially you appear to have a major issue, it may be that the reset
    has cleared it. However it may not.

    Ok, going back to the fmdumps

    SCF-8003-HA
    This fault can occur on the MBC chip that resides on the Motherboard.

    The fault may affect the the entire MBC chip or it may affect just a
    single XSB; this can be determined by looking at the FMRI of the fault.

    The recommended service action for this event is to schedule the
    replacement of the affected FRU.

    What does

    fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb

    Show ?

    I still say your best bet is to buy one off ebay.


    There must be a way for _my_ purchased machine. ;-)

    Yeah, but it is old and broken.

    I bought a MacBook in 2013. It is still in use, but I expect it to break
    at some point.

    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Fri Sep 13 07:51:49 2019
    Am 12.09.2019 um 21:36 schrieb YTC#1:

    It has been a while since I touched an Mx000, never mind trouble shot
    one :-(

    a fortiori I appreciate your help!

    What does

    fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb

    Show ?

    Now: nothing, because I did carry out a factory reset. ;-)

    TIME UUID MSG-ID
    fmdump: /var/opt/sun/fm/fmd/fltlog is empty


    Of course I saved the outputs before:

    XSCF> fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb
    TIME UUID MSG-ID
    Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
    100% fault.chassis.SPARC-Enterprise.asic.mbc.fe

    Problem in: hc:///chassis=0/cmu=0/mbc=0
    Affects: hc:///chassis=0/cmu=0/xsb=0
    FRU: hc://:product-id=SPARC Enterprise M4000:chassis-id=BC********:server-id=******:serial=BC********:part=CF00541-0893
    06 \541-0893-06:revision=0101/component=/MBU_A
    Location: /MBU_A

    XSCF> fmdump -v -u 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    TIME UUID MSG-ID
    Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS
    100% fault.chassis.SPARC-Enterprise.asic.mbc.se

    Problem in: hc:///chassis=0/cmu=0/mbc=0
    Affects: hc:///chassis=0/cmu=0
    FRU: hc://:product-id=SPARC Enterprise M4000:chassis-id=BCF092404K:server-id=******:serial=BC********:part=CF00541-0893
    06 \541-0893-06:revision=0101/component=/MBU_A
    Location: /MBU_A

    XSCF> fmdump -m -M
    MSG-ID: SCF-8003-LS, TYPE: Fault, VER: 1, SEVERITY: Critical
    EVENT-TIME: Tue Aug 27 08:38:07 CEST 2019
    PLATFORM: SPARC Enterprise M4000, CSN: BC********, HOSTNAME: ******
    SOURCE: sde, REV: 1.16
    EVENT-ID: 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    DESC: A non-fatal uncorrectable error was detected within a MBC chip.
    Refer to http://www.sun.com/msg/SCF-8003-LS for more information. AUTO-RESPONSE: No immediate action is taken by XSCF software due to this
    fault.
    Resources associated with the faulty FRU will be deconfigured after the platform is power cycled or after the domain reboots or after a Dynamic Reconfiguration operation is performed. This resource deconfiguration
    may cause the platform to become unbootable. Please consult the detail
    section of the knowledge article for additional information.
    IMPACT: The non-fatal uncorrectable error trap may cause the domain to
    panic.
    REC-ACTION: Schedule a repair action to replace the affected Field
    Replaceable Unit (FRU), the identity of which can be determined using
    fmdump -v -u EVENT_ID.
    Please consult the detail section of the knowledge article for
    additional information.


    One reason to insist on a software-related trial is the "Current Issues
    Page" in the "Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000
    (OPL) Servers" from 2012:

    M5000 - MBC failures SCF-8003-LS and/or SCF-8003-HA
    Specifically looking for cases that have a fatal error immediately
    before the serious error. ->
    Still under investigation by engineering. Current action is to replace
    the faulted MBU.


    Perhaps the guys found a solution meanwhile ...
    The above mentioned web page http://www.sun.com/msg/SCF-8003-LS has been unfortunately moved behind the pay wall. :-(


    There must be a way for _my_ purchased machine. ;-)

    Yeah, but it is old and broken.

    I meant, for a machine without any service contract I should have access
    to all functions / features IMHO (also to execute a "clearstatus /MBU_A").

    Kind regards
    Kay-Uwe Loebel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Fri Sep 13 09:29:00 2019
    On 13/09/2019 06:51, Kay-Uwe Loebel wrote:
    Am 12.09.2019 um 21:36 schrieb YTC#1:

    It has been a while since I touched an Mx000, never mind trouble shot
    one :-(

    a fortiori I appreciate your help!

    What does

    fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb

    Show ?

    Now: nothing, because I did carry out a factory reset. ;-)

    TIME                 UUID                                 MSG-ID
    fmdump: /var/opt/sun/fm/fmd/fltlog is empty


    Of course I saved the outputs before:

    XSCF> fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb TIME                 UUID                                 MSG-ID
    Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
      100%  fault.chassis.SPARC-Enterprise.asic.mbc.fe

            Problem in: hc:///chassis=0/cmu=0/mbc=0
               Affects: hc:///chassis=0/cmu=0/xsb=0                FRU: hc://:product-id=SPARC Enterprise M4000:chassis-id=BC********:server-id=******:serial=BC********:part=CF00541-0893
    06   \541-0893-06:revision=0101/component=/MBU_A
              Location: /MBU_A

    XSCF> fmdump -v -u 8852b041-6bc0-4fe9-b972-d373dfd0f10c TIME                 UUID                                 MSG-ID
    Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS
      100%  fault.chassis.SPARC-Enterprise.asic.mbc.se

            Problem in: hc:///chassis=0/cmu=0/mbc=0
               Affects: hc:///chassis=0/cmu=0                FRU: hc://:product-id=SPARC Enterprise M4000:chassis-id=BCF092404K:server-id=******:serial=BC********:part=CF00541-0893
    06   \541-0893-06:revision=0101/component=/MBU_A
              Location: /MBU_A

    XSCF> fmdump -m -M
    MSG-ID: SCF-8003-LS, TYPE: Fault, VER: 1, SEVERITY: Critical
    EVENT-TIME: Tue Aug 27 08:38:07 CEST 2019
    PLATFORM: SPARC Enterprise M4000, CSN: BC********, HOSTNAME: ******
    SOURCE: sde, REV: 1.16
    EVENT-ID: 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    DESC: A non-fatal uncorrectable error was detected within a MBC chip.
    Refer to http://www.sun.com/msg/SCF-8003-LS for more information. AUTO-RESPONSE: No immediate action is taken by XSCF software due to this fault.
    Resources associated with the faulty FRU will be deconfigured after the platform is power cycled or after the domain reboots or after a Dynamic Reconfiguration operation is performed. This resource deconfiguration
    may cause the platform to become unbootable. Please consult the detail section of the knowledge article for additional information.
    IMPACT: The non-fatal uncorrectable error trap may cause the domain to
    panic.
    REC-ACTION: Schedule a repair action to replace the affected Field Replaceable Unit (FRU), the identity of which can be determined using
    fmdump -v -u EVENT_ID.
    Please consult the detail section of the knowledge article for
    additional information.


    One reason to insist on a software-related trial is the "Current Issues
    Page" in the "Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000
    (OPL) Servers" from 2012:

    M5000 - MBC failures SCF-8003-LS and/or SCF-8003-HA
    Specifically looking for cases that have a fatal error immediately
    before the serious error.        ->
    Still under investigation by engineering.  Current action is to replace
    the faulted MBU.


    As you can, it is a H/W fault. The part needs replacing.
    Sometimes you have to give up and accept defeat :-(


    Perhaps the guys found a solution meanwhile ...

    It is broken.

    The above mentioned web page http://www.sun.com/msg/SCF-8003-LS has been unfortunately moved behind the pay wall. :-(


    It says it is broken, contact your support provider and organise a
    replacement.


    There must be a way for _my_ purchased machine. ;-)

    Yeah, but it is old and broken.

    I meant, for a machine without any service contract I should have access
    to all functions / features IMHO (also to execute a "clearstatus /MBU_A").

    Why ? Even Sun would not have agreed to that. Fujitsu must have had good
    reason to hold end users back from that command level, probably because
    it can completely fubar the server. To get to escalation mode you need
    to put get a password .... using a support call.

    When you buy a car, does the manufacture supply you with free fixes for
    life ? There is a warranty period and when that is over you pay for an
    extended one, or take your chances and pay per breakage.


    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to Chris on Fri Sep 13 13:37:32 2019
    On 09/13/19 13:19, Chris wrote:
    On 09/13/19 09:29, YTC#1 wrote:
    On 13/09/2019 06:51, Kay-Uwe Loebel wrote:
    Am 12.09.2019 um 21:36 schrieb YTC#1:

    It has been a while since I touched an Mx000, never mind trouble shot
    one :-(

    a fortiori I appreciate your help!

    What does

    fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb

    Show ?

    Now: nothing, because I did carry out a factory reset. ;-)

    TIME UUID MSG-ID
    fmdump: /var/opt/sun/fm/fmd/fltlog is empty


    Of course I saved the outputs before:

    XSCF> fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb
    TIME UUID MSG-ID
    Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
    100% fault.chassis.SPARC-Enterprise.asic.mbc.fe

    Problem in: hc:///chassis=0/cmu=0/mbc=0
    Affects: hc:///chassis=0/cmu=0/xsb=0
    FRU: hc://:product-id=SPARC Enterprise
    M4000:chassis-id=BC********:server-id=******:serial=BC********:part=CF00541-0893

    06 \541-0893-06:revision=0101/component=/MBU_A
    Location: /MBU_A

    XSCF> fmdump -v -u 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    TIME UUID MSG-ID
    Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS
    100% fault.chassis.SPARC-Enterprise.asic.mbc.se

    Problem in: hc:///chassis=0/cmu=0/mbc=0
    Affects: hc:///chassis=0/cmu=0
    FRU: hc://:product-id=SPARC Enterprise
    M4000:chassis-id=BCF092404K:server-id=******:serial=BC********:part=CF00541-0893

    06 \541-0893-06:revision=0101/component=/MBU_A
    Location: /MBU_A

    XSCF> fmdump -m -M
    MSG-ID: SCF-8003-LS, TYPE: Fault, VER: 1, SEVERITY: Critical
    EVENT-TIME: Tue Aug 27 08:38:07 CEST 2019
    PLATFORM: SPARC Enterprise M4000, CSN: BC********, HOSTNAME: ******
    SOURCE: sde, REV: 1.16
    EVENT-ID: 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    DESC: A non-fatal uncorrectable error was detected within a MBC chip.
    Refer to http://www.sun.com/msg/SCF-8003-LS for more information.
    AUTO-RESPONSE: No immediate action is taken by XSCF software due to this >>> fault.
    Resources associated with the faulty FRU will be deconfigured after the
    platform is power cycled or after the domain reboots or after a Dynamic
    Reconfiguration operation is performed. This resource deconfiguration
    may cause the platform to become unbootable. Please consult the detail
    section of the knowledge article for additional information.
    IMPACT: The non-fatal uncorrectable error trap may cause the domain to
    panic.
    REC-ACTION: Schedule a repair action to replace the affected Field
    Replaceable Unit (FRU), the identity of which can be determined using
    fmdump -v -u EVENT_ID.
    Please consult the detail section of the knowledge article for
    additional information.


    One reason to insist on a software-related trial is the "Current Issues
    Page" in the "Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000
    (OPL) Servers" from 2012:

    M5000 - MBC failures SCF-8003-LS and/or SCF-8003-HA
    Specifically looking for cases that have a fatal error immediately
    before the serious error. ->
    Still under investigation by engineering. Current action is to replace
    the faulted MBU.


    As you can, it is a H/W fault. The part needs replacing.
    Sometimes you have to give up and accept defeat :-(


    Perhaps the guys found a solution meanwhile ...

    It is broken.

    The above mentioned web page http://www.sun.com/msg/SCF-8003-LS has been >>> unfortunately moved behind the pay wall. :-(


    It says it is broken, contact your support provider and organise a
    replacement.


    There must be a way for _my_ purchased machine. ;-)

    Yeah, but it is old and broken.

    I meant, for a machine without any service contract I should have access >>> to all functions / features IMHO (also to execute a "clearstatus
    /MBU_A").

    Why ? Even Sun would not have agreed to that. Fujitsu must have had good
    reason to hold end users back from that command level, probably because
    it can completely fubar the server. To get to escalation mode you need
    to put get a password .... using a support call.

    When you buy a car, does the manufacture supply you with free fixes for
    life ? There is a warranty period and when that is over you pay for an
    extended one, or take your chances and pay per breakage.



    Don't know if this would help, but had a similar problem on an M3000 in
    April this year. That needed a password to clear the fault, but searched around for a bit and found a procedure to reset the service processor by moving a jumper. Here are the notes made at the time:

    * remove service processor and fit jumper to J505, external terminal to serial management port, 9600,n,8,1
    * Plug in power cord
    * To interrupt the sp boot process, type in xyzzy when you see the line: Booting linux in n seconds. May take a couple of attempts.
    * At the preboot prompt, Preboot > type in reset all
    * The preboot menu exits, SP restarts, erases flash, sets defaults and reboots SP
    * At the login prompt, root, pwd, changeme

    Might be worth a try...

    Chris



    The fault here was reported as: SCF XSCF watchdog timeout

    To get into service mode, must have mode privs, which needs a password.
    so the command, enableservice, asks for one. But, service mode
    can be entered via the keyswitch.

    Don't remember more deatils, but assume defaults have been set via the
    jumper J505, when rebooted with keyswitch in diags mode, you can access
    service mode withoiut a password.

    In my case, a spurious faultt message, which disappeared after the above...

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to All on Fri Sep 13 13:19:26 2019
    On 09/13/19 09:29, YTC#1 wrote:
    On 13/09/2019 06:51, Kay-Uwe Loebel wrote:
    Am 12.09.2019 um 21:36 schrieb YTC#1:

    It has been a while since I touched an Mx000, never mind trouble shot
    one :-(

    a fortiori I appreciate your help!

    What does

    fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb

    Show ?

    Now: nothing, because I did carry out a factory reset. ;-)

    TIME UUID MSG-ID
    fmdump: /var/opt/sun/fm/fmd/fltlog is empty


    Of course I saved the outputs before:

    XSCF> fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb
    TIME UUID MSG-ID
    Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
    100% fault.chassis.SPARC-Enterprise.asic.mbc.fe

    Problem in: hc:///chassis=0/cmu=0/mbc=0
    Affects: hc:///chassis=0/cmu=0/xsb=0
    FRU: hc://:product-id=SPARC Enterprise
    M4000:chassis-id=BC********:server-id=******:serial=BC********:part=CF00541-0893
    06 \541-0893-06:revision=0101/component=/MBU_A
    Location: /MBU_A

    XSCF> fmdump -v -u 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    TIME UUID MSG-ID
    Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS
    100% fault.chassis.SPARC-Enterprise.asic.mbc.se

    Problem in: hc:///chassis=0/cmu=0/mbc=0
    Affects: hc:///chassis=0/cmu=0
    FRU: hc://:product-id=SPARC Enterprise
    M4000:chassis-id=BCF092404K:server-id=******:serial=BC********:part=CF00541-0893
    06 \541-0893-06:revision=0101/component=/MBU_A
    Location: /MBU_A

    XSCF> fmdump -m -M
    MSG-ID: SCF-8003-LS, TYPE: Fault, VER: 1, SEVERITY: Critical
    EVENT-TIME: Tue Aug 27 08:38:07 CEST 2019
    PLATFORM: SPARC Enterprise M4000, CSN: BC********, HOSTNAME: ******
    SOURCE: sde, REV: 1.16
    EVENT-ID: 8852b041-6bc0-4fe9-b972-d373dfd0f10c
    DESC: A non-fatal uncorrectable error was detected within a MBC chip.
    Refer to http://www.sun.com/msg/SCF-8003-LS for more information.
    AUTO-RESPONSE: No immediate action is taken by XSCF software due to this
    fault.
    Resources associated with the faulty FRU will be deconfigured after the
    platform is power cycled or after the domain reboots or after a Dynamic
    Reconfiguration operation is performed. This resource deconfiguration
    may cause the platform to become unbootable. Please consult the detail
    section of the knowledge article for additional information.
    IMPACT: The non-fatal uncorrectable error trap may cause the domain to
    panic.
    REC-ACTION: Schedule a repair action to replace the affected Field
    Replaceable Unit (FRU), the identity of which can be determined using
    fmdump -v -u EVENT_ID.
    Please consult the detail section of the knowledge article for
    additional information.


    One reason to insist on a software-related trial is the "Current Issues
    Page" in the "Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000
    (OPL) Servers" from 2012:

    M5000 - MBC failures SCF-8003-LS and/or SCF-8003-HA
    Specifically looking for cases that have a fatal error immediately
    before the serious error. ->
    Still under investigation by engineering. Current action is to replace
    the faulted MBU.


    As you can, it is a H/W fault. The part needs replacing.
    Sometimes you have to give up and accept defeat :-(


    Perhaps the guys found a solution meanwhile ...

    It is broken.

    The above mentioned web page http://www.sun.com/msg/SCF-8003-LS has been
    unfortunately moved behind the pay wall. :-(


    It says it is broken, contact your support provider and organise a replacement.


    There must be a way for _my_ purchased machine. ;-)

    Yeah, but it is old and broken.

    I meant, for a machine without any service contract I should have access
    to all functions / features IMHO (also to execute a "clearstatus /MBU_A").

    Why ? Even Sun would not have agreed to that. Fujitsu must have had good reason to hold end users back from that command level, probably because
    it can completely fubar the server. To get to escalation mode you need
    to put get a password .... using a support call.

    When you buy a car, does the manufacture supply you with free fixes for
    life ? There is a warranty period and when that is over you pay for an extended one, or take your chances and pay per breakage.



    Don't know if this would help, but had a similar problem on an M3000 in
    April this year. That needed a password to clear the fault, but searched
    around for a bit and found a procedure to reset the service processor by
    moving a jumper. Here are the notes made at the time:

    * remove service processor and fit jumper to J505, external terminal to
    serial management port, 9600,n,8,1
    * Plug in power cord
    * To interrupt the sp boot process, type in xyzzy when you see the line:
    Booting linux in n seconds. May take a couple of attempts.
    * At the preboot prompt, Preboot > type in reset all
    * The preboot menu exits, SP restarts, erases flash, sets defaults and
    reboots SP
    * At the login prompt, root, pwd, changeme

    Might be worth a try...

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Mon Sep 16 08:57:34 2019
    Am 12.09.2019 um 09:20 schrieb YTC#1:

    On 11/09/2019 22:24, Keith Thompson wrote:

    Can you talk to your vendor about changing the hostid for the license
    server? I would hope they have provisions for dying hardware.

    Bet they have not got support for that either :-)

    We have, but compared to Oracle's hardware support for the M4000 it's
    quite cheap. ;-)

    Kay

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Mon Sep 16 08:34:21 2019
    Am 13.09.2019 um 14:37 schrieb Chris:

    * remove service processor and fit jumper to J505, external terminal to
    serial management port, 9600,n,8,1
    * Plug in power cord
    * To interrupt the sp boot process, type in xyzzy when you see the line:
    Booting linux in n seconds. May take a couple of attempts.
    * At the preboot prompt, Preboot > type in reset all
    * The preboot menu exits, SP restarts, erases flash, sets defaults and
    reboots SP
    * At the login prompt, root, pwd, changeme

    Might be worth a try...

    Chris



    The fault here was reported as: SCF XSCF watchdog timeout

    To get into service mode, must have mode privs, which needs a password.
    so the command, enableservice, asks for one. But, service mode
    can be entered via the keyswitch.

    Don't remember more deatils, but assume defaults have been set via the
    jumper J505, when rebooted with keyswitch in diags mode, you can access service mode withoiut a password.

    In my case, a spurious faultt message, which disappeared after the above...

    Chris

    Sounds like a very hot tip!
    I pulled the XSCF Unit (FFSCFB) but found only a very small 4x jumper
    block labelled CN21 (nothing put).
    One could say 4 jumpers - 4 chances (or 3 to brick the unit). ;-)

    Kay

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Mon Sep 16 09:50:57 2019
    On 16/09/2019 07:57, Kay-Uwe Loebel wrote:
    Am 12.09.2019 um 09:20 schrieb YTC#1:

    On 11/09/2019 22:24, Keith Thompson wrote:

    Can you talk to your vendor about changing the hostid for the license
    server?  I would hope they have provisions for dying hardware.

    Bet they have not got support for that either :-)

    We have, but compared to Oracle's hardware support for the M4000 it's
    quite cheap. ;-)


    Is there a cost to your department to the service not being available to
    users ? (as in they only pay when it is available).

    As time drags on the buying a 2nd unit off Ebay will look better and
    better value.


    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Mon Sep 16 09:49:34 2019
    On 16/09/2019 07:34, Kay-Uwe Loebel wrote:
    Am 13.09.2019 um 14:37 schrieb Chris:

    * remove service processor and fit jumper to J505, external terminal to
    serial management port, 9600,n,8,1
    * Plug in power cord
    * To interrupt the sp boot process, type in xyzzy when you see the line: >>> Booting linux in n seconds. May take a couple of attempts.
    * At the preboot prompt, Preboot > type in reset all
    * The preboot menu exits, SP restarts, erases flash, sets defaults and
    reboots SP
    * At the login prompt, root, pwd, changeme

    Might be worth a try...

    Chris



    The fault here was reported as: SCF XSCF watchdog timeout

    To get into service mode, must have mode privs, which needs a password.
    so the command, enableservice, asks for one. But, service mode
    can be entered via the keyswitch.

    Don't remember more deatils, but assume defaults have been set via the
    jumper J505, when rebooted with keyswitch in diags mode, you can
    access service mode withoiut a password.

    In my case, a spurious faultt message, which disappeared after the
    above...

    Chris

    Sounds like a very hot tip!
    I pulled the XSCF Unit (FFSCFB) but found only a very small 4x jumper
    block labelled CN21 (nothing put).
    One could say 4 jumpers - 4 chances (or 3 to brick the unit). ;-)


    It is a brick already.... what could you do to make it worse ? :-)

    But bear in mind, his issue was a spurious one. Ones defo indicates a
    H/W issue.

    *if* you do manage to clear it (which I have my doubts without the
    escalated password) you may end up corrupting something. After all the
    system has done it's job, spotted an issue and stopped things going from
    bad to worse.


    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Tue Sep 17 09:10:05 2019
    Am 16.09.2019 um 10:50 schrieb YTC#1:

    Is there a cost to your department to the service not being available to users ? (as in they only pay when it is available).

    Of course _no_ at the university. ;-)
    I moved the few actual guys to the linux server, but soon the term will
    start and many students want / must use several programs on the M4000 ...

    Meanwhile I found a company possibly would swap the mainboard.
    Therfore arises the question, how the Service Processor notices, that
    the MBU is repaired (changed). It's not the replacefru command.
    I removed and put in the current board, but the fault status is not
    cleared (perhaps due to the unchanged serial number).

    Kay

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to All on Wed Sep 18 01:12:15 2019
    On 09/16/19 09:49, YTC#1 wrote:
    On 16/09/2019 07:34, Kay-Uwe Loebel wrote:
    Am 13.09.2019 um 14:37 schrieb Chris:

    * remove service processor and fit jumper to J505, external terminal to >>>> serial management port, 9600,n,8,1
    * Plug in power cord
    * To interrupt the sp boot process, type in xyzzy when you see the line: >>>> Booting linux in n seconds. May take a couple of attempts.
    * At the preboot prompt, Preboot> type in reset all
    * The preboot menu exits, SP restarts, erases flash, sets defaults and >>>> reboots SP
    * At the login prompt, root, pwd, changeme

    Might be worth a try...

    Chris



    The fault here was reported as: SCF XSCF watchdog timeout

    To get into service mode, must have mode privs, which needs a password.
    so the command, enableservice, asks for one. But, service mode
    can be entered via the keyswitch.

    Don't remember more deatils, but assume defaults have been set via the
    jumper J505, when rebooted with keyswitch in diags mode, you can
    access service mode withoiut a password.

    In my case, a spurious faultt message, which disappeared after the
    above...

    Chris

    Sounds like a very hot tip!
    I pulled the XSCF Unit (FFSCFB) but found only a very small 4x jumper
    block labelled CN21 (nothing put).
    One could say 4 jumpers - 4 chances (or 3 to brick the unit). ;-)


    It is a brick already.... what could you do to make it worse ? :-)

    But bear in mind, his issue was a spurious one. Ones defo indicates a
    H/W issue.

    *if* you do manage to clear it (which I have my doubts without the
    escalated password) you may end up corrupting something. After all the
    system has done it's job, spotted an issue and stopped things going from
    bad to worse.



    I would do a bit more digging, as some of the jumpers may erase the
    flash completely, for example. Or, could be for remote jtag debug.
    Found the info for mine on one of the Oracle forums, but plug any
    error message strings directly into google. Should turn up something
    relevant. M4000 are real cheap now, trading time against money on Ebay
    and something will most likely turn up that you can maybe fund yourself...

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kay-Uwe Loebel@21:1/5 to All on Thu Sep 19 10:13:18 2019
    The machine is back in service!

    The solution was finally to install a firmware update (amazingly
    possible despite faultet MBU), execute the clearfault command and
    perform a power cycling.
    Many thanks for all the help in this seemingly abandoned group!

    Kay

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Kay-Uwe Loebel on Thu Sep 19 11:49:55 2019
    On 19/09/2019 09:13, Kay-Uwe Loebel wrote:
    The machine is back in service!

    The solution was finally to install a firmware update (amazingly
    possible despite faultet MBU), execute the clearfault command and
    perform a power cycling.
    Many thanks for all the help in this seemingly abandoned group!


    Good that it is up and working.
    But bear in mind, you had a fault. You have cleared that fault. But the
    fault may re-occur.

    Now would be a good time to hunt out another Sparc server and get a backup/migration to it. If a nice T series is available, run up some
    LDoms :-)


    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Udo_T=c3=b6dter?=@21:1/5 to All on Mon Sep 23 16:57:23 2019
    On 19.09.19 12:49, YTC#1 wrote:
    On 19/09/2019 09:13, Kay-Uwe Loebel wrote:
    The machine is back in service!

    The solution was finally to install a firmware update (amazingly
    possible despite faultet MBU), execute the clearfault command and
    perform a power cycling.
    Many thanks for all the help in this seemingly abandoned group!


    Good that it is up and working.
    But bear in mind, you had a fault. You have cleared that fault. But the
    fault may re-occur.

    Now would be a good time to hunt out another Sparc server and get a backup/migration to it. If a nice T series is available, run up some
    LDoms :-)



    Well here is another M4000 still in service, it runs and runs and runs.
    And the old gem is still certified to run Solaris 11.

    We are currently migrating our application (SAM/FS with 600TB active
    data) to a LDOM running on a S7.

    Udo

    --
    +----------------------------------------------------------------------+
    |Udo Toedter |FSU Jena |Email: |Phone +493641940532| |Bereich ZSB |Rechenzentrum|Udo.Toedter@uni-jena.de|FAX +493641940632| +----------------------------------------------------------------------+

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to All on Mon Sep 23 18:55:18 2019
    On 23/09/2019 15:57, Udo Tödter wrote:
    On 19.09.19 12:49, YTC#1 wrote:
    On 19/09/2019 09:13, Kay-Uwe Loebel wrote:
    The machine is back in service!

    The solution was finally to install a firmware update (amazingly
    possible despite faultet MBU), execute the clearfault command and
    perform a power cycling.
    Many thanks for all the help in this seemingly abandoned group!


    Good that it is up and working.
    But bear in mind, you had a fault. You have cleared that fault. But the
    fault may re-occur.

    Now would be a good time to hunt out another Sparc server and get a
    backup/migration to it. If a nice T series is available, run up some
    LDoms :-)



    Well here is another M4000 still in service, it runs and runs and runs.
    And the old gem is still certified to run Solaris 11.

    We are currently migrating our application (SAM/FS with 600TB active
    data) to a LDOM running on a S7.


    When you have migrated, I think I know someone who might want the old
    server off you :-)




    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From larbob@21:1/5 to All on Thu Dec 16 20:27:38 2021
    I've got an M3000 with the same issue. Chris, if you still read here, "xyzzy" doesn't seem to want to work for me as that password -- it doesn't really acknowledge that I made any input and then keeps going after 5 seconds.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From larbob@21:1/5 to larbob on Thu Dec 16 20:34:16 2021
    On Thursday, December 16, 2021 at 11:27:41 PM UTC-5, larbob wrote:
    I've got an M3000 with the same issue. Chris, if you still read here, "xyzzy" doesn't seem to want to work for me as that password -- it doesn't really acknowledge that I made any input and then keeps going after 5 seconds.
    https://pastebin.com/9xM6Zwbp

    I found this but I'm not sure what the root password is either.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From chris@21:1/5 to larbob on Fri Dec 17 17:12:55 2021
    On 12/17/21 04:34, larbob wrote:
    On Thursday, December 16, 2021 at 11:27:41 PM UTC-5, larbob wrote:
    I've got an M3000 with the same issue. Chris, if you still read here, "xyzzy" doesn't seem to want to work for me as that password -- it doesn't really acknowledge that I made any input and then keeps going after 5 seconds.
    https://pastebin.com/9xM6Zwbp

    I found this but I'm not sure what the root password is either.

    I bought a T4-1 completely bricked, but after a bit of web search, found
    a page that involves removing the internal service processor
    card, fit jumper to J505, refit card, repower up with a terminal to
    the ilom port.

    SP reboots, type in xyzzy when you see the message:

    Booing linux in N seconds (may take several tries)

    Reinstall and re power up . At the preboot prompt:

    reset all

    Then at the login prompt:

    Login root
    Password changeme

    Must be something similar for the M4000

    Chris

    M3000 for 4 years+ now...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)