• Sun Fire V1280 - seeing VxWorks prompt after modding TOD chip

    From YTC#1@21:1/5 to All on Wed Sep 27 09:16:22 2017
    XPost: comp.unix.solaris

    Copied to comp.sys.sun.hardware, maybe someone there can help ?

    There's a cold spare Sun Fire V1280 that's recently been pressed into service because another V1280's TOD chip's battery depleted its charge.

    The TOD chip is a ST Microelectronics M48T59Y-70PC1D.
    It has a lithium primary battery in it that goes dead after, um, 15 years.

    A youtube video inspired a tech to grind off the arse end of the chip and solder a CR2032 battery to it. He did it to two chips, one that still had a good internal battery, say GB, and one whose internal battery died, say DB.

    A GB in this V1280 and it does its POST fine.

    A DB chip put into this V1280 causes it to dump to what looks like a VxWorks prompt. I can "ls" around in /sd/flash and see some Java stuff and a vxworks.init file.

    So, plug in power. Unit starts self-tests.
    (All the debug that follows was hand-typed after being handwritten to paper, so forgive typos and omitted lines.)

    TOD(M48T59) self-test is good (previously had been bad).
    Later:
    POST Complete.
    ERI Device Present
    Getting MAC address for SSC1
    Using SSC MAC address
    MAC address is 0:3:ba:ca:7a:da
    Using DHCP to configure network interface
    Attached TCP/IP interface to eri unit 0
    Attaching interface lo0...done
    Initializing DHCP libraries
    Timeout waiting for network driver (flags=0x8062)

    (stops here)

    version
    VxWorks (for Sun Fire System Controller) version 5.4.
    Kernel: WIND version 2.5.
    Made on Aug 18 2004, 09:32:05.
    Boot line:
    eri(0,0)bootHost: e=0.0.0.0 u=target f=0xe0 tn=noname.example.com
    value = 0 = 0x0 <- seems to be kicked out at the end of any command.

    whoami
    target

    ifShow
    eri (unit number 0):
    Flags: (0x8063) UP BROADCAST MULTICAST ARP RUNNING
    Type: ETHERNET_CSMACD
    Internet address: 0.0.0.0
    Broadcast addresses: 0.255.255.255
    Netmask 0xff000000 Subnetmask 0xff000000
    Ethernet address is 00:03:ba:ca:7a:da
    Metric is 0
    Maximum transfer unit size is 1500
    0 packets received; 1 packets sent
    0 multicast packets received
    0 multicast packets sent
    0 input errors; 0 output errors
    0 collisions; 0 dropped
    lo (unit number 0):
    Flags: (0x8069) UP BROADCAST MULTICAST ARP RUNNING
    Type: SOFTWARE_LOOPBACK
    Internet address: 127.0.0.1
    Broadcast addresses: 0.255.255.255
    Netmask 0xff000000 Subnetmask 0xff000000
    Metric is 0
    Maximum transfer unit size is 32768
    0 packets received; 0 packets sent
    0 multicast packets received
    0 multicast packets sent
    0 input errors; 0 output errors
    0 collisions; 0 dropped

    boot
    Copyright 2001-2004 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Sun Fire System Firmware
    RTOS version: 41
    ScApp version: 5.18.1 Build_01
    SC POST diag level: off

    The date is Tuesday, May 6, 2003, 7:27:41 AM PDT.
    <date> noname.example.com lom: Boot: ScApp 5.18.1, RTOS 41
    <date> noname.example.com lom: Boot: SBBC Reset Reason(s): Power On Reset <date> noname.example.com lom: Boot: Initializing the SC SRAM
    <date> noname.example.com lom: Boot: Caching IO information
    <date> noname.example.com lom: Boot: Clock Source: 75MHz
    <date> noname.example.com lom: Boot: /N0/PS0: Status is OK
    <date> noname.example.com lom: Boot: /N0/PS1: Status is OK
    <date> noname.example.com lom: Boot: /N0/PS2: Status is OK
    <date> noname.example.com lom: Boot: /N0/PS3: Status is OK
    <date> noname.example.com lom: Boot: Chassis is in single partition mode. <date> noname.example.com lom: Boot: NOTICE: /N0/FT0 is powered off
    <date> noname.example.com lom: Boot: Cold boot detected: recovering active domains
    ...
    setdate 0926191717 mmddHHMM[[cc]yy][.SS]
    resetsc
    ...
    Timeout waiting for network driver (flags=0x8062)

    (stops here)

    I thought it was stopping because DHCP failed, so I reconfigured
    the SSC to disable it (you can tell it it's not connected to a network,
    at the lom> prompt).
    It still comes up after plugging in, and stops with the -> prompt.

    I can just type "boot", and it continues. I've set the SC POST diag
    level to min and tried. No change.
    Any idea to try so I don't have to hook up a serial console every
    time and type boot?


    More info, mainly for those who are curious enough to have read this far.

    devs
    drv name
    0 /null
    1 /tyCo/0
    1 /tyCo/1
    1 /tyCo/2
    1 /tyCo/3
    5 /sc/flash
    pwd
    /sc/flash
    ls
    Info
    JVM.zip
    evxworks.init
    lib
    libNativ.o
    libSSH.o
    moduli
    vxAddOn.o
    vxworks.init
    vxworks.jdb
    vxworks.norm
    copy "vxworks.init"
    #
    # JVM configurator/startup script
    #
    ld 1,0,"/sc/flash/vxAddOn.o"
    # lw8 needs a to run scapp from ram. Those are no-ops for serengeti copyFileToRam("/sc/flash/lib/rt.jar","/sc/flash/lib/rt.jar"); copyFileToRam("/sc/flash/lib/scapp.jar","/sc/flash/lib/scapp.jar");
    # uncompress the JVM
    uncompressJVM("/sc/flash/JVM.zip","/sc/flash/JVM");
    kernelTimeSlice 5
    # load the JVM
    javaConfig
    javaClassPathSet "/sc/flash/lib/scapp.jar:/sc/flash/lib/jdmkrt.jar" javaLoadLibraryPathSet "/sc/flash"
    # delete temporary files created during uncompression.
    deleteTmpFiles
    ...
    # Set in order to accept packets on interfaces only if the
    # destination in ip hdr matches ip address on the receiving interface ipStrongEnded = 1
    startJava

    copy "Info"
    Sun_Fire_Version_5.18.1-Build_01




    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Packard@21:1/5 to All on Wed Sep 27 14:57:10 2017
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to Scott Packard on Thu Sep 28 23:30:57 2017
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    Might be trying to default boot from the network, which is
    why it's looking for a dhcp server. Looks like it sent out
    a request packet for that and got no reply. That is
    also settable from OBP console, or the eeprom command once
    the system is booted...

    Regards,

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to Scott Packard on Thu Sep 28 23:15:41 2017
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    There should be an option in the OBP to enable or disable
    autoboot from power on. From sc>, type console <return>,
    which gives you a console prompt. Then type help for info.

    Alternatively, boot the system and type eeprom <return>,
    to get current settings...

    Regards,

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Chris on Fri Sep 29 10:30:18 2017
    On 29/09/2017 00:15, Chris wrote:
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    There should be an option in the OBP to enable or disable
    autoboot from power on. From sc>, type console <return>,
    which gives you a console prompt. Then type help for info.

    Alternatively, boot the system and type eeprom <return>,
    to get current settings...

    Except by typing just boot it would try and boot from the same device
    (net) if that was the case.


    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to All on Fri Sep 29 22:52:45 2017
    On 09/29/17 09:30, YTC#1 wrote:
    On 29/09/2017 00:15, Chris wrote:
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    There should be an option in the OBP to enable or disable
    autoboot from power on. From sc>, type console<return>,
    which gives you a console prompt. Then type help for info.

    Alternatively, boot the system and type eeprom<return>,
    to get current settings...

    Except by typing just boot it would try and boot from the same device
    (net) if that was the case.



    Have you tried boot disk <return>, from obp ?...

    Regards,

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Chris on Sat Sep 30 09:34:11 2017
    On 29/09/2017 23:52, Chris wrote:
    On 09/29/17 09:30, YTC#1 wrote:
    On 29/09/2017 00:15, Chris wrote:
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    There should be an option in the OBP to enable or disable
    autoboot from power on. From sc>, type console<return>,
    which gives you a console prompt. Then type help for info.

    Alternatively, boot the system and type eeprom<return>,
    to get current settings...

    Except by typing just boot it would try and boot from the same device
    (net) if that was the case.



    Have you tried boot disk <return>, from obp ?...


    If you follow the OPs thread, you will see he has typed "boot" and it
    booted from disk.

    The system appears to halt at the LOM configuration of it's network
    management port.

    He says he has disabled that, but the issue still exists.





    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to All on Sat Sep 30 12:03:36 2017
    On 09/30/17 08:34, YTC#1 wrote:


    If you follow the OPs thread, you will see he has typed "boot" and it
    booted from disk.

    The system appears to halt at the LOM configuration of it's network management port.

    He says he has disabled that, but the issue still exists.


    Why is this such hard work ? :-)...

    No, it's not booting from disk, according to the log shown. It's also not
    clear if the boot command has been tried at all from obp. If from sc, it
    will just reboot sc, as appears to be happening, not the system. If from
    the obp prompt, it will boot from the default device as setup in obp.

    Regards,

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From YTC#1@21:1/5 to Chris on Sat Sep 30 13:35:20 2017
    On 30/09/2017 13:03, Chris wrote:
    On 09/30/17 08:34, YTC#1 wrote:


    If you follow the OPs thread, you will see he has typed "boot" and it
    booted from disk.

    The system appears to halt at the LOM configuration of it's network
    management port.

    He says he has disabled that, but the issue still exists.


    Why is this such hard work ? :-)...

    Because we are working from memory, and CBA plugging in any old servers
    with LOMs :-)



    No, it's not booting from disk, according to the log shown. It's also not clear if the boot command has been tried at all from obp. If from sc, it

    And I admit, I failed to notice he was typing boot at SC> , not OK> :-(

    will just reboot sc, as appears to be happening, not the system. If from
    the obp prompt, it will boot from the default device as setup in obp.

    Regards,

    Chris






    --
    Bruce Porter
    "The internet is a huge and diverse community but mainly friendly" http://ytc1.blogspot.co.uk/
    There *is* an alternative! http://www.openoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Packard@21:1/5 to Chris on Sat Sep 30 11:39:23 2017
    On Thursday, September 28, 2017 at 4:30:58 PM UTC-7, Chris wrote:
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    Might be trying to default boot from the network, which is
    why it's looking for a dhcp server. Looks like it sent out
    a request packet for that and got no reply. That is
    also settable from OBP console, or the eeprom command once
    the system is booted...

    Regards,

    Chris

    You guessed correct in an earlier post.
    I'm a few minutes earlier in the boot sequence than the lom> prompt, and
    the lom> prompt is a few minutes (or hours! in the case of mem2) earlier than the OBP's ok prompt.

    This -> prompt is something I've never seen before, and seems to be a VxWorks prompt. I Googled a little and the commands and syntax are a subset of VxWorks.
    Checking around in support.oracle.com the only time they mention VxWorks is for a Net-Net 4250 Acme Packet used to make.

    What my intuition tells me is the VxWorks was supposed to autoload a JVM and I'd guess a running JVM is what gives the lom> prompt. Something is interfering with the autoload, I'm guessing, and VxWorks drops to its command line.
    If I could guess what the defect is and fix it then I figure I'd never see this -> command line again.

    I thought applying the latest firmware patch would fix what was wrong with it, but it didn't. I can type boot, then it'll continue on, and I was able to stick a HDD into it and install Solaris 9.



    I guess this isn't a high-priority issue. What I heard is the repair of the TOD chip worked in the field unit (nobody mentioned they had the same drop to VxWorks that I've seen here, which is strange, but I'm getting the info 4th hand).

    I searched pretty hard and came across a Sun Fire™ Midrange Server Maintenance SM-340. That's interesting. It says The LOM shell runs under Vxworks, the system controller (SC) operating environment.

    I'm guessing I'd have to troubleshoot the VxWorks autoload process. I'll ask around next week to see if they were dropped to the -> prompt or not on the fielded unit.

    Regards, Scott

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to Scott Packard on Sat Sep 30 21:39:27 2017
    On 09/30/17 18:39, Scott Packard wrote:
    On Thursday, September 28, 2017 at 4:30:58 PM UTC-7, Chris wrote:
    On 09/27/17 21:57, Scott Packard wrote:
    FWIW, I applied the latest update, 114527-18, which
    patches ScApp, RTOS, and SC POST, and it still happens.
    I ran the highest POST level, mem2, and all tests pass.

    Regards, Scott

    Might be trying to default boot from the network, which is
    why it's looking for a dhcp server. Looks like it sent out
    a request packet for that and got no reply. That is
    also settable from OBP console, or the eeprom command once
    the system is booted...

    Regards,

    Chris

    You guessed correct in an earlier post.
    I'm a few minutes earlier in the boot sequence than the lom> prompt, and
    the lom> prompt is a few minutes (or hours! in the case of mem2) earlier than the OBP's ok prompt.

    This -> prompt is something I've never seen before, and seems to be a VxWorks prompt. I Googled a little and the commands and syntax are a subset of VxWorks.
    Checking around in support.oracle.com the only time they mention VxWorks is for a Net-Net 4250 Acme Packet used to make.

    What my intuition tells me is the VxWorks was supposed to autoload a JVM and I'd guess a running JVM is what gives the lom> prompt. Something is interfering with the autoload, I'm guessing, and VxWorks drops to its command line.
    If I could guess what the defect is and fix it then I figure I'd never see this -> command line again.

    I thought applying the latest firmware patch would fix what was wrong with it, but it didn't. I can type boot, then it'll continue on, and I was able to stick a HDD into it and install Solaris 9.



    I guess this isn't a high-priority issue. What I heard is the repair of the TOD chip worked in the field unit (nobody mentioned they had the same drop to VxWorks that I've seen here, which is strange, but I'm getting the info 4th hand).

    I searched pretty hard and came across a Sun Fire™ Midrange Server Maintenance SM-340. That's interesting. It says The LOM shell runs under Vxworks, the system controller (SC) operating environment.

    I'm guessing I'd have to troubleshoot the VxWorks autoload process. I'll ask around next week to see if they were dropped to the -> prompt or not on the fielded unit.

    Regards, Scott

    Scott,

    I haven't seen the VxWorks prompt either, but perhaps you need to set
    defaults in sc to reinitialise everything if you have done a firmware
    update. The other thing is that the firmware update / reflash may
    have failed or become corrupted. You could try reverting to the
    previous version that worked, or reflash the later version again.

    As for OBP, if you replace the battery, you need to do a set defaults
    there, in OBP as well. Fwir, OBP itself is Forth based, or used to be,
    so dunno where the Java comes from or what it's for. Have done a fair
    number of rebattery fixes here, as older machines are what I do for fun
    these days. You grind away the epoxy, but must also cut the tags
    going to the cells at the top of the package, to prevent discharge
    through the old cells. I also use larger cells epoxied to the top
    of the package. Should outlive the machine and perhaps me as well.

    Good to see it's working anyway, man eeprom to get info on the obp
    settings from Solaris. Usenet still has it's uses :-)...

    Regards,

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris@21:1/5 to All on Sat Sep 30 21:44:55 2017
    On 09/30/17 12:35, YTC#1 wrote:
    On 30/09/2017 13:03, Chris wrote:
    On 09/30/17 08:34, YTC#1 wrote:


    If you follow the OPs thread, you will see he has typed "boot" and it
    booted from disk.

    The system appears to halt at the LOM configuration of it's network
    management port.

    He says he has disabled that, but the issue still exists.


    Why is this such hard work ? :-)...

    Because we are working from memory, and CBA plugging in any old servers
    with LOMs :-)

    Right, never had any sort of formal support here, so the firmware in
    use is what it a arrives with, but seems to work ok. Oldest machine here
    is a Sun 3, newest is an M3000, both running, but not seriously in
    use. V245 is writing this post...

    Regards,

    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)