Mark6 troubleshooting

From $1

    General

    In case you run into trouble with the Mark6 recorders first try to restart the cplane and dplane services.

    You can download a service script (to /home/oper/bin) that reliably shuts down and restarts the services in the proper order. The script also displays the relevant logs files (named dplane-daemon.log, cplane-daemon.log) which can help you in debugging the issue.

    as user oper:

    m6service_restart
    

    If the issue cannot be resolved by restarting the services try rebooting the mark6 machine.

    Procedure for replacing a broken module when running the schedule

    1. Stop the schedule (end M6_CC)
    2. log-into the recorder and start  da-client
      1. record=off
      2. group=close:1234
      3. group=unmount:1234
    3. remove the broken module
    4. insert the new module
    5. reboot the mark6 machine (this step is neccessary at the moment due to problems in Mark6 software 1.3c)
    6. in da-client (VERY IMPORTANT: Don't use the "new" option in the following mod_init commands)
      1. mod_init:1:8:{VSN1}:sg
      2. mod_init:2:8:{VSN2}:sg
      3. mod_init:3:8:{VSN3}:sg
      4. mod_init:4:8:{VSN4}:sg
      5. group=new:1234
      6. redefine/commit your input_streams
      7. group=open:1234
    7. restart the schedule

      This procedure has been verified to work at PV30M. The previously recorded scans were readable. Recording on the new group of modules has worked. Note that even though list? will not show the previously recorded scans anymore they are still physically on the module disks.

    Host name

    The hostname as reported by the hostname command should not be fully qualified, e.g.

    > hostname
    recorder1                OK
    recorder1.iram.es        Not OK
    

    fully qualified hostnames prevent communication between cplane and dplane which for example would cause the recording state to always remain in the "pending" state even if data is being recorded.

    Timezone

    The configured timezone of the mark6 machine must be UTC otherwise starting of the scheduled recording will not work!

    Check the timezone of the mark6 machine e.g. run:

    date
    > Tue Mar 17 08:12:54 UTC 2015
    

    make sure that the timezone is UTC. If it is not run

    dpkg-reconfigure tzdata
    > Etc > UTC
    

    Input_stream error

    Before entering input streams, groups of diskmodules have to be in a closed state. When group is open you may see this problem when you commit the input_stream.

    <<  !mstat?0:0:1234:1:BHC%0029/48008/4/8:8:8:47991:48008:open:ready:sg:1234:2:BHC%0030/48008/4/8:8:8:47991:48008:open:ready:sg:1234:3:BHC%0031/48008/4/8:8:8:47991:48008:open:ready:sg:1234:4:BHC%0032/48008/4/8:8:8:47991:48008:open:ready:sg;
    >> input_stream=add:FILA10G-L:vdif:8224:50:42:eth3:172.16.3.1:0:12
    <<  !input_stream=0:0;
    >> input_stream=add:FILA10G-H:vdif:8224:50:42:eth5:172.16.5.1:0:34
    <<  !input_stream=0:0;
    >> input_stream=commit
    <<  'BHC%0031/48008/4/8'
    >>

    After committing the input_stream, the command returns a disk module status.  

    **If you issue a input_stream? it may look like the input_streams are committed after this bug **

    You should restart cplane and to be sure the disks are in a closed state.  The expected input_stream=commit response should be:

    >> input_stream=add:FILA10G-L:vdif:8224:50:42:eth3:172.16.3.1:0:12
    <<  !input_stream=0:0;
    >> input_stream=add:FILA10G-H:vdif:8224:50:42:eth5:172.16.5.1:0:34
    <<  !input_stream=0:0;
    >> input_stream=commit
    <<  !input_stream=0:0;
    >>
    

     

    cplane missing disk/diskmodule after diskmodule swap

    Occasionally we've seen a disk or diskmodule not seen after swapping in new modules. 

    If the Mark6 OS doesn't see a disk it may be due to a kernel bug when swapping disks in and out.

    #log in as root
    su -l 
    # use fdisk -l command to see disks recognized by the Mark6 OS
    fdisk -l 
    .
    .
    .
    # There should be 33 disks seen (1 being the OS disk)
    # /dev/sdag1  # number of disks seen /dev/sdag = 33 disks...

    You may have to restart the Mark6 unit to clear the kernel bug if there isn't 33 disks seen

    If all disks are seen by the Mark6 OS But not cplane, this would require a restart of the dplane/cplane services

    #log in as root
    su -l 
    # restart dplane and cplane services
    /etc/init.d/dplane restart
    /etc/init.d/cplane restart
    Tags: (Edit tags)
    • No tags
    FileSizeDateAttached by 
     m6service_restart
    No description
    2.91 kB09:06, 19 Mar 2015Helge RottmannActions
    Comments (0)
    You must login to post a comment.

     
    Powered by MindTouch Core