Using the Serial Port Interface to Access the SDB "v" commands for EPC Micros

Last modified on Wednesday, July 26, 2006

The Problem

The Embedded PC (EPC) micro cards have an Ethernet interface which we have been trying to use with limited success. RMX has never had a TCP/IP stack that was embeddable i.e. can be used in a nucleus-only environment like ours. Thus, Mark Crane made many mods to another stack called Fusion so that it would run under RMX. This mostly works but under heavy load in PR02 and PR04, the micro hangs totally after some relatively short period of time running Ethernet. We think that both SLCnet and Ethernet are unavailable. At least you can't use MD386 to see what's going on. A further obstacle to debugging is that the EPC BIOS trashes parts of memory on reset so even a crash dump is mostly useless.

Therefore it would be nice to have a third way into the micro. Enter the serial port. In configuring the EPC card, we usurp the COM port interrupts but the COM2 port I/O addresses are still accessible. We have terminal servers by which we can access an EPC serial port remotely. We also have a callable interface to the RMX System Debugger (SDB) by which we can access all of the "v" or view commands to examine RMX data structures and information.

That's nice if the system is still functional enough to schedule a task that can respond to the user's commands and send the result to the serial port. The question is how dead is it? What if it's hung in some ISR or interrupt task? What if RMX itself is not functional? The clock interrupt is the most basic element that must be operational for the system to be at all functional. In additional to the SDB commands, we also need a way to access the system at the most basic level so it seems we also need to extend our hook into the Clock ISR.

The Solution

We take a two-pronged approach to providing a window into the dead system. When the EPC micro starts, a task called crashmain is created at the highest user i.e. non-interrupt task priority. The following is how it works:

Do Forever

Once/second see if there is an input char on the COM2 port

If there is then

read and echo chars until input buffer id full or a CR is entered

send the entered command to SDB and get its output

send the SDB output to the COM2 port

end if

end do

It's about as simple as it gets. The SDB callable interface will give results back for any valid "v" command and reject any garbage input with a "syntax error". The "v" commands are documented in the iRMX System Debugger Reference manual. We have several copies here or you can see the online version at http://www.tenasys.com/irmx_manuals.htm

Additionally we provide a routine in crashmain that is called by the Clock ISR. This routine initially just writes a 'C' character to the serial port every few seconds to show that the Clock is still running. The function of this routine can be expanded arbitrarily depending on the results of our initial debug efforts.

The Usage

The checkpr.com file in slccom: does the telnet for you based on micro name or you can do it manually. The first thing to do is to telnet into the terminal server and port that is connected to the micro you want to debug. The names, ports, userids and passwords can vary with time so you may need to check with the appropriate people to get this info. Here's the current configuration:

Micro	Server Name	Port	Userid
LI34	b5as	4	pepii
PR04	tty04pep00	33	eoic
PR02	tty02pep00	1	eoic

The following example is from a UNIX session connecting to the terminal server b5as and port number 2005 used with LI34 in the test closet. The VMS command format is telnet b5as/port=2005. Bold type is your input.

rcs@slcsun1 $ telnet b5as 2005
Trying 134.79.48.219...
Connected to b5as.SLAC.Stanford.EDU.
Escape character is '^]'.

User Access Verification

Username: pepii
Password:
Password OK

So now you're in. The crashmain program doesn't spontaneously send anything and only scans for new input once/second to keep it's system impact to a minimum. It echoes whatever you type so when you enter the first character, wait for it to echo back before you enter the rest of the command. Once it detects the first character it can scan them in at 100 Hz so you can type as fast as you want. Type Enter when you've finished a valid command or just want to terminate your input. The output from SDB is then sent to the terminal.

A word about output speed. Again, to minimize system impact and to be able to use this program on the running system, characters are output only at 100 Hz so you'll need to be a bit patient and wait a few seconds for the command results.

Another niggling detail is that the SDB output overwrites your input command because Enter just sends <CR> and the SDB output doesn't have a leading line feed. If you want to see the command you typed as well as the output, you can type <CTRL J> which is a line feed before you type Enter. Cheap trick but it works.

Here's some sample output. The vk command displays all ready and sleeping tasks. The leading and trailing 'C's are the heartbeat of the clock ISR. When a command is entered, the heartbeat is suspended until the command is complete.

CCCCCCCCvk
Ready tasks:    15d0    0268

Sleeping tasks: 0270    0fa8    1058    1070    10a0    10b0    10e8    1100
                       1110    1128    1168    1258    1268    1290    12c0    12e8
                       13b8    13e0    1408    1430    1458    1480    14a8    14d0
                       14f8    1520    1548    1570    1598    15c0    1718    1be8
                       1bf8    1c10    1d38    1d50    1d60    1f10    2108    2660
                       2b58    2b68    2d70    3138    3630    3640    3848    3c10
                       4190    41a0    43a8
CCCC

The vd command displays a job's object directory and 258 is the root job.

vd 258
Directory size: 00c8    Entries used: 0049

ATM_LI34LOOP  2b68
ACT_LI34LOOP  2d70
SlcnetPorts   1128
BitbusInt     2108
MD386         1150
CTL_LI34LOP2  3630
ATM_LI34LOP2  3640
CTL_LI34LOP3  4190
ATM_LI34LOP3  41a0
MES_LI34LOP2  3138
ACT_LI34LOP2  3848
MES_LI34LOP3  3c10
ACT_LI34LOP3  43a8
CRATTASK      13e0
STATTASK      14a8
FBCKTASK      1520
CRATMAIN      13c8
STATMAIN      1490
MICROMAIN     1020
MSG_TASK      1290
FBCKMAIN      1508
MBEMAIN       15a8
MSGMAIN       1278
CRASHMAIN     15d0
ANLGTASK      14d0
CAMVTASK      13b8
TIMETASK      1408
ANLGMAIN      14b8
MSGBWT        12c0
DBMAIN        12d0
CAMVMAIN      13a0
ERRMAIL       1268
RQMONITOR     0fb0
ERRMTIM       1258
MD386Login    1168
RQSYSINFO     0fb8
DBS_TASK      12e8
EEPROINT      1070
FNSTIMER      1058
TCPeSERV      10b0
PORT_REQUEST  1118
NET_REQUEST   10f0
TIMEMAIN      13f0
HTTPSERV      10a0
MGNTMAIN      1418
BPMOMAIN      1440
TESTMAIN      1468
SLCnetIntTsk  10e8
KLYSMAIN      14e0
MCOM_MAIN     1530
MPSCMAIN      1558
BCOMMAIN      1580
TESTTASK      1480
MPSCTASK      1570
SLCnetDriver  10d0
SLCnetServer  1100
SLCnetTimer   1110
KISTTASK      1548
MBE_TASK      15c0
KLYSTASK      14f8
BPMOTASK      1458
MGNTTASK      1430
BCOMTASK      1598
MBCDEXER      1718
TIMERRSND     1be8
TIMENONMIS    1bf8
TIM360HZ      1c10
BPMPROC       1d38
BPMFDBK       1d50
BPTO          1d60
TIMDWNLD      1f10
MES_LI34LOOP  2660
CTL_LI34LOOP  2b58

http://www.slac.stanford.edu/grp/cd/soft/pepii/slaconly/how-to/use_term.html gives you the varying control sequences required to exit telnet depending how you connected to the terminal server.

That's all there is to it Have fun in the new Millennium.