PDP-11/45: RK11 II

Mon 20 February 2017 by Fritz Mueller

Okay, moving on with the RK11-C debug, the following bit of test code is modeled after that part of the ZRKJE0 diagnotic that is trapping out:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
        177404                          RKCS=177404
000000                                  .ASECT
        001000                          .=1000
001000  012706  000770          START:  MOV     #770,SP         ;INIT STACK POINTER
001004  005000                          CLR     R0              ;INIT TRAP BASE
001006  012701  000002                  MOV     #2,R1           ;INIT TRAP DEST
001012  010120                  L1:     MOV     R1,(R0)+        ;STORE TRAP DEST
001014  005020                          CLR     (R0)+           ;AND STORE HALT AT TRAP DEST
001016  062701  000004                  ADD     #4,R1           ;UPDATE TRAP DEST
001022  020600                          CMP     SP,R0           ;ALL TRAPS INITD?
001024  001372                          BNE     L1              ;IF NOT, KEEP GOING
001026  005002                          CLR     R2              ;INIT MAIN ITERATION COUNT
001030  005202                  L2:     INC     R2              ;INC MAIN ITERATION COUNT
001032  010237  177570                  MOV     R2,@#177570     ;UPDATE DISPLAY REG
001036  012700  177404                  MOV     #RKCS,R0        ;GET RKCS ADDRESS
001042  012710  007560                  MOV     #7560,@R0       ;SET RESET CTRLR CMD
001046  005210                          INC     @R0             ;SET GO BIT
001050  005005                          CLR     R5              ;INIT CHECK COUNT
001052  105710                  L3:     TSTB    @R0             ;CHECK DONE BIT
001054  100765                          BMI     L2              ;IF SET, NEXT MAIN LOOP
001056  005205                          INC     R5              ;OTHERWISE INC CHECK COUNT
001060  001374                          BNE     L3              ;IF NOT EXPIRED GO CHECK AGAIN
001062  000000                          HALT                    ;OTHERWISE, HALT HERE
        001000                          .END    START

Running this code, the error is easily reproduced -- the machine traps on a bus timeout and halts after anywhere from a few dozen to a few hundred iterations. Put the logic analyzer on MSYN and SSYN at the back M105 address decode module on the RK11-C backplane and set up a trigger for long bus cycles, but surprisingly this was not triggering even though the processor was taking a trap 4. Verified that the trigger itself was working fine by accessing a non-existent memory location from the front panel. Hmmm...

Next step then was to move back to the CPU, and throw the UBC card out on extenders to get more visibility into the trap. A further surprise here -- the problem went away when the UBC was on the extender! I was able to run the test code above for hundreds of thousands of iterations without timeouts, and the original ZRKJE0 diagnostic ran for over half an hour this way.

Took the UBC back off the extender, and the problem re-occurred, so apparently not just a bad seat. Hooked the logic analyzer up to BUS A MSYN L, BUS A SSYN L, and UBCB TIMEOUT (1) H on the 11/45 backplane. With this, I was able to capture lots of traces of the failure mode, which looks like this:

Here a glitch on the timeout signal is clearly visible, even though the MSYN/SSYN interval is well under the bus timeout. The interesting thing is that bus cycles that result in a glitch all have a MSYN/SSYN interval of 568 nanosceconds, to within a nanosecond. Cycles with a slightly different interval do not timeout. This jibes with what I saw with the card extender also. As a further verification, replaced the M920 bus jumper I'd been using with a 2-foot BC11, and the problem disappeared again.

At this point, Don over on the VCFED forum pointed out that the M920 I had been using was discontinued early on due to negative effects on bus signal integrity, and was replaced with the M9202 (which itself contains 2 feet of BC11). The issue with the M920 is apprently that it provides so little separation that the connected loads appear to the bus overall as a single lumped load. The M9202 separates the loads on the bus to smear out reflections and ringing and avoid false triggers. I have tracked down an M9202 on eBay, and have also put an inexpensive digital storage scope on order so I can start to investigate signal integrity issues like this that are not apparent on a logic analyzer.