Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems running two or more MSCP controllers in the simulator #439

Open
davepl opened this issue Jan 7, 2025 · 10 comments
Open

Problems running two or more MSCP controllers in the simulator #439

davepl opened this issue Jan 7, 2025 · 10 comments

Comments

@davepl
Copy link

davepl commented Jan 7, 2025

  • Drives from second and third MSCP controller do not work under 211BSD, but do on real hardware

Expect: To be able to mount two MSCP controllers, each with 1-3 drives. I expect those drives to show up as ra0-ra3 for the first controller, and ra8-ra11 for the second controller, and so on. And so they do on my actual hardware, which I'm trying to mirror in the simulator. But in the simulator, only ra0-3 work, and ra8 produces IO errors.

For example, with this INI file, and any BSD image I can find, I notice three things:
(1) BSD autoconfig detects the controller CSRs at 17722150 and 17772154, and vectors of 154 and 254 per the dtab
(2) The KDA50 primary controller works as expect, as does the RL. All those disks are mountable
(3) The RQDX3 drives are present, but when BSD tries to access them, it gets an I/O from the controller

set rq KDA50
set rq enabled address=17772150
set rq0 ra92 noautosize
att rq0 211bsd-root.dsk

set rqb RQDX3
set rqb enabled address=17772154
set rqb1 rx50 noautosize
att rqb1 rx50-01.dsk
set rqb2 rx50 noautosize
att rqb2 rx50-02.dsk

set rl enabled
set rl0 rl02 noautosize
att rl0 rl02-01.dsk
set rl1 rl02 noautosize
att rl1 rl02-02.dsk

... and I see this from dmesg during boot:
ra1: Ver 3 mod 3
ra9 st=3 sb=0 fl=0 en=9
ra1 st=3 sb=1 fl=0 en=9
rl1a is entire disk: no disk label
ra10 st=3 sb=0 fl=0 en=9
rl0a is entire disk: no disk label
ra8 st=3 sb=0 fl=0 en=9

  • the output of "sim> SHOW VERSION" while running the simulator which is having the issue

sim> show version
PDP-11 simulator Open SIMH V4.1-0 Current
Simulator Framework Capabilities:
32b data
32b addresses
Threaded Ethernet Packet transports:PCAP:VDE:NAT:UDP
Idle/Throttling support is available
Virtual Hard Disk (VHD) support
RAW disk and CD/DVD ROM support
Asynchronous I/O support (Lock free asynchronous event queue)
Asynchronous Clock support
FrontPanel API Version 12
Host Platform:
Compiler: GCC Apple LLVM 16.0.0 (clang-1600.0.26.4)
Simulator Compiled as C (Release Build) on Dec 16 2024 at 11:19:31
Build Tool: simh-makefile
Memory Access: Little Endian
Memory Pointer Size: 64 bits
Large File (>2GB) support
SDL Video support: SDL Version 2.30.10, PNG Version 1.6.44, zlib: (Compiled: 1.3.1, Runtime: 1.2.12)
No RegEx support for EXPECT commands
OS clock resolution: 1ms
Time taken by msleep(1): 1ms
Ethernet packet info: libpcap version 1.10.1
OS: Darwin M2MacPro.local 24.1.0 Darwin Kernel Version 24.1.0: Thu Oct 10 21:03:11 PDT 2024; root:xnu-11215.41.3~2/RELEASE_ARM64_T6020 arm64
Processor Name: Apple M2 Ultra
tar tool: bsdtar 3.5.3 - libarchive 3.5.3 zlib/1.2.12 liblzma/5.4.3 bz2lib/1.0.8
curl tool: curl 8.7.1 (x86_64-apple-darwin24.0) libcurl/8.7.1 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.62.0
git commit id: 2437b13
git commit time: 2024-09-04T20:10:48-0400

  • how you built the simulator or that you're using prebuilt binaries

I did not save the make output, but it completed without errors

  • the simulator configuration file (or commands) which were used when the problem occurred.

set cpu 4m
set tto 7b

set tq enabled
set tq TK50
attach tq0 bsd.tap

set dz lines=8 localhost:2323

#set dz disabled
#set vh enabled
#set vh address=17760500, vector=300, lines=8
#attach vh 5040

RA92: Big friggin disk or iot was in the old days for DEC!

If you want even more emulated space, just "set rq rauser=size_in_megabytes",

but this is the largest DEC disk that existed in real hardware.

set rq KDA50
set rq enabled address=17772150
set rq0 ra92 noautosize
att rq0 211bsd-root.dsk

set rqb RQDX3
set rqb enabled address=17772154
set rqb1 rx50 noautosize
att rqb1 rx50-01.dsk
set rqb2 rx50 noautosize
att rqb2 rx50-02.dsk

set rl enabled
set rl0 rl02 noautosize
att rl0 rl02-01.dsk
set rl1 rl02 noautosize
att rl1 rl02-02.dsk

set xq type=delqa
set xq mac=00-55-56-01-02-26
attach xq en1

boot rq0

  • the expected behavior and the actual behavior

On the physical hardware with this configuration BSD will respond to "disklabel ra8" with the disk label.
On the simulator, BSD responds as follows:

disklabel ra8

ra8 st=3 sb=0 fl=0 en=9
disklabel: /dev//rra8a: Input/output error

The "st=3 sb=0..." line leads me to believe that BSD is trying to talk to the device, but is failing. But only on the simulator.

During autoconfig you can see the controllers are detected and connected:

ra 0 csr 172150 vector 154 vectorset attached
ra 1 csr 172154 vector 254 vectorset attached

The kernel was compiled for two MSCP controllers and 5 MSCP disks. But I expect this repos with any working 211BSD image, like the one from the PiDP.

Devices info:

RQ address=17772150-17772153, vector=154, BR5, KDA50, 4 units
RQB address=17772154-17772157, vector=254, BR5, RQDX3, 4 units
RQC disabled
RQD disabled

sim> show rq
RQ address=17772150-17772153, vector=154, BR5, KDA50, 4 units
RQ0 1505MB, attached to 211bsd-root.dsk, write enabled
RA92, UNIT=0, noautosize
RAW format
RQ1 159MB, not attached, write enabled
RD54, UNIT=1, autosize
AUTO detect format
RQ2 159MB, not attached, write enabled
RD54, UNIT=2, autosize
AUTO detect format
RQ3 409KB, not attached, write enabled
RX50, UNIT=3, autosize
AUTO detect format
sim> show rqb
RQB address=17772154-17772157, vector=254, BR5, RQDX3, 4 units
RQB0 159MB, not attached, write enabled
RD54, UNIT=4, autosize
AUTO detect format
RQB1 409KB, attached to rx50-01.dsk, write enabled
RX50, UNIT=5, noautosize
RAW format
RQB2 409KB, attached to rx50-02.dsk, write enabled
RX50, UNIT=6, noautosize
RAW format
RQB3 159MB, not attached, write enabled
RD54, UNIT=7, autosize
AUTO detect format

  • you may also need to provide specific pointers to data files that may be necessary to demonstrate the problem

@pkoning2
Copy link
Member

pkoning2 commented Jan 7, 2025

Why would you expect unit 8 to work? As the output from "show rqb" indicates, the unit numbers assigned to the drive on that controller are 4 through 7. You can change that but it's a bit messy because the MSCP emulation supports lots of drives per controller though it defaults to 4. And those additional drives are assigned unit numbers 4 and up, so if you just try to set the unit number of rqb0 to 8 it complains that rqb8 already has that unit number.

The simplest is to go with the unit numbers SIMH assigns, so ra4 and up for the second controller.

Some day we need to fix the way unit numbers are set on MSCP controllers so those silly error messages don't appear.

@davepl
Copy link
Author

davepl commented Jan 7, 2025 via email

@davepl
Copy link
Author

davepl commented Jan 7, 2025 via email

@davepl
Copy link
Author

davepl commented Jan 7, 2025

One other thing I've observed that might be relevant: in the SHOW DEV and SHOW RQB, you'll notice that is as NO VECTOR set, whereas BSD believes it to be (and confirms finding it at) 254.

I would expect that RQB shows 254 just as RQ shows 154, but it does not.

@pkoning2
Copy link
Member

pkoning2 commented Jan 7, 2025

MSCP vectors are programmable; 154 is the conventional first value but the real rule is "pick any free vector". BSD will have to program the vector as part of initializing the device, and until it does, "not set" would be the value I'd expect to see displayed.

If BSD wants to assume that the second controller has units 8 and up, you'll have to override that manually in the SIMH device config. As I said, it's a bit of a nuisance because of the invisible additional controller slots. Proceed as follows:
set rqb drives=12
set rqb8 unit=0
set rqb0 unit=8
set rqb9 unit=1
set rqb1 unit-9
set rqb10 unit=2
set rqb2 unit=10
set rqb11 unit=3
set rqb3 unit=11
set rqb drives=4

Now "show rqb" should tell you that its unit numbers are 8..11 and that should make the OS happier.

@davepl
Copy link
Author

davepl commented Jan 7, 2025 via email

@pkoning2
Copy link
Member

pkoning2 commented Jan 7, 2025

Oh wait... I was assuming that "ra8" means the MSCP driver will ask for unit 8 on the controller. That may not be true.
Looking at the definition of RAUNIT, if that is the unit number sent to the controller it obviously is a 3 bit value, so it can't be 8. You might do ls -l /dev/ra8 to see what the minor number is, to see what RAUNIT would be for that device.

I wonder if it's 0. If so, you don't need to do the magic dance I described earlier, just do:
set rqb0 unit=0
set rqb1 unit=1
set rqb2 unit=2
set rqb3 unit=3

That means you have two MSCP controllers with matching unit numbers, but that's fine hardware-wise, it is something not allowed in some OS (like RSTS) but probably fine in others.

@davepl
Copy link
Author

davepl commented Jan 7, 2025 via email

@markpizz
Copy link
Contributor

markpizz commented Jan 7, 2025

There are several different things going on here.

  1. The default Unit Numbers for each of the separate MSCP controllers (RQ, RQB, RQC and RQD) are Start at 0, 4, 8, and 12 when built into the PDP11 simulator. This reflects that some of the PDP11 operating systems want unique unit numbers for each drive connected to the system. When these devices exist in a VAX simulator, the default unit numbers start at 0 FOR EACH of the RQ, RQB, RQC and RQD devices. The VAX operating systems (certainly starting with VMS) had different device names for drives connected to each controller (DUA, DUB, DUC and DUD), and therefore completely unique unit numbers don't add anything.
  2. Unit numbers on each of the RQ< RQB< RQC and RQD controllers must be be unique on each of the controllers. This corresponds to the original hardware with large drives which each had a unit plug which had the same requirement. Smaller MSCP controllers (RQDX 1,2,3) didn't have a physical unit plugs, so the default reflect the behavior there. Each of the drives can have a logical unit number plug set in the simh setup commands with a "SET RQ(,B,C,D)n UNIT=plug-value" command where the plug-value can be any currently unused value from 0 thru 65534. Some operating systems can't tolerate large plug values.

Beyond the above defaults and setting mechanisms, EACH of the 4 MSCP controllers can have a minimum of 4 drives or a maximum of 254 drives connected. This maximum is set with a "SET RQ(,B,C,D) DRIVES=n" command. You then have drives known via simh commands as RQ(,B,C,D)n where n is from 0 up through the specified number of drives minus 1.

@pkoning2
Copy link
Member

pkoning2 commented Jan 8, 2025

It seems we need some fixes to this machinery.

  1. The drive count per controller should either be fixed at what that model supports, or limited to no more than that.
  2. Drives beyond the (configured or fixed) limit should not exist at all, not be phantom drives that have unit numbers assigned to them which can cause conflicts.
  3. If drive count can be increased, any unit number conflicts at that time can be either auto-fixed or handled by disabling the offending later units.

Are there any MSCP controllers (other than HSC, which I don't think SIMH supports) that support more than 4 drives? If so, which ones and what are their limits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants