Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hard sector support to infrastructure and HFE #412

Closed
wants to merge 2 commits into from

Conversation

ejona86
Copy link
Collaborator

@ejona86 ejona86 commented Jan 2, 2021

Alternative to #401. This drops the VGI support that it used. I have VGI read support with this version in my own repo but removed it because it did not support writes and it seemed there was a preference to go with the general-purpose HFE instead of many custom image formats. It proved to be very helpful during development though.

This is more invasive than #401, although the infrastructure enhancements seem fair. I considered many The HFE changes were the trickier ones.

There are two FIXMEs in the code commenting out code for fake_fired. The code was causing trouble, and it was unclear what the intention was since it was in the index_suppression == FALSE case. I assume that will be resolved before merging.

HFE index pulses do not have to come at the beginning of the track, so normal pulses are disabled when custom pulses are being used. Syncing the custom pulses immediately when they are changed is required because HFE tracks don't have to be aligned with each other and because seeking ignores opcodes which causes temporary skew.

The index pulses with this change and index-suppression=no are very consistent during reading/seeking. I still have not tested with real hardware, but pulses during writes with all three write-drain settings behave as expected and realtime and eot are promising for actual hardware compatibility.

The hard-sectored HFE files that I could find were clearly recordings and not all that pretty. I did test them and they did work, but I also created my own 16 sector files. I'll post the script I used to generate it to #400.

Fixes #393 #400

CC @teiram

@keirf
Copy link
Owner

keirf commented Jan 2, 2021

Okay, seems a good approach. I think polishing too much without testing on real hardware is risking making more work: The design decisions taken are not validated, and this means not only may it not work off the bat, but you may also be striving to replicate real behaviour that doesn't actually matter in practice.

Consider indexes. In soft-sectored FDC, the index is used only for:

  1. Laying down a new track (runs index to index, this will be a WriteTrack or FormatTrack command)
  2. Timing out sector ID searches (see two indexes means we've seen at least one full pass over the track)

Hence later drives suppress index during seek, and FlashFloppy extends that to other times it is busy (specifically writeback to Flash). Some hosts however use index for another purpose:
3. Simulate /READY by checking for regular INDEX.

This was first seen on BBC B with original single-density controller, and actually hasn't been seen much elsewhere. Here we must have pulses not delayed by more than a few % than expected or READY is deasserted and in-progress host operations fail. Unfortunately keeping properly regular index pulses can't work: 2 is in conflict with 3 and we can end up timing out host operations because an index pulse was generated while RDATA still inactive. Hence the index-suppression=no shenanigans: Issue index pulses in the 'shadow' of seek and write commands. Maximises time to seek or writeback.

On Micropolis and other hard-sectored systems, I wonder whether index pulses are used for anything more than incrementing and clearing the sector id? Because if not, is there any reason not to run with the defaults:

track-change = instant
write-drain = instant
index-suppression = yes

This is by very far the most common FlashFloppy behaviour, and you would save yourself pain if you can roll with it on HS systems.

Another thing I need to check is that HFEv3 images generated by HxC tools always explicitly issue Index opcodes, and don't require an implicit index at track start. This is simply not something I checked when I hacked in HFEv3 support. Apart from that, I suspect a bunch of your HFEv3 logic is going to be good to pull in to master.

@keirf
Copy link
Owner

keirf commented Jan 2, 2021

Ah, of course I forgot that at least the extra index hole preceding track start must be time-critical I'm guessing. This gets hard I think, in practice. You don't want to delay index pulses, or at least not the pair leading into track start. But if you don't, and hardware sees index pulses counting in a sector it wants, and FlashFloppy is busy for 100+ms writing back a sector to Flash, and the sector you need is not cached, what happens?

You will really want to stress test on hard-sector hardware, reads and writes, as it will be all too easy to make a 95% solution. 100% may require hardware more suited to the real-time, like Pi + hat. Such that you can cache the image in RAM, and plenty of cores to bitbang, manage drive emulation, manage host UI, writeback, etc. That would be a fun project. One I've considered myself.

EDIT:
Perhaps I should give more details on my future thoughts for FlashFloppy. The basic firmware on STM32F1 is pretty much baked. There isn't much Flash space left and, within its pretty generous parameters of use, it works well and as plug'n'play as possible. I'd say the biggest potential features for it now are improving the buffer cache (stream both heads; write-back on seek), and maybe two-drive emulation. Neither of these actually matter for I'd guess 99% of users.

To substantially improve emulation or add big new features (snazzy new UI perhaps?) would need different hardware. One thought is to go incremental on what we have already. I have used F7 MCU for Greaseweazle, and I could make a solution based on that with more Flash, RAM, and pins. But it's still incremental, someone has to build the boards, and they still won;t come with the 3.5"-sized casing of a Gotek. Probably this solution falls between two stools: plug'n'play good-enough Gotek, and perfect emulation of a completely different approach like Pi. Looking at say Pi1541, there is an open solution that could be cribbed from, some FlashFloppy code could be pulled in, the hat could be largely passive perhaps with level shifting/buffering. And it's more fun ;)

I guess whether this sort of thing is of interest to you does depend how well you get hard sectoring going on a real host.

EDIT2:
Ordered a brace of Pi 4 2GB boards. I'll set one up for PXE and have a play :)

@teiram
Copy link

teiram commented Jan 2, 2021

I was able to test the changes on my Northstar Advantage with the images provided in #400 and the computer boots from the Gotek properly. Really nice!
There were some access errors though I was not yet able to look into.

I'm also not sure if I'm able to create proper hard-sector HFEv3 images. I've tried with HxcFloppyEmulator to convert NSI images to HFEv3 but they don't even boot. Not sure if the hard sector information is being properly encoded. Example:

WORDSTAR_NSI_2.zip

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 3, 2021

I think polishing too much without testing on real hardware is risking making more work

That's fair. This can sit until I or others are getting comfortable with it on real hardware. My access to hardware is off-and-on, so I was doing things I can now to give me more time to do other things with the hardware.

Ah, of course I forgot that at least the extra index hole preceding track start must be time-critical I'm guessing.

Yeah, that's the issue. For "normal" sectors we should be able to extend the time between the pulses. But for the last sector we can only extend the time in the last half; we need to keep the double pulse in the first half of the last sector. We also can't go around generating fake pulses, as they will either be interpreted as a sector or index pulse and both cause problems; that's why I disabled fake_fired.

But if you don't, and hardware sees index pulses counting in a sector it wants, and FlashFloppy is busy for 100+ms writing back a sector to Flash, and the sector you need is not cached, what happens?

instant has trouble. Let's assume this is a problem for realtime. eot is still on the table. eot repeats the last sector pulses so the controller would be properly synchronized on resume, as long as it didn't want the immediately-next sector.

Going back to realtime. Reads seem easy, as if we just drop-out the read flux the OS will retry. And between writes there should generally be a read (although that wouldn't be the case for systems with sector skew). So I think this is mostly a concern for large contiguous writes. But those would be absorbed by the 32 K/500 ms buffer dedicated to write_bc. And there's still the option of just using better flash. Once the worst-case flash write is within 10-20 ms, it becomes less of a big deal although still not worry-free.

Another option is to make the writeback asynchronous. Assuming I figured out a way that doesn't cause havoc to the code (ha!), it would open options. Since write_bc consumption could continue in parallel to writeback the write_bc buffer wouldn't need to be near as large. We could then store the entire track in memory, although HD HFE might be tight. IMG files have a space advantage here, as they don't have the 2x factor due to MFM nor the 1.2-1.5x factor for formatting overhead. If push comes to shove, I'd at least give this a serious look.

I have used F7 MCU for Greaseweazle, and I could make a solution based on that with more Flash, RAM, and pins.

Flash we can get on the F105 and Flash and RAM we can also get on the F4x1. I need to dig in more where all the pins have gone to, but it seems there's still some to spare. If we need more pins, then it seems any of the options work but we have to go to a larger package. The neat thing about F7 to me is high-speed USB, otherwise I don't see much different to F1/F4.

A Pi hat sounds fine. Doesn't fully excite me because I see it as not much more than a really big RAM chip and I feel like "surely there is another way." But it could be a small departure from what is here. It's probably easiest to treat the Pi mostly as a storage device, but dropping the FAT stuff. The Pi could do format conversion, so maybe every format but HFE could be dropped. The Pi would control the process, choosing the images and the like, but that's mostly just a "disk changed" signal. Could initially team it up with a stm32f401 black pill.

I guess whether this sort of thing is of interest to you does depend how well you get hard sectoring going on a real host.

I'm really not all that pessimistic about it; there seem to be a lot of options still available. But I do have some other hopes and dreams where it may become useful (*cough* st506 *cough*), but that's still too early to tell.

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 3, 2021

@teiram, make sure to include index-suppression=no in ff.cfg. If you do any writes, you'll want to change write-drain to either "realtime" or "eot"; you'll just have to play with it. If you have a logic analyzer it'd be great to get a recording of the write access pattern, but I assume you don't. Since Northstar uses a 10 sector disk eot may not be quite right. It's important that the restart position includes all of the last track, which is 20 ms long. You'll want to increase the eot restart position to, say, 30 ms. (21 ms would be enough, but their's little benefit in pushing the envelope.)

There were some access errors though I was not yet able to look into.
I'm also not sure if I'm able to create proper hard-sector HFEv3 images.

It looks mostly right. II see problems when reading WORDSTAR_NSI_2.hfe on my end, but that applies also to northstar_nst.hfe. The index pulses can sometimes get wonky; I didn't retest northstar very hard after I got the micropolis file worked out. I'm seeing pulse skews over 2 ms, which is a ton, where I expect there are none in the HFE file. If they were less than 2 ms they would merge together and not cause a problem. Both northstar and wordstar files have lots of ops that set the bitrate which can cause seeking alignment problems. Seeing those ops makes a lot of sense (I had been wondering how changing tracks worked with bitrates...) but there's only enough of them to cause 200 us of skew. So this needs further investigation.

I will say that the northstar_nst.hfe has sector pulses nicely aligned with the start of the sector preface except for the first sector. In the wordstar file all the sector pulses are in the middle of the preface. But that is a secondary issue; without the right pulses things will be messed up.

@keirf
Copy link
Owner

keirf commented Jan 3, 2021

I don't think HD HFE is likely to succumb to full caching, as ~50kB would be incredibly tight, especially since we would still need some ring buffering. But just about everything else is plausible, except perhaps ED-rate IMG, but that could be okay treated side-at-a-time. Whereas everything else could prefetch/stream both heads for current cylinder, writeback on change of cylinder, etc. So yes that's where I'd like to go. Thinking about how to get the asynchrony, the cleanest would be a simple process scheduler and move the Flash access into a process behind an extended buffer-cache interface. Probably. Trying to think how to do it using interrupts only gets confusing.

@hharte
Copy link
Contributor

hharte commented Jan 3, 2021

Thanks for your work in this, I’m enjoying the discussion. I’d be interested to use this on both a Northstar Horizon and a Vector MZ. One request I’d like to make, if new hardware is being considered is to add a 2SIDE output for FlashFloppy to allow better compatibility with 8” controllers. I’m about to start imaging a bunch of Vector/Micropolis 16-hard-sector floppies. I may be able to try the hard sector firmware with FlashFloppy at that time.

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 3, 2021

I don't think HD HFE is likely to succumb to full caching, as ~50kB would be incredibly tight

I was thinking of HD formats at 360 RPM, which are ~42K. Yeah, I agree HD at 300 RPM formats seem out-of-reach for HFE and full-track buffering. And ED wasn't on my mind at all; that would be demanding.

@teiram
Copy link

teiram commented Jan 3, 2021

@teiram, make sure to include index-suppression=no in ff.cfg. If you do any writes, you'll want to change write-drain to either "realtime" or "eot"; you'll just have to play with it. If you have a logic analyzer it'd be great to get a recording of the write access pattern, but I assume you don't. Since Northstar uses a 10 sector disk eot may not be quite right. It's important that the restart position includes all of the last track, which is 20 ms long. You'll want to increase the eot restart position to, say, 30 ms. (21 ms would be enough, but their's little benefit in pushing the envelope.)

Thanks a lot for your support and hints. I will try disabling the index-supression and see what is the outcome.
I do have a logic analyzer and can try to get the signals during a write attempt. What signals do you think would be relevant to check?

It looks mostly right. II see problems when reading WORDSTAR_NSI_2.hfe on my end, but that applies also to northstar_nst.hfe. The index pulses can sometimes get wonky; I didn't retest northstar very hard after I got the micropolis file worked out. I'm seeing pulse skews over 2 ms, which is a ton, where I expect there are none in the HFE file. If they were less than 2 ms they would merge together and not cause a problem. Both northstar and wordstar files have lots of ops that set the bitrate which can cause seeking alignment problems. Seeing those ops makes a lot of sense (I had been wondering how changing tracks worked with bitrates...) but there's only enough of them to cause 200 us of skew. So this needs further investigation.

This wordstar image was generated directly from an NSI file with HxCFloppyEmulator. My understanding here (maybe wrong) is that the timing for the index pulses should be the same for every NSI image, since the format is fixed and only holds the data itself. I don't really know how the images that are working were generated though.

I will say that the northstar_nst.hfe has sector pulses nicely aligned with the start of the sector preface except for the first sector. In the wordstar file all the sector pulses are in the middle of the preface. But that is a secondary issue; without the right pulses things will be messed up.

Is there any software to easily check to the generated HFE files, timings, index pulses and such stuff?

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 3, 2021

I do have a logic analyzer and can try to get the signals during a write attempt. What signals do you think would be relevant to check?

@teiram, I've been using:

pin signal
8 IDX
18 DIR (not essential)
20 STP
22 WD (less important than WG, but still helpful)
24 WG
26 TK0 (not essential)
30 RD
32 SS

My understanding here (maybe wrong) is that the timing for the index pulses should be the same for every NSI image, since the format is fixed and only holds the data itself.

Ideally. Although I'm seeing sectors (padding) start before the sector pulses, which means the data and sector pulses are mis-aligned. I'm not familiar with NSI files, but I assume those are like IMG files which means it is HxCFloppyEmulator putting in the indexes at a not-quite-ideal spot.

Is there any software to easily check to the generated HFE files, timings, index pulses and such stuff?

HxCFloppyEmulator's "Disk view mode" is useful to verify the tracks are aligned. But it doesn't include the index pulses from what I've seen. I've just been viewing the file with a hex editor. If you search for 0x8F those are the index pulses. The track data starts at offset 0x400 and every 256 bytes alternate between the sides.

Here's a section from wordstar_sni_2.hfe. You can see 8f is in the middle of a bunch of Us:

00000e00: 2222 2222 2222 2222 2222 2222 2222 2222  """"""""""""""""
00000e10: 2222 2222 224f 1222 2222 2222 2222 2222  """""O."""""""""
00000e20: 2222 2222 2222 2222 2222 2222 2222 2222  """"""""""""""""
00000e30: 2222 2222 2222 2222 2222 2255 5555 5555  """""""""""UUUUU
00000e40: 5555 5555 5555 5555 5555 5555 5555 5555  UUUUUUUUUUUUUUUU
00000e50: 5555 5555 5555 5555 5555 5555 5555 5555  UUUUUUUUUUUUUUUU
00000e60: 5555 5555 5555 5555 8f55 5555 5555 5555  UUUUUUUU.UUUUUUU # index
00000e70: 5555 5555 5555 5555 5555 5555 5555 5555  UUUUUUUUUUUUUUUU
00000e80: 5555 aaa2 5495 4aa5 8888 aaa8 4aa5 528a  UU..T.J.....J.R.
00000e90: aaa8 4aa5 9494 aaa8 4f12 4aa5 94a2 aaa8  ..J.....O.J.....
00000ea0: 4aa5 aaaa aa22 4aa5 5225 aaa8 a84a aa8a  J...."J.R%...J..
00000eb0: 5495 5495 4a92 5455 5555 4aa5 a2a4 2aaa  T.T.J.TUUUJ...*.
00000ec0: 5455 a9aa 5455 5555 2995 5289 5455 5555  TU..TUUU).R.TUUU
00000ed0: 29a2 aa52 25aa aa28 5555 5555 4a52 5555  )..R%..(UUUUJRUU
00000ee0: 5555 5555 5555 aaaa aaaa aaaa aaaa aaaa  UUUUUU..........
00000ef0: aaaa aaaa aaaa aaaa aaaa 2452 49a5 2492  ..........$RI.$.

We'd expect it to be nearer where the data swaps from " to U. It doesn't have to be exact. What's most striking is that the 8f is after the start of the Us. We'd expect it to be before, as that's how the controller knows it is safe to start writing those Us. Here's from the northstar_nst.hfe file:

00000e00: 2222 2222 2222 2222 2222 2222 2222 2222  """"""""""""""""
00000e10: 2222 2222 224f 1222 2222 2222 2222 2222  """""O."""""""""
00000e20: 2222 2222 2222 2222 2222 2222 2222 2222  """"""""""""""""
00000e30: 2222 2222 2222 2222 228f 5555 5555 5555  """"""""".UUUUUU # index
00000e40: 5555 5555 5555 5555 5555 5555 5555 5555  UUUUUUUUUUUUUUUU
00000e50: 5555 5555 5555 5555 5555 5555 5555 5555  UUUUUUUUUUUUUUUU
00000e60: 5555 5555 5555 5555 5555 5555 5555 5555  UUUUUUUUUUUUUUUU
00000e70: 5555 5555 5555 5555 5555 5555 aaa2 5495  UUUUUUUUUUUU..T.
00000e80: 4aa5 28a2 2a22 4aa5 a2a8 2a22 4aa5 24a9  J.(.*"J...*"J.$.
00000e90: 2a22 4aa5 a494 2a22 4f12 4aa5 2852 2a2a  *"J...*"O.J.(R**
00000ea0: 4aa5 92a2 2a22 a94a aa54 5595 5495 4a92  J...*".J.TU.T.J.
00000eb0: 5455 5555 4aa5 a228 2a25 5555 a9aa 5455  TUUUJ..(*%UU..TU
00000ec0: 5555 2995 5289 5455 5555 5252 2aa2 4895  UU).R.TUUURR*.H.
00000ed0: 2a92 5455 5555 5555 4a52 5555 5555 5555  *.TUUUUUJRUUUUUU
00000ee0: 5555 5555 5555 5555 a9aa 5425 5555 5555  UUUUUUUU..T%UUUU
00000ef0: 5555 5555 5555 5555 a9aa 5425 2aaa 5455  UUUUUUUU..T%*.TU

Those 4f12 bytes set the bitrate, which I had mentioned I hadn't noticed they were so plentiful. Note the " and U is bit-level data, so you may see a different character.

Again though, this index problem isn't a big problem by itself. It just reduces our tolerances and so may make other problems stand out more.

@teiram
Copy link

teiram commented Jan 3, 2021

Hi @ejona86. Thanks a lot for the info. I think you're right and there is some sort of missalignment in the images generated by HxcFloppyEmulator from NSI files. I so far was sampling the attempt to boot of the Northstar Advantage for a working image nortstar_nst.hfe and a non-working one (my wordstar_sni_2.hfe). The working one reads from track0 and then seems to jump to track2, where the CP/M directory is located:

image

For the wordstar one, seems that the readings from track0 are not properly done, because there is no attempt to jump to any other track and the computer returns to the boot system prompt. It's also impossible to read anything from this image even swapping to this disk after a proper CP/M boot with a working disk image.

image

I'm pretty sure the NSI file just contains the sector data. As example the wordstar one I was using:
WORDSTAR.zip
It is 350Kbytes in size, what matches the size of 35 tracks and 2 sides with 10 512 byte sectors per track.

I will try to figure out how HxcFloppyEmulator does the conversion from NSI to HFE and why those sector marks seem to be misplaced.

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 4, 2021

I figured it out! The horrible alignment is due to the setbitrate opcodes. I had forgotten a factor of 8x since the opcode contains 8 bits and a factor of 2x since there's two bytes for that specific opcode. I estimated around 100 setbitrate opcodes; that can contribute 3.2 ms of skew! Puzzle solved.

At some point we'll probably need to enhance the HFE support to seek more accurately. Until then, we can just remove the useless setbitrate opcodes, and reindex if we want to as well. I've done so for your wordstar image; try it out: WORDSTAR_NSI_2_reopcoded.zip. I suggest using the "nosetbitrate" file which preserves the original index opcodes, but if have trouble try the "nosetbitrate_reindexed" to compare. (Edit: My script fails to update the track size, so the images are sort of dubious. I don't think they cause much trouble with the earlier version, but they behave poorly with my recent "automatically compensate" commit. Edit2: Script updated to fix track size.)

Script I used. Pass HFE file as stdin, it will output HFE file to stdout. Probably breaks on Windows. You can change strip_index as you want. start_offset and end_offset would ideally be 0, and using 0 still improves the image, but those values gave very good results for the specific image you provided. The sector size was clearly 1252 bytes, but the track size was longer than 12520 bytes and the offsets account for that. Dunno if the image should be trimmed some or if the first or last sector just happen to be longer on the real hardware. (Edit: updated script to fixup track length, although that's only necessary if stripping setbitrate.)

#!/bin/env python3
import math
import sys


def split_sides(bs):
    s1 = bytearray()
    s2 = bytearray()
    for pos in range(0, len(bs), 512):
        s1.extend(bs[pos:pos+256])
        s2.extend(bs[pos+256:pos+512])
    return s1, s2


def join_sides(s1, s2):
    track = bytearray()
    for pos in range(0, len(s1), 256):
        track.extend(s1[pos:pos+256])
        track.extend(s2[pos:pos+256])
    return track


def strip_opcodes(bs, bitrate=True, index=True):
    obs = bytearray()
    i = 0
    while i < len(bs):
        b = bs[i]
        if b == 0x4F and bitrate:
            i += 2
            continue
        if b == 0x8F and index:
            i += 1
            continue
        obs.append(b)
        i += 1
    return obs


def reformat_side(bs):
    obs = strip_opcodes(bs, bitrate=strip_bitrate, index=strip_index)

    if strip_index:
        start_offset = strip_bitrate and 64 or 66
        end_offset = strip_bitrate and -80 or -85
        sector_len = (len(obs) - start_offset - end_offset)/sector_count
        obs.insert(len(obs) - int(sector_len)//2 - end_offset, 0x8F)
        for sector in range(sector_count-1, -1, -1):
            obs.insert(start_offset + int(sector_len * sector), 0x8F)

    return obs


sector_count = 10
strip_bitrate = True
strip_index = False
f = sys.stdout.buffer
s = sys.stdin.buffer

offset = 0
header = s.read(512)
offset += 1
f.write(header)
track_count = header[9]
track_list_offset = (header[19] << 8) | header[18]
assert track_list_offset == 1

tracklist = bytearray(s.read(512))
offset += 1

tracks = []
for track_num in range(track_count):
    tracklist_entry = tracklist[track_num*4:track_num*4+4]
    track_offset = (tracklist_entry[1] << 8) | tracklist_entry[0]
    assert offset <= track_offset
    if offset < track_offset:
        f.read(512 * (track_offset - offset))
    orig_track_size = ((tracklist_entry[3] << 8) | tracklist_entry[2]) // 2
    orig_track_cap = math.ceil(orig_track_size / 256) * 256
    track = s.read(2*orig_track_cap)
    offset += orig_track_cap / 256
    (s1, s2) = split_sides(track)
    s1 = reformat_side(s1[:orig_track_size])
    s2 = reformat_side(s2[:orig_track_size])
    if len(s1) != len(s2):
        print("Warning: side length mismatch", file=sys.stderr)
    track_size = max(len(s1), len(s2))
    s1.extend(b'\x0F' * (orig_track_cap - len(s1)))
    s2.extend(b'\x0F' * (orig_track_cap - len(s2)))
    tracklist_entry[2] = (track_size*2) & 0xFF
    tracklist_entry[3] = ((track_size*2) >> 8) & 0xFF
    tracklist[track_num*4:track_num*4+4] = tracklist_entry
    tracks.append(join_sides(s1, s2))

f.write(tracklist)
for track in tracks:
    f.write(track)
f.flush()

This reduces the worst-case observed skew from 3 ms to 50 us. There is a
risk that in the future an opcode is added that is numerous and not
evenly distributed, but that seems unlikely and we'd need to swap to a
more complicated/advanced scheme.
@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 4, 2021

@teiram, I just pushed a commit that seems to work well for unmodified HFE files that contain many setbitrate opcodes. But it behaves poorly with the HFE files generated from my script, because my script didn't update the track length.

@keirf
Copy link
Owner

keirf commented Jan 4, 2021

Backing up a little, is it worthwhile trying to handle 'poor' HFE images? I mean, a good HFEv3 image should keep the two sides in pretty good sync. That's one thing the skip opcode is for aiui: to allow a side which has "got behind" to catch up?

Specifically looking at hard sectored HFEv3, the set-bitrate opcodes are surely superfluous as I'm sure the tracks are constant rate. Would it make more sense, if recordings of real disks are wanted, to clean them up with an external script:

  1. Remove all opcodes and carve each cylinder up by index opcode into sectors
  2. Match pairs of sectors, add a skip opcode at the end of each sector to make the sectors on each side match exactly in length
  3. Reconstruct cylinder, inserting index opcodes at the exact same offsets on both sides.

Being able to assume your HFE is clean has got to be a nice simplification right?

EDIT: I haven't played with HFEv3 much myself though, to be fair. So I'm not up on what all the snags can be.

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 4, 2021

Backing up a little, is it worthwhile trying to handle 'poor' HFE images?

I don't think they are worth spending a considerable amount of effort. heath_kf.hfe, for example, I've mostly ignored because the tracks are clearly misaligned.

I mean, a good HFEv3 image should keep the two sides in pretty good sync. That's one thing the skip opcode is for aiui: to allow a side which has "got behind" to catch up?

Yes, it should. The no-op opcode is for keeping the sides in sync. An opcode on one side should always be mirrored by an opcode on the other (same opcode, or a noop). The skipbtis opcode is just to handle non-divisible-by-8 bitcell counts. There's no "catching up" involved; the two sides go in lock-step, modulo less than 8 bitcells in some cases.

That said, different tracks don't have to align as well. But I'm fine assuming they do.

Specifically looking at hard sectored HFEv3, the set-bitrate opcodes are surely superfluous as I'm sure the tracks are constant rate.

From what I can infer, the point of the frequent set-bitrate opcodes is for track changes. If any one track is a different bitrate then when you swap to/from that track you need to learn of the different bitrate quickly. But yes, in these images all the tracks have the same rate. So while the many bitrate ops are extraneous, it doesn't really make the HFE image "poor." I'll also note that the only reason these opcodes are a problem is because we use ticks for setup_track and not bc; if HFE was the core format for the system these opcodes would be less of a problem (not to say we should make any changes there, but that HFE is not the only source of the problem).

But I'm quite practical here: stripping out the bitrates with that python script is not that bad of a solution in my mind for hard sector support. (Other than I need to fix it so the track lengths are semi-correct.) But I didn't see this setbitrate issue as being solely a hard sector problem; I think it could cause trouble during writing with soft sectors as well.

Would it make more sense, if recordings of real disks are wanted

To be clear, the images I'm messing with here have perfectly aligned tracks. The skews and issues were due to our processing of the file (mainly seeking with hfe_setup_track), not the file itself. If the tracks need alignment, I'm fine telling someone to do that manually with the HxC software.

EDIT: I haven't played with HFEv3 much myself though, to be fair. So I'm not up on what all the snags can be.

Definitely the same for me. I originally was not wild about HFEv3. But it has been growing on me as I've been figuring out its mental model.

@teiram
Copy link

teiram commented Jan 4, 2021

I figured it out! The horrible alignment is due to the setbitrate opcodes. I had forgotten a factor of 8x since the opcode contains 8 bits and a factor of 2x since there's two bytes for that specific opcode. I estimated around 100 setbitrate opcodes; that can contribute 3.2 ms of skew! Puzzle solved.

At some point we'll probably need to enhance the HFE support to seek more accurately. Until then, we can just remove the useless setbitrate opcodes, and reindex if we want to as well. I've done so for your wordstar image; try it out: WORDSTAR_NSI_2_reopcoded.zip. I suggest using the "nosetbitrate" file which preserves the original index opcodes, but if have trouble try the "nosetbitrate_reindexed" to compare.~ (Edit: My script fails to update the track size, so the images are sort of dubious. I don't think they cause much trouble with the earlier version, but they behave poorly with my recent "automatically compensate" commit.)

Very nice. I've just tested the provided images. The nosetbitrate doesn't work either but the nosetbitrate_reindexed mostly does: I was able to boot with that disk and run wordstar. There is an error while loading that doesn't happen on retrying the operation.

I will next give a try to your new commit with the original HxcFloppyEmulator generated hfe files and will provide feedback.

@teiram
Copy link

teiram commented Jan 4, 2021

@teiram, I just pushed a commit that seems to work well for unmodified HFE files that contain many setbitrate opcodes. But it behaves poorly with the HFE files generated from my script, because my script didn't update the track length.

Just checked this. Results are different in the following way:

  • Reference images still work.
  • The images generated with HxcFloppyEmulator don't work. Some of them nearly boot but then I get some hardware failure error.
  • The wordstar images filtered with your script work more or less like before. The nosetbitrate doesn't boot, the nosetbitrate-reindexed does, but I'm getting more errors that get sorted out on retry.

@keirf
Copy link
Owner

keirf commented Jan 4, 2021

From what I can infer, the point of the frequent set-bitrate opcodes is for track changes. If any one track is a different bitrate then when you swap to/from that track you need to learn of the different bitrate quickly. But yes, in these images all the tracks have the same rate. So while the many bitrate ops are extraneous, it doesn't really make the HFE image "poor." I'll also note that the only reason these opcodes are a problem is because we use ticks for setup_track and not bc; if HFE was the core format for the system these opcodes would be less of a problem (not to say we should make any changes there, but that HFE is not the only source of the problem).

But I'm quite practical here: stripping out the bitrates with that python script is not that bad of a solution in my mind for hard sector support. (Other than I need to fix it so the track lengths are semi-correct.) But I didn't see this setbitrate issue as being solely a hard sector problem; I think it could cause trouble during writing with soft sectors as well.

Yes it's a shame there's not a per-track default bitrate. Still, for nearly all platforms, bitrate is consistent across and within all tracks, so the single bitrate reported at the top of the image is good.

I think the original reason for set-bitrate was to support variable rate protection tracks. For example Copylock and Speedlock on Commodore Amiga. And I also think the HxC HFEv3 converter is a bit over keen to spew set-bitrate opcodes, at least when presented with a raw dump.

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 4, 2021

@teiram wrote:

The wordstar images filtered with your script work more or less like before. The nosetbitrate doesn't boot, the nosetbitrate-reindexed does, but I'm getting more errors that get sorted out on retry.

That flakiness is probably caused by the length bug my script had, since retry fixes it. It looks like "reindexed" is really all you need.

I think I've fixed my script, and re-generated the HFE files. This time there's also a plain "reindexed" which leaves the setbitrate in-place. With my latest commit it shouldn't matter whether the setbitrates are present or not. WORDSTAR_NSI_2_reopcoded2.zip

@teiram
Copy link

teiram commented Jan 5, 2021

@teiram wrote:

The wordstar images filtered with your script work more or less like before. The nosetbitrate doesn't boot, the nosetbitrate-reindexed does, but I'm getting more errors that get sorted out on retry.

That flakiness is probably caused by the length bug my script had, since retry fixes it. It looks like "reindexed" is really all you need.

I think I've fixed my script, and re-generated the HFE files. This time there's also a plain "reindexed" which leaves the setbitrate in-place. With my latest commit it shouldn't matter whether the setbitrates are present or not. WORDSTAR_NSI_2_reopcoded2.zip

The plain reindexed one works perfectly for reading. Nice work!
Could you provide the updated script so that I can test other HFE files?
Regarding writing, I did the following modification:
image

But I'm getting errors on write attempts. Seems that some of the index pulses are missing:

image

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 5, 2021

@teiram, that's great news! I updated the script in the comment I posted it. For the missing index pulses during write, have you set write-drain in ff.cfg like I mentioned before?

@teiram
Copy link

teiram commented Jan 5, 2021

@teiram, that's great news! I updated the script in the comment I posted it. For the missing index pulses during write, have you set write-drain in ff.cfg like I mentioned before?

Forgot that part, sorry. :)
Tried again with write-drain = realtime and still getting the error. Signals are now different though:

image

After the first WGATE activation timing of the index pulses is not consistent. Normally the time between pulses is 18ms but here we have one of 16.96ms, then 18, 20 and seems that there are either many pulses for this track spin or something weird has happened?

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 10, 2021

@teiram, I've been able to reproduce similar behavior. What I'm seeing seems to be triggered by slow reads immediately following writes. Reads normally take 2 ms for me, but after some writes they take 10-20 ms. The code expects all the reading and processing at that point to be within 10 ms. It's possible that inaccurate stk_per_rev could also contribute to a similar failure, but making stk_per_rev more accurate didn't significantly change the behavior I saw.

It's possible to work around that by increasing two time values (e.g., to 25 ms and 30 ms for my system). But that will cause all reads to be delayed and you have to know how slow these rare reads are. This is probably a better workaround for now:

diff --git a/src/floppy.c b/src/floppy.c
index d6f7abb..0ca54fe 100644
--- a/src/floppy.c
+++ b/src/floppy.c
@@ -365,6 +365,9 @@ static void floppy_sync_flux(void)
     if (!drv->index_suppressed) {
         ticks = time_diff(time_now(), sync_time) - time_us(1);
         if (ticks > time_ms(15)) {
+            printk("Early. Retrying\n");
+            dma_rd->state = DMA_inactive;
+            return;
             /* Too long to wait. Immediately re-sync index timing. */
             drv->index_suppressed = TRUE;
             printk("Trk %u: skip %ums\n",
@@ -378,6 +381,9 @@ static void floppy_sync_flux(void)
             /* If we're out of sync then forcibly re-sync index timing. */
             ticks = time_diff(time_now(), sync_time);
             if (ticks < -100) {
+                printk("Late. Retrying\n");
+                dma_rd->state = DMA_inactive;
+                return;
                 drv->index_suppressed = TRUE;
                 printk("Trk %u: late %uus\n",
                        drv->image->cur_track, -ticks/time_us(1));

If you still see weird timing, you can try this as well. This is more of a bug fix instead of a workaround, but I'm not making it in a commit because I've not been able to see any difference in behavior:

diff --git a/src/image/hfe.c b/src/image/hfe.c
index a23e32e..01a75f8 100644
--- a/src/image/hfe.c
+++ b/src/image/hfe.c
@@ -271,6 +271,7 @@ static uint16_t hfe_rdata_flux(struct image *im, uint16_t *tbuf, uint16_t nr)
         if (im->cur_bc >= im->tracklen_bc) {
             ASSERT(im->cur_bc == im->tracklen_bc);
             im->tracklen_ticks = im->cur_ticks;
+            im->stk_per_rev = stk_sysclk(im->tracklen_ticks / 16);
             im->cur_bc = im->cur_ticks = 0;
             /* Skip tail of current 256-byte block. */
             bc_c = (bc_c + 256*8-1) & ~(256*8-1);

@teiram
Copy link

teiram commented Jan 12, 2021

Hello @ejona86
Thanks for the workarounds. I applied the first patch and seeems that writes still fail but on retry the operation completes. Unfortunately there are issues with the written sectors (some seem to be missing). The second fix seems to not change the behaviour at all for this concern.

I didn't have the chance yet to look at the signals. I was just able to do this test:

  • Copy a file with PIP: PIP WS2.COM=WS.COM. Here the Northstar throws an error but on retrying the operation it eventually success.
  • Try to execute WS2.COM under CP/M. Fails.
  • Reimport back the file to NSI and extract the WS2.COM to check its content. Seems that the first 6 sectors are missing in the file. So it seems that they directory entry was properly written and there were failures during the other writes.

I have also noticed that even the reindexed images with the more up to date script fail sometimes while reading. It happens less than before but still there.

@teiram
Copy link

teiram commented Jan 19, 2021

Was trying to build with debug=y and seems it doesn't fit anymore:

make[3]: Leaving directory '/mnt/c/Users/mteira/Source/flashfloppy-hard-sectors/bootloader/usb'
CPP Bootloader.ld
LD Bootloader.elf
/usr/lib/gcc/arm-none-eabi/7.3.1/../../../arm-none-eabi/bin/ld: Bootloader.elf section `.text' will not fit in region `FLASH'
/usr/lib/gcc/arm-none-eabi/7.3.1/../../../arm-none-eabi/bin/ld: region `FLASH' overflowed by 908 bytes
collect2: error: ld returned 1 exit status

Any hint about what to change to reduce those 908 bytes? I was trying to tune it a bit and try to remove some font, but no luck. Seems the size of the bootloader is always the same.

@keirf
Copy link
Owner

keirf commented Jan 19, 2021

Size of bootloader shouldn't have grown at all. If it has it's probably a mistake / something dumb and should be easy to claw back.

@teiram
Copy link

teiram commented Jan 19, 2021

Just tried in master and same behavior compiling with:
make debug=y

@keirf
Copy link
Owner

keirf commented Jan 19, 2021

Debug build, Ubuntu 20.04 LTS:

viper:FlashFloppy$ ls -l bootloader/Bootloader.bin 
-rw-rw-r-- 1 keir keir 30328 Jan 19 19:58 bootloader/Bootloader.bin
viper:FlashFloppy$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (15:9-2019-q4-0ubuntu1) 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599]

@teiram
Copy link

teiram commented Jan 19, 2021

Debian GNU Linux 10:

mteira@carbon:~/src/flashfloppy-keirf$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (15:7-2018-q2-6) 7.3.1 20180622 (release) [ARM/embedded-7-branch revision 261907]

@keirf
Copy link
Owner

keirf commented Jan 19, 2021

I guess it doesn't build on that one then.

@teiram
Copy link

teiram commented Jan 22, 2021

Well, this is really weird. I have tried on Debian 10 and got that failure with the mentioned version of the compiler.
Then updated debian to testing what updated the compiler to (15:8-2019-q3-1+b1) 8.3.1 20190703 (release) [gcc-8-branch revision 273027]. same error.
Then installed Ubuntu 20.04 (note that I'm running under WSL) what got me compiler version (15:9-2019-q4-0ubuntu1) 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599] and surprisingly same error.

Finally installed Ubuntu 20.04 on a virtual machine. Did:

git clone https://github.com/keirf/FlashFloppy.git ff-test
cd ff-test
make debug=y

Same error

@keirf
Copy link
Owner

keirf commented Jan 22, 2021

How interesting. If I debug=y make then it works. If I make debug=y as you do, it fails. So do the former instead ;) I'll have to investigate why the latter doesn't work. I'm sure it should!

@keirf
Copy link
Owner

keirf commented Jan 22, 2021

Oh it'll be because I override debug=y when building bootloader precisely because it otherwise is too large. But debug=y on command line (after MAKE) overrides environment. So I need to update the makefile to put variables overrides after MAKE on submake invocations.

@teiram
Copy link

teiram commented Jan 22, 2021

Thanks @keirf. That did it. :)

Regarding the write issues, I was able to get this serial log on a write operation. After this log the OS returns an error. Maybe it helps to troubleshoot it:

** FlashFloppy v3.22 for Gotek
** Keir Fraser <[email protected]>
** https://github.com/keirf/FlashFloppy

Build: Jan 22 2021 18:55:35
Board: Standard
Config: Flash Slot 0 (ver 2, size 69)
Interface: Shugart (pin2=chg, pin34=rdy)
I2C: OLED found at 0x3c
OLED: SSD1306
Display: LCD/OLED

> USBH_USR_Init
> USBH_USR_ResetDevice
> USBH_USR_DeviceAttached
> USBH_USR_ResetDevice
> USBH_USR_DeviceSpeedDetected
> Device speed: Full
> USBH_USR_DeviceDescAvailable
 VID : 0204
 PID : 6025
> USBH_USR_DeviceAddressAssigned
> USBH_USR_ConfigurationDescAvailable
> Class connected: 08 (MSC)
 Manufacturer : USB2.0
 Product : Flash Disk
 Serial Number : 091602003A7D7B03
> USBH_USR_EnumerationDone
> USBH_USR_UserInput
Cache 104 items
0:F: 'WORDSTAR_NSI.hfe'
Cache 102 items
Mode: Native
Current slot: 26/28
Name: 'WORDSTAR_NSI' Type: hfe
Attr: 20 Clus: 00000c50 Size: 897024
Fast Seek: 1 frags
Cache 27 items
Write 0-4095 (4096)... 49700 us
Write 12800-16895 (4096)... 10452 us
Early. Retrying
Write 5120-9215 (4096)... 7722 us
Early. Retrying
Write 17920-22015 (4096)... 12180 us
Early. Retrying
Write 5120-9215 (4096)... 7734 us
Early. Retrying
Write 17920-22015 (4096)... 12044 us
Early. Retrying
Write 7680-11775 (4096)... 8664 us
Early. Retrying
Write 20480-24575 (4096)... 13003 us
Late. Retrying
Write 10240-14335 (4096)... 9528 us
Early. Retrying
Write 23040-25599 (2560)... 11936 us
Write 0-4095 (4096)... 51272 us
Early. Retrying
Write 0-4095 (4096)... 14117 us
Early. Retrying
Write 12800-16895 (4096)... 19258 us
Early. Retrying
Write 2560-6655 (4096)... 15475 us
Early. Retrying
Write 15360-19455 (4096)... 19760 us
Early. Retrying
Write 5120-9215 (4096)... 16358 us
Early. Retrying
Write 17920-22015 (4096)... 20820 us
Early. Retrying
Write 0-4095 (4096)... 22809 us
Write 7680-11775 (4096)... 30230 us
Write 0-4095 (4096)... 22626 us
Early. Retrying
Write 0-4095 (4096)... 22643 us
Early. Retrying

@teiram
Copy link

teiram commented Jan 23, 2021

I'm also adding a signal capture of a write error using the patches provided by @ejona86. Seems that sometimes the index pulse comes still too late:

image

Could this be an issue of the pendrive? Can it affect the generation of index pulses or are USB writes and pulse generation detached?

@ejona86
Copy link
Collaborator Author

ejona86 commented Jan 25, 2021

@teiram Oh! You have debug logs now! Great! That gives me a few more options.

That is a lot of "Early"s. Something is clearly broken there. I've suspected your troubles may be caused by the staggered writes caused by sector skew, but I've not hacked something together to produce similar results. But the writes in your recording look fine; I guess the retries are working.

I have still seen some issues on my end with messed up pulses, and had trouble tracking it down. It looks similar to what you see: the reads pretty far after a write gets screwy. I've tried to get rid of every place that changes the start of the track and am missing something. IIRC, I think I've seen the issue with my custom-made Micropolis HFE image, so at least some issue isn't related to setbitrate op, but I still think the op is asking for trouble and has some kinks to be worked out.

@keirf
Copy link
Owner

keirf commented Mar 2, 2021

Should we leave this open for now? As is it's not going to get pulled. I assume at some point it will be moved atop of the new asyncio work. But there's good discussion here, made less visible if we close it.

@ejona86
Copy link
Collaborator Author

ejona86 commented Mar 2, 2021

I'm fine either way. I'd like to pull out the writing fixes to observe HFE v3 opcodes and some of the misc v3 opcode handling fixes merged in the short-term. Basically anything that is useful outside the scope of hard sectors, which should be most things other than index opcode handling. Then, yeah, at some point I'll loop back to this and see how well the hard sector stuff works on top of asyncio (in either this PR, or a new one).

My goal is get the immediately-useful stuff ready and merged and stable before swapping back to the speculative stuff (like hard sectors). Hopefully it won't be too long though.

@keirf keirf closed this Apr 25, 2021
@ejona86 ejona86 deleted the hfe-hard-sectors branch June 13, 2021 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Micropolis support and hard sectors
4 participants