-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hard sector support to infrastructure and HFE #412
Conversation
Okay, seems a good approach. I think polishing too much without testing on real hardware is risking making more work: The design decisions taken are not validated, and this means not only may it not work off the bat, but you may also be striving to replicate real behaviour that doesn't actually matter in practice. Consider indexes. In soft-sectored FDC, the index is used only for:
Hence later drives suppress index during seek, and FlashFloppy extends that to other times it is busy (specifically writeback to Flash). Some hosts however use index for another purpose: This was first seen on BBC B with original single-density controller, and actually hasn't been seen much elsewhere. Here we must have pulses not delayed by more than a few % than expected or READY is deasserted and in-progress host operations fail. Unfortunately keeping properly regular index pulses can't work: 2 is in conflict with 3 and we can end up timing out host operations because an index pulse was generated while RDATA still inactive. Hence the On Micropolis and other hard-sectored systems, I wonder whether index pulses are used for anything more than incrementing and clearing the sector id? Because if not, is there any reason not to run with the defaults:
This is by very far the most common FlashFloppy behaviour, and you would save yourself pain if you can roll with it on HS systems. Another thing I need to check is that HFEv3 images generated by HxC tools always explicitly issue Index opcodes, and don't require an implicit index at track start. This is simply not something I checked when I hacked in HFEv3 support. Apart from that, I suspect a bunch of your HFEv3 logic is going to be good to pull in to master. |
Ah, of course I forgot that at least the extra index hole preceding track start must be time-critical I'm guessing. This gets hard I think, in practice. You don't want to delay index pulses, or at least not the pair leading into track start. But if you don't, and hardware sees index pulses counting in a sector it wants, and FlashFloppy is busy for 100+ms writing back a sector to Flash, and the sector you need is not cached, what happens? You will really want to stress test on hard-sector hardware, reads and writes, as it will be all too easy to make a 95% solution. 100% may require hardware more suited to the real-time, like Pi + hat. Such that you can cache the image in RAM, and plenty of cores to bitbang, manage drive emulation, manage host UI, writeback, etc. That would be a fun project. One I've considered myself. EDIT: To substantially improve emulation or add big new features (snazzy new UI perhaps?) would need different hardware. One thought is to go incremental on what we have already. I have used F7 MCU for Greaseweazle, and I could make a solution based on that with more Flash, RAM, and pins. But it's still incremental, someone has to build the boards, and they still won;t come with the 3.5"-sized casing of a Gotek. Probably this solution falls between two stools: plug'n'play good-enough Gotek, and perfect emulation of a completely different approach like Pi. Looking at say Pi1541, there is an open solution that could be cribbed from, some FlashFloppy code could be pulled in, the hat could be largely passive perhaps with level shifting/buffering. And it's more fun ;) I guess whether this sort of thing is of interest to you does depend how well you get hard sectoring going on a real host. EDIT2: |
I was able to test the changes on my Northstar Advantage with the images provided in #400 and the computer boots from the Gotek properly. Really nice! I'm also not sure if I'm able to create proper hard-sector HFEv3 images. I've tried with HxcFloppyEmulator to convert NSI images to HFEv3 but they don't even boot. Not sure if the hard sector information is being properly encoded. Example: |
That's fair. This can sit until I or others are getting comfortable with it on real hardware. My access to hardware is off-and-on, so I was doing things I can now to give me more time to do other things with the hardware.
Yeah, that's the issue. For "normal" sectors we should be able to extend the time between the pulses. But for the last sector we can only extend the time in the last half; we need to keep the double pulse in the first half of the last sector. We also can't go around generating fake pulses, as they will either be interpreted as a sector or index pulse and both cause problems; that's why I disabled
Going back to Another option is to make the writeback asynchronous. Assuming I figured out a way that doesn't cause havoc to the code (ha!), it would open options. Since
Flash we can get on the F105 and Flash and RAM we can also get on the F4x1. I need to dig in more where all the pins have gone to, but it seems there's still some to spare. If we need more pins, then it seems any of the options work but we have to go to a larger package. The neat thing about F7 to me is high-speed USB, otherwise I don't see much different to F1/F4. A Pi hat sounds fine. Doesn't fully excite me because I see it as not much more than a really big RAM chip and I feel like "surely there is another way." But it could be a small departure from what is here. It's probably easiest to treat the Pi mostly as a storage device, but dropping the FAT stuff. The Pi could do format conversion, so maybe every format but HFE could be dropped. The Pi would control the process, choosing the images and the like, but that's mostly just a "disk changed" signal. Could initially team it up with a stm32f401 black pill.
I'm really not all that pessimistic about it; there seem to be a lot of options still available. But I do have some other hopes and dreams where it may become useful (*cough* st506 *cough*), but that's still too early to tell. |
@teiram, make sure to include
It looks mostly right. II see problems when reading WORDSTAR_NSI_2.hfe on my end, but that applies also to northstar_nst.hfe. The index pulses can sometimes get wonky; I didn't retest northstar very hard after I got the micropolis file worked out. I'm seeing pulse skews over 2 ms, which is a ton, where I expect there are none in the HFE file. If they were less than 2 ms they would merge together and not cause a problem. Both northstar and wordstar files have lots of ops that set the bitrate which can cause seeking alignment problems. Seeing those ops makes a lot of sense (I had been wondering how changing tracks worked with bitrates...) but there's only enough of them to cause 200 us of skew. So this needs further investigation. I will say that the northstar_nst.hfe has sector pulses nicely aligned with the start of the sector preface except for the first sector. In the wordstar file all the sector pulses are in the middle of the preface. But that is a secondary issue; without the right pulses things will be messed up. |
I don't think HD HFE is likely to succumb to full caching, as ~50kB would be incredibly tight, especially since we would still need some ring buffering. But just about everything else is plausible, except perhaps ED-rate IMG, but that could be okay treated side-at-a-time. Whereas everything else could prefetch/stream both heads for current cylinder, writeback on change of cylinder, etc. So yes that's where I'd like to go. Thinking about how to get the asynchrony, the cleanest would be a simple process scheduler and move the Flash access into a process behind an extended buffer-cache interface. Probably. Trying to think how to do it using interrupts only gets confusing. |
Thanks for your work in this, I’m enjoying the discussion. I’d be interested to use this on both a Northstar Horizon and a Vector MZ. One request I’d like to make, if new hardware is being considered is to add a 2SIDE output for FlashFloppy to allow better compatibility with 8” controllers. I’m about to start imaging a bunch of Vector/Micropolis 16-hard-sector floppies. I may be able to try the hard sector firmware with FlashFloppy at that time. |
I was thinking of HD formats at 360 RPM, which are ~42K. Yeah, I agree HD at 300 RPM formats seem out-of-reach for HFE and full-track buffering. And ED wasn't on my mind at all; that would be demanding. |
Thanks a lot for your support and hints. I will try disabling the index-supression and see what is the outcome.
This wordstar image was generated directly from an NSI file with HxCFloppyEmulator. My understanding here (maybe wrong) is that the timing for the index pulses should be the same for every NSI image, since the format is fixed and only holds the data itself. I don't really know how the images that are working were generated though.
Is there any software to easily check to the generated HFE files, timings, index pulses and such stuff? |
@teiram, I've been using:
Ideally. Although I'm seeing sectors (padding) start before the sector pulses, which means the data and sector pulses are mis-aligned. I'm not familiar with NSI files, but I assume those are like IMG files which means it is HxCFloppyEmulator putting in the indexes at a not-quite-ideal spot.
HxCFloppyEmulator's "Disk view mode" is useful to verify the tracks are aligned. But it doesn't include the index pulses from what I've seen. I've just been viewing the file with a hex editor. If you search for Here's a section from wordstar_sni_2.hfe. You can see
We'd expect it to be nearer where the data swaps from
Those Again though, this index problem isn't a big problem by itself. It just reduces our tolerances and so may make other problems stand out more. |
Hi @ejona86. Thanks a lot for the info. I think you're right and there is some sort of missalignment in the images generated by HxcFloppyEmulator from NSI files. I so far was sampling the attempt to boot of the Northstar Advantage for a working image nortstar_nst.hfe and a non-working one (my wordstar_sni_2.hfe). The working one reads from track0 and then seems to jump to track2, where the CP/M directory is located: For the wordstar one, seems that the readings from track0 are not properly done, because there is no attempt to jump to any other track and the computer returns to the boot system prompt. It's also impossible to read anything from this image even swapping to this disk after a proper CP/M boot with a working disk image. I'm pretty sure the NSI file just contains the sector data. As example the wordstar one I was using: I will try to figure out how HxcFloppyEmulator does the conversion from NSI to HFE and why those sector marks seem to be misplaced. |
I figured it out! The horrible alignment is due to the setbitrate opcodes. I had forgotten a factor of 8x since the opcode contains 8 bits and a factor of 2x since there's two bytes for that specific opcode. I estimated around 100 setbitrate opcodes; that can contribute 3.2 ms of skew! Puzzle solved. At some point we'll probably need to enhance the HFE support to seek more accurately. Script I used. Pass HFE file as stdin, it will output HFE file to stdout. Probably breaks on Windows. You can change #!/bin/env python3
import math
import sys
def split_sides(bs):
s1 = bytearray()
s2 = bytearray()
for pos in range(0, len(bs), 512):
s1.extend(bs[pos:pos+256])
s2.extend(bs[pos+256:pos+512])
return s1, s2
def join_sides(s1, s2):
track = bytearray()
for pos in range(0, len(s1), 256):
track.extend(s1[pos:pos+256])
track.extend(s2[pos:pos+256])
return track
def strip_opcodes(bs, bitrate=True, index=True):
obs = bytearray()
i = 0
while i < len(bs):
b = bs[i]
if b == 0x4F and bitrate:
i += 2
continue
if b == 0x8F and index:
i += 1
continue
obs.append(b)
i += 1
return obs
def reformat_side(bs):
obs = strip_opcodes(bs, bitrate=strip_bitrate, index=strip_index)
if strip_index:
start_offset = strip_bitrate and 64 or 66
end_offset = strip_bitrate and -80 or -85
sector_len = (len(obs) - start_offset - end_offset)/sector_count
obs.insert(len(obs) - int(sector_len)//2 - end_offset, 0x8F)
for sector in range(sector_count-1, -1, -1):
obs.insert(start_offset + int(sector_len * sector), 0x8F)
return obs
sector_count = 10
strip_bitrate = True
strip_index = False
f = sys.stdout.buffer
s = sys.stdin.buffer
offset = 0
header = s.read(512)
offset += 1
f.write(header)
track_count = header[9]
track_list_offset = (header[19] << 8) | header[18]
assert track_list_offset == 1
tracklist = bytearray(s.read(512))
offset += 1
tracks = []
for track_num in range(track_count):
tracklist_entry = tracklist[track_num*4:track_num*4+4]
track_offset = (tracklist_entry[1] << 8) | tracklist_entry[0]
assert offset <= track_offset
if offset < track_offset:
f.read(512 * (track_offset - offset))
orig_track_size = ((tracklist_entry[3] << 8) | tracklist_entry[2]) // 2
orig_track_cap = math.ceil(orig_track_size / 256) * 256
track = s.read(2*orig_track_cap)
offset += orig_track_cap / 256
(s1, s2) = split_sides(track)
s1 = reformat_side(s1[:orig_track_size])
s2 = reformat_side(s2[:orig_track_size])
if len(s1) != len(s2):
print("Warning: side length mismatch", file=sys.stderr)
track_size = max(len(s1), len(s2))
s1.extend(b'\x0F' * (orig_track_cap - len(s1)))
s2.extend(b'\x0F' * (orig_track_cap - len(s2)))
tracklist_entry[2] = (track_size*2) & 0xFF
tracklist_entry[3] = ((track_size*2) >> 8) & 0xFF
tracklist[track_num*4:track_num*4+4] = tracklist_entry
tracks.append(join_sides(s1, s2))
f.write(tracklist)
for track in tracks:
f.write(track)
f.flush() |
This reduces the worst-case observed skew from 3 ms to 50 us. There is a risk that in the future an opcode is added that is numerous and not evenly distributed, but that seems unlikely and we'd need to swap to a more complicated/advanced scheme.
@teiram, I just pushed a commit that seems to work well for unmodified HFE files that contain many setbitrate opcodes. But it behaves poorly with the HFE files generated from my script, because my script didn't update the track length. |
Backing up a little, is it worthwhile trying to handle 'poor' HFE images? I mean, a good HFEv3 image should keep the two sides in pretty good sync. That's one thing the skip opcode is for aiui: to allow a side which has "got behind" to catch up? Specifically looking at hard sectored HFEv3, the set-bitrate opcodes are surely superfluous as I'm sure the tracks are constant rate. Would it make more sense, if recordings of real disks are wanted, to clean them up with an external script:
Being able to assume your HFE is clean has got to be a nice simplification right? EDIT: I haven't played with HFEv3 much myself though, to be fair. So I'm not up on what all the snags can be. |
I don't think they are worth spending a considerable amount of effort.
Yes, it should. The no-op opcode is for keeping the sides in sync. An opcode on one side should always be mirrored by an opcode on the other (same opcode, or a noop). The skipbtis opcode is just to handle non-divisible-by-8 bitcell counts. There's no "catching up" involved; the two sides go in lock-step, modulo less than 8 bitcells in some cases. That said, different tracks don't have to align as well. But I'm fine assuming they do.
From what I can infer, the point of the frequent set-bitrate opcodes is for track changes. If any one track is a different bitrate then when you swap to/from that track you need to learn of the different bitrate quickly. But yes, in these images all the tracks have the same rate. So while the many bitrate ops are extraneous, it doesn't really make the HFE image "poor." I'll also note that the only reason these opcodes are a problem is because we use ticks for setup_track and not bc; if HFE was the core format for the system these opcodes would be less of a problem (not to say we should make any changes there, but that HFE is not the only source of the problem). But I'm quite practical here: stripping out the bitrates with that python script is not that bad of a solution in my mind for hard sector support. (Other than I need to fix it so the track lengths are semi-correct.) But I didn't see this setbitrate issue as being solely a hard sector problem; I think it could cause trouble during writing with soft sectors as well.
To be clear, the images I'm messing with here have perfectly aligned tracks. The skews and issues were due to our processing of the file (mainly seeking with hfe_setup_track), not the file itself. If the tracks need alignment, I'm fine telling someone to do that manually with the HxC software.
Definitely the same for me. I originally was not wild about HFEv3. But it has been growing on me as I've been figuring out its mental model. |
Very nice. I've just tested the provided images. The nosetbitrate doesn't work either but the nosetbitrate_reindexed mostly does: I was able to boot with that disk and run wordstar. There is an error while loading that doesn't happen on retrying the operation. I will next give a try to your new commit with the original HxcFloppyEmulator generated hfe files and will provide feedback. |
Just checked this. Results are different in the following way:
|
Yes it's a shame there's not a per-track default bitrate. Still, for nearly all platforms, bitrate is consistent across and within all tracks, so the single bitrate reported at the top of the image is good. I think the original reason for set-bitrate was to support variable rate protection tracks. For example Copylock and Speedlock on Commodore Amiga. And I also think the HxC HFEv3 converter is a bit over keen to spew set-bitrate opcodes, at least when presented with a raw dump. |
@teiram wrote:
That flakiness is probably caused by the length bug my script had, since retry fixes it. It looks like "reindexed" is really all you need. I think I've fixed my script, and re-generated the HFE files. This time there's also a plain "reindexed" which leaves the setbitrate in-place. With my latest commit it shouldn't matter whether the setbitrates are present or not. WORDSTAR_NSI_2_reopcoded2.zip |
The plain reindexed one works perfectly for reading. Nice work! But I'm getting errors on write attempts. Seems that some of the index pulses are missing: |
@teiram, that's great news! I updated the script in the comment I posted it. For the missing index pulses during write, have you set |
Forgot that part, sorry. :) After the first WGATE activation timing of the index pulses is not consistent. Normally the time between pulses is 18ms but here we have one of 16.96ms, then 18, 20 and seems that there are either many pulses for this track spin or something weird has happened? |
@teiram, I've been able to reproduce similar behavior. What I'm seeing seems to be triggered by slow reads immediately following writes. Reads normally take 2 ms for me, but after some writes they take 10-20 ms. The code expects all the reading and processing at that point to be within 10 ms. It's possible that inaccurate It's possible to work around that by increasing two time values (e.g., to 25 ms and 30 ms for my system). But that will cause all reads to be delayed and you have to know how slow these rare reads are. This is probably a better workaround for now: diff --git a/src/floppy.c b/src/floppy.c
index d6f7abb..0ca54fe 100644
--- a/src/floppy.c
+++ b/src/floppy.c
@@ -365,6 +365,9 @@ static void floppy_sync_flux(void)
if (!drv->index_suppressed) {
ticks = time_diff(time_now(), sync_time) - time_us(1);
if (ticks > time_ms(15)) {
+ printk("Early. Retrying\n");
+ dma_rd->state = DMA_inactive;
+ return;
/* Too long to wait. Immediately re-sync index timing. */
drv->index_suppressed = TRUE;
printk("Trk %u: skip %ums\n",
@@ -378,6 +381,9 @@ static void floppy_sync_flux(void)
/* If we're out of sync then forcibly re-sync index timing. */
ticks = time_diff(time_now(), sync_time);
if (ticks < -100) {
+ printk("Late. Retrying\n");
+ dma_rd->state = DMA_inactive;
+ return;
drv->index_suppressed = TRUE;
printk("Trk %u: late %uus\n",
drv->image->cur_track, -ticks/time_us(1)); If you still see weird timing, you can try this as well. This is more of a bug fix instead of a workaround, but I'm not making it in a commit because I've not been able to see any difference in behavior: diff --git a/src/image/hfe.c b/src/image/hfe.c
index a23e32e..01a75f8 100644
--- a/src/image/hfe.c
+++ b/src/image/hfe.c
@@ -271,6 +271,7 @@ static uint16_t hfe_rdata_flux(struct image *im, uint16_t *tbuf, uint16_t nr)
if (im->cur_bc >= im->tracklen_bc) {
ASSERT(im->cur_bc == im->tracklen_bc);
im->tracklen_ticks = im->cur_ticks;
+ im->stk_per_rev = stk_sysclk(im->tracklen_ticks / 16);
im->cur_bc = im->cur_ticks = 0;
/* Skip tail of current 256-byte block. */
bc_c = (bc_c + 256*8-1) & ~(256*8-1); |
Hello @ejona86 I didn't have the chance yet to look at the signals. I was just able to do this test:
I have also noticed that even the reindexed images with the more up to date script fail sometimes while reading. It happens less than before but still there. |
Was trying to build with debug=y and seems it doesn't fit anymore:
Any hint about what to change to reduce those 908 bytes? I was trying to tune it a bit and try to remove some font, but no luck. Seems the size of the bootloader is always the same. |
Size of bootloader shouldn't have grown at all. If it has it's probably a mistake / something dumb and should be easy to claw back. |
Just tried in master and same behavior compiling with: |
Debug build, Ubuntu 20.04 LTS:
|
Debian GNU Linux 10:
|
I guess it doesn't build on that one then. |
Well, this is really weird. I have tried on Debian 10 and got that failure with the mentioned version of the compiler. Finally installed Ubuntu 20.04 on a virtual machine. Did:
Same error |
How interesting. If I |
Oh it'll be because I override debug=y when building bootloader precisely because it otherwise is too large. But debug=y on command line (after MAKE) overrides environment. So I need to update the makefile to put variables overrides after MAKE on submake invocations. |
Thanks @keirf. That did it. :) Regarding the write issues, I was able to get this serial log on a write operation. After this log the OS returns an error. Maybe it helps to troubleshoot it:
|
I'm also adding a signal capture of a write error using the patches provided by @ejona86. Seems that sometimes the index pulse comes still too late: Could this be an issue of the pendrive? Can it affect the generation of index pulses or are USB writes and pulse generation detached? |
@teiram Oh! You have debug logs now! Great! That gives me a few more options. That is a lot of "Early"s. Something is clearly broken there. I've suspected your troubles may be caused by the staggered writes caused by sector skew, but I've not hacked something together to produce similar results. But the writes in your recording look fine; I guess the retries are working. I have still seen some issues on my end with messed up pulses, and had trouble tracking it down. It looks similar to what you see: the reads pretty far after a write gets screwy. I've tried to get rid of every place that changes the start of the track and am missing something. IIRC, I think I've seen the issue with my custom-made Micropolis HFE image, so at least some issue isn't related to setbitrate op, but I still think the op is asking for trouble and has some kinks to be worked out. |
Should we leave this open for now? As is it's not going to get pulled. I assume at some point it will be moved atop of the new asyncio work. But there's good discussion here, made less visible if we close it. |
I'm fine either way. I'd like to pull out the writing fixes to observe HFE v3 opcodes and some of the misc v3 opcode handling fixes merged in the short-term. Basically anything that is useful outside the scope of hard sectors, which should be most things other than index opcode handling. Then, yeah, at some point I'll loop back to this and see how well the hard sector stuff works on top of asyncio (in either this PR, or a new one). My goal is get the immediately-useful stuff ready and merged and stable before swapping back to the speculative stuff (like hard sectors). Hopefully it won't be too long though. |
Alternative to #401. This drops the VGI support that it used. I have VGI read support with this version in my own repo but removed it because it did not support writes and it seemed there was a preference to go with the general-purpose HFE instead of many custom image formats. It proved to be very helpful during development though.
This is more invasive than #401, although the infrastructure enhancements seem fair. I considered many The HFE changes were the trickier ones.
There are two FIXMEs in the code commenting out code for
fake_fired
. The code was causing trouble, and it was unclear what the intention was since it was in theindex_suppression == FALSE
case. I assume that will be resolved before merging.HFE index pulses do not have to come at the beginning of the track, so normal pulses are disabled when custom pulses are being used. Syncing the custom pulses immediately when they are changed is required because HFE tracks don't have to be aligned with each other and because seeking ignores opcodes which causes temporary skew.
The index pulses with this change and
index-suppression=no
are very consistent during reading/seeking. I still have not tested with real hardware, but pulses during writes with all threewrite-drain
settings behave as expected and realtime and eot are promising for actual hardware compatibility.The hard-sectored HFE files that I could find were clearly recordings and not all that pretty. I did test them and they did work, but I also created my own 16 sector files. I'll post the script I used to generate it to #400.
Fixes #393 #400
CC @teiram