Skip to content

Roadmap for the Apache Daffodil VS Code Extension

Davin Shearer edited this page Jun 7, 2022 · 12 revisions

Now that the Apache Daffodil VS Code Extension version 1.0.0 has been released and published to the Microsoft® VS Code Marketplace, it is time to consider features and improvements for the next releases.


Important ideas on the future of the Apache Daffodil VS Code Extension

While v1.0.0 focused on the schema and the XML infoset, the theme of the next version will place additional emphasis on the input data. The input data could be any kind of file, so having a robust hex editing capability is important. It is also important to have the ability to set breakpoints not only in the schema, but also in the data, and allow for manipulating the data and watch it affect the parse. In other words, what happens to the parse when I change the data in some way. While stepping through the debugger, the schema, the XML infoset, and the data views need to all be kept in sync.


Desired Features of the Input Data Editor

For organizational purposes, the desired features are broken down into 8 functional areas.

1. File Type Support (FTS)

1.1 The data editor needs to support any fixed length (non-streaming) file Daffodil is capable of opening. Generally, any file type can be opened and displayed by a hex editor. The file type and extension do not influence the rendering of the file in hex or binary formats.

2. User Interface (UI)

2.1 The data editor needs to be responsive and provide a good Visual Studio Code User Experience. Existing third-party VS Code hex editors will decrease in responsiveness while rendering medium to large size files. The editor will handle file sizes common to Daffodil without impacting overall usability.

2.2 The data editor needs to be designed as a composition of display panels that allow for multiple data representations to be rendered on the same screen. A data file may be segmented into multiple representations of data, from differing on byte boundaries to endianess. The editor will render differing representations within the same user interface.

2.3 The data editor needs to allow individual display panels to maintain their own position in the data to allow viewing different segments of data in different display panels. The editor will manage each composable view as a separate Viewport capable of displaying a view into the data at a specified offset and capacity.

2.4 The data editor viewports need to be interactive to allow mouse and keyboard interactions such as scrolling and context menus. User interaction will drive the function of the editor as such the ability to interpret keyboard and mouse actions on individual and block data selections are critical.

2.5 The data editor needs to include a Properties View component. The property view will provide a static region on the display to place file and selection metadata. The property view is not associated to a specific region in the file, so it is not a viewport component. It is tied to events such as selection events and is updated based on notification of events occurring.

2.6 The data editor needs to include a property display mode for a single unit selection. The Properties View will allow multiple representations for a single unit, eg byte, to be displayed simultaneously.

2.7 The data editor should include a property display mode for multiple unit selection. Selecting up to some limit of bytes, for example 4, could still be rendered in the properties view. For example, selecting four bytes could render a 32-bit integer value.

3. Persisting Edits (PER)

3.1 The data editor needs to allow edits to be saved as a new file. The editor will not attempt to write the file that is held open by Daffodil. Instead, a copy of the file will be written to disk.

3.2 The data editor needs to provide an auto-incremented file revision number to save without prompting user. When saving edits to a file it may be preferrable for the save-as-new-file to be transparent to the user. In this case the user will not be prompted for a file name but instead use an autogenerated name.

3.3 The data editor needs to provide a save-as option to name a new file. When saving edits to a file the user may want to specify where the edited file should be saved, in this case a file picker dialog or something similar can be used to allow the user to specify the location of the save file.

3.4 The data editor should provide a convenient way of restarting the Daffodil debugger with the specified edits. After saving the edits to a file the debugger can be restarted and automatically set to use the new files path as the input. This convenience allows the user to avoid editing their launch profile to point to the new file.

4. Data Representations (DATAREP)

Hex and binary representations for both viewing and editing.

4.1 The data editor needs to implement support for multiple data representations. The editor will use the viewport component design to deliver a composable multiple representation rendering capability.

4.2 The data editor needs to provide a viewport for viewing byte delimited data. The viewport will display hex bytes similar to the common hex editor displayed.

4.3 The data editor needs to provide a viewport for viewing data as individual bits. The viewport will render binary 1-0 display. The details of the rendering such as unit length can be modified using properties associated with the viewport.

4.4 The data editor needs to provide configurable rendering properties for any given representation. The UI will allow user to view and edit viewport properties

4.5 The data editor needs to provide configurable endianness properties for viewport rendering. Configuring big or little endian for a viewport.

5. Editing (EDT)

5.1 The data editor needs to implement inline editing within a viewport. The viewport will support mouse and keyboard interaction to initiate editing a value.

5.2 The data editor needs to default to editing in the same representation as the view. The editor will allow editing using the same viewport rendering as the representation, e.g., hex from hex, binary from binary can be represented using the native rendering logic of the viewport.

5.3 The data editor needs to provide undo / redo capability related to edits. A common expectation of editors such as this would be to provide commands to undo and redo edits that have been made.

5.4 The data editor provide editing in differing representations as the view. The editor could provide something like a pop-out component that allows editing a value in a format that differs from the viewport representation, e.g., editing binary from the hex view.

6. Debugger integration (DBG)

6.1 The debugger needs to provide extension points which allow executing debug commands from the editor. There are certain non-standard operations such as setting breakpoints on data locations that are to be supported. This will require the debugger to provide extension points that allow the editor to pass instructions that augment the debugger flow.

6.2 The debugger should support breakpoints to be set at data positions in the input file. Setting breakpoints on data locations indicates to the debugger that when the input stream reaches a specified point in the file it should break execution as if it hit a code breakpoint.

6.3 The data editor should allow breakpoints to be set at data positions in the input file. The data editor should allow creation of and then render data breakpoints in a similar way to how code breakpoints are set and rendered.

6.4 The data editor should support starting debug from a specified position. The editor provides a function via a context menu that indicates a starting point in the file for the input stream. This will drop all bytes prior to this location when starting the debug.

6.5 The data editor should support stopping debug at a specified position. The editor provides a function via a context menu that indicates the stopping point in the input stream such that all data after this point will be ignored by the input stream, ending the debug at the specified point.

6.6 The debugger should support the latest version of daffodil released. The extension should be kept up to date with the latest version of daffodil.

7. Editing Commands (CMD)

In this section a “block” is defined as a range that has been selected by the user.

7.1 The data editor needs to support adding individual bytes. The editor will provide function to insert a single byte at a position in the file.

7.2 The data editor needs to support adding blocks of bytes. The editor will provide function to insert multiple bytes starting at a position in the file.

7.3 The data editor needs to support deleting individual bytes. The editor will provide function to delete a single byte from the file.

7.4 The data editor needs to support deleting blocks of bytes. The editor will provide function to delete multiple bytes selection from the file.

7.5 The data editor needs to support modifying the value of an individual byte. The editor will provide function to overwrite the value of a byte in the file.

7.6 The data editor needs to support modifying the value of a block of bytes. The editor will provide function to overwrite the value of a range of bytes in the file.

7.7 The data editor needs to support copying byte(s). The editor may provide ability to select and copy a range of bytes to the clipboard for convenience and interoperability. The size of bytes that can be copied may need an upper limit depending on the file size and system memory availability.

7.8 The data editor needs to support pasting byte(s). The editor may provide ability to past bytes from the system clipboard into the file at a specified position for convenience and interoperability.

7.9 The data editor needs to support searching for patterns. Search function similar to a text editor find text using literal text. This pattern would literally be searched for in each given representation.

7.10 The data editor needs to support replacing searching results with new patterns. Search function similar to a text editor find text using literal text and replace with alternate text. This pattern would literally be searched for in each given representation and replaced using text that is valid within said representation.

7.11 The data editor should use the native clipboard provided by the operating system for interoperability with other applications. The editor may use the operating system clipboard for copy and paste operations to improve interoperability with other applications.

7.12 The data editor should support applying a bit mask to an individual byte. The editor may provide function to apply a mask to a byte at a position in the file.

7.13 The data editor should support applying a bit mask to a block of bytes. The editor may provide function to apply a mask to a selection of bytes in the file.

8. Test Data Markup Language integration (TDML)

8.1 All external files needed by the TDML file shall be incorporated as relative paths into the TDML file.

8.2 Test Data Markup Language features shall be as modular as possible. Modularization allows for their future removal from the DFDL extension’s repository and addition to a library that can be shared by the DFDL repository.

8.3 Test Data Markup Language features shall be written in Scala and shall read/write XML by using XML bindings (e.g., Jaxb/scalaxb).

8.4 The extension shall provide an item in the command palette (ctrl + shift + p) for ‘Generate TDML File’.

Selecting this command shall bring up menus allowing the user to select the following:

  • TDML File Name
  • Name for the test case
  • Description for the test case
  • DFDL Schema
  • Data Document

This selection shall work in the same way as the DFDL debugger where if you select the command from a DFDL Schema, it will automatically use that in place of a selection. The TDML File shall be created in the workspace directory. The DFDL Schema and Document files shall also be file names only.

  • These file names shall be relative to the workspace directory. It will be the user’s responsibility to organize everything when creating a TDML file and to package the files up for distribution.
  • The name of the TDML file shall be the name of the DFDL schema used with ‘.tdml’ appended to the end.

8.6 The extension shall provide an item in the command palette (ctrl + shift + p) for ‘Add Test Case to TDML File’.

Selecting this command shall bring up menus allowing the user to select the following:

  • TDML File Name
  • Name for the test case
  • Description for the test case
  • DFDL Schema
  • Data Document

This selection shall work in the same way as the DFDL debugger where if you select the command from a DFDL Schema, it will automatically use that in place of a selection.

8.7 The extension shall provide an item in the command palette (ctrl + shift + p) for ‘Run Test Case in TDML File’.

Selecting this command shall bring up menus allowing the user to select the following:

  • TDML File Name
  • Test Case to run (This list shall be populated with data in the selected TDML File)

This command shall start the Daffodil process in run mode. This command should provide an option to start the Daffodil process in debug mode. The location of the DFDL Schema shall be expected to be relative to the location of the TDML File. It shall be the user who created the TDML file’s responsibility to ensure that packaging of their TDML file is correct.


UI Mockups

Initial UI wireframes

Schema Editor Live Infoset Data Editor

Initial UI mockup in VS Code

DFDL-hex-mockup

Release Plan (Proposed)

The goal is to get these VS Code extension capabilities released, and published to the Marketplace, by 4Q2022.

The table below should be updated as new releases come out, or the themes/emphasis of a release change.

Of course this is all highly subject to change based on what the user community needs, and what community developers choose to work on.

The release numbering is also subject to change.

Release Description
1.1.0
Target: June, 2022
UI wireframes showing a vision of the data editor has been posted for discussion and feedback. The main editing viewport now has support for the delete and insert editing primitives in addition to overwrite. Support for multiple viewports, being able to undo and redo changes, cut and paste, and file saving are implemented.
1.2.0
Target: July, 2022
Search and replace is implemented. Wireframe adjustments based on feedback have been made.
1.3.0
Target: July, 2022
Editing is permitted in any of several viewports. Each viewport is capable of displaying data in different formats (e.g, binary, hex, ascii, big and little endian integers). Support for a properties view component.
1.4.0
Target: August, 2022
Support for transformations of a byte range and checkpoints. Initial "MVP" integration of TDML.
1.5.0
Target: August, 2022
Initial DFDL Debugger integration and DFDL language support.
1.6.0
Target: September, 2022
Breakpoints can be set at data offsets and debugging can start and stop at specified offsets. Additional TDML support and refinements.
1.7.0
Target: October, 2022
Test coverage, user testing, packaging, documentation, CI release process, 2.0.0 release candidates / technical previews.
2.0.0
Target: November, 2022
Outstanding blockers are done, documentation published, release vote is approved, version 2.0.0 released and the extension is published to the marketplace.
Clone this wiki locally