Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treewide: Update docs and tutorial #196

Merged
merged 1 commit into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ docs: doc-srcs doxygen-docs
clean-docs:
rm -rf $(GENERATED_DOCS_DIR)
rm -rf $(DOXYGEN_DOCS_DIR)
rm -rf site

$(GENERATED_DOCS_DIR):
mkdir -p $@
Expand Down
95 changes: 52 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,55 +40,59 @@ licenses. See the respective folder for the licenses used.

## Publications

<!--start-publications-->

If you use the Snitch cluster or its extensions in your work, you can cite us:

<details>
<summary><b>Snitch: A tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads</b></summary>
<summary><b><a href="https://doi.org/10.1109/TC.2020.3027900">Snitch: A Tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads</a></a></b></summary>
<p>

```
@article{zaruba2020snitch,
title={Snitch: A tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads},
@ARTICLE{zaruba2021snitch,
author={Zaruba, Florian and Schuiki, Fabian and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Computers},
year={2020},
publisher={IEEE}
journal={IEEE Transactions on Computers},
title={Snitch: A Tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads},
year={2021},
volume={70},
number={11},
pages={1845-1860},
doi={10.1109/TC.2020.3027900}
}
```

</p>
</details>

<details>
<summary><b>Stream semantic registers: A lightweight risc-v isa extension achieving full compute utilization in single-issue cores</b></summary>
<summary><b><a href="https://doi.org/10.1109/TC.2020.2987314">Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores</a></b></summary>
<p>

```
@article{schuiki2020stream,
title={Stream semantic registers: A lightweight risc-v isa extension achieving full compute utilization in single-issue cores},
@ARTICLE{schuiki2021ssr,
author={Schuiki, Fabian and Zaruba, Florian and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Computers},
journal={IEEE Transactions on Computers},
title={Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores},
year={2021},
volume={70},
number={2},
pages={212--227},
year={2020},
publisher={IEEE}
pages={212-227},
doi={10.1109/TC.2020.2987314}
}
```

</p>
</details>

<details>
<summary><b>Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra</b></summary>
<summary><b><a href="https://doi.org/10.1109/TPDS.2023.3322029">Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra</a></b></summary>
<p>

```
@article{scheffler2023sparsessr,
@ARTICLE{scheffler2023sparsessr,
author={Scheffler, Paul and Zaruba, Florian and Schuiki, Fabian and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Parallel and Distributed Systems},
title={Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra},
journal={IEEE Transactions on Parallel and Distributed Systems},
title={Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra},
year={2023},
volume={34},
number={12},
Expand All @@ -101,52 +105,54 @@ If you use the Snitch cluster or its extensions in your work, you can cite us:
</details>

<details>
<summary><b>A High-performance, Energy-efficient Modular DMA Engine Architecture</b></summary>
<summary><b><a href="https://doi.org/10.1109/TC.2023.3329930">A High-Performance, Energy-Efficient Modular DMA Engine Architecture</a></b></summary>
<p>

```
@ARTICLE{benz2023idma,
@ARTICLE{benz2024idma,
author={Benz, Thomas and Rogenmoser, Michael and Scheffler, Paul and Riedel, Samuel and Ottaviano, Alessandro and Kurth, Andreas and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Computers},
title={A High-performance, Energy-efficient Modular DMA Engine Architecture},
year={2023},
volume={},
number={},
pages={1-14},
doi={10.1109/TC.2023.3329930}}
journal={IEEE Transactions on Computers},
title={A High-Performance, Energy-Efficient Modular DMA Engine Architecture},
year={2024},
volume={73},
number={1},
pages={263-277},
doi={10.1109/TC.2023.3329930}
}
```

</p>
</details>

<details>
<summary><b>MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores</b></summary>
<summary><b><a href="https://doi.org/10.1109/ARITH54963.2022.00010">MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores</a></b></summary>
<p>

```
@inproceedings{bertaccini2022minifloat,
@INPROCEEDINGS{bertaccini2022minifloat,
author={Bertaccini, Luca and Paulin, Gianna and Fischer, Tim and Mach, Stefan and Benini, Luca},
booktitle={2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)},
title={MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores},
booktitle={2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)},
title={MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores},
year={2022},
volume={},
number={},
pages={1-8}
pages={1-8},
doi={10.1109/ARITH54963.2022.00010}
}
```

</p>
</details>

<details>
<summary><b>Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters</b></summary>
<summary><b><a href="https://doi.org/10.1109/ISVLSI54635.2022.00021">Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters</a></b></summary>
<p>

```
@inproceedings{paulin2022softtiles,
@INPROCEEDINGS{paulin2022softtiles,
author={Paulin, Gianna and Cavalcante, Matheus and Scheffler, Paul and Bertaccini, Luca and Zhang, Yichao and Gürkaynak, Frank and Benini, Luca},
booktitle={2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)},
title={Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters},
booktitle={2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)},
title={Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters},
year={2022},
volume={},
number={},
Expand All @@ -159,20 +165,23 @@ If you use the Snitch cluster or its extensions in your work, you can cite us:
</details>

<details>
<summary><b>SARIS: Accelerating Stencil Computations on Energy-Efficient RISC-V Compute Clusters with Indirect Stream Registers</b></summary>
<summary><b><a href="https://doi.org/10.48550/arXiv.2404.05303">SARIS: Accelerating Stencil Computations on Energy-Efficient RISC-V Compute Clusters with Indirect Stream Registers</a></b></summary>
<p>

```
@misc{scheffler2024saris,
title={SARIS: Accelerating Stencil Computations on Energy-Efficient
RISC-V Compute Clusters with Indirect Stream Registers},
author={Paul Scheffler and Luca Colagrande and Luca Benini},
year={2024},
eprint={2404.05303},
archivePrefix={arXiv},
primaryClass={cs.MS}
title={SARIS: Accelerating Stencil Computations on Energy-Efficient
RISC-V Compute Clusters with Indirect Stream Registers},
author={Paul Scheffler and Luca Colagrande and Luca Benini},
year={2024},
eprint={2404.05303},
archivePrefix={arXiv},
primaryClass={cs.MS},
url={https://arxiv.org/abs/2404.05303}
}
```

</p>
</details>

<!--end-publications-->
145 changes: 6 additions & 139 deletions docs/publications.md
Original file line number Diff line number Diff line change
@@ -1,141 +1,8 @@
# Publications

If you use the Snitch cluster or its extensions in your work, you can cite us:

<!--start-publications-->

<details>
<summary><b>Snitch: A tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads</b></summary>
<p>

```
@article{zaruba2020snitch,
title={Snitch: A tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads},
author={Zaruba, Florian and Schuiki, Fabian and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Computers},
year={2020},
publisher={IEEE}
}
```

</p>
</details>

<details>
<summary><b>Stream semantic registers: A lightweight risc-v isa extension achieving full compute utilization in single-issue cores</b></summary>
<p>

```
@article{schuiki2020stream,
title={Stream semantic registers: A lightweight risc-v isa extension achieving full compute utilization in single-issue cores},
author={Schuiki, Fabian and Zaruba, Florian and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Computers},
volume={70},
number={2},
pages={212--227},
year={2020},
publisher={IEEE}
}
```

</p>
</details>

<details>
<summary><b>Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra</b></summary>
<p>

```
@article{scheffler2023sparsessr,
author={Scheffler, Paul and Zaruba, Florian and Schuiki, Fabian and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Parallel and Distributed Systems},
title={Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra},
year={2023},
volume={34},
number={12},
pages={3147-3161},
doi={10.1109/TPDS.2023.3322029}
}
```

</p>
</details>

<details>
<summary><b>A High-performance, Energy-efficient Modular DMA Engine Architecture</b></summary>
<p>

```
@ARTICLE{benz2023idma,
author={Benz, Thomas and Rogenmoser, Michael and Scheffler, Paul and Riedel, Samuel and Ottaviano, Alessandro and Kurth, Andreas and Hoefler, Torsten and Benini, Luca},
journal={IEEE Transactions on Computers},
title={A High-performance, Energy-efficient Modular DMA Engine Architecture},
year={2023},
volume={},
number={},
pages={1-14},
doi={10.1109/TC.2023.3329930}}
```

</p>
</details>

<details>
<summary><b>MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores</b></summary>
<p>

```
@inproceedings{bertaccini2022minifloat,
author={Bertaccini, Luca and Paulin, Gianna and Fischer, Tim and Mach, Stefan and Benini, Luca},
booktitle={2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)},
title={MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores},
year={2022},
volume={},
number={},
pages={1-8}
}
```

</p>
</details>

<details>
<summary><b>Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters</b></summary>
<p>

```
@inproceedings{paulin2022softtiles,
author={Paulin, Gianna and Cavalcante, Matheus and Scheffler, Paul and Bertaccini, Luca and Zhang, Yichao and Gürkaynak, Frank and Benini, Luca},
booktitle={2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)},
title={Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters},
year={2022},
volume={},
number={},
pages={44-49},
doi={10.1109/ISVLSI54635.2022.00021}
}
```

</p>
</details>

<details>
<summary><b>SARIS: Accelerating Stencil Computations on Energy-Efficient RISC-V Compute Clusters with Indirect Stream Registers</b></summary>
<p>

```
@misc{scheffler2024saris,
title={SARIS: Accelerating Stencil Computations on Energy-Efficient
RISC-V Compute Clusters with Indirect Stream Registers},
author={Paul Scheffler and Luca Colagrande and Luca Benini},
year={2024},
eprint={2404.05303},
archivePrefix={arXiv},
primaryClass={cs.MS}
}
```

</p>
</details>

<!--end-publications-->
{%
include-markdown '../README.md'
start="<!--start-publications-->"
end="<!--end-publications-->"
comments=false
%}
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ The FREP instruction has the following signature:
| max_inst | max_rpt | stagger_max | stagger_mask | 0 | OP-CUSTOM1 | FREP.I |
| max_inst | max_rpt | stagger_max | stagger_mask | 1 | OP-CUSTOM1 | FREP.O |

FREP.I and FREP.O repeat the *max_inst + 1* instructions following the FREP instruction for *max_rpt + 1* times. The FREP.I instruction (*I* stands for inner) repeats every instruction the specified number of times and moves on to executing and repeating the next. The FREP.O instruction (*O* stands for outer) repeats the whole sequence of instructions *max_rpt + 1* times. Register staggering can be enabled and configured via the *stagger_mask* and *stagger_max* immediates. A detailed explanation of their use can be found in the Snitch [paper](/publications).
FREP.I and FREP.O repeat the *max_inst + 1* instructions following the FREP instruction for *max_rpt + 1* times. The FREP.I instruction (*I* stands for inner) repeats every instruction the specified number of times and moves on to executing and repeating the next. The FREP.O instruction (*O* stands for outer) repeats the whole sequence of instructions *max_rpt + 1* times. Register staggering can be enabled and configured via the *stagger_mask* and *stagger_max* immediates. A detailed explanation of their use can be found in the Snitch [paper](../../publications.md).

The assembly instruction signature follows:

Expand Down
5 changes: 4 additions & 1 deletion docs/rm/peripherals.md → docs/rm/hw/peripherals.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@

This section documents the registers exposed by the Snitch cluster to interface with various cluster-level peripherals, including the performance counters.

{% include-markdown '../generated/peripherals.md' %}
{%
include-markdown '../../generated/peripherals.md'
rewrite-relative-urls=false
%}
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
!!! warning
This page is no longer maintained, and may contain outdated information.

# Reqrsp Interface

The `reqrsp_interface` (request and response) is a custom interface based on
Expand Down
9 changes: 6 additions & 3 deletions hw/snitch/doc/index.md → docs/rm/hw/snitch.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
!!! warning
This page is no longer maintained, and may contain outdated information.

# Snitch

Snitch is a single-stage, single-issue, in-order RISC-V core (RV32I or RV32E)
Expand All @@ -7,7 +10,7 @@ configurable and can be used in a plethora of different applications.
The core has an optional accelerator interface which can be used to control and
off-load RISC-V instructions. The load/store interface is a dual-channel
interface with a separately handshaked request and response channel. More
information can be found [here](../../rm/reqrsp_interface).
information can be found [here](reqrsp_interface.md).

This folder contains the main Snitch core, incl. L0 translation lookaside buffer
(TLB), register file and load store unit (LSU).
Expand Down Expand Up @@ -84,8 +87,8 @@ the Snitch core and a file list you can:
| acc_prsp_i | `bits(acc_resp_t)` | In | Accelerator response information. |
| acc_pvalid_i | `1` | In | Accelerator response is valid. *AXI-style handshake.* |
| acc_pready_o | `1` | Out | Accelerator response has been accepted by the core. |
| data_req_o | `bits(dreq_t)` | Out | Load/store request. See [reqrsp interface](../../rm/reqrsp_interface). |
| data_rsp_i | `bits(drsp_t)` | In | Load/store response. See [reqrsp interface](../../rm/reqrsp_interface). |
| data_req_o | `bits(dreq_t)` | Out | Load/store request. See [reqrsp interface](reqrsp_interface.md). |
| data_rsp_i | `bits(drsp_t)` | In | Load/store response. See [reqrsp interface](reqrsp_interface.md). |
| wake_up_sync_i | `1` | In | Deprecated. Tie-low. |
| ptw_valid_o | `2` | Out | Instruction or data TLB missed. Page table walking request. |
| ptw_ready_i | `2` | In | Instruction or data miss has been accepted. |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
!!! warning
This page is no longer maintained, and may contain outdated information.

# Snitch Cluster

This ip contains a cluster of Snitch cores, arranged in a specific (but
Expand Down
1 change: 0 additions & 1 deletion docs/rm/reqrsp_interface.md

This file was deleted.

Loading
Loading