Open equestions/assumptions:

Q: Impose a minimum vector length of 4 for speed?
A:
Q: Trig functions range
A: 0..2pi
Q: Work with matrix objects or on arrays?
A: Raw arrays as default
Q: Work with explicit C99 native types for DSP library?
A: Yes, no savings in hidden structures, except for types
Q: Library vs API portability?
A: ??
Q: 32/64 bit portability
A:
Q: Argument order, "natural" or compatible with other the "other" library
A: Natural order, except for when we want to implement exact API
Q: Saturation/overflow policy
A: GIGO
Q: Include scatter gather in basic memcpy operations
A: Yes, fundemental to DMAs, should be in hal
Q: Include barrier?
A: Yes, fundemental to all models, based on absolute address like mutex
Q: Perfect API compatibility?
A: The ability to run the exact same code on multiple platforms is magic!
Strive for MPI and POSIX same name no change compatibility.
Q: What DSP/math API's where researched?
A:
- SAL: http://opensal.net/
- VSIPL: http://portals.omg.org/hpec/files/vsipl/software3.html
- LIQUID: http://liquidsdr.org/
- NUMERIX: http://www.numerix-dsp.com/siglib.html
- CMSIS: http://www.keil.com/pack/doc/CMSIS/DSP/html/index.html
- OPENCV: http://docs.opencv.org/
- OPENVX: https://www.khronos.org/openvx/
- CUDA: http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf
- MKL:
- EIGEN:
- BLAS:
- FFTW:
Q: Which POSIX starting point
A: Bionic, newlib, uClibc, other?
Q: int dd or explicit structure type?
A: file descriptor..
Q: Should we have the ability to open device as stream, what does it mean?
A: No, that is one level up.(fifo=memory+state)
Q: Use argc/argv for passing variables?
A: Make the argv of type void* args[], not just chars
Q: NAN, denormal numbers in math
A: pass through (gigo)
Q: Error checking in library
A: How about a "debug" #define flag that gets turned off in production.
(would prefer not having it at all...)
Q: Do we need a level of abstraction above the memory pointer?
A: Yes
Q: Do we still need a fast memcpy like interface?
A: Yes, but it's niche for shared memory systems
Q: What should the processor id be?
A: A simple integer
Q: At what level should we be doing work?
A: At the team level
Q: What memory operations to support?
A: host writing to epiphany registers-->p_write
epiphany writing to other epiphany-->p_write
host to shared memory-->p_write, assign alloc to a team
Q: Assume contiguous memory in all buffers?
A: Yes
Q: Support no holds bar memory direct read/writes to addresses?
A:
1. declare a team that holds all the cores in the system
2. create a 2D array of pal_mem_t structs, each one 32KB
Q: Is "remote malloc" on processor 'n' a bad idea?
A: No, but implementation might be tricky:-) Can it work in a request/signal fashion?
Q: Make all data movement calls non-blocking?
A: Yes, you can always add the blocking part in wrapper above.
Non-blocking is the lego block.
Q: pal_* or p_*
A: Clarity vs brevity?
Q: The building blocks.
A:
1. A current processor "X", ie "me".
2. Other processors "Y(s)", ie "not me"
3. A set of named memory buffers belonging to processor "X", AND "Y"
4. A team of N processors, at least one X and one Y
5. A startup program on processor X, ie "root processor".
Q: Make data structures completely opaque?
A: Yes (a processor could be almost anything)
Q: Return value as return argument or in argument list?
A: Prefer simple functions with short argument lists.
We know what functional programming would suggest..
Q: Typedefs for some structs
A: Yes, because the intent is to make these completely opaque
Q: Include events in argument list?
A: No. Complicated, can't think of a clean solution
Q: Include event information in memory/team objects
A: Read/write ordering and "done" is too tricky for everyone.
Q: Argument ordering in functions: inputs, flags, output
A: Having flags last does make some sense
Q: Argument passing to run
A: Enable passing arguments in "main style" and function-kernel style.
Contract between caller/callee program. Short cuts allowed
Q: Is it practical to run the same executable on bare metal and in Linux?
A:
Q: Use of "node" to represent a processor?
A: The reason I chose this name was to abstract away as much as possible the biases people have. The word "core" has been completely bastardized by the industry and spans anything from a multiply unit up to a CPU running Linux. The word processor could work, but also carries with it a heavy history and suggests a microprocessor. Node is non-descript which is a good thing. I also like that a node is often represented by a dot, which ironically is what compute hardware has become. Programmers should think about memory, dots, and lines when architecting algorithms and programs. Anything beyond that is too much mental overhead.
Q: Is there a need for a new memory object?
A: Yes, memory is not the same thing as a file and files are not appropriate as something to work with when doing HPC. We need very fast and direct manipulations of array data.
Q: Make ssize_t a first class citizen for memory manipulation
A: Yes, we can no longer assume that all writes succeed. The faulty domain of clusters is making it down to chips.
Q: Why not make the PAL memory objects true file descriptors?
A:
Q: Math arg name output, change to 'res' for clarity
A:
Q: Include consts for input pointers to functions, any other keyword?
A:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUESTIONS.md

QUESTIONS.md

Open equestions/assumptions:

Files

QUESTIONS.md

Latest commit

History

QUESTIONS.md

File metadata and controls

Open equestions/assumptions: