rtps-fpga/src/TODO.txt

* https://github.com/osrf/ros_dds/issues/7 tries to determine the feasibility of an FPGA-only DDS implementation
    It was suggested using a 'OpenMS430' soft-microcontroller to run DDS middleware on-top.
    - Compare resource utilization and performance with this approach
* DDS FPGA Vendors
    - TwinOaks Computing Inc (CoreDX)
    - OpenVPX
* Implementation makes unnecessary transitions, that are ignored in later stages.
    This was a design decision to simplify complexity of each stage (and probably FMAX), but increases power consumtion.
* Is the Timestamp used by something else except ordering by source? If not, does it have to be "sane"?
    2.2.3.16
    This QoS relies on the sender and receiving applications having their clocks sufficiently synchronized. If this is not the case
    and the Service can detect it, the DataReader is allowed to use the reception timestamp instead of the source timestamp in its
    computation of the ‘expiration time.’
    2.2.3.17
    The mechanism to set the source timestamp is middleware dependent.
    -   Well, it has to be synchronized with every writer updating the same instance.
* Are the Builtin Endpoints part of the DDS Specification or the RTPS specification?
    - Both
* ALL Data Fields are sent on each change:
    https://community.rti.com/forum-topic/sending-only-fields-have-changed
* OFFERED_INCOMPATIBLE_QOS, REQUESTED_INCOMPATIBLE_QOS? (DDS-1.4, 2.2.4.1 Communication Status)
* Does a RTPS Reader subscribe and receive ALL Instances of a keyed Topic?
    i.e. where are the instances differentiated?
    S.29 - Topic Kind
        - When a reader "subscribes" to a topic, it receives all the instances of the matched writers (May not be ALL instances of the topic in general)
* Does a RTPS reader subscribe to more than one Writer (If not using Groups)?
    - Yes
* Since only positive Sequence Numbers are valid, why is the type signed?
* Store only one locator? Select the first supported multicast? Check for active ping response? Verify with source address?
* What is the purpose of "manualLivelinessCount" in "SPDPdiscoveredParticipantData"? Since the liveliness is asserted by use of "ParticipantMessageData", what is the purpose of also sending an updated "SPDPdiscoveredParticipantData"? (Duplicates is out of the question, since it is not sent with the actual liveliness assertion)
* What does that mean?:
    2.2.3.19
    If reliability is BEST_EFFORT then the Service is allowed to drop samples. If the reliability is
    RELIABLE, the Service will block the DataWriter or discard the sample at the DataReader in order not to lose existing
    samples.
* What is now the valid Parameter List length?
    According to DDSI-RTPS 9.4.2.11
        The length encodes the number of octets following the length to reach the ID of the next parameter (or the ID of the sentinel). Because every parameterId starts on a 4-byte boundary, the length is always a multiple of four.
    According to DDS XTypes 7.4.1.2
        Unlike it is stated in [RTPS] Sub Clause 9.4.2.11 “ParameterList”, the value of the parameter length is the exact length of the serialized member. It does not account for any padding bytes that may follow the serialized member. Padding bytes may be added in order to start the next parameterID at a 4 byte offset relative to the previous parameterID.
* 32-bit Timestamps? Seriously? Ever heard of Y2k38?
* Use generic package with unconstrained arrays (VHDL-2008), and assert bounds/length inside the package.
* Count repository lines
    git ls-files | grep .vhd | xargs wc -l
* Count Field in Heartbeat/Acknack
    The following sentence is quite self-explanatory:
    "A counter that is incremented each time a new message is sent. Provides the means to detect
    duplicate messages that can result from the presence of redundant communication paths."
    But then, in 8.4.15.7 it says:
    "So, an implementation should ensure that same logical HEARTBEATs are tagged with the same Count."
    Does that mean there are cases were I have to put the same count? What is a logical HEARTBEAT?
* Should a "Keyed" Endpoint communicate with a "Non-Keyed"? (In the sense of Entity Kind)
* Is the empty String a valid Topic and Type Name?
* We can determine if a Endpoint is a Reader or Writer via the Entity ID. Is it illegal to get a SEDP with incompatible source (Reader Entity ID from Publications Announcer?)
* Can we make an array of records of uncontrained strings? That we we could make an array of variable sized strings...
* Should I also check for Minor_Version >= 4?
    - Yes: 8.6 Implementations of this version of the RTPS protocol should be able to process RTPS Messages not only with the same major version but possibly higher minor versions.
* If a DATA Submessage is invalid in any way, the Sequence Number is never marked as received, and thus processing of remote Endpoints could stall on corrupt Messages.
* Can a Participant unmatch an Endpoint by marking it's announcing sequence number in a GAP message?
* Is DEADLINE per-INSTANCE or per-INSTANCE-and-WRITER?
    - Since the matching is per-WRITER the assumption would be per-INSTANCE-and-WRITER
* Only a sub-part of the DDS QOS are actually relevant for the RTPS. Should I remove the QoS Specifications from the RTPS Package?
* What happens if we get a sample with a source timestamp earlier than the last sample that was accessed by the DataReader when using DESTINATION ORDER BY_SOURCE_TIMESTAMP? Is the smaple dropped?
* The spec does not define the serialized Key (KEY=1 DATA MESSAGE)
    - fast-rtps assumes it is the Key Hash
    - opendds sends Payload Encapsulation with a Key Holder Object (As defined in XType 7.6.8)
    - opensplice seems todo the same as opendds
* Currently the builtin-endpoint does only acknowledge SN, but does not negatively acknowledge any SN (Bitamp is always empty).
  A writer usually responds with repqirs only to negative acknowledgements.
* Currently a RTPS Writer with DURABILITY TARNSIENT_LOCAL does send historical data to all matched readers, not depending if they are VOLATILE or TRANSIENT_LOCAL.
* Assert Heartbeat period > Heartbeat Suppression Period
* Can I request (NACK) SNs that were NOT announced by the writer (> last_sn in Heartbeat)?
* Does AUTOMATIC Liveliness QoS also update the lease on write/assert_liveliness operations?
* The Lease Duration is also updated if the Cache Change is not accepted by the DDS/HC. This in effect "skews" the "correctness" of the Writer Liveliness Protocol until the reader has no pending request from the Writer.
* If an Instance is DISPOSED, but later has no active writers, the Instance STAYS in the NOT_ALIVE_DISPOSED state.
* Is a Writer that is Disposing a Instance also Unregistering that instance? (Currently only Unregistering removes the remote Writer)
    - No
* Since Lifespan is a duration, there is an inherent difference in the expiration time between writer and reader. This in addition to the fact that the reader may use the Reception time for the expiration time calculation could lead to an actual expiration duration almost double in length (If sent right before expiring locally in the writer).
* The current implementation will sent a second unregister/dispose Sample, if the user does the unregister/dispose operation a second time. Should we handle that specially?
* If a Keyed Reader receives a DATA Message with no Key hash and no Payload, it will drop it since there is no way to determine the instance (And the SN will never be accepted).


* Fast-RTPS does not follow DDSI-RTPS Specification
    - Open Github Issue
        https://github.com/eProsima/Fast-RTPS/issues/1221
    - Seems that Fast-RTPS is also not checking the Validity of Submessages according to Spec

* DDSI-RTPS 2.3 ISSUES
    - 8.2.2 The History Cache
        The 'get_change()' operation depicted in 8.3 is not documented.
    - 8.2.2.4 get_seq_num_min
      8.2.2.5 get_seq_num_max
        This asume a history cache with duplicate-free sequence numbers, but because sequence number are
        generated per writer, and a reader can be matched with mutliple writers, we can get duplicate
        sequence numbers of different changes from different writers.
        Ergo the functions are non-deterministic.
    - 8.3.7.7 InfoDestination
        'This message is sent from an RTPS Writer to an RTPS Reader to modify the GuidPrefix used to
        interpret the Reader entityIds appearing in the Submessages that follow it.'
        But state is changed as follows 'Receiver.destGuidPrefix = InfoDestination.guidPrefix'.
        Isn't Reader -> Writer also valid? Does it have a specific direction?
    - 9.4.5.3 Data Submessage
        writerSN is incorrectly shown as only 32 bits in width
    - 8.2.3 The RTPS CacheChange
        Add IDL Specification for CacheChange_t
    - 8.3.4 The RTPS Message Receiver, Table 8.16 - Initial State of the Receiver
        Port of UnicastReplyLocatorList should be initialized to Source Port.
    - 8.3.7.4.3 Validity
        gapList.Base >= gapStart
    - 8.3.7.10.3 Validity
        'This Submessage is invalid when the following is true:
            submessageLength in the Submessage header is too small'
        But if InvalidateFlag is set, Length can be Zero. Since the length is unsigned, there cannot be an invalid length.
    - 8.4.7 RTPS Writer Reference Implementation
        According to 8.2.2 the History Cache (HC) is the interface between RTPS nad DDS, and can be invoked
        by both RTPS and DDS Entities.
        8.2.9 further states 'A DDS DataWriter, for example, passes data to its matching RTPS Writer through
        the common HistoryCache.', implying that a DDS Writer adds changes directly to the HC, which is then
        read by the RTPS writer and handled accordingly. This means that the DDS Writer is responsible for
        assigning Sequence Numbers.
        This goes against 8.4.7, which states that the RTPS Writer is adding the Cache Changes to the HC
        and is responsible for assigning Sequence Numbers.
    - 8.7.2.2.1 DURABILITY
        'While volatile and transient-local durability do not affect the RTPS protocol'
        But in case of Durability TRANSIENT_LOCAL the writer has to send historical Data.
    - 8.7.2.2.3 LIVELINESS
        'If the Participant has any MANUAL_BY_PARTICIPANT Writers, implementations must check periodically
        to see if write(), assert_liveliness(), dispose(), or unregister_instance() was called for any of
        them.'
        Elaborate if "any of them" does specify all Writers of the Participant, or only the Writers with
        MANUAL_BY_PARTICIPANT Liveliness.
    - 8.7.3.2 Indicating to a Reader that a Sample has been filtered
        Text refs 8.3.7.2.2 for DataFlag, but shoudl also ref 8.7.4 for FilteredFlag
    - 9.4.5.1.2 Flags
        Clarify from where the endianness begins.
        One might think it would begin after the Submessage Header, but the length is also endian dependent.
    - 9.4.5.3.1 Data Flags
        "D=1 and K=1 is an invalid combination in this version of the protocol."
        Does this invalidate the Submessage? Does 8.3.4.1 apply (Invalidate rest of Message)?
    - 9.4.5.1.3 octetsToNextHeader
        Similarly to "9.4.2.11" state that this is always a multiple of four.
    - 9.6.2.2.2 Table 9.14
        States that the builtinEndpointQos has no Default value, but according to
        8.4.13.3
            If the ParticipantProxy::builtinEndpointQos is included in the SPDPdiscoveredParticipantData, then the
            BuiltinParticipantMessageWriter shall treat the BuiltinParticipantMessageReader as indicated by the flags. If
            the ParticipantProxy::builtinEndpointQos is not included then the BuiltinParticipantMessageWriter shall treat
            the BuiltinParticipantMessageReader as if it is configured with RELIABLE_RELIABILITY_QOS.
        which means that the default value is 0.


* DDS 1.4 ISSUES
    - 2.2.3 Supported QoS
        Partition is marked as RxO=No, but should be RxO=Yes? Or not?
        -   Existing Issue: https://issues.omg.org/issues/DDS15-245

*   Source Port of SPDP is irrelevant, since it is BEST EFFORT and we do not reply (only Destination Port is of significance)


DESIGN DECISIONS
================

*   !REJECTED!
    In order to save memory GUID should only be saved once.
    Decision was made to replace GUID with internal reference index.
    Discovery module is responsible for saving the GUID and map it to a refernec eindex, that can then be used by other entities.
    Writer Endpoints may need access to the real GUID for message fields.
    2 options exist:
        - All Endpoints have access to the central memory where the real GUID is saved (needs Arbiter, handle starvation)
        - Writer Endpoints fill the fields with the reference index as placeholder, and a seperate Entity will access the central memory and replace the actual values
    The Second option was chosen (Less resources)
    RTPS Handler should lookup received message GUID in central memory (The lookup should happen in parallel with the actual message handling):
        - If not stored, and messegae not for Built-in ENdpoints, drop message
        - If in memory, replace with refernece index
    The central memory is accessd by 3 Entities:
        - RTPS Handler (READ, GUID Lookup)
        - Placeholder Handler (READ, GUID Lookup)
        - Discovery Module (WRITE, GUID Save) [Need initial Lookup? RTPS Handler should have already handled it. How does DM know if actual GUID or reference index?]
    Use a 2-port RAM with an arbiter for READ operations (Give Placeholder Handler priority to prevent DoS starvation)

*   !REJECTED! (Use the unused extra flags in the stored participant data)
    Use the lowest bit of the Heartbeat/Acknack Deadline stored in the Participant Data to differentiate
    between Delay and Suppression. This reduces the resolution from 0.23 ns to 0.47 ns

*   Originally we stored the mask of local matching endpoints in the memory frame of the remote endpoint
    in order to be able to send MATCH frames only to new matches, and UNMATCH frames only to previously
    matched local endpoints. This decision was reverted, and we just sent MATCH frames to the currently
    matched local endpoints (non depending on if they are already matched) and UNMATCH frames to the
    rest of the local endpoints (non depending on if they were previously matched).
    So we basically push the responsibility to the local endpoints, which have to handle this situations
    accordingly. Since META traffic is not supposed to be generated as often, this should not produce
    any significant overhead. As optimization, on new matched remote endpoints UNMATCH frames can be
    ignored.

*   The HEARTBEATs are sent out together with the liveliness assertions. This adds a 96-Byte overhead
    to the output RTPS Message. This was done to prevent having to loop through the memory to find
    remote participant destination more than once.

*   The Publisher, Subscriber, and Message Data is written on separate RTPS Messages, even though they are
    sent simutanously. This decision was made to support as many local Endpoints as possible. We could
    make a compile-time check and sent them in the same RTPS Message/UDP Packet, but the overhead is
    quite small and not worth the hassle.

*   Even though the Reader does not need to keep track of received SN with respect to each Writer with
    exception of the Highest/Last received (since it only keeps the SN in order and does only need to
    request from the last stored SN on), the writer does need to keep track of the requested SN (and
    possibly also the acknowledgements).
    This could be solved by either storing the SN in a bitmap in the endpoint data, or be storing the
    requester bitmap (endpoint data address) in the change data.
    But since the writer might drop SN in any order, the highest and lowest SN inside the cache history
    is unbounded. We can thus only reference to still available SN, and not to GAPs.
    In order to acoomodate for that, we could store the lowest (and possibly highest) SN of a requested
    lost SN and always send ALL GAPs in that range.

*   The meta_data (sample info) of a cache change is fixed size, and a cache change may be connected to
    data (payload), which may be variable in size. For this reason, we store the cache change and
    payload in separate memories. The payload size may either be fixed (in which case the memory frame
    sizes are adjusted according to that), or may be variable, in which case the payload is stored in
    a linked list of predefined sized memory frames. The first word of a payload contains the address
    of the next linked memory frame. If this is the last frame (or if the payload is static and there
    are no linked frames), the address is MAX_ADDRESS.

*   !REJECTED! The last bit of this address is the "occupied" bit. This bit signifies if the memory
    frame is used or free, and is used for the insert operation to find a new empty slot. This in
    effect means that all frame sizes have to be a multiple of 2 (all frame addresses have to be
    aligned to 2).

*   If the last payload slot of a variable sized payload is not aligned with the actual end of the
    Payload slot, we mark this via a bit in the sample info memory, and store the last address of the
    actual payload in the last address of the payload slot.

*   !REJECTED! The History Cache (HC) is the interface between RTPS and DDS. The History Cache contains
    the Sample Info and Payload memories. The HC has two input "sides", one is connected to the DDS
    and one to the RTPS entity. Housing the memories inside the HC entity and abstracting the direct
    memory address via opcode requests allows the memory interface to be replaced in future (e.g. AXI
    Lite). Since all memory operations are handled by the same entity, this allows some state keeping
    to improve memory bandwidth. More specifically the "linked list" paradigm can be extended to also
    reference empty slots (and next unread slots), to allow selecting empty slots without iterating
    through the whole memory. Originally the memory was to be implemented in a true dual port fashion,
    and two seperate procoesses would each satisfy the requests from one input side. This would allow
    concurrent RTPS and DDS requests to be handled. The write concurrency (add and remove change) does
    not allow for state keeping (first empty slot address), since it is reset by the "adding" side, by
    set by the "removing" side. Because of this, it was decided against concurrent input handling in
    light of the fact that the history cache will be most commonly quite large in size, and iterating
    through all...

*   Since most of the DDS QoS need information that is directly available to the History Cache (HC),
    it makes sense to integrate most of the DDS functionality directly into the HC to save up space
    and performance. Further more the needed stored information for a DDS Entity is different enough
    from the generic HC defined in the RTPS Specification to warrant a seperate entity for both.
    The DDS Entity will directly connect to the RTPS Endpoint. A separate generic HC will be
    implemented, that follows the RTPS Specification.
    The RTPS Endpoint will have to output multiple versions of Changes, depending on the connected
    Entity, in order to facilitate this design decision.

*   Since the "reading" side needs to have consistent state during it's processing, it does not make
    sense to implement dual port RAMs for the History Cache.

*   Since the RTPS Writer only gets ACKNACK Messages from the matched Readers, and these Messages are
    dropped by the rtps_handler if smaller than expected, we do not need a "READ GUARD" in the RTPS
    Writer.

*   Because "Once Acknowledged, Always Acknowledged" the Base of an ACKNACK can only be bigger or
    equal to the SN of the last ACKNACK. It is also reasonable, that the Reader will always request
    ALL missing segments each time it sends an ACKNACK (i.e. does not assume once requested, always
    requested until reception). This means that during the ACKNACK response delay, we can just parse
    the new request bitmap and overwrite the last old one.

PROTOCOL UNCOMPLIANCE
=====================
* Partition QoS
* Coherent Sets
* Built-in Endpoint is NOT the same as a normal Endpoint
    -> No User access to Data
* Known but unused Submessage IDs are treated as uknown
    -> No validity check
* Inline QoS validated in Endpoint
    -> Cannot invalidate Rest of Message/Packet
* RESOURCE_LIMITS applies also to "empty" samples (Samples with no valid data).
* Write/Dispose/Untergister Operation do not return (TIMEOUT). I.e. the MAX_BLOCKING_TIME is not used.


RTPS ENDPOINT
=============

* 8.2.6
    topicKind   Used to indicate whether the Endpoint supports instance lifecycle management operations (see 8.7.4).