I have been trying to get SAMBA4 running on ROS but I believe I have run into a bug in AFD. Based on reviewing the code and adding instrumentation to trace it I believe the issue stems from mishandling the buffer structures sent in as parameters via winsock. WSASendTo accepts a 'buffer array' for its data source, which arrives as a pointer to the base of the array and a count of buffer items. The array itself is a simple list of structures containing sizes and pointers to data blocks (plural). Winsock validates the parameters and passes them down to AFD which is responsible for locking and duplicating the buffers for the driver layer. In order to work properly and minimize the use of pool memory, AFD would need to perform a 'gather' operation on the buffers bringing them into a single new MDL allocation making the data contiguous and stripping away slack space (the source buffers need not be contiguous, that is not a documented requirement for the source buffers).
Unfortunately the code in AFD does not seem handle the process of cloning the 'buffer array' properly, and the Datagram function in AFD's write.c call TdiSendDatagram() using the size and address of the first buffer in the array and simply ignore the remaining elements. As a result of this combination the AFD code works fine for buffer arrays with a single element, but fails badly with any buffer arrays that actually do contain multiple elements sending only the first element which the application detects as an incomplete packet and discards. the code does not attempt to send the remaining buffers separately, it simply discards them.
Samba4 uses this method for assembling its control messages building them from several pieces stored in different places the first of which is a short ID cookie. The receiver gets that but discards the packet as invalid because there is no payload. As a result even the unit test application for the messaging library in Samba4 fails to run on ROS even though it works correctly on Windows and does not make use any newer API functions not yet implemented in ROS.
I'm pretty sure this condition results from two separate bugs, but the biggest issue is the mishandling of buffers in the AFD LockBuffers() function. When combined with the erroneous use of a single buffer in AfdConnectedSocketWriteData() and AfdPacketSocketWriteData() which never check the buffer count parameter passed as an input parameter yields broken datagram send functions.
I created a small hack to properly determine the array length before calling TdiSendDatagram(), but that did not prove to work because the buffers are allocated individually and teh slack space is not removed so copying the proper number of data bytes is not enough.
This second failure is what gives me the impression that LockBuffers needs significant revision as well. Either the allocations are not consecutive in memory, or the 'slack' space between them is not being eliminated resulting in sending the slack space rather than the data. I am doing some testing to see if sending the entire allocation area resolves the issue temporarily, but that is far from efficient since the buffers appear to be pre-allocated large enough to house an entire packet payload, and their combined length makes for a big wasteful datagram, and may even exceed the allowable payload size for a datagram packet.