Discussions about active development projects
#1205 by Nils Roos
Fri Nov 28, 2014 7:49 pm
OK, time for an update.

I teamed up with BrandonKinman to build a network streamer for ADC data. The system has three components:
  1. a DMA capable extension to the scope
  2. a driver that allocates suitable memory and provides hardware access to userland
  3. a command line application that sends data from the driver over a network socket
On the other end of the network cable, we plugged a PC with GNU radio companion, running a small design that takes data from a TCP serversocket and plots it on a scope GUI.

This worked quit well up to 34MSps, with one flaw, about 2% of the data looked like it was from an older acquisition cycle. Closer examination of these shadow samples revealed that 2% of the time the data was read from a stale cacheline instead of DDR.
Forcibly disabling the cache on the DMA memory region resolved this issue, but as a consequence the maximum sustainable data rate dropped to about 8MSps. It seems that there is significant handling of the data being done by the kernel while it's in transit.

Following up on SampsaRanta's suggestions (see above), we are now looking into integrating netmap on the RedPitaya, and wiring it into a purpose-build network stack to achieve zero-copy transmission.
My next step is to get acquainted with the IP stack and to see what it would take to get it to not do anything with the buffers that it processes (apart from preparing and inserting headers and dividing it into chunks, obviously).

Any help with or advice on these matters is highly welcome.

At some point, a web front-end to the whole thing would also be required. Any takers ? 8-)
You do not have the required permissions to view the files attached to this post.
#1206 by bexizuo
Fri Nov 28, 2014 8:44 pm
well ... at this moment i'm having only one testcase as an idea considering "2%" ...
could try to run it slower ? for example 25MSPS ... may be there is rare collision of speeds ... something like "resonance" ...
#1209 by SampsaRanta
Sat Nov 29, 2014 5:41 pm
Hi Nils,

good work!

I used netmap as example, it is not the sole and only solution to do this. Depends how power the CPU is able to give you..

There are other options that might or might not be easier..

You could look like something like,
http://stackoverflow.com/questions/2000 ... o-a-socket
(There seems to be some paper reference on DMA buffer reusing if I read correctly.. )

Or there is also tun-driver network that seems to allow userland to have zero-copy tx support of a kind at latest kernels, you might be able to send skb directly and route it out through network driver.. In fact, netmap also has something a like for virtual machine use..
"tun: experimental zero-copy tx support (commit)."

There are also things that help to make the standard network stack more lightweight by optimizing unrequired features away..
http://www.linuxplumbersconf.org/2011/o ... rmance.pdf
(Some of this is more x86 related but might give you some idea of the network stack and features present and their possible impact on performance.)

I use netmap for 10G on already supported drivers so it is easy for me. It is hard to compete with performance on putting something directly to NIC queue. But on red pitaya this also requires quite some on understanding the details of driver internals as ready to use solution is not provided I guess.

I dunno if this provides real help but seemed like a good idea while writing.. :)

#1301 by bexizuo
Thu Dec 18, 2014 5:52 pm
Hi Nils,

i'm having different project on arm (Cortex-A15) and i found that all usb traffic (USB-ETH gigabit) generates softirqs only on CPU0 , of course it depends on many configurations and hardware (eth not on USB), but the kernel i used is configured that only CPU0 is used for softirqs and i cannot set affinity to use other CPUs,

maybe you can try to play with affinity ...
#1559 by Zurd0
Thu Feb 19, 2015 2:26 pm
Your job has been so usefull, but i like to know if there is a posibility to increase the number of taken samples. Im new on this environment and i want to change the number of samples taken to 1.25MS from 1MS so it suit with the frecuency of 125MHz.

Thanks for all the great done job, ill be wating your answer.
#1717 by defreng
Tue Mar 17, 2015 4:33 pm
Hi Nils!

Thanks a lot for posting this project! This looks like a great starting point for epic speed enhancements regarding ADC readouts.

I just had a quick peek at the driver sources, and as you said it is still pretty basic and doesn't allow for user configurable settings (trigger, decimation, length etc.) yet. However, to me this looked not too hard to implement... Are you planning in doing so or have you already done this in a private branch?

Keep up the great work and thx!

#1718 by Nils Roos
Tue Mar 17, 2015 7:18 pm
We are currently in the process of wrapping the core functionality into a bazaar application. The first release will allow you to generate a network stream of raw sample data.
Options will be
  • channel A or B
  • decimation 8, 16, 32, 64, ..., 65536 (and all powers of 2 in between)
  • shaping filter and HV gain adjustment on/off
  • client or server
  • target address and port in case of client, listening port in case of server
  • acquisition length

Set the options, press the start button and your Red Pitaya will establish a connection to your data-sink and flood it with fresh samples.

We wont support triggering yet, because our focus is on continuous acquisition. It might be added later.

The lowest decimation is 8 because our network performance still tops out at 50MByte/s which is enough to stream 15MS/s comfortably. I don't expect to be able to push that boundary further anytime soon.

If lower rates than ~2kS/s are desired, we can expand the range of allowed decimation. Just say so ;O)
#1719 by fisafisa
Tue Mar 17, 2015 10:15 pm
Very good job!
IS it going to be UDP or TCP?

I am involved in a major international project where we plan to do a major distributed control system over MULTICAST UDP
Non blocking or cut-through switches will be employed and 10G 40G network aggregation will be put in place.
I am interested in your application as a potential source of data acquisition data for this project to be used in prototypes or
test suites.

So I am hoping that you are employing UDP ;)

Time stamping is another major issue that one will encounter in any serious application of the technology.
How do we time stamp the samples in order to correlate with other sources?

PTP could be used. Just as a mechanism to calibrate any internal high resolution counter from an external source of PTP packets. Or even easier, using an extern clock and trigger(RESET counter) connected via one of the FPGA pins to a FPGA based counter.
One method gives absolute time with a certain jitter (hardware implementation I have seen producing 50ns). The other, relative time from a start point (very precise and used typically in non-continuous experiments)

If you guys are interested in exploring this specific use case, I can explain in more detail and maybe somewhat help.

In any case, very good job!

#1729 by Nils Roos
Thu Mar 19, 2015 1:55 pm
Thanks for taking an interest. I forgot to list it, but another option in the setttings will be "TCP or UDP".

We hadn't really thought about it until now, but you are right, timestamping is crucial for sample-transfer over UDP.

We would very much appreciate your input on that, please let me know what would be most convenient for you.

Who is online

Users browsing this forum: No registered users and 1 guest