Non-stop data write on a storage device

Just about everything about Red Pitaya
melko
Posts: 19
Joined: Fri Aug 08, 2014 7:13 pm

Non-stop data write on a storage device

Post by melko » Mon Dec 15, 2014 8:54 pm

TLDR: I'm having quite a hard time writing a non-stop data flow on my USB stick, so I'm looking for ideas :)

What I did so far:

I've implemented a design that uses the Xilinx's DMA IP to transfer data from the PL to the OCM SRAM through the AXI HP bus.
The test design is quite simple:
within the PL, using the 125MHz clock, I increase a 64bit counter that then goes to an AXIS Fifo connected to the DMA IP (I've configured a shared register that can be accessed from the PS, to reduce the data rate so that is 125/2^(N+1) MHz);
then with a C code, the PS start the DMA transfer and write the data from the SRAM to file in an endless loop.

Results so far:

Since the data sent from PL to PS is an incremental counter, it's easy to check the correctness of the result, I just take the output file and check that the sequence isn't broken (so after N comes N+1 etc).

Now.. when the output file is located on the ram filesystem, the results are impressive, I've misured the throughput to be around 200MB/s, but with this data rate I fill the available RAM in 2s so it's not of much use :D

The troubles come when the output file destination is the SD/USB stick/NFS filesystem or whatever. Lets take into account the case with the USB stick (the same conclusion holds with the other faster/slower mediums).
I have a 64GB usb stick formatted with ext2 (previously I tried with fat32 too), the stick seems to have a write speed of ~10MB/s (measured from the Red Pitaya itself), so hopefully I thought I could reach at least half of that speed.

In the end though, nevertheless the speed with which I see the file growing matches the one I set (125/2^8 MHz x 8bytes = 3.7MB/s), the file doesn't always contain incremental counts, but has some holes within.

That means the PS sometime doesn't manage to write the data fast enough, so the FIFO on the PL gets full and some data is dropped (I've attached the full signal of the FIFO to a led, so I actually see the FIFO got full for sure).
The same happens even if I decrease the data rate from ~4MB/s to 1MB/s (but in this case it takes more time for this to happen compared to the higher data rate).

What I wonder now is why I can't manage to write a sustainable data flow of 1MB/s when the usb can achieve 10MB/s?

Initially I thought that maybe the FIFO on the PL was too small and maybe couldn't hold data while the PS was emptying the buffer, so since I couldn't make that FIFO bigger (since I'm using almost all the BRAM available, the FIFO is actually 64bit x 24576 deep), as a quick hack I used Linux named pipes. So I split the C software into two different files: one cares to do the DMA transfer and write the data from the SRAM to the named pipe, the other retrieves that data and writes it to file.
Since I made the pipe as big as I could (16MB), I hoped this would help.
Indeed it helped, but the issue is still there, just delayed in time. So now I can write about 1GB (at 4MB/s rate) of data before seeing the cursed led blinking, meaning that the FIFO got full and data is being wasted. Now since 16MB is 4s worth of acquisition at 4MB/s rate, I think the too small FIFO hypothesis doesn't hold anymore, so I really wonder where the problem lies.

As a desperate hypothesis I even thought some background task was the culprit, stealing the precious CPU time from my task; I even tried killing the nginx daemon, but indeed nothing changed, and don't really think this is the issue, since the FIFO gets full sooner or later depending on the data rate I set... If a background task was the culprit probably data rate wouldn't matter that much).

In the end I'm running out of ideas, so I hope someone has the right hint to enlighten me,
the whole code is in the dma branch of my fork on github ( https://github.com/melko/RedPitaya ), specifically the code that generates the data in in the custom.v verilog file, while the C software is split into dma.c and listener.c, while the block design is indeed inside the system design, so you'll need to open it with Vivado to see it.

Probably I missed quite some important details, but it's late now, I'll recheck the post tomorrow to fix eventual errors ^^

john k2ox
Posts: 39
Joined: Sun Oct 05, 2014 6:47 pm
Location: New York
Contact:

Re: Non-stop data write on a storage device

Post by john k2ox » Mon Dec 15, 2014 11:30 pm

Have you tried sending to a hard disk via ethernet? SD cards are designed for different purposes. Some are very fast for small bursts, some are designed for long transfers.

Just a thought. Happy to see what you have so far.

Regards,
John

Nils Roos
Posts: 1441
Joined: Sat Jun 07, 2014 12:49 pm
Location: Königswinter

Re: Non-stop data write on a storage device

Post by Nils Roos » Tue Dec 16, 2014 12:00 am

When you say holes, what exactly are you talking about ? Are there regions that are filled with zeroes, or are there sequences of incrementing values that just don't fit ?

I am asking this, because I have a similar project (this one), where I would see blocks of data that seemed to be out of sequence.
This turned out to be a caching issue, and went away when I disabled the cache on my DMA region.

All in all, quite nice work you have done there. Congrats.
Nils

melko
Posts: 19
Joined: Fri Aug 08, 2014 7:13 pm

Re: Non-stop data write on a storage device

Post by melko » Tue Dec 16, 2014 11:43 am

john k2ox wrote:Have you tried sending to a hard disk via ethernet? SD cards are designed for different purposes. Some are very fast for small bursts, some are designed for long transfers.

Just a thought. Happy to see what you have so far.

Regards,
John
yeah I tried with nfs, with that I've managed to write a few GBs at 7MB/s, but then the same happened (at 14MB/s the FIFO seems to get full quite soon, after just a few hundreds of MBs)

melko
Posts: 19
Joined: Fri Aug 08, 2014 7:13 pm

Re: Non-stop data write on a storage device

Post by melko » Tue Dec 16, 2014 12:03 pm

Nils Roos wrote:When you say holes, what exactly are you talking about ? Are there regions that are filled with zeroes, or are there sequences of incrementing values that just don't fit ?

I am asking this, because I have a similar project (this one), where I would see blocks of data that seemed to be out of sequence.
This turned out to be a caching issue, and went away when I disabled the cache on my DMA region.

All in all, quite nice work you have done there. Congrats.
Nils
the data I send is a sequential count, so what I see in the output file (with hexdump or after decoding it) are numbers starting from 0 and incrementing by one (so 0 1 2 3 ... etc);
with holes I mean I see something like 1000, 1001, 1002, 5000, that means the fifo stayed full while the counter was between that missing range.
As a more specific example, these holes can be quite big, e.g. I see 15841878 and then 15931352, so this hole is 89474 counts big, that given the clock rate I was using means it dropped data for about 23ms, that's quite a long time.

Thanks for your suggestion, I'll check your thread to see how disable caching on my DMA region.
Speaking about caching, I noticed a weird behavior when writing data on the file:
checking with the "free" command, I see memory usage increases as I write data on the disk, now this is expected as the kernel is caching it thinking that it may be useful in the future, but the strange thing is that the second line given by free, should be the used ram without taking into account the cache, but I see it growing as well until it reaches about 450MB used. Why this?
On my desktop pc this doesn't happen, so maybe it's because of some kernel configuration? Or maybe the free command provided by busybox works in a different way than the one from a standard GNU/Linux distribution?

Nils Roos
Posts: 1441
Joined: Sat Jun 07, 2014 12:49 pm
Location: Königswinter

Re: Non-stop data write on a storage device

Post by Nils Roos » Tue Dec 16, 2014 5:29 pm

In the meantime I had a closer look at your logic design, as well as your program.

After some more deliberation, I do not think that stale cache data is the cause of your problem. You said you observe the PL fifo level getting full on the external LED, which is the point at which you start loosing data. That is before caches come into play.

I'd suggest you put some profiling code into your listener and observe the performance of writing to thumbdrive. Does it achieve a consistent write speed or are there spikes or prolonged periods of slower writing, which can't be bridged by the pipe buffer ? If stalled writes to flash are not the problem, work backwards from there.

melko
Posts: 19
Joined: Fri Aug 08, 2014 7:13 pm

Re: Non-stop data write on a storage device

Post by melko » Tue Dec 16, 2014 6:17 pm

Nils Roos wrote:In the meantime I had a closer look at your logic design, as well as your program.

After some more deliberation, I do not think that stale cache data is the cause of your problem. You said you observe the PL fifo level getting full on the external LED, which is the point at which you start loosing data. That is before caches come into play.

I'd suggest you put some profiling code into your listener and observe the performance of writing to thumbdrive. Does it achieve a consistent write speed or are there spikes or prolonged periods of slower writing, which can't be bridged by the pipe buffer ? If stalled writes to flash are not the problem, work backwards from there.
There's one detail I forgot to say yesterday:
together with the led, I put the fifo count on a shared register, so I can actually see how many item are in the fifo (if you had a look at the code I mean the not AXIS one).
Looking at this value during a run (with the monitor tool), it always stay under the PACKET_SIZE I've set, so I think that most of the time the listener can keep it up writing fast enough, but sometime (how likely is this to happen depends on the data rate) happens it can't.
Said that I think this is due to spikes of slower writing (also what make me think that is given a 5GB file, where the full led turned on after the first GB, I see just 100MB of missing data, so most of the time data was retrieved), but what puzzles me is why a 16MB pipe can't hold it...

Speaking about the pipe, seems I was wrong about the 16MB limit, that's a limit for unprivileged users, root can ask for more space. I'm running now with 32MB and I'll see if I can ask more, hoping that from a certain size, the issue won't happen anymore.
Probably I should start to think a less dirty solution than named pipes.

About the profiler, could you explain more what I could try?

melko
Posts: 19
Joined: Fri Aug 08, 2014 7:13 pm

Re: Non-stop data write on a storage device

Post by melko » Tue Dec 16, 2014 7:57 pm

some updates:

with the 32MB pipes, missing data is just 5MB (while it was 100MB with a 16MB pipe),
I tried a run with a 200MB pipe and after an hour (15GB worth of data) the led is still off :D

Nils Roos
Posts: 1441
Joined: Sat Jun 07, 2014 12:49 pm
Location: Königswinter

Re: Non-stop data write on a storage device

Post by Nils Roos » Tue Dec 16, 2014 10:59 pm

Well, it looks as if occasionally, there are very long stalls in writing to the thumbdrive. Profiling the writes was meant to get a clear picture of what kind of delays you are dealing with.

Like, count the outgoing data and measure the time it takes to write out each megabyte for example. To keep it simple, collect timestamps with gettimeofday() , the precision is ideal for the purpose. Write the times out to a file on /tmp, maybe mark all values significantly above 0.2s (the expected write time of 1MB at a theoretical write speed of 5MB/s). This way you can easily spot long periods of slow writes, which would lead to over-filling the pipe.

melko
Posts: 19
Joined: Fri Aug 08, 2014 7:13 pm

Re: Non-stop data write on a storage device

Post by melko » Wed Dec 17, 2014 7:43 pm

So I made some plots based on your suggestion,
what I did is writing 1024 times a 1MB array (so in the end 1GB data), measuring the time taken for each write() call.

I did this using as a destination both the usb stick and a nfs mounted filesystem, then I repeated it on the usb stick but from my desktop pc instead of the red pitaya.

Time is expressed in seconds (sorry for the bad style of the plots but I did them quickly).

So here's the one on the usb stick (it took 130s to write 1GB, so an average of 7.88MB/s) :
time.png
speed.png
the one on NFS (41s, so an average of 25MB/s):
time-nfs.png
speed-nfs.png
and the one on the usb stick from desktop pc (73s, so an average of 14MB/s)
time-pc.png
You do not have the required permissions to view the files attached to this post.

Post Reply
jadalnie klasyczne ekskluzywne meble wypoczynkowe do salonu ekskluzywne meble tapicerowane ekskluzywne meble do sypialni ekskluzywne meble włoskie

Who is online

Users browsing this forum: No registered users and 23 guests