Home arrow Support arrow Forums

Luminary Micro Forums

<< Start < Prev 1 2 Next > End >>

rocksoft

Expert Boarder
Click here to see the profile of this user

2008/08/12 16:43

Re:For sharing: Memory copy routine

Good results.

I also saw good improvements in my TCP/IP, particularly with IP fragment reassembly and TCP segment handling. Even though I try to avoid memory copies and attempt to make everything long word aligned it's difficult to avoid non-aligned transfers sometimes, that's where most of the speed gain comes from.

You might also want to try to rename the MEM_DataCopy label to "memcpy". I found a number of routines in the standard library I am using that rely on memcpy itself.

With the correct link order you can replace that too so the libraries use the version you supply. In my environment "v/f/s/printf" had a speed increase by doing this.

Regards,

Liam.

login or register to reply

ravaz

Platinum Boarder
Click here to see the profile of this user

2008/08/13 07:30

Re:For sharing: Memory copy routine

I didn't really measure the time improvement, but reading the code you wrote I can see how is optimized.

Thank for sharing...

login or register to reply

Riveywood

Fresh Boarder
Click here to see the profile of this user

2008/08/28 04:09

Re:For sharing: Memory copy routine

That's a nice bit of code, which should be useful to most readers.

I always recommend that it's worthwhile to spend some time optimizing block copy code in the library - it's particularly important on the ARM where you have load and store multiple instructions available. The ARM tools have similar routines in the library already.

People might also want to look at zero-initialization routines like memset() in the same way - no point spending thousands of cycles writing zeroes to memory using LDRB and STRB when you can write 40+ bytes with a single STM instruction.

Also, if memcpy() performance is important in your system, it's a good idea to understand the bus interface of the memory you're dealing with. Rocksoft's code copies in blocks of 40 bytes (10 words). If your flash or SDRAM plays nicely with lines of 32 bytes (8 words) starting on an 8 word boundary, for example, you may find you get better results by modifying memcpy() to do the same.

login or register to reply

rocksoft

Expert Boarder
Click here to see the profile of this user

2008/08/28 05:48

Re:For sharing: Memory copy routine

> People might also want to look at zero-initialization routines like memset() in the same way...

Absolutely, I do this too, though I have separate routines rather than an integrated one like the copy routine.

> If your flash or SDRAM plays nicely with lines of 32 bytes...

A very good point indeed. With SDRAM burst mode length and cache lines set to match the copy routine it should really fly. Even better if the device has instruction cache too.

It will be interesting to see this on a Cortex-M3 device in the future when someone builds one with cache and an SDRAM controller.

PS: I recently noticed I push/pop r12 in my routine, in several places, this is actually not required so you can save another couple of cycles by removing it.

Post edited by: rocksoft, at: 2008/08/28 05:55

login or register to reply
<< Start < Prev 1 2 Next > End >>