mirror of
https://github.com/bbulkow/FastLED-idf.git
synced 2025-07-30 02:38:18 +02:00
Description of how I think the threading is working in this module
This commit is contained in:
92
THREADING.md
Normal file
92
THREADING.md
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
# How to think about Threads in FastLED with ESP-IDF
|
||||||
|
|
||||||
|
I believe I understand how this is supposed to work.
|
||||||
|
|
||||||
|
## Parallel output, or multiple controller, mode
|
||||||
|
|
||||||
|
FastLED talks a bit about what they call 'multi channel'. This is
|
||||||
|
the very cool feature which allows parallel channels to all be blasted
|
||||||
|
at the same time, because you have hardware support.
|
||||||
|
|
||||||
|
The ESP32 has the RMT driver, which is a bit of hardware assist
|
||||||
|
which is used to blast out the waveform from a very small interrupt handler.
|
||||||
|
|
||||||
|
The examples in the Wiki don't match the way the code seems to be written
|
||||||
|
for this platform --- and it's a little weird.
|
||||||
|
|
||||||
|
For this platform, when you do a showLeds(), it keeps track of the number
|
||||||
|
of calls, and when you reach the same number of calls as the number of allocated
|
||||||
|
controllers, it actually spins up all of them together.
|
||||||
|
|
||||||
|
To say that's "peculiar" stretches the point.
|
||||||
|
|
||||||
|
You'd prefer an interface called something like "flushLeds" which
|
||||||
|
acts on all the known controllers, or takes the list of controllers
|
||||||
|
you'd like to operate over.
|
||||||
|
|
||||||
|
I think this API is supposed to work that you bang all the pixel arrays,
|
||||||
|
then call showLeds() which hits all the controllers in turn, the last
|
||||||
|
of which actually kicks off the transfers for all the controllers.
|
||||||
|
|
||||||
|
## using the individual showLeds() functions
|
||||||
|
|
||||||
|
If you set up individual showLeds() functions on the different controller, the
|
||||||
|
code looks like the early ones will be very quick, and the final controller will
|
||||||
|
do all the transfers.
|
||||||
|
|
||||||
|
I'd have to do some timing work and see if that's really happening --- but I'm at least 50% certain.
|
||||||
|
|
||||||
|
## Protecting the LED array, or using the external RMT driver
|
||||||
|
|
||||||
|
The API contract of showLEDs appears to be that showLeds blocks while
|
||||||
|
bytes are being pushed, which means its safe to molest the led array after
|
||||||
|
showLeds() finishes, thus setting you up for the next call.
|
||||||
|
|
||||||
|
Which is well and good, and allows you to do things like have a semaphore or
|
||||||
|
signal that fires off when the frame is blasted.
|
||||||
|
|
||||||
|
There are two peculiarities with this concept.
|
||||||
|
|
||||||
|
If you are using the EXTERNAL_RMT, then each controller allocates a buffer 3x larger
|
||||||
|
than the number of pixels. It does a convert() on the pixel array into the big memory
|
||||||
|
buffer, which is the RMT buffer of signal transitions, and then doesn't molest your
|
||||||
|
pixel buffer.
|
||||||
|
|
||||||
|
While this a lot of memory, you've just achieved double-buffering. The actual pixel array
|
||||||
|
is clear to be modified right after that 'build' happens, and it would be super nice to
|
||||||
|
know if that was true.
|
||||||
|
|
||||||
|
If you use the internal system, then the conversion happens in the interrupt routine.
|
||||||
|
The amount of work is the same ( roughly ), but most of it is happening in the interrupt
|
||||||
|
instead of out in "regular" time. Depending on your use, it might be better
|
||||||
|
to have the work going on in regular time as part of the double-buffer.
|
||||||
|
|
||||||
|
While you are sure that your Pixel arrays are safe after showPixels finishes, you've
|
||||||
|
covered a lot of time in the middle.
|
||||||
|
|
||||||
|
## How much CPU is really used?
|
||||||
|
|
||||||
|
If you 'do the math' on the number of pixels you can support, and you have 8 channels ( the number of RMT channels ),
|
||||||
|
you'll think you don't have much time to actually update your models. But, in fact, you do,
|
||||||
|
because the CPU is taking a fraction of its time feeding the RMT buffer, and a majority
|
||||||
|
of its time hanging on the semaphore waiting to be done.
|
||||||
|
|
||||||
|
The right thing to do, then, is to create another task to do whatever set of interpolation
|
||||||
|
you'd like to do.
|
||||||
|
|
||||||
|
If you measure the amount of time spent waiting on the Semaphore you'd be part of the way there,
|
||||||
|
but you should also measure the amount of CPU spent in the IRQ handler. Minus those out,
|
||||||
|
and you'd learn how much CPU you've really got left.
|
||||||
|
|
||||||
|
## Rumination on double buffering
|
||||||
|
|
||||||
|
The use of the RMT buffer for external systems means you'd like to have some kind of control interface - like
|
||||||
|
a semaphore - on each individual pixel buffer. That would allow a higher level task to simply bang
|
||||||
|
away on the array and get held off when it was unsafe.
|
||||||
|
|
||||||
|
In the case of using the Internal RMT, the amount of time is almost the same as ShowPixels, so you
|
||||||
|
could easily have showPixels signal outside code through the mechanism of your choice.
|
||||||
|
|
||||||
|
I'm not aware, at the moment, of how you would measure the amount of CPU spent in the IRQ system,
|
||||||
|
which is the bulk of the time.
|
||||||
|
|
Reference in New Issue
Block a user