mirror of
https://github.com/bbulkow/FastLED-idf.git
synced 2025-07-29 18:28:23 +02:00
Description of how I think the threading is working in this module
This commit is contained in:
92
THREADING.md
Normal file
92
THREADING.md
Normal file
@ -0,0 +1,92 @@
|
||||
# How to think about Threads in FastLED with ESP-IDF
|
||||
|
||||
I believe I understand how this is supposed to work.
|
||||
|
||||
## Parallel output, or multiple controller, mode
|
||||
|
||||
FastLED talks a bit about what they call 'multi channel'. This is
|
||||
the very cool feature which allows parallel channels to all be blasted
|
||||
at the same time, because you have hardware support.
|
||||
|
||||
The ESP32 has the RMT driver, which is a bit of hardware assist
|
||||
which is used to blast out the waveform from a very small interrupt handler.
|
||||
|
||||
The examples in the Wiki don't match the way the code seems to be written
|
||||
for this platform --- and it's a little weird.
|
||||
|
||||
For this platform, when you do a showLeds(), it keeps track of the number
|
||||
of calls, and when you reach the same number of calls as the number of allocated
|
||||
controllers, it actually spins up all of them together.
|
||||
|
||||
To say that's "peculiar" stretches the point.
|
||||
|
||||
You'd prefer an interface called something like "flushLeds" which
|
||||
acts on all the known controllers, or takes the list of controllers
|
||||
you'd like to operate over.
|
||||
|
||||
I think this API is supposed to work that you bang all the pixel arrays,
|
||||
then call showLeds() which hits all the controllers in turn, the last
|
||||
of which actually kicks off the transfers for all the controllers.
|
||||
|
||||
## using the individual showLeds() functions
|
||||
|
||||
If you set up individual showLeds() functions on the different controller, the
|
||||
code looks like the early ones will be very quick, and the final controller will
|
||||
do all the transfers.
|
||||
|
||||
I'd have to do some timing work and see if that's really happening --- but I'm at least 50% certain.
|
||||
|
||||
## Protecting the LED array, or using the external RMT driver
|
||||
|
||||
The API contract of showLEDs appears to be that showLeds blocks while
|
||||
bytes are being pushed, which means its safe to molest the led array after
|
||||
showLeds() finishes, thus setting you up for the next call.
|
||||
|
||||
Which is well and good, and allows you to do things like have a semaphore or
|
||||
signal that fires off when the frame is blasted.
|
||||
|
||||
There are two peculiarities with this concept.
|
||||
|
||||
If you are using the EXTERNAL_RMT, then each controller allocates a buffer 3x larger
|
||||
than the number of pixels. It does a convert() on the pixel array into the big memory
|
||||
buffer, which is the RMT buffer of signal transitions, and then doesn't molest your
|
||||
pixel buffer.
|
||||
|
||||
While this a lot of memory, you've just achieved double-buffering. The actual pixel array
|
||||
is clear to be modified right after that 'build' happens, and it would be super nice to
|
||||
know if that was true.
|
||||
|
||||
If you use the internal system, then the conversion happens in the interrupt routine.
|
||||
The amount of work is the same ( roughly ), but most of it is happening in the interrupt
|
||||
instead of out in "regular" time. Depending on your use, it might be better
|
||||
to have the work going on in regular time as part of the double-buffer.
|
||||
|
||||
While you are sure that your Pixel arrays are safe after showPixels finishes, you've
|
||||
covered a lot of time in the middle.
|
||||
|
||||
## How much CPU is really used?
|
||||
|
||||
If you 'do the math' on the number of pixels you can support, and you have 8 channels ( the number of RMT channels ),
|
||||
you'll think you don't have much time to actually update your models. But, in fact, you do,
|
||||
because the CPU is taking a fraction of its time feeding the RMT buffer, and a majority
|
||||
of its time hanging on the semaphore waiting to be done.
|
||||
|
||||
The right thing to do, then, is to create another task to do whatever set of interpolation
|
||||
you'd like to do.
|
||||
|
||||
If you measure the amount of time spent waiting on the Semaphore you'd be part of the way there,
|
||||
but you should also measure the amount of CPU spent in the IRQ handler. Minus those out,
|
||||
and you'd learn how much CPU you've really got left.
|
||||
|
||||
## Rumination on double buffering
|
||||
|
||||
The use of the RMT buffer for external systems means you'd like to have some kind of control interface - like
|
||||
a semaphore - on each individual pixel buffer. That would allow a higher level task to simply bang
|
||||
away on the array and get held off when it was unsafe.
|
||||
|
||||
In the case of using the Internal RMT, the amount of time is almost the same as ShowPixels, so you
|
||||
could easily have showPixels signal outside code through the mechanism of your choice.
|
||||
|
||||
I'm not aware, at the moment, of how you would measure the amount of CPU spent in the IRQ system,
|
||||
which is the bulk of the time.
|
||||
|
Reference in New Issue
Block a user