diff --git a/README.md b/README.md
index 2c94f20..fc1edb2 100644
--- a/README.md
+++ b/README.md
@@ -4,8 +4,11 @@
 
 This port of FastLED 3.3 runs under the 4.x ESP-IDF development environment. Enjoy.
 
-HUGE UPDATE July 18, 2020. I have ported over Sam Guyer's branch, and now I can do network traffic
-without having glitches. This is a MASSIVE IMPROVEMENT and you owe Sam huge props.
+MASSIVE UPDATE Sept 4, 2020. Even after porting Sam Guyer's branch in July, I still
+had a huge amount of visual artifacts. I've done a huge analysis and have licked the issue to my
+satisfaction, and can say the system simply doesn't glitch.
+
+There are some new tunables, and if you're also fighting glitches, you need to read `components/FastLED-idf/ESP-IDF.md`.
 
 Note you must use the ESP-IDF environment, and the ESP-IDF build system. That's how the paths and whatnot are created.
 
@@ -81,17 +84,25 @@ people for LED control. 8 channels is basically still a ton of LEDs, even
 if the FastLED ESP32 module is even fancier and multiplexes the use of these
 channels.
 
+With the 800k WS8211 and similar that are now common, the end result of the 
+math and the buffers is you need to fill the RMT hardware buffer about every 35microseconds.
+Even with an RTOS, it seems this is problematic, using C code. For this reason,
+the default settings are to use two "memory buffers", which double the depth of the
+RMT hardware buffer, and means that interrupt jitter of up to about 60us can be absorbed
+without visual artifact. However, this means getting hardware accelleration with
+only 4 channels instead of 8. This can be changed back to 8.
+
+Please see the lengthy discussion under `components/FastLED-idf/ESP-IDF.md` to
+enable some tracing to find your timers, and similar.
+
 The FastLED ESP32 RMT use has two modes: one which uses the "driver", and
 one which doesn't, and claims to be more efficient due to when it's converting
 between LED RGB and not. 
 
 Whether you can use the "direct" mode or not depends on whether you have other
-users of the RMT driver within ESP-IDF. 
-
-It also depends on the version of ESP-IDF you're using. I've found that the "direct"
-driver works perfectly well in ESP-IDF 4.0, but with higher versions, there
-are incompatibilities. Since I haven't found solutions yet, the built-in driver
-is used with ESP-IDF v 4.1 and above.
+users of the RMT driver within ESP-IDF - however, *using the ESP-IDF supplied driver is
+not currently working or supported*. I've grown tired of trying to figure out the
+differences of the different versions.
 
 Essentially, if you have the Driver turned on, you shouldn't use the direct mode,
 and if you want to use the direct mode, you should turn off the driver.
diff --git a/components/FastLED-idf/ESP-IDF.md b/components/FastLED-idf/ESP-IDF.md
index 647c04a..8648f6d 100644
--- a/components/FastLED-idf/ESP-IDF.md
+++ b/components/FastLED-idf/ESP-IDF.md
@@ -1,9 +1,130 @@
 # Port to ESP-IDF
 
-Describe what had to be done, when we did it
+THis is based off the 3.3 version of FastLED. The 3.3 version is where a lot of development paused, although there are also
+a lot of good small fixes. This is actually a port of Sam Guyer's awesome ESP32 focused fork, https://github.com/samguyer/FastLED .
+
+If one is reporting, there's a few bits here and there to be done, but most of the work is in the platform directory.
+
+I have not tested the "4 wire" I2S based LEDs or code.
+
+# see section below about glitches!
 
 # Environment
 
 This port is to be used with ESP-IDF version 4.x, which went GA on about Feb, 2020.
 
-The FastLED code is vintage 3.3, which includes sophisticated ESP32 support.
+I have tested it to the greatest extent, at this point, with the 4.2 release, as its more stable than master
+but far closer to master than 4.0, which has some breaking changes.
+
+In a number of cases, the code can't easily be made to support all versions of ESP-IDF.
+
+In more recent versions of ESP-IDF, there is a new `_ll_` interface for the RMT system. Other posters
+have re-tooled to that interface instead of writing directly to hardware addresses.
+
+# menuconfig
+
+I prefer running my code -O3 optimized. I haven't changed any of the stack depths.
+
+# Differences
+
+This code defaults to using two memory buffers instead of 1. There are tradeoffs, and you can change the values. See below.
+
+It is not clear that using the ESP-IDF interrupt handler works anymore, although it should be tried. With larger
+memory buffers, and using the translate function, it should work no better or worse than any other LEVEL3 interrupt.
+
+Recent RMT driver code also includes setting the "ownership" of the shared DRAM. THis was overlooked in the FastLED
+driver code, but has been implemented. It seemed to make no difference to the stability of the system.
+
+# Difficulties
+
+## Timing and glitches
+
+The greatest difficulty with controlling any WS8211 system with the ESP32 / ESP8266 RMT system is timing.
+
+The WS8211 single wire protocols require two transitions to make a bit.
+
+The definition of the RMT interface means you put a time between the transition ( in divided 80mhz intervals, fit into a 15 bit 
+field with the high bit being the value to emit ). A single RMT buffer has 64 values, but we use a "double buffer" strategy
+(recommended by documentation). This means that 32 values, with 32 bits each, requires re-feeding the buffer about every 35 us.
+The buffer won't fully run dry until 70us, but at 35us.
+
+With a 400Khz LED, the RMT buffer is 2x longer in time.
+
+Interupts can be run in C only at "medium priority", which means that there are a class of activities - such as Wifi - which can 
+create enough jitter to disturb 35us timing requirement. I have observed this with a very simple REST web service using
+the ESP-IDF supplied web server, which doesn't use Flash in any noticable way, other than executing from it - and still, 50us
+interrupt jitter was observed.
+
+The RMT interface allows using more buffering, which will overcome this latency. THis is controlled by `MEMORY_BUFFERS` parameter,
+which is now configurable at the beginning of `clockless_rmt_esp32.h`. To absorb the latencies that I've seen, I need
+two memory buffers instead of 1. If you're not using wifi, you can perhaps get away with 1. Maybe if you're doing other things
+on the CPUs, 2 isn't enough and you're going to have to use 4.
+
+Increasing this value means you can't use as many RMT hardware channels at the same time. IF you use a value of 2, which
+works in my environment, the code will only use 4 hardware channels in parallel. If you create 8 LED strings, what should
+happen is 4 run in parallel, and the other 4 get picked up as those finish, so you'll end up using as much parallelism
+as you have available.
+
+In order to tune this variable, you'll find another configuration, which is `FASTLED_ESP32_SHOWTIMING`, also in that clockless H file.
+If you enable this, for each show call, the number of microseconds between calls will be emitted, and a prominent message
+when a potential underflow is detected. This will allow you to stress the system, look at the interupt jitter, and decide
+what setting you'd like for the `MEMORY_BUFFER`s.
+
+Please note also that I've been testing with the fairly common 800Khz WS8211's. If you're using 400Khz, you can almost certainly
+go back to 1 `MEMORY_BUFFER`. Likewise, if you've got faster LEDs, you might have to go even higher. The choice is yours.
+
+## Reproducing the issue on other systems
+
+I consider that the RTOS, at the highest level of interrupt available to the common user, even with 2 CPUs, in
+an idle system, 10 uS jitter to be something of a bug. At first, I blamed the FastLED code, but have spent quite a bit
+of energy exhonerating FastLED and placing the blame on ESP-IDF.
+
+In order to determine the problem, I also did a quick port of the NeoPixelBus interface, and I also used the sample
+code for WS8211 LEDs which I found in the ESP-IDF examples directory. All exhibited the same behavior.
+
+Thus the issue is not in the FastLED library, but simply a jitter issue in the RTOS. It seems that people
+using Arduino do not use the same TCP/IP stack, and do not suffer from these issues. Almost certainly, they are running at
+lower priorities or with a different interrupt structure.
+
+A simple test, using 99% espressif code, would open a simple HTTP endpoint and rest service, connect to WIFI and maintain an IP address,
+then use the existing WS8211 sample code provided by Espressif. I contend that a web server which is simply returning "404", and
+is being hit by mutiple requests per second ( I used 4 windows with a refresh interval of 2 seconds, but with cached content ) will
+exhibit this latency.
+
+# Todo: Running at a higher priority
+
+ESP-IDF has a page on running at higher prioity: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/hlinterrupts.html
+
+This page says (somewhat incorrectly) that C code may not be run from high priority interrupts. The document then 
+disagrees with itself, and the example presented calls C saying some C code is safe, but being a bit cagey about 
+why. 
+
+Given this problem happens with all drivers ( custom and ESP-IDF provided ), writing a high-priority driver in assembly
+which packes the RMT buffer from a "pixel" format buffer seems a very reasonable optimization. Or, using the current
+interrupt driver, and simply throwing the disasm into a .S file.
+
+I would greatly hope that ESP-IDF improves the documentation around higher priority interrupts, and about manipulating the interrupts
+in the system. I'd like to be able to profile and find what's causing these long delays, and drop their priority a little.
+The FreeRTOS documentation ( on which ESP-IDF is built ) clearly says that high priority interrupts are required for motor control,
+and LED control falls into that category. Placing the unnecessary barrier of assembly language isn't a terrible thing,
+but it's not the right way - show how to write safe C code and allow raising the priority, even if some people will
+abuse the privledge.
+
+
+# Todo - why the jitter?
+
+The large glitches ( 50us and up ) at the highest possible prority level, even with 2 cores of 240Mhz, is almost implausible.
+
+Possible causes might be linked to the Flash subsystem. However, disabling all flash reads and writes ( the SPI calls ) doesn't change the glitching behavior.
+
+Increasing network traffic does. Thus, the Wifi system is implicated. Work to do would be to decrease, somehow, the priority of 
+those interrupts. One forum poster said they found a way. As TCP and Wifi are meant to be unreliable, this would cause 
+performance issues, and that would have to be weighed against the nature of the poor LED output.
+
+# Todo - why visual artifacts?
+
+I haven't put a scope on, but I'm a little surprised that if you only throw an "R" and not a "G and B" on the wire, a pixel changes.
+This appears to be the reason you get a single pixel flash, and there's nothing one can do about that other than these deeper buffers.
+I would hope that other WS8211 type LEDs are a bit more robust and would only change color when they get a full set of R,G, and B.
+
+
diff --git a/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.cpp b/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.cpp
index 2721efa..fcce78c 100644
--- a/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.cpp
+++ b/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.cpp
@@ -3,9 +3,8 @@
 #define FASTLED_INTERNAL
 #include "FastLED.h"
 
-static const char *TAG = "FastLED";
+//static const char *TAG = "FastLED";
 
-#define FASTLED_ESP32_SHOWTIMING 1
 
 // -- Forward reference
 class ESP32RMTController;
@@ -30,18 +29,123 @@ static intr_handle_t gRMT_intr_handle = NULL;
 //    Semaphore is not given until all data has been sent
 static xSemaphoreHandle gTX_sem = NULL;
 
-// -- Make sure we can't call show() too quickly
+// -- Make sure we can't call show() too quickly (fastled library)
 CMinWait<50>   gWait;
 
 static bool gInitialized = false;
 
-// -- SZG: For debugging purposes
+/*
+** general DRAM system for printing during faster IRQs
+** be careful not to set the size too large, because code that prints
+** has the tendancy to do a stack alloc of the same size...
+*/
+
+// -- BB: For debugging purposes
 #if FASTLED_ESP32_SHOWTIMING == 1
-static uint32_t gLastFill[8];
-static int gTooSlow[8];
-static uint32_t gTotalTime[8];
+
+#define MEMORYBUF_SIZE 256
+DRAM_ATTR char g_memorybuf[MEMORYBUF_SIZE] = {0};
+DRAM_ATTR char *g_memorybuf_write = g_memorybuf;
+
+void IRAM_ATTR memorybuf_add( char *b ) {
+
+    int buflen = strlen(b);
+
+    // don't overflow
+    int bufRemain = sizeof(g_memorybuf) - ( g_memorybuf_write - g_memorybuf );
+    if ( bufRemain == 0 ) return;
+    if (bufRemain < buflen) buflen = bufRemain;
+
+    memcpy(g_memorybuf_write, b, buflen);
+    g_memorybuf_write += buflen;
+}
+
+void IRAM_ATTR memorybuf_add( char c ) {
+
+    // don't overflow
+    int bufRemain = sizeof(g_memorybuf) - ( g_memorybuf_write - g_memorybuf );
+    if ( bufRemain < 1 ) return;
+
+    *g_memorybuf_write = c;
+    g_memorybuf_write++;
+}
+
+void IRAM_ATTR memorybuf_insert( char *b, int buflen ) {
+    // don't overflow
+    int maxbuf = sizeof(g_memorybuf) - ( g_memorybuf_write - g_memorybuf );
+    if ( maxbuf == 0 ) return;
+    if (maxbuf < buflen) buflen = maxbuf;
+
+    memcpy(g_memorybuf_write, b, buflen);
+    g_memorybuf_write += buflen;
+}
+
+// often one wants a separator and an integer, do a helper
+void IRAM_ATTR memorybuf_int( int i, char sep) {
+
+    // am I full already?
+    int maxbuf = sizeof(g_memorybuf) - ( g_memorybuf_write - g_memorybuf );
+    if ( maxbuf == 0 ) return;
+
+    // for speed, just make sure I have 12 bytes, even though maybe I need fewer
+    // 12 is the number because I need a null which I will fill with sep, and 
+    // there's always the chance of a minus
+    if (maxbuf <= 12) return;
+
+    // prep the buf and find the length ( can't copy)
+    itoa(i, g_memorybuf_write, 10);
+    int buflen = strlen(g_memorybuf_write);
+    g_memorybuf_write[buflen] = sep;
+    g_memorybuf_write += (buflen + 1);
+
+}
+
+// get from the front... requires a memmove because overlaps.
+// *len input is the size of the buf, return is the length you got
+
+// this will always be the most efficient if you ask for a buffer that's as large as the
+// capture buffer
+void memorybuf_get(char *b, int *len) {
+    // amount in the buffer
+    int blen = g_memorybuf_write - g_memorybuf ;
+    if ( blen == 0 ) {
+        *len = 0;
+        return;
+    }
+    if (blen > *len) {
+        memcpy(b, g_memorybuf, *len);
+        int olen = blen - *len;
+        memmove(g_memorybuf, g_memorybuf_write - olen, olen);
+        g_memorybuf_write = g_memorybuf + olen;
+    }
+    else {
+        memcpy(b, g_memorybuf, blen);
+        g_memorybuf_write = g_memorybuf;
+        *len = blen;
+    }
+    return;
+}
+
+#endif /* FASTLED_ESP32_SHOWTIMING == 1 */
+
+/*
+** in later versions of the driver, they very carefully set the "mem_owner"
+** flag before copying over. Let's do the same.
+*/
+
+// probably already defined.
+#ifndef RMT_MEM_OWNER_SW
+#define RMT_MEM_OWNER_SW 0
+#define RMT_MEM_OWNER_HW 1
 #endif
 
+static inline void rmt_set_mem_owner(rmt_channel_t channel, uint8_t owner)
+{
+    RMT.conf_ch[(uint16_t)channel].conf1.mem_owner = owner;
+}
+
+
+
 ESP32RMTController::ESP32RMTController(int DATA_PIN, int T1, int T2, int T3)
     : mPixelData(0), 
       mSize(0), 
@@ -70,6 +174,13 @@ ESP32RMTController::ESP32RMTController(int DATA_PIN, int T1, int T2, int T3)
     gControllers[gNumControllers] = this;
     gNumControllers++;
 
+    // -- Expected number of CPU cycles between buffer fills
+    mCyclesPerFill = (T1 + T2 + T3) * PULSES_PER_FILL;
+
+    // -- If there is ever an interval greater than 1.75 times
+    //    the expected time, then bail out.
+    mMaxCyclesPerFill = mCyclesPerFill + ((mCyclesPerFill * 3)/4);
+
     mPin = gpio_num_t(DATA_PIN);
 }
 
@@ -91,15 +202,13 @@ void ESP32RMTController::init()
 {
     if (gInitialized) return;
 
-    ESP_LOGW(TAG, "controller init");
-
     for (int i = 0; i < FASTLED_RMT_MAX_CHANNELS; i++) {
         gOnChannel[i] = NULL;
 
         // -- RMT configuration for transmission
         rmt_config_t rmt_tx = RMT_DEFAULT_CONFIG_TX((gpio_num_t)0, rmt_channel_t(i));
         rmt_tx.gpio_num = gpio_num_t(0);  // The particular pin will be assigned later
-        rmt_tx.mem_block_num = 1;
+        rmt_tx.mem_block_num = MEM_BLOCK_NUM; 
         rmt_tx.clk_div = DIVIDER;
         rmt_tx.tx_config.loop_en = false;
         rmt_tx.tx_config.carrier_level = RMT_CARRIER_LEVEL_LOW;
@@ -108,7 +217,12 @@ void ESP32RMTController::init()
         rmt_tx.tx_config.idle_output_en = true;
 
         // -- Apply the configuration
-        rmt_config(&rmt_tx);
+        // warning: using more than MEM_BLOCK_NUM 1 means sometimes this might fail because
+        // we don't have enough MEM_BLOCKs. Todo: add code to track and only allocate as many channels
+        // as we have memblocks.
+        ESP_ERROR_CHECK(
+            rmt_config(&rmt_tx)
+        );
 
         if (FASTLED_RMT_BUILTIN_DRIVER) {
             rmt_driver_install(rmt_channel_t(i), 0, 0);
@@ -131,8 +245,11 @@ void ESP32RMTController::init()
         //    interrupt handler must work for all different kinds of
         //    strips, so it delegates to the refill function for each
         //    specific instantiation of ClocklessController.
-        if (gRMT_intr_handle == NULL)
-            esp_intr_alloc(ETS_RMT_INTR_SOURCE, ESP_INTR_FLAG_IRAM | ESP_INTR_FLAG_LEVEL3, interruptHandler, 0, &gRMT_intr_handle);
+        if (gRMT_intr_handle == NULL) {
+            ESP_ERROR_CHECK(
+                esp_intr_alloc(ETS_RMT_INTR_SOURCE, ESP_INTR_FLAG_IRAM | ESP_INTR_FLAG_LEVEL3, interruptHandler, 0, &gRMT_intr_handle)
+            );
+        }
     }
 
     gInitialized = true;
@@ -142,6 +259,7 @@ void ESP32RMTController::init()
 //    This is the main entry point for the pixel controller
 void ESP32RMTController::showPixels()
 {
+
     if (gNumStarted == 0) {
         // -- First controller: make sure everything is set up
         ESP32RMTController::init();
@@ -156,7 +274,7 @@ void ESP32RMTController::showPixels()
     gNumStarted++;
 
     // -- The last call to showPixels is the one responsible for doing
-    //    all of the actual worl
+    //    all of the actual work
     if (gNumStarted == gNumControllers) {
         gNext = 0;
 
@@ -202,15 +320,19 @@ void ESP32RMTController::showPixels()
 #endif
 
 #if FASTLED_ESP32_SHOWTIMING == 1
-        // uint32_t expected = (2080000L / (1000000000L/F_CPU));
-        for (int i = 0; i < gNumControllers; i++) {
-            if (gTooSlow[i] > 0) {
-                printf("Channel %d total time %d too slow %d\n",i,gTotalTime[i],gTooSlow[i]);
-            }
-        }
-#endif
+        // the interrupts may have dumped things to the buffer. Print it.
+        // warning: this does a fairly large stack allocation. 
+        char mb[MEMORYBUF_SIZE+1];
+        int mb_len = MEMORYBUF_SIZE;
+        memorybuf_get(mb, &mb_len);
+        if (mb_len > 0) {
+           mb[mb_len] = 0;
+           printf(" rmt irq print: %s\n",mb);
+       }
+#endif /* FASTLED_ESP32_SHOWTIMING == 1 */
 
     }
+
 }
 
 // -- Start up the next controller
@@ -230,6 +352,7 @@ void ESP32RMTController::startNext(int channel)
 //    for it to finish.
 void ESP32RMTController::startOnChannel(int channel)
 {
+
     // -- Assign this channel and configure the RMT
     mRMT_channel = rmt_channel_t(channel);
 
@@ -254,7 +377,7 @@ void ESP32RMTController::startOnChannel(int channel)
         mCur = 0;
         mWhichHalf = 0;
 
-        // -- Fill both halves of the RMT buffer (a totaly of 64 bits of pixel data)
+        // -- Fill both halves of the RMT buffer (a totality of 64 bits of pixel data)
         fillNext();
         fillNext();
 
@@ -264,6 +387,7 @@ void ESP32RMTController::startOnChannel(int channel)
         // -- Kick off the transmission
         tx_start();
     }
+
 }
 
 // -- Start RMT transmission
@@ -271,12 +395,8 @@ void ESP32RMTController::startOnChannel(int channel)
 void ESP32RMTController::tx_start()
 {
     rmt_tx_start(mRMT_channel, true);
+    mLastFill = __clock_cycles();
 
-#if FASTLED_ESP32_SHOWTIMING == 1
-    gLastFill[mRMT_channel] = __clock_cycles();
-    gTooSlow[mRMT_channel] = 0;
-    gTotalTime[mRMT_channel] = 0;
-#endif
 }
 
 // -- A controller is done 
@@ -288,7 +408,6 @@ void ESP32RMTController::tx_start()
 void ESP32RMTController::doneOnChannel(rmt_channel_t channel, void * arg)
 {
 
-
     // -- Turn off output on the pin
     // SZG: Do I really need to do this?
     //  ESP32RMTController * pController = gOnChannel[channel];
@@ -321,9 +440,6 @@ void ESP32RMTController::doneOnChannel(rmt_channel_t channel, void * arg)
 //    next half of the RMT buffer with data.
 void IRAM_ATTR ESP32RMTController::interruptHandler(void *arg)
 {
-#if FASTLED_ESP32_SHOWTIMING == 1
-    int64_t now = __clock_cycles();
-#endif
 
     // -- The basic structure of this code is borrowed from the
     //    interrupt handler in esp-idf/components/driver/rmt.c
@@ -331,64 +447,124 @@ void IRAM_ATTR ESP32RMTController::interruptHandler(void *arg)
     uint8_t channel;
 
     for (channel = 0; channel < FASTLED_RMT_MAX_CHANNELS; channel++) {
-        int tx_done_bit = channel * 3;
-        int tx_next_bit = channel + 24;
 
         ESP32RMTController * pController = gOnChannel[channel];
         if (pController != NULL) {
+
+            int tx_done_bit = channel * 3;
+            int tx_next_bit = channel + 24;
+
             if (intr_st & BIT(tx_next_bit)) {
                 // -- More to send on this channel
                 RMT.int_clr.val |= BIT(tx_next_bit);
-                pController->fillNext();
 
-#if FASTLED_ESP32_SHOWTIMING == 1
-                uint32_t delta = (now - gLastFill[channel]);
-                if (delta > C_NS(50500)) {
-                    gTooSlow[channel]++;
-                }
-                gTotalTime[channel] += delta;
-                gLastFill[channel] = now;
-#endif
-            } else {
-                // -- Transmission is complete on this channel
-                if (intr_st & BIT(tx_done_bit)) {
-                    RMT.int_clr.val |= BIT(tx_done_bit);
-#if FASTLED_ESP32_SHOWTIMING == 1
-                    uint32_t delta = (now - gLastFill[channel]);
-                    gTotalTime[channel] += delta;
-#endif
-                    doneOnChannel(rmt_channel_t(channel), 0);
+                // if timing's NOT ok, have to bail
+                if (true == pController->timingOk()) {
+
+                    pController->fillNext();
+
                 }
+            } // -- Transmission is complete on this channel
+            else if (intr_st & BIT(tx_done_bit)) {
+
+                RMT.int_clr.val |= BIT(tx_done_bit);
+                doneOnChannel(rmt_channel_t(channel), 0);
+
             }
         }
     }
 }
 
+DRAM_ATTR char g_bail_str[] = "_BAIL_";
+
+// check to see if there's a bad timing. Returns
+// we may be behind the necessary timing, so we should bail out of this 'show'.
+//
+// returns true if the timing is OK, false if bad
+
+bool IRAM_ATTR ESP32RMTController::timingOk() {
+
+    // last time is always delayed, don't check that one
+    if (mCur >= mSize)   return(true);
+
+    uint32_t delta = __clock_cycles() - mLastFill;
+
+    // interesting test - what if we only write 4? will nothing else light?
+    if ( delta > mMaxCyclesPerFill) {
+
+#if FASTLED_ESP32_SHOWTIMING == 1
+        memorybuf_add('!');
+        memorybuf_int( CYCLES_TO_US(delta), '-' );
+        memorybuf_int( CYCLES_TO_US(mMaxCyclesPerFill), '-');
+        memorybuf_int( mCur, ':' );
+        memorybuf_int( mSize, ':' );
+        memorybuf_add( g_bail_str );
+#endif /* FASTLED_ESP32_SHOWTIMING == 1 */
+
+        // how do we bail out? It seems if we simply call rmt_tx_stop, 
+        // we'll still flicker on the end. Setting mCur to mSize has the side effect
+        // of triggering the other code that says "we're finished"
+
+        // Old code also set this, hoping it wouldn't send garbage bytes
+        mCur = mSize;
+
+        // other code also set some zeros to make sure there wasn't anything bad.
+        rmt_set_mem_owner(mRMT_channel, RMT_MEM_OWNER_SW);
+        for (uint32_t j = 0; j < PULSES_PER_FILL; j++) {
+            * mRMT_mem_ptr++ = 0;
+        }
+        rmt_set_mem_owner(mRMT_channel, RMT_MEM_OWNER_HW);
+
+        return false;
+    }
+
+#if FASTLED_ESP32_SHOWTIMING == 1
+    else {
+        memorybuf_int( CYCLES_TO_US(delta), '-' );
+    }
+#endif /* FASTLED_ESP32_SHOWTIMING == 1 */
+
+    return true;
+}
+
 // -- Fill RMT buffer
 //    Puts 32 bits of pixel data into the next 32 slots in the RMT memory
 //    Each data bit is represented by a 32-bit RMT item that specifies how
 //    long to hold the signal high, followed by how long to hold it low.
 void IRAM_ATTR ESP32RMTController::fillNext()
 {
+
     if (mCur < mSize) {
+
         // -- Get the zero and one values into local variables
+        // each one is a "rmt_item_t", which contains two values, which is very convenient
         register uint32_t one_val = mOne.val;
         register uint32_t zero_val = mZero.val;
 
         // -- Use locals for speed
         volatile register uint32_t * pItem =  mRMT_mem_ptr;
 
-        // -- Get the next four bytes of pixel data
-        register uint32_t pixeldata = mPixelData[mCur];
-        mCur++;
+        // set the owner to SW --- current driver does this but its not clear it matters
+        rmt_set_mem_owner(mRMT_channel, RMT_MEM_OWNER_SW);
             
         // Shift bits out, MSB first, setting RMTMEM.chan[n].data32[x] to the 
         // rmt_item32_t value corresponding to the buffered bit value
-        for (register uint32_t j = 0; j < PULSES_PER_FILL; j++) {
-            *pItem++ = (pixeldata & 0x80000000L) ? one_val : zero_val;
-            // Replaces: RMTMEM.chan[mRMT_channel].data32[mCurPulse].val = val;
 
-            pixeldata <<= 1;
+        for (int i=0; i < PULSES_PER_FILL / 32; i++) {
+            if (mCur < mSize) {
+                register uint32_t thispixel = mPixelData[mCur];
+                for (int j = 0; j < 32; j++) {
+
+                    *pItem++ = (thispixel & 0x80000000L) ? one_val : zero_val;
+                    // Replaces: RMTMEM.chan[mRMT_channel].data32[mCurPulse].val = val;
+                    thispixel <<= 1;
+                }
+                mCur++;
+            }
+            else {
+                // if you hit the end, add 0 for signal
+                *pItem++ = 0;
+            }
         }
 
         // -- Flip to the other half, resetting the pointer if necessary
@@ -400,11 +576,20 @@ void IRAM_ATTR ESP32RMTController::fillNext()
 
         // -- Store the new pointer back into the object
         mRMT_mem_ptr = pItem;
+
+        // set the owner back to HW
+        rmt_set_mem_owner(mRMT_channel, RMT_MEM_OWNER_HW);
+
+        // update the time I last filled
+        mLastFill = __clock_cycles();
+
     } else {
         // -- No more data; signal to the RMT we are done
+        rmt_set_mem_owner(mRMT_channel, RMT_MEM_OWNER_SW);
         for (uint32_t j = 0; j < PULSES_PER_FILL; j++) {
             * mRMT_mem_ptr++ = 0;
         }
+        rmt_set_mem_owner(mRMT_channel, RMT_MEM_OWNER_HW);
     }
 }
 
diff --git a/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.h b/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.h
index 9fdd6ee..660351c 100644
--- a/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.h
+++ b/components/FastLED-idf/platforms/esp/32/clockless_rmt_esp32.h
@@ -1,5 +1,6 @@
 /*
  * Integration into FastLED ClocklessController
+ * Copyright (c) 2020 Brian Bulkowski brian@bulkowski.org
  * Copyright (c) 2018,2019,2020 Samuel Z. Guyer
  * Copyright (c) 2017 Thomas Basler
  * Copyright (c) 2017 Martin F. Falatic
@@ -138,26 +139,32 @@ __attribute__ ((always_inline)) inline static uint32_t __clock_cycles() {
 #define FASTLED_HAS_CLOCKLESS 1
 #define NUM_COLOR_CHANNELS 3
 
-// NOT CURRENTLY IMPLEMENTED:
-// -- Set to true to print debugging information about timing
-//    Useful for finding out if timing is being messed up by other things
-//    on the processor (WiFi, for example)
-//#ifndef FASTLED_RMT_SHOW_TIMER
-//#define FASTLED_RMT_SHOW_TIMER false
-//#endif
 
 // -- Configuration constants
 #define DIVIDER             2 /* 4, 8 still seem to work, but timings become marginal */
-#define MAX_PULSES         64 /* A channel has a 64 "pulse" buffer */
-#define PULSES_PER_FILL    32 /* Half of the channel buffer */
+                                /* there is no point in higher dividers, as this parameter only needs to make
+                                   sure the scaling factors of the RMT intervals fit in 15 bits. */
+
+#define MEM_BLOCK_NUM       2 /* the number of memory blocks. There are 8 for the entire RMT system, and nominally
+                                1 per channel. Using a larger number reduces the number of hardware channels that can be used
+                                at one time, but increases the resistance to RTOS interrupt jitter. 1 seems to be good enough,
+                                but jitter created by wifi might still cause glitches and 2 or more may be reuired. */
+#define PULSES_PER_CHANNEL  (64 * MEM_BLOCK_NUM) /* A channel has a 64 "pulse" buffer of 32 bits (aka Items in RMT interface) */
+#define PULSES_PER_FILL     (PULSES_PER_CHANNEL / 2)     /* Half of the channel buffer */
+                            // PPF must be a multipel of 32 or fillNext must be re-coded
 
 // -- Convert ESP32 CPU cycles to RMT device cycles, taking into account the divider
+// -- according to https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/rmt.html
+//    the RMT clock is taken at 80 000 000 
 #define F_CPU_RMT                   (  80000000L)
 #define RMT_CYCLES_PER_SEC          (F_CPU_RMT/DIVIDER)
 #define RMT_CYCLES_PER_ESP_CYCLE    (F_CPU / RMT_CYCLES_PER_SEC)
 #define ESP_TO_RMT_CYCLES(n)        ((n) / (RMT_CYCLES_PER_ESP_CYCLE))
 
+#define CYCLES_TO_US(n)             ( (n) / (F_CPU / 1000000L ))
+
 // -- Number of cycles to signal the strip to latch
+// in RMT cycles
 #define NS_PER_CYCLE                ( 1000000000L / RMT_CYCLES_PER_SEC )
 #define NS_TO_CYCLES(n)             ( (n) / NS_PER_CYCLE )
 #define RMT_RESET_DURATION          NS_TO_CYCLES(50000)
@@ -174,8 +181,17 @@ __attribute__ ((always_inline)) inline static uint32_t __clock_cycles() {
 
 // -- Number of RMT channels to use (up to 8)
 //    Redefine this value to 1 to force serial output
+// -- todo: this is wrong if MEM_BLOCK_NUM is 3, but at least it's safe
 #ifndef FASTLED_RMT_MAX_CHANNELS
-#define FASTLED_RMT_MAX_CHANNELS 8
+#define FASTLED_RMT_MAX_CHANNELS ( 8 / MEM_BLOCK_NUM )
+#endif
+
+// use this if you want to try the flash lock
+// doesn't seem to make any postitive difference
+//#define FASTLED_ESP32_FLASH_LOCK 1
+
+#ifndef FASTLED_ESP32_SHOWTIMING
+#define FASTLED_ESP32_SHOWTIMING 0
 #endif
 
 class ESP32RMTController
@@ -192,6 +208,12 @@ private:
     rmt_item32_t   mZero;
     rmt_item32_t   mOne;
 
+    // -- Total expected time to send 32 bits
+    //    Each strip should get an interrupt roughly at this interval
+    uint32_t       mCyclesPerFill;
+    uint32_t       mMaxCyclesPerFill;
+    uint32_t       mLastFill;
+
     // -- Pixel data
     uint32_t *     mPixelData;
     int            mSize;
@@ -215,6 +237,9 @@ public:
     //    member variables.
     ESP32RMTController(int DATA_PIN, int T1, int T2, int T3);
 
+    // -- Get max cycles per fill
+    uint32_t IRAM_ATTR getMaxCyclesPerFill() const { return mMaxCyclesPerFill; }
+
     // -- Get or create the pixel data buffer
     uint32_t * getPixelBuffer(int size_in_bytes);
 
@@ -254,6 +279,16 @@ public:
     //    next half of the RMT buffer with data.
     static void IRAM_ATTR interruptHandler(void *arg);
 
+    // -- Determine if there was a long pause
+    //    Especially in ESP32, it seems hard to guarentee that interrupts fire
+    //    this function checks to see if there has been a long period
+    //    so you can abort an RMT send without sending bits that cause flashes
+    //
+    // SIDE EFFECT: Triggers stop of the channel
+    //
+    //    return FALSE means one needs to abort
+    bool IRAM_ATTR timingOk();
+
     // -- Fill RMT buffer
     //    Puts 32 bits of pixel data into the next 32 slots in the RMT memory
     //    Each data bit is represented by a 32-bit RMT item that specifies how