OpenWRT on Zyxel NBG6716 (ar71xx nand) - upgrading to Chaos Calmer final

Well, I've had an interesting day...

I wrote before about using OpenWRT on the Zyxel NBG6716. I've been using it ever since; I updated to newer snapshot Chaos Calmer builds a couple of times to fix a couple of minor wifi bugs, but I'd been sitting on a fairly old snapshot (from March 2015) for a while now.

Today I figured it was time to update to the final stable release of Chaos Calmer, which came out not too long ago. Unfortunately it turned out to be quite the experience!

When I upgraded to newer snapshots the sysupgrade method worked fine. Because of some issue or other in their build scripts upstream doesn't actually produce images for the NBG6716 right now, but it's easy enough to build them with the image generator - you just grab the ar71xx-nand image builder and run make image PROFILE=NBG6716 PACKAGES="nano htop luci luci-ssl" (or whatever package loadout you want), and you get images in bin/ar71xx. I'd just been copying across the sysupgrade.tar file to the router and running sysupgrade -v on it and it was fine.

However, when I tried that for CC final, it just did not want to work at all. At first it was failing on a pre-flight check:

 /sbin/sysupgrade: eval: line 1: nand_do_platform_check: not found
 Image check 'platform_check_image' failed.

I worked out that the check it was looking for was in a file /lib/upgrade/nand.sh which didn't seem to be in my current firmware at all. So I copied it over from the image builder root onto the router and tried again, but now it would fail when trying to actually do the upgrade, with Command failed: Method not found. Near as I could tell this was suggesting that some expected capability in OpenWRT's ubus thing is not present in the firmware version I was running. So basically it looks like the upgrade process was simply broken for upgrading from the firmware I had installed to CC final.

Crap.

So I figured, OK, guess I'll back up my config, flash the new firmware clean, and restore my config. So I copied the factory.bin firmware file over to the router and tried to flash it with mtd...and it was just not having it at all. Every attempt would fail with [e]Failed to get erase block status, which near as I can tell indicates that for some reason mtd was hitting an error when it tried to check if the flash blocks were bad (it raises that error when it tries to use the kernel MEMGETBADBLOCK interface to check, and gets back a negative result). I tried various stupid things to try and work around that, but no joy.

So I thought screw it, I'll flash it over tftp. Read the instructions, set up a box as instructed, rebooted the router with the WPS button held down, and...nada. It just booted apparently normally.

So I kept trying, with the rabbit's foot, fiddling with network cables, and chasing after suspicious-looking tftp error messages for hours...finally by running a different TFTP server I could observe that it was definitely sending the image to the router, but for some reason the router wasn't flashing it.

Finally I twigged: it won't flash an OpenWRT image via TFTP, only a stock firmware!

So I had to grab a stock Zyxel firmware, flash that over TFTP, hard reset it (because for some reason it came up out of the flash with non-default admin credentials...whatever), then flash the clean OpenWRT CC final image from the stock firwmare using mtd (whatever mtd is in the stock firmware had no problem doing it...), and finally boot into CC final and reload my configuration backups.

Yeeeeesh.

Comments

Bad Joker wrote on 2015-11-27 11:23:
May you upload a prebuild final CC image somewhere for Zyxel NBG6716? I'd like to test it on one of my devices, but my linux machine is in a box somewhere, I recently moved to a new home. I would appreciate it very much, thanks for the hint how to upgrade ;)