Embedded Filesystem Types

ext4

{:.gc-basic}

Basic

ext4 (Fourth Extended Filesystem) is the workhorse block filesystem of Linux. It introduced extents (replacing indirect block pointers), delayed allocation, persistent pre-allocation, and journal checksums. For embedded systems it is the right choice when you have a managed flash device (eMMC, SD card, SATA SSD) that handles its own wear-leveling.

Key Features

Feature	Details
Journaling	Metadata journal (default) or data+metadata (`data=journal`)
Extents	Contiguous block ranges replace indirect block maps — better for large files
Max file size	16 TiB
Max volume size	1 EiB
Online resize	`resize2fs` can grow a live filesystem
Directory indexing	HTree (hash-tree) for fast large directory lookups

Creating an ext4 Image for Embedded

# 128 MB rootfs image
dd if=/dev/zero of=rootfs.ext4 bs=1M count=128

# -b 4096  : block size — 4096 is default; use 1024 for many small files
# -i 4096  : bytes-per-inode ratio — lower = more inodes (more files)
# -L rootfs: volume label
# -O ^has_journal : disable journal for read-mostly partitions (saves space)
mkfs.ext4 -b 4096 -i 4096 -L rootfs rootfs.ext4

# Mount as loop device and populate
sudo mount -o loop rootfs.ext4 /mnt/target
sudo cp -a $ROOTFS/* /mnt/target/
sudo umount /mnt/target

# Check and repair
e2fsck -f rootfs.ext4

# Shrink to minimum size
resize2fs -M rootfs.ext4

tune2fs — Adjust Parameters After Creation

# Show filesystem parameters
tune2fs -l rootfs.ext4

# Reduce mount count before auto-fsck (set to 0 to disable count-based check)
tune2fs -c 0 -i 0 /dev/mmcblk0p2

# Set reserved block percentage to 0% (default is 5% — wasteful on embedded)
tune2fs -m 0 /dev/mmcblk0p2

Mount Options for Embedded Flash

# /etc/fstab for an embedded eMMC partition
/dev/mmcblk0p2  /  ext4  noatime,nodiratime,data=writeback,errors=remount-ro  0 1

Option	Purpose
`noatime`	Do not update access timestamps — reduces write amplification significantly
`nodiratime`	Do not update directory access timestamps
`data=writeback`	Only journal metadata, not data — highest performance but data can be stale after crash
`data=ordered`	Default — data flushed before metadata journal commit
`errors=remount-ro`	Remount read-only on error instead of crashing — important for embedded
`commit=60`	Increase journal commit interval (default 5s) to reduce writes

squashfs — Read-Only Compressed

{:.gc-basic}

Basic

SquashFS is a compressed, read-only filesystem designed for packaging. It compresses data and metadata together into a single image with random-access decompression (block-based). A read-only rootfs has significant embedded advantages:

No filesystem corruption — power can be cut at any time
Atomic OTA updates — swap out the image file or partition
Smaller storage — 40–60% compression ratio typical with zstd/xz
Faster reads — decompression from flash is often faster than uncompressed reads from slow NAND

Creating SquashFS Images

# Basic — default gzip compression
mksquashfs $ROOTFS rootfs.squashfs

# Production — zstd (best speed/ratio balance)
mksquashfs $ROOTFS rootfs.squashfs \
    -comp zstd \
    -Xcompression-level 15 \
    -b 131072 \
    -noappend \
    -no-progress

# Maximum compression — xz (best ratio, slower decompress)
mksquashfs $ROOTFS rootfs.squashfs \
    -comp xz \
    -b 262144 \
    -noappend

# Embedded optimised — lz4 (fastest decompress, lowest CPU on boot)
mksquashfs $ROOTFS rootfs.squashfs \
    -comp lz4 \
    -Xhc \
    -b 65536 \
    -noappend

Compression Comparison

Algorithm	Ratio	Decompress Speed	Compress Speed	Best For
`gzip`	Good	Moderate	Moderate	Default/compatibility
`lzo`	Moderate	Very fast	Fast	Low-power CPUs
`lz4`	Lower	Fastest	Fastest	Boot speed critical
`zstd`	Excellent	Fast	Fast	Recommended default
`xz`	Best	Slow	Very slow	Smallest image size

# Mount a squashfs image
sudo mount -t squashfs -o loop rootfs.squashfs /mnt/target

# Inspect contents without mounting
unsquashfs -l rootfs.squashfs | head -30

# Extract to directory
unsquashfs -d extracted/ rootfs.squashfs

Typical Embedded Usage Pattern

NOR/NAND Flash Layout:
  Partition 0: U-Boot bootloader      (512 KB)
  Partition 1: Kernel + dtb           (8 MB)
  Partition 2: SquashFS rootfs (ro)   (32 MB)   ← atomic OTA target
  Partition 3: ext4 data (rw)         (rest)

tmpfs — RAM Filesystem

{:.gc-mid}

Intermediate

tmpfs is a virtual filesystem implemented entirely in kernel memory (and swap, if available). Unlike ramfs (which grows without bound), tmpfs has a configurable size limit and returns unused pages to the page cache.

Characteristics

Files live in kernel memory — lost on reboot or unmount
Memory is only consumed when files are actually written (sparse)
Supported as backing store: anonymous pages + swap
Extremely fast — no disk I/O at all
Supports extended attributes and POSIX ACLs

Mount Options

# Mount a 128 MB tmpfs on /tmp
mount -t tmpfs -o size=128m,mode=1777 tmpfs /tmp

# /run — runtime data (PID files, sockets)
mount -t tmpfs -o size=32m,mode=755 tmpfs /run

# /dev/shm — POSIX shared memory
mount -t tmpfs -o size=64m tmpfs /dev/shm

Option	Description
`size=N`	Maximum size in bytes, KiB (k), MiB (m), GiB (g), or % of RAM
`mode=OCTAL`	Permissions of the mount point root directory
`uid=N`	Owner UID of the mount point root
`gid=N`	Owner GID of the mount point root
`nr_inodes=N`	Maximum number of inodes (default: half of RAM pages)
`nr_blocks=N`	Maximum number of blocks

# /etc/fstab entries for typical embedded tmpfs mounts
tmpfs  /tmp          tmpfs  size=64m,mode=1777,nosuid,nodev   0 0
tmpfs  /run          tmpfs  size=16m,mode=755,nosuid,nodev    0 0
tmpfs  /var/volatile tmpfs  size=32m,mode=755                 0 0
tmpfs  /dev/shm      tmpfs  size=32m,mode=1777,nosuid,nodev   0 0

Use Cases on Embedded Systems

/tmp — temporary files (always volatile)
/run — PID files, Unix sockets, runtime state
/var/volatile — when /var needs to be writable but is on a read-only rootfs
/var/log — log files on systems without persistent storage
Buildroot default: /var is a symlink to /var/volatile (a tmpfs)

# Check current tmpfs usage
df -h -t tmpfs

# Verify size limit is respected
mount | grep tmpfs
# tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=65536k,mode=1777)

overlayfs — Layered Filesystem

{:.gc-mid}

Intermediate

overlayfs merges two directory trees: a lower (read-only) layer and an upper (read-write) layer. Reads come from upper if the file exists there, otherwise from lower. Writes always go to upper. The merged directory is the union view presented to users.

Directory Structure

lower/     ← read-only (e.g., squashfs rootfs)
upper/     ← read-write (e.g., tmpfs or ext4 partition)
workdir/   ← overlayfs internal use (must be on same fs as upper)
merged/    ← the unified view (mounted here)

Mounting overlayfs

# Create the required directories
mkdir -p /overlay/{upper,work}
mkdir -p /overlay/merged

# Mount — lower is read-only squashfs already at /ro
mount -t overlay overlay \
    -o lowerdir=/ro,upperdir=/overlay/upper,workdir=/overlay/work \
    /overlay/merged

# Multiple lower layers (read from right to left, right = highest priority)
mount -t overlay overlay \
    -o lowerdir=/base:+/layer1:+/layer2,upperdir=/upper,workdir=/work \
    /merged

Practical Embedded OTA Pattern

Boot with squashfs (ro) + overlayfs (rw tmpfs upper):

  /lower    ← mount squashfs here (read-only, compressed)
  /upper    ← tmpfs (writable, lost on reboot)  OR  ext4 partition (persistent writes)
  /workdir  ← same filesystem as upper
  /         ← overlayfs merged view (read-write from user perspective)

# Typical rcS sequence for overlayfs rootfs
mount -t squashfs /dev/mtdblock2 /lower
mount -t tmpfs tmpfs /upper -o size=64m
mkdir -p /upper/data /upper/work
mount -t overlay overlay \
    -o lowerdir=/lower,upperdir=/upper/data,workdir=/upper/work \
    /mnt/newroot
exec switch_root /mnt/newroot /sbin/init

Docker’s Use of overlayfs

Docker uses overlayfs to layer container image layers. Each image layer is a lowerdir; the running container gets a upperdir for its writes. Deleting the container discards the upperdir, restoring the image to its original state.

overlayfs Limitations

Limitation	Detail
Hardlinks	Cannot create hardlinks that span lower and upper layers
`fsync` propagation	`fsync` on the merged view does not guarantee lower layer durability
`rename` across layers	Moving a file from lower to a different directory creates a copy-up + whiteout
NFS lower	overlayfs over NFS lower layer is not supported in upstream kernel
Copy-up overhead	First write to a lower-layer file triggers a full copy-up to upper

JFFS2 and UBIFS for Raw NAND

{:.gc-adv}

Advanced

Raw NAND flash differs fundamentally from managed flash (eMMC, SD, SSD):

No built-in FTL (Flash Translation Layer)
Must handle erase blocks (128 KB – 2 MB typical)
Subject to write endurance (10,000–100,000 cycles per block)
Prone to bit errors — requires ECC
Cannot overwrite — must erase entire block before writing

Linux exposes raw NAND through the MTD (Memory Technology Device) subsystem:

# View MTD partitions
cat /proc/mtd
# dev:    size   erasesize  name
# mtd0: 00080000 00020000 "u-boot"
# mtd1: 00500000 00020000 "kernel"
# mtd2: 07a80000 00020000 "rootfs"

# Character devices: /dev/mtd0 (raw), /dev/mtdblock0 (block emulation)
mtdinfo /dev/mtd2

JFFS2 (Journalling Flash File System 2)

JFFS2 was the original Linux flash filesystem. It stores files as linked lists of nodes written sequentially across the flash.

Characteristics:

Built-in wear leveling via log-structured writes
Transparent compression (zlib, rtime, LZO)
Power-fail safe — nodes are written atomically
Slow mount time on large partitions (must scan all nodes at boot)
Recommended for NOR flash and small NAND partitions (<64 MB)

# Create JFFS2 image (requires mtd-utils)
mkfs.jffs2 \
    -r $ROOTFS \
    -o rootfs.jffs2 \
    -e 128KiB \            # erase block size — must match hardware
    --pad=0x7a80000 \      # pad to partition size
    -n                     # do not add cleanmarkers (for NAND)

# Flash to MTD partition
flashcp -v rootfs.jffs2 /dev/mtd2

# Mount
mount -t jffs2 /dev/mtdblock2 /mnt

UBIFS (Unsorted Block Image File System)

UBIFS runs on top of UBI (Unsorted Block Images) — a volume management layer that abstracts wear leveling, bad block management, and ECC from the filesystem.

Application
    │
  UBIFS      ← filesystem with sorted B-tree index
    │
   UBI       ← volume manager, wear leveling, bad block handling
    │
   MTD       ← raw NAND access with ECC
    │
  NAND       ← physical flash

UBIFS vs JFFS2:

Feature	JFFS2	UBIFS
Mount time	O(n) — scans all nodes	O(1) — indexed B-tree
Write performance	Moderate	High
Suitable flash size	< 64 MB	Any size
Compression	Yes (zlib/lzo)	Yes (lzo/zstd)
Power-fail safety	Yes	Yes
Wear leveling	Built-in	Via UBI layer
ECC	Via MTD	Via UBI

# Step 1: Attach MTD device to UBI
ubiattach /dev/ubi_ctrl -m 2 -d 0   # mtd2 → ubi0

# Step 2: Create UBI volume
ubimkvol /dev/ubi0 -n 0 -N rootfs -s 120MiB

# Step 3: Create UBIFS image
mkfs.ubifs \
    -r $ROOTFS \
    -o rootfs.ubifs \
    -m 2048 \         # minimum I/O size (page size)
    -e 126976 \       # logical erase block size (physical - 2*page for UBI overhead)
    -c 1000           # maximum number of logical erase blocks

# Step 4: Package into a UBI image (for flashing)
ubinize -o ubi.img -m 2048 -p 131072 ubinize.cfg

# ubinize.cfg:
# [ubifs]
# mode=ubi
# image=rootfs.ubifs
# vol_id=0
# vol_size=120MiB
# vol_type=dynamic
# vol_name=rootfs
# vol_flags=autoresize

# Flash the UBI image
nandwrite -p /dev/mtd2 ubi.img

# Mount UBIFS
mount -t ubifs ubi0:rootfs /mnt

Filesystem Image Creation

{:.gc-adv}

Advanced

Creating ext2/ext3/ext4 Without Root (`genext2fs`)

# genext2fs creates ext2 images without root privileges or loop mounts
genext2fs \
    -b 65536 \          # image size in 1K blocks (= 64 MB)
    -d $ROOTFS \        # source directory
    -i 4096 \           # bytes per inode
    -U \                # squash uid/gid to 0
    rootfs.ext2

# Convert to ext4
tune2fs -O extents,uninit_bg,dir_index rootfs.ext2
e2fsck -f rootfs.ext2

fakeroot — Correct Permissions Without Root

# fakeroot intercepts file ownership calls and tracks them in a database
# Allows creating images with root-owned files as a non-root user

fakeroot -- bash -c '
    cp -a $ROOTFS /tmp/staging
    chown -R root:root /tmp/staging
    chmod 4755 /tmp/staging/bin/su
    mksquashfs /tmp/staging rootfs.squashfs -comp zstd
'

Complete Image Creation Table

Format	Command	Read-Write	Compression	Flash Type	Root Required
ext4	`mkfs.ext4`	Yes	No	eMMC/SD/HDD	Yes (or loop)
ext2 (no root)	`genext2fs`	Yes	No	eMMC/SD	No
SquashFS	`mksquashfs`	No	Yes	Any	No (with fakeroot)
cpio initramfs	`find \| cpio`	RAM only	With gzip/xz	Any	No
JFFS2	`mkfs.jffs2`	Yes	Yes	NOR/NAND MTD	No
UBIFS	`mkfs.ubifs` + `ubinize`	Yes	Yes	NAND (UBI)	No
SquashFS+overlayfs	combined	Apparent yes	Partial	Any	Partial

Buildroot Image Generation Pipeline

# Buildroot handles all of this automatically — shown here for understanding
# output/images/ contains the final artifacts:
ls output/images/
# rootfs.ext4      ← ext4 for eMMC targets
# rootfs.squashfs  ← squashfs for read-only targets
# rootfs.tar.gz    ← tarball for NFS or container base
# sdcard.img       ← full disk image with partition table

Interview Q&A

{:.gc-iq}

Interview Q&A

Q1 — Why is SquashFS preferred for a production embedded rootfs?

SquashFS provides three key properties for production: (1) power-fail safety — because it is read-only, there is no risk of filesystem corruption from unexpected power loss; (2) atomic OTA — you replace or swap the entire image atomically rather than patching individual files; (3) compression — it reduces flash consumption and can improve read throughput on slow NAND by spending CPU cycles to decompress rather than waiting for I/O. The combination of overlayfs for a writable upper layer gives back the ability to modify files at runtime.

Q2 — In overlayfs, what is the difference between lowerdir and upperdir, and what happens on a write?

lowerdir is the read-only base layer (typically a SquashFS mount). upperdir is the read-write layer (typically a tmpfs or ext4 partition). When a file in lowerdir is written for the first time, overlayfs performs a copy-up: it copies the entire file from lowerdir to upperdir, then modifies the copy. Subsequent writes go directly to upperdir. Deletion is handled with a whiteout file in upperdir that masks the lower-layer entry.

Q3 — What is the default size of a tmpfs mount if no size= option is given?

By default, tmpfs is limited to half of physical RAM. On a system with 512 MB RAM, an unconfigured tmpfs can grow to 256 MB. This is why /etc/fstab entries for /tmp, /run, and similar should always specify an explicit size= limit, especially on memory-constrained embedded systems.

Q4 — What are the three ext4 journal modes, and when would you use each?

data=journal — both data and metadata go through the journal. Safest, slowest. Rarely used. data=ordered — data is written to disk before its metadata is committed to the journal. Default mode. Good balance of safety and performance. data=writeback — metadata is journaled but data write-ordering is not guaranteed. Fastest, but data can be stale after a crash (though never corrupt). Preferred for embedded flash where write endurance matters more than ordered-write guarantees.

Q5 — When should you choose UBIFS over JFFS2?

UBIFS should be chosen for any raw NAND partition larger than ~64 MB. JFFS2’s mount time scales linearly with the number of nodes in the filesystem — on a 256 MB NAND partition it can take 30–60 seconds to mount at boot. UBIFS uses a B-tree index stored on the UBI volumes, so mount time is near-constant regardless of size. JFFS2 remains appropriate for small NOR flash chips (typically < 16 MB) where the UBI overhead is not worth the setup complexity.

Q6 — Why should embedded systems use noatime in their mount options?

Every file read updates the atime (access time) field in the inode, which requires a write to flash. On a system that reads files frequently but does not need access time tracking (virtually all embedded applications), this causes pointless write amplification — shortening flash lifespan and reducing performance. noatime disables these writes entirely. relatime (the Linux default since 2.6.30) is a compromise that only updates atime if it is older than mtime or once per day, but noatime is still preferred for flash-heavy embedded targets.

Q7 — How do you create a disk image with a partition table for an eMMC target?

# Create a 1 GB disk image with MBR partition table
dd if=/dev/zero of=sdcard.img bs=1M count=1024

# Partition: 64 MB boot (FAT32) + rest as rootfs (ext4)
parted -s sdcard.img \
    mklabel msdos \
    mkpart primary fat32 4MiB 68MiB \
    mkpart primary ext4 68MiB 100% \
    set 1 boot on

# Format partitions via loop device
sudo losetup -fP sdcard.img
LOOP=$(losetup -j sdcard.img | cut -d: -f1)
sudo mkfs.vfat -F32 -n BOOT ${LOOP}p1
sudo mkfs.ext4 -L rootfs    ${LOOP}p2
# Copy content, then:
sudo losetup -d $LOOP

References

{:.gc-ref}

References

Resource	Link
Linux kernel — SquashFS documentation	kernel.org/doc/html/latest/filesystems/squashfs.html
Linux kernel — overlayfs documentation	kernel.org/doc/html/latest/filesystems/overlayfs.html
Linux kernel — UBIFS documentation	kernel.org/doc/html/latest/filesystems/ubifs.html
MTD (Memory Technology Devices)	linux-mtd.infradead.org
mtd-utils project	git.infradead.org/mtd-utils.git
ext4 wiki	ext4.wiki.kernel.org
Buildroot filesystem generation	buildroot.org/downloads/manual/manual.html#_filesystem_images
`man 8 mkfs.ext4`	ext4 filesystem creation
`man 8 mksquashfs`	SquashFS image creation
`man 8 mkfs.ubifs`	UBIFS image creation
Bootlin embedded Linux slides	bootlin.com/doc/training/embedded-linux/

Embedded Filesystem Types

ext4

Key Features

Creating an ext4 Image for Embedded

tune2fs — Adjust Parameters After Creation

Mount Options for Embedded Flash

squashfs — Read-Only Compressed

Creating SquashFS Images

Compression Comparison

Typical Embedded Usage Pattern

tmpfs — RAM Filesystem

Characteristics

Mount Options

Use Cases on Embedded Systems

overlayfs — Layered Filesystem

Directory Structure

Mounting overlayfs

Practical Embedded OTA Pattern

Docker’s Use of overlayfs

overlayfs Limitations

JFFS2 and UBIFS for Raw NAND

JFFS2 (Journalling Flash File System 2)

UBIFS (Unsorted Block Image File System)

Filesystem Image Creation

Creating ext2/ext3/ext4 Without Root (genext2fs)

fakeroot — Correct Permissions Without Root

Complete Image Creation Table

Buildroot Image Generation Pipeline

Interview Q&A

References

Creating ext2/ext3/ext4 Without Root (`genext2fs`)