https://i.sstatic.net/72CJR.png
To grasp this concept, it is essential to consider both the excerpt you referenced and the paragraph that comes before it. Together, they articulate:
A floated box is moved to the left or right until its outer edge touches the containing block edge or another float's outer edge. If there is a line box present, the top outer edge of the floated box aligns with the current line box's top.
If there isn't sufficient horizontal space for the float, it is shifted downwards until it fits or no other floats are remaining.
This essentially means:
If there is a line box ...
Initially, the first inline-block image creates a line box equivalent to the image's height vertically aligned with the line box's strut (margin just slightly taller than the image itself). Due to other elements within the block formatting context (the floats), the line box gets enveloped in an anonymous block box.
Hence, the first floated image snugly fits alongside, aligning its top with the top of that line box.
However, accommodating a second floated image while maintaining alignment with the top of the line box and being parallel to the inline-block image poses a challenge. Consequently, the second floated image must be positioned after the line box and the encapsulating anonymous block box. Since there is no subsequent line box,
it is shifted downward until it fits ...
In this scenario within a block formatting context, placing an item after a block box implies immediately following it on the vertical axis. The spot where it fits is chosen for placement.
... or there are no more floats present.
Your example doesn’t engage with this aspect. This provision accounts for instances where a float's width may exceed 100% of its containing block’s width. In such cases, the float cannot fit alongside any preceding items and is thus positioned directly beneath the last floating element.