Choose the caption bubble to begin. If you want to upload a custom bubble or alternative image then submit the caption first, then edit the comment and attach a "Place img". Once the image is approved you can use it by editing the caption again.
Yes, since most comments have nothing to do with the image itself but are just amateur attempts at being funny, that would make sense. But then why even read comments on ESR at all?