7 votes

How can you have the img's src attribute point to a web page itself instead of an image?

Posted July 2, 2023 by pyeri (edited July 2, 2023)

Consider the strange case of this reddit preview page for example:

https://preview.redd.it/uhomipyb8kp71.jpg?width=575&auto=webp&v=enabled&s=b0e044ddd8a83774e0453cb7607ef681444c4c37

If you inspect the primary <img> element on the page, you'll find its src attribute not pointing to any image file but (behold!) that link itself!

Through this mechanism, they've effectively hidden the direct link to that image, isn't it? How is this even possible? Is this a new phenomenon or way in web development?

5 comments

[5]
teaearlgraycold
July 2, 2023
Link
The server can detect the intent of the client (do they expect an image or a webpage?) and then give a different response accordingly. An img tag can tell the client to request a jpeg specifically.

The server can detect the intent of the client (do they expect an image or a webpage?) and then give a different response accordingly. An img tag can tell the client to request a jpeg specifically.

11 votes
1. [4]
  talklittle
  July 2, 2023
  Link Parent
  That's right. The Accept request header for the first page starts with text/html and has a few image/* entries afterward. The server therefore serves HTML because it takes precedence. (Sometimes,...
  
  That's right. The Accept request header for the first page starts with text/html and has a few image/* entries afterward. The server therefore serves HTML because it takes precedence. (Sometimes, depending on the server software, the server will ignore the text/html and just serve an image always.)
  
  On the other hand, the <img> uses a different Accept request header of image/* only. So the server knows to send the image and not HTML.
  
  8 votes
  1. [3]
    pyeri (OP)
    July 2, 2023
    Link Parent
    Actually, that's the conventional or widely popular way of serving all image URLs (ones that end with .JPG, .PNG, etc.). This non-standard way of serving based on conditional header content will...
    
    Sometimes, depending on the server software, the server will ignore the text/html and just serve an image always
    
    Actually, that's the conventional or widely popular way of serving all image URLs (ones that end with .JPG, .PNG, etc.). This non-standard way of serving based on conditional header content will likely break the working of most archiving or crawling utilities such as httrack, archive.org and search engine crawlers?
    
    3 votes
    
    talklittle
    July 2, 2023
    Link Parent
    Right. I'm certainly not a fan of MIME types not matching file extensions. Worse yet when a site also checks User-Agent, and to some browsers/devices serves an image, and to others serves HTML....
    
    Right. I'm certainly not a fan of MIME types not matching file extensions. Worse yet when a site also checks User-Agent, and to some browsers/devices serves an image, and to others serves HTML.
    
    Edit: To address the archiving/crawling issue: Crawlers can probably set a narrower Accept header if they believe it's an image file extension. Then the server must serve the image and not HTML. Otherwise the server would be considered broken.
    
    3 votes
    
    teaearlgraycold
    July 2, 2023
    Link Parent
    I don't see why that would break crawlers.
    
    I don't see why that would break crawlers.