10 votes

Coding Noob Needs Help/Guidance on Small Project

Hi,

There's a certain site which hosts media files and has a player that depends on a lot of third-party resources to play, while browsers have native support for those file types. Those 3rd-party resources are often blocked by ad blockers and I have no desire to white-list them. I would like to extract the direct link to the media file and make it playable on my custom web page.

The link to the media file is present in the page source of each page, always on the same line. It's not anchored in HTML but present in the JavaScript for the player, like so:

    $(document).ready(function(){
      $("#jquery_jplayer_1").jPlayer({
        ready: function () {
          $(this).jPlayer("setMedia", {
            [ext]: "https://[domain]/[filename.ext]"
          });
        },

In this example it's on line #5. [ext] = the file extension.

I want to build the following:

  • A web page with a form with a single input field meant to receive links from that specific file host
  • [Something] that extracts the file link from the source of the host's page
  • Present the linked file as playable in an embedded native player

So far I've managed to create a form with an input box and a submit button, but it doesn't do anything yet. What is the best way to build the actual functionality? I know HTML/CSS. I have some rudimentary understanding of JavaScript/jQuery and Python3, so those would be my preferred tools.

For those worried about piracy: The files in question are not copyrighted and I'm not looking to make copies. I just want to make them playable. This is for personal use.

Thank you for reading this far. Any and all advice is welcome!

17 comments

  1. [3]
    Pilgrim
    Link
    There are a lot of folks on this site that know a lot more than I do, but you're basically combining several elements here: Form to enter the URL Submit button to start the action Way to play the...

    There are a lot of folks on this site that know a lot more than I do, but you're basically combining several elements here:

    1. Form to enter the URL
    2. Submit button to start the action
    3. Way to play the video

    You have #1. Below is a link that explains how to play the video using HTML5, so that's #3. You'll need to work with others or on your own to come up with the code for the submit button that actually goes and gets the URL for the media. That's as much time as I have today (but maybe I can come back to this...).

    W3 does a good job of explaining how to play the video using HTML5:
    https://www.w3schools.com/html/html5_video.asp

    NOTE: To everyone, please don't rip me for referring someone to w3. They're much improved from where they were 10 years ago.

    EDIT: HTML5 assumes you want to play mp4 video files. If you want to do something else, please provide more info :)

    4 votes
    1. [2]
      ourari
      Link Parent
      Thank you for your reply. It's the bit between step 2 and 3 as you listed them that I most need help with. If I were to build it in Python3, I've read about how I could use BeautifulSoup or...

      Thank you for your reply. It's the bit between step 2 and 3 as you listed them
      that I most need help with. If I were to build it in Python3, I've read about how I could use BeautifulSoup or html.parser and re (regular expression) to extract the file url. But how to get the input from the form into that Python script and get the output of that script back on the HTML page remains out of my reach :)

      HTML5 assumes you want to play mp4 video files. If you want to do something else, please provide more info :)

      There's audio support too! https://www.w3schools.com/html/html5_audio.asp

      1 vote
      1. teaearlgraycold
        Link Parent
        The easiest way would probably be to make it a flask web app. Create a flask route that calls your script. Provide the URL as a query or post parameter.

        But how to get the input from the form into that Python script and get the output of that script back on the HTML page remains out of my reach

        The easiest way would probably be to make it a flask web app. Create a flask route that calls your script. Provide the URL as a query or post parameter.

        1 vote
  2. [4]
    Crespyl
    Link
    The most straightforward way to get at the resource URL is probably to use a regex. People will, wisely, tell you that you shouldn't try to parse HTML with a regex, but in this case, the data...

    The most straightforward way to get at the resource URL is probably to use a regex. People will, wisely, tell you that you shouldn't try to parse HTML with a regex, but in this case, the data you're looking for (the URL) has a pretty specific form, and any higher level approach is probably going to be more trouble than it's worth.

    A full URL spec-compliant regex is also going to be a rabbit hole, you're (presumably) only interested in a fairly restricted set of URLs, all from the same host (or a small set) and having the same extension.

    Let me know if you need any help understanding how it works, and there's always a few quirks between implementations in different languages, but they're pretty fun to learn and a great tool to have in your toolbox.

    4 votes
    1. [2]
      ourari
      (edited )
      Link Parent
      I managed to figure it out! #!/usr/bin/env python3 import re import requests link = requests.get('https://domain.tld/folder/page') result = re.findall(r'(https?://domain.tld/folder/[^\s"]+)',...

      I managed to figure it out!

      #!/usr/bin/env python3
      import re
      import requests
      
      link = requests.get('https://domain.tld/folder/page')
      result = re.findall(r'(https?://domain.tld/folder/[^\s"]+)', link.text)
      print(result)
      

      Now I need to find a way to get the input from the HTML5/jQuery form into link and get the output of result onto that page to play it with <audio controls><source src="[result output]" type="audio/mp4"></audio>

      Thank you for* sending me in the right direction.

      3 votes
      1. Crespyl
        Link Parent
        Glad I could help! I love regular expressions (probably too much for my own good), so I'm always happy to help someone get started. You may already have some recommendations for simple Python web...

        Glad I could help! I love regular expressions (probably too much for my own good), so I'm always happy to help someone get started.

        You may already have some recommendations for simple Python web frameworks, and I'm not super familiar with the territory, but if it were me I'd probably start with something minimal like Flask.

        You'll want to set up a route to display the main page with the form, and a second one that the browser will POST to when the user submits the form.

        2 votes
    2. ourari
      Link Parent
      Thank you for your advice. I'll dig into it before bothering you with any questions :)

      Thank you for your advice. I'll dig into it before bothering you with any questions :)

      1 vote
  3. [3]
    Vadsamoht
    Link
    This is just basic HTML for the form, and something to catch it on the backend. If you want to use Python, for the simplest approach you might want to look at a Flask application which you can...

    A web page with a form with a single input field meant to receive links from that specific file host

    This is just basic HTML for the form, and something to catch it on the backend. If you want to use Python, for the simplest approach you might want to look at a Flask application which you can then host on PythonAnywhere or similar.

    [Something] that extracts the file link from the source of the host's page

    I assume that there's something like BeautifulSoup for JS, which would let you parse the code to extract what you want. I doubt BS can extract from JS itself, but that's the kind of thing you want. Just googling there seem to be a few options, but I don't want to suggest any because I have no experience with them.

    Present the linked file as playable in an embedded native player

    Simple enough to embed something like that into a template within the Flask site. However, I don't have nearly enough info to provide a suggestion for the player itself, so I'll leave that up to you.

    1 vote
    1. [2]
      ourari
      Link Parent
      This is the road I'm going to take. Thank you for your advice!

      If you want to use Python, for the simplest approach you might want to look at a Flask application which you can then host on PythonAnywhere or similar.

      This is the road I'm going to take. Thank you for your advice!

      1. Vadsamoht
        Link Parent
        No worries! I use python/Flask a bit myself (though I'm by no means an expert), so if you run into any problems let me know and I might be able to point you in the right direction.

        No worries! I use python/Flask a bit myself (though I'm by no means an expert), so if you run into any problems let me know and I might be able to point you in the right direction.

        1 vote
  4. [7]
    bme
    Link
    I know you want help, and not judgement, but effectively you are asking for help to skim bandwidth while denying revenue. That's a dick move. The minimum you should do here is rehost it, and if...

    I know you want help, and not judgement, but effectively you are asking for help to skim bandwidth while denying revenue. That's a dick move. The minimum you should do here is rehost it, and if you don't have the rights to do that, don't consume the video.

    1. [6]
      ourari
      Link Parent
      Wow, you're making a lot of assumptions! The site in question does not have ads nor paid subscriptions or anything. It is completely free. I would listen to their content on their site if their...

      Wow, you're making a lot of assumptions!

      1. The site in question does not have ads nor paid subscriptions or anything. It is completely free.
      2. I would listen to their content on their site if their player wasn't dependent on third-party resources. All I'm doing is making it playable, using the exact same bandwidth I'd be using anyway.
      3. It's not video.

      If you're planning on judging and lecturing people, make sure you have the facts first. Start with asking questions.

      Now, do you have anything constructive to add to help me solve my problem?

      5 votes
      1. [5]
        bme
        (edited )
        Link Parent
        I do. I would ask them directly if they would be willing to expose an api to consume the media directly. I wouldn't try to parse or scrape the js. It's fragile and flaky. How about I'll continue...

        Now, do you have anything constructive to add to help me solve my problem?

        I do. I would ask them directly if they would be willing to expose an api to consume the media directly. I wouldn't try to parse or scrape the js. It's fragile and flaky.

        If you're planning on judging and lecturing people, make sure you have the facts first. Start with asking questions.

        How about I'll continue forming my own judgements and opinions based on the information I have to hand, and you can continue begging for help.

        1. [4]
          cfabbro
          Link Parent
          ಠ_ಠ https://docs.tildes.net/overall-goals#the-golden-rule

          ಠ_ಠ

          If people treat each other in good faith and apply charitable interpretations, everyone's experience improves.

          https://docs.tildes.net/overall-goals#the-golden-rule

          6 votes
          1. [3]
            bme
            (edited )
            Link Parent
            Sure. I'll consider myself advised. I'd also advise the OP that the easiest way to demonstrate the accuracy of their claims would be to point to the site with free, hosted, and copyright-free...

            Sure. I'll consider myself advised.

            I'd also advise the OP that the easiest way to demonstrate the accuracy of their claims would be to point to the site with free, hosted, and copyright-free videos that is absolutely fine with the type of scraping the OP wants to implement. I suggested what they were doing was bad, not that they were bad. The fact that got a huffy denial rather than the obvious objective refutation (a link to the site in question) leaves me confident that I am not wrong, but I am open to the fact that I might be.

            1. [2]
              ourari
              Link Parent
              The raison d'être for this project is protecting my privacy. Why would I publicly disclose something that I wish to keep private - so much so that I'm willing to learn how to code to do it? You...

              The raison d'être for this project is protecting my privacy. Why would I publicly disclose something that I wish to keep private - so much so that I'm willing to learn how to code to do it?

              You are essentially saying that because I wish to keep something private, I must be doing something wrong. Yes, that presses one of my buttons because I'm a privacy activist; One of the main reasons I'm here on Tildes is to support a project that takes a privacy-by-design approach. It's not a place where I expected to encounter a 'nothing to hide, nothing to fear' mentality. Perhaps I should've known better.

              If you want to keep assuming bad faith and bad intentions that's fine. I'm not going to be bullied into disclosing anything more than what I already have.

              1. bme
                Link Parent
                I didn't say any of the things that you are imputing to me. I am not trying to bully you into doing anything. I merely laid out my standard. You don't have to meet it or agree with it. Other...

                I didn't say any of the things that you are imputing to me. I am not trying to bully you into doing anything. I merely laid out my standard. You don't have to meet it or agree with it. Other people on this site certainly do not agree with me. That's fine.