tomf's recent activity

  1. Comment on What was the best job you ever had? in ~life

    tomf
    Link
    I was a BA on contract with this big corporation. I wrapped my projects up early but still had a good four months to go. I pretty much shopped myself around to different teams and took on smaller...

    I was a BA on contract with this big corporation. I wrapped my projects up early but still had a good four months to go. I pretty much shopped myself around to different teams and took on smaller projects and some larger projects. It was fun and I did a whole range of stuff from change management stuff, SharePoint nightmares, streamlining data input, prepping materials for a long overdue government audit, and more.

    It wasn’t particularly challenging or anything, but the variety was great.

  2. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link Parent
    yeah, fingerprinting is okay but still not perfect. It works remarkably well with the Vogue podcast, but nothing else.

    yeah, fingerprinting is okay but still not perfect. It works remarkably well with the Vogue podcast, but nothing else.

    1 vote
  3. Comment on What's your dream job? in ~life

    tomf
    Link
    designing casual dresses or tailored men's fashion

    designing casual dresses or tailored men's fashion

    2 votes
  4. Comment on Do you prefer chunky or smooth peanut butter? in ~food

    tomf
    Link
    60/40 Adam’s smooth and Adam’s roasted chunky is the way.

    60/40 Adam’s smooth and Adam’s roasted chunky is the way.

    1 vote
  5. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link Parent
    Ok! This method is luck-based at best. For the main podcast I’m wanting it for, the Albanian and I’ll actually added more ads! Back to the drawing board. I really don’t want to have to use...

    Ok! This method is luck-based at best. For the main podcast I’m wanting it for, the Albanian and I’ll actually added more ads!

    Back to the drawing board. I really don’t want to have to use Whisper.

    I’m wondering if I can simply pull both, have them compared, then have the identical segments ripped and concat’d… I don’t totally trust the Whisper approach.

    1 vote
  6. Comment on Tildes Survey #2: What country do you live in? (Results) in ~talk

    tomf
    Link Parent
    Svenborgia was me :) it’s one of a few secret countries for the ultra wealthy from 30 Rock

    Svenborgia was me :) it’s one of a few secret countries for the ultra wealthy from 30 Rock

    3 votes
  7. Comment on Formula 1 Miami Grand Prix 2026 - Race Weekend Discussion in ~sports.motorsports

    tomf
    Link
    pretty good race. I like this new set and the tweaks... at least so far. Great to see two miraculous recoveries, too. I wonder how George is really handling Kimi's success. Its got to be tough. I...

    pretty good race. I like this new set and the tweaks... at least so far. Great to see two miraculous recoveries, too.

    I wonder how George is really handling Kimi's success. Its got to be tough. I really though this year would be Oscar's year, but now I'm genuinely concerned he'll never get there.

    I came away with six points in my pool

    • Kimi Antonelli - Pole, First Place (3pts)
    • Lando Norris - Sprint (1pt)
    • Nico Hulkenberg - Weekly Bet (most time in pit lane, 1pt)
    • Carlos Sainz - F1.5 Second (1pt)
    4 votes
  8. Comment on What are you reading these days? in ~books

    tomf
    Link Parent
    wicked. I'll add them to the queue. Sometimes I feel like I have nothing to read, then suddenly I have thirty books set

    wicked. I'll add them to the queue. Sometimes I feel like I have nothing to read, then suddenly I have thirty books set

  9. Comment on What Google thinks you're worth in ~tech

    tomf
    Link Parent
    Half off topic, why don’t you block ads? I can’t think of the last time I saw an ad for something

    Half off topic, why don’t you block ads? I can’t think of the last time I saw an ad for something

    3 votes
  10. Comment on Reddit reports 69% jump in revenue, topping analyst estimates in ~tech

    tomf
    Link Parent
    spez and co all use old Reddit. I really get the sense that they’re keeping old Reddit for the smart people and then sh/mobile for regular people to generate income, have the crazy patterns, etc...

    spez and co all use old Reddit. I really get the sense that they’re keeping old Reddit for the smart people and then sh/mobile for regular people to generate income, have the crazy patterns, etc that the investors love.

    13 votes
  11. Comment on What are you reading these days? in ~books

    tomf
    Link Parent
    its pretty great -- especially for its length, he packs so much in there without it feeling overwritten or bloated. This is my first of his. I think I'll do some more, though. Any must-reads?

    its pretty great -- especially for its length, he packs so much in there without it feeling overwritten or bloated.

    This is my first of his. I think I'll do some more, though. Any must-reads?

  12. Comment on What are you reading these days? in ~books

    tomf
    Link
    I’m finishing Conrad’s Heart of Darkness today. I also read Much Ado About Nothing. Fun, quick read. I’m not big into Shakespeare, but I think I should run through the hits. Before that I did Don...

    I’m finishing Conrad’s Heart of Darkness today.

    I also read Much Ado About Nothing. Fun, quick read. I’m not big into Shakespeare, but I think I should run through the hits.

    Before that I did Don Winslow’s The Power of the Dog — good book, but very much ‘this then this then that’

    Lots of short books.

    2 votes
  13. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link Parent
    Oh nice. I was just going to do a jq and count sponsor segments It’s surprising there isn’t a big fancy web app for this.

    Oh nice. I was just going to do a jq and count sponsor segments

    It’s surprising there isn’t a big fancy web app for this.

    1 vote
  14. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link Parent
    For the main, its literally --sponsorblock-remove sponsor and a bunch of other stuff in a big config. Nothing crazy. For the RSS one I was going to incorporate something like...

    For the main, its literally --sponsorblock-remove sponsor and a bunch of other stuff in a big config. Nothing crazy. For the RSS one I was going to incorporate something like https://github.com/xenova/sponsorblock-ml or whisper to find ads, but its so much computing power when I can simply skip.

    Back to youtube, all you need to do is have a text file with the channels and -a urls.txt and --download-archive ~/downloaded.txt

    The next step for my yt one will be a count of the sponsorblock channels. If there isn't a count, it will leave it for the next run.

    https://i.imgur.com/1J0kRjV.jpeg

    I rarely see that view, but it shows the previous download for the day, current downloaded file name, size, etc.

    1 vote
  15. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link Parent
    so far the three or four are all megaphone / spotify --- but I'm looking for some really ad-heavy podcasts. Nothing related to youtube. So far I've tested with This American Life, The Lakers Film...

    so far the three or four are all megaphone / spotify --- but I'm looking for some really ad-heavy podcasts. Nothing related to youtube.

    So far I've tested with This American Life, The Lakers Film Room Podcast, and Real Crime Profile. I don't listen to Real Crime Profile anymore (their OJ series was great), but its from Wondery so it'll have a shitload of ads.

    • Proxy: 1479.2 seconds = 24:39.2
    • Direct: 1418.2 seconds = 23:38.2

    I scrubbed through the episode and it seems to be ad-free. This proxy sucks, though. I might just use something like Hola to see if its just my provider that isn't totally reliable.

    Laker Film Room podcast --

    • Proxy: 31:43.2
    • Direct: 33:11.2

    This one starts with an ad in Albanian for The Boys, but the rest is ad-free. So far so good.

    One thing I've noticed is that some podcasts send an ad-free stream for the first pull but have ads in the subsequent pulls. Probably anecdotal, though. I can't think of a better way to tackle this if the podcast isn't on youtube. I might simply do two pulls from every location available to see if that helps.

    1 vote
  16. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link Parent
    My main one is like you with yt-dlp and sponsorblock --- and that works really well. I've got two or three podcasts that have the worst ads... and the same ads all the time. I don't even care...

    My main one is like you with yt-dlp and sponsorblock --- and that works really well. I've got two or three podcasts that have the worst ads... and the same ads all the time. I don't even care about skipping ads normally, but I don't ever want to hear this one about newspapers ever again.

    Anyway, this is day two of the other and it seems to work. I really need a good ad-heavy podcast to test it with, though. This was mostly written by chatgpt with some tweaks --- very rushed.

    The proxy comes free with my usenet provider. I have no endorsement or anything for them.

    potential rss ad-avoider
    #!/Users/sw/Projects/venv/bin/python
    
    import os
    import re
    import subprocess
    import requests
    import xml.etree.ElementTree as ET
    
    from datetime import datetime
    from email.utils import parsedate_to_datetime
    from concurrent.futures import ThreadPoolExecutor, as_completed
    
    # ---------------- PATHS ----------------
    
    DOWNLOAD_DIR = "/Users/xx/Documents/Projects/podcast/"
    FEEDS_FILE = "/Users/xx/Documents/Projects/podcast/feeds.txt"
    LOG_FILE = os.path.join(DOWNLOAD_DIR, "rss.log")
    ARCHIVE_FILE = os.path.join(DOWNLOAD_DIR, "downloaded.txt")
    FFMPEG = "/opt/homebrew/bin/ffmpeg"
    FFPROBE = "/opt/homebrew/bin/ffprobe"
    
    os.makedirs(DOWNLOAD_DIR, exist_ok=True)
    
    # ---------------- PROXY ----------------
    
    PROXY = "socks5h://username:password@tia.socks.privado.io:1080"
    PROXIES = {"http": PROXY, "https": PROXY}
    
    # ---------------- LOGGING ----------------
    
    def log(msg):
        ts = datetime.now().isoformat(timespec="seconds")
        line = f"{ts} | {msg}"
        print(line)
        with open(LOG_FILE, "a") as f:
            f.write(line + "\n")
    
    # ---------------- ARCHIVE ----------------
    
    def load_archive():
        if not os.path.exists(ARCHIVE_FILE):
            return set()
        with open(ARCHIVE_FILE) as f:
            return set(line.strip() for line in f if line.strip())
    
    ARCHIVE_SET = load_archive()
    
    def mark_done(key):
        with open(ARCHIVE_FILE, "a") as f:
            f.write(key + "\n")
        ARCHIVE_SET.add(key)
    
    # ---------------- RSS ----------------
    
    def parse_items(xml):
        try:
            root = ET.fromstring(xml)
            return root.findall(".//item")
        except Exception as e:
            log(f"[XML FAIL] {e}")
            return []
    
    def get_guid(item):
        g = item.find("guid")
        if g is not None and g.text:
            return g.text.strip()
        enc = item.find("enclosure")
        return enc.attrib.get("url") if enc is not None else None
    
    def get_enclosure(item):
        enc = item.find("enclosure")
        return enc.attrib.get("url") if enc is not None else None
    
    def get_pub(item):
        p = item.find("pubDate")
        if p is None or not p.text:
            return None
        try:
            return parsedate_to_datetime(p.text)
        except:
            return None
    
    # ---------------- UTIL ----------------
    
    def safe_name(s):
        return re.sub(r"[^a-zA-Z0-9_-]", "_", s)
    
    def get_date(item):
        pub = get_pub(item)
        return pub.strftime("%Y-%m-%d") if pub else datetime.now().strftime("%Y-%m-%d")
    
    # ---------------- MEDIA ----------------
    
    def download(url, path, use_proxy=False):
        try:
            r = requests.get(
                url,
                stream=True,
                timeout=60,
                proxies=PROXIES if use_proxy else None
            )
            r.raise_for_status()
    
            with open(path, "wb") as f:
                for chunk in r.iter_content(1024 * 128):
                    if chunk:
                        f.write(chunk)
    
            return True
    
        except Exception as e:
            log(f"[DOWNLOAD FAIL] proxy={use_proxy} | {e}")
            return False
    
    def duration(file):
        try:
            out = subprocess.check_output([
                FFPROBE, "-v", "error",
                "-show_entries", "format=duration",
                "-of", "default=noprint_wrappers=1:nokey=1",
                file
            ]).decode().strip()
    
            return float(out) if out else 0.0
        except:
            return 0.0
    
    def convert_mp3_to_m4a(src, dst):
        cmd = [
            FFMPEG, "-y",
            "-i", src,
            "-vn",
            "-c:a", "aac",
            "-b:a", "192k",
            dst
        ]
    
        p = subprocess.run(cmd, capture_output=True, text=True)
    
        if p.returncode != 0:
            log(f"[FFMPEG FAIL]\n{p.stderr}")
            return False
    
        return True
    
    def upload_file(path):
        try:
            cmd = [
                "scp",
                "-i", os.path.expanduser("~/.ssh/dosf"),
                "-o", "StrictHostKeyChecking=no",
                path,
                f"sw@boring.party:/home/xx/podcasts/{os.path.basename(path)}"
            ]
    
            p = subprocess.run(cmd, capture_output=True, text=True)
    
            if p.returncode != 0:
                log(f"[UPLOAD FAIL]\n{p.stderr}")
                return False
    
            return True
    
        except Exception as e:
            log(f"[UPLOAD ERROR] {e}")
            return False
    
    # ---------------- CORE ----------------
    
    def process_feed(url, name):
        try:
            log(f"[FEED] {name}")
    
            r = requests.get(url, timeout=30)
            r.raise_for_status()
    
            items = parse_items(r.content)
            if not items:
                log(f"[EMPTY] {name}")
                return
    
            latest = items[0]
    
            guid = get_guid(latest)
            if not guid:
                log(f"[NO GUID] {name}")
                return
    
            archive_key = f"{name}|{guid}"
    
            if archive_key in ARCHIVE_SET:
                log(f"[SKIP DONE] {name}")
                return
    
            mp3 = get_enclosure(latest)
            if not mp3:
                log(f"[NO ENCLOSURE] {name}")
                return
    
            date = get_date(latest)
            base = f"{date}_{safe_name(name)}"
    
            proxy_file = os.path.join(DOWNLOAD_DIR, base + "_proxy.mp3")
            direct_file = os.path.join(DOWNLOAD_DIR, base + "_direct.mp3")
            final_m4a = os.path.join(DOWNLOAD_DIR, base + ".m4a")
    
            # 1. proxy REQUIRED
            if not download(mp3, proxy_file, use_proxy=True):
                log(f"[SKIP NO PROXY] {name}")
                return
    
            # 2. direct optional
            download(mp3, direct_file, use_proxy=False)
    
            # 3. compare
            d_proxy = duration(proxy_file)
            d_direct = duration(direct_file)
    
            log(f"[DURATION] proxy={d_proxy:.1f}s direct={d_direct:.1f}s")
    
            chosen = proxy_file
            if d_direct > 0 and (d_direct - d_proxy) < 60:
                chosen = direct_file
    
            # 4. convert
            if not convert_mp3_to_m4a(chosen, final_m4a):
                log(f"[CONVERT FAIL] {name}")
                return
    
            # 5. upload
            if not upload_file(final_m4a):
                log(f"[UPLOAD FAIL] {name}")
                return
    
            # 6. mark done
            mark_done(archive_key)
    
            # 7. cleanup
            for f in [proxy_file, direct_file, final_m4a]:
                try:
                    if os.path.exists(f):
                        os.remove(f)
                except:
                    pass
    
            log(f"[DONE] {name}")
    
        except Exception as e:
            log(f"[ERROR] {name} | {e}")
    
    # ---------------- RUN ----------------
    
    def main():
        feeds = []
    
        with open(FEEDS_FILE) as f:
            for line in f:
                if "|" in line:
                    url, name = line.strip().split("|", 1)
                    feeds.append((url, name))
    
        with ThreadPoolExecutor(max_workers=4) as ex:
            futures = [ex.submit(process_feed, url, name) for url, name in feeds]
    
            for _ in as_completed(futures):
                pass
    
    if __name__ == "__main__":
        main()
    
    1 vote
  17. Comment on What is your favorite dinosaur? in ~talk

    tomf
    Link
    Irritator! the best!

    Irritator! the best!

    4 votes
  18. Comment on What programming/technical projects have you been working on? in ~comp

    tomf
    Link
    I'm trying to fetch podcasts without ads. I've got a script that connects to a proxy in Albania, fetches the mp3 and then fetches the mp3 again without the proxy. If the Albanian one is 1m+...

    I'm trying to fetch podcasts without ads. I've got a script that connects to a proxy in Albania, fetches the mp3 and then fetches the mp3 again without the proxy. If the Albanian one is 1m+ shorter, it goes over to the php script that generates the RSS feed; but if it isn't shorter, it still has the same shitty ads or the same duration as the normal feed in Pocketcasts.

    This is only for podcasts that aren't on youtube. For those, I fetch them with yt-dlp and sponsorblock.

    Not the best logic with the first one, but I'm not sure how else I can handle it.

    4 votes
  19. Comment on Ted Lasso | Season 4 official teaser in ~tv

    tomf
    Link
    Ted Lasso: The New Class — i wonder if they’ll try a backdoor pilot or two.

    Ted Lasso: The New Class — i wonder if they’ll try a backdoor pilot or two.

    2 votes
  20. Comment on TV Tuesdays Free Talk in ~tv

    tomf
    (edited )
    Link
    Widow's Bay (2026) is a new comedy horror show that is definitely worth watching... at least so far. Directed by Hiro Murai (Atlanta, Barry, Mr & Mrs Smith) and written by Katie Dippold (Mad TV,...

    Widow's Bay (2026) is a new comedy horror show that is definitely worth watching... at least so far.

    Directed by Hiro Murai (Atlanta, Barry, Mr & Mrs Smith) and written by Katie Dippold (Mad TV, Parks and Rec), and stars Matthew Rhys (The Americans, Perry Mason)

    A skeptical mayor leads the superstitious residents of a cursed New England island.

    oh! and yes, of course it has Stephen Root like everything does and should.

    wednesday edit: this episode of The Boys is so good.

    1 vote