11 votes

Secrets of an Excel spreadsheet esports player

4 comments

  1. [4]
    Merry
    Link
    I have a lot of respect from people who are highly proficient in Excel but some of these takes are kind of... woof. Going from saying this: To this: Which is kind of some cognitive dissonance at...

    I have a lot of respect from people who are highly proficient in Excel but some of these takes are kind of... woof.

    Going from saying this:

    You write numerous Python scripts that scrape tenant lists from the property websites each night (property, tenant, unit, square feet), uploading them to a SQL database. It’s at this stage that Excel becomes a critical differentiator. The data is messy! You’d be surprised how many ways Best Buy is spelled, or how often shopping centers change their name. Sure, Python can attempt to handle it with some fuzzy math algos, or a developer can build a GUI in C++… but I’m not a coder and good luck making either of those transparent to the end user (a financial analyst who works in Excel all day).

    So we built spreadsheets that interact with the database (using macros), allowing a person to spot-check and correct various abnormalities once per week in just 10 minutes for all 3,000 shopping centers. Another macro assembles the cleansed data for each landlord’s portfolio, quantifying various metrics over time, compares it to company-reported data and our internal models, then sends an email with summary charts & tables and a link to the raw data should the analyst wish to dig deeper.

    P.S., Excel can also scrape the tenant lists from the websites.

    To this:

    PCW: There’s a belief that Excel doesn’t need much hardware to run, can you describe the computer, or PCs you use for Excel? Such as what kind of CPU and how many RAM and storage?

    Jason Moore: The latest build (June 2020) is a Core i9-10900X with 256GB of Samsung DDR4-2666 RAM and a 2TB Samsung 983 Enterprise SSD. Some spreadsheets run the CPU at 100 percent for hours, so it has an Asetek 650LS 120mm CPU Cooler. Having as many cores as possible and at least 64GB of RAM is key.

    PCW: How large do your spreadsheets get?

    Jason Moore: My current record is a 2.3GB stock-picking algorithm. It’s 95 percent calculations, not just a giant warehouse. I’d say my average size is 50 MBs. We had to upgrade everyone’s office PCs so my colleagues could open my spreadsheets; 16GB of RAM isn’t enough if you’re a heavy Excel user (or know someone who is).

    Which is kind of some cognitive dissonance at play. I do stuff between Python and Excel at work. Python for when I'm doing something personal that doesn't need to be shared with others (I'm the only one on my team who knows Python) and Excel for distribution. The times I have to work in Excel can be kind of awful depending on the task I'm doing. For example, merging 10-20 data sets together using Power Query and then building out Power Pivot data models can be extremely resource intensive. My computer chugs and if Excel crashes and I forgot to save 10 minutes ago, that's it. I just lost 10 minutes worth of work and I have to figure out exactly what happened. Whereas on my old 10+ year old desktop at work, I was able to build a turnover predictive model using XGboost and sure, it sometimes chugged, but if Python crashed (it never did), I still had my code. Plus on the Excel side, I have found debugging and troubleshooting errors kind of painful compared to what is available for Python modules.

    Also, I could not imagine how much a pain in the ass it is to set up web scraping in Excel. Like can you? Yeah. Should you? Probably not.

    In all, this is someone who has bought into the ecosystem so much that they might be overstating how much better Excel is than SQL or Python or R. It is a powerful tool, but once you start getting past basic formulas and pivot tables, you might as well be creating tech debt equivalent to one of my one-off Python scripts.

    8 votes
    1. Greg
      Link Parent
      Excel absolutely has its place, it's a great point and click way to play around with data up to the few thousand rows by tens of columns kind of size, but anyone who takes it beyond that strikes...

      Excel absolutely has its place, it's a great point and click way to play around with data up to the few thousand rows by tens of columns kind of size, but anyone who takes it beyond that strikes me as the extreme power user version of this.

      Some of the things they come up with are genuinely impressive in their way, but the effort they've invested learning to do things the slow, hard, compute-intensive way could absolutely have been better spent learning R or Python. We've got notebooks now, and have done for a good while, so even the CLI learning curve isn't required - and while not everyone is suited to coding, anyone who can force themselves to understand VBA or a deep Excel if/then stack can definitely handle it.

      Leave Excel to what it's good at: relatively quick reporting and exploration, and ad hoc semi structured data input for not-specifically-technical office workers, and invest the up front time to learn the right tools when you find yourself going beyond that. It's the "when all you have is a hammer..." logic, but they're doubling down and insisting that it's easier for them to keep hammering screws rather than learning to use a screwdriver.

      6 votes
    2. [2]
      arghdos
      (edited )
      Link Parent
      Or just like… use pandas and: heck, you can even dump to a csv if you really wanna open it up in Excel and poke around. Woof, that’s a strong nope from me. Last time I touched VBA I found it a...

      So we built spreadsheets that interact with the database (using macros), allowing a person to spot-check and correct various abnormalities once per week in just 10 minutes for all 3,000 shopping centers.

      Or just like… use pandas and:

      print(df.name.unique())

      heck, you can even dump to a csv if you really wanna open it up in Excel and poke around.

      In my opinion, VBA is what makes Excel the best programming tool in the world

      Woof, that’s a strong nope from me. Last time I touched VBA I found it a horrible, undocumented mess. Maybe it’s gotten better?

      5 votes
      1. Weldawadyathink
        Link Parent
        Nope, VBA is as bad as ever. Although I have heard that Microsoft is adding python into the office suite eventually. It is supposed to act as an alternative to VBA, so it would still be embedded...

        Nope, VBA is as bad as ever.

        Although I have heard that Microsoft is adding python into the office suite eventually. It is supposed to act as an alternative to VBA, so it would still be embedded into the spreadsheet file. I will be super excited if/when this happens.

        6 votes