6 votes

JPMorgan's Athena has 35 million lines of Python code, and won't be updated to Python 3 in time

11 comments

  1. Moonchild
    (edited )
    Link
    Python 2.7.17 is 1.25 million lines of code. That's 3.6% the amount of code. It would be easier to maintain python 2 indefinitely than to migrate their codebase, and they would be less likely to...

    Python 2.7.17 is 1.25 million lines of code. That's 3.6% the amount of code. It would be easier to maintain python 2 indefinitely than to migrate their codebase, and they would be less likely to run into bugs as part of the migration process.

    11 votes
  2. tindall
    (edited )
    Link
    JP Morgan had ten years to mitigate a major, obvious risk and chose not to do so - and, indeed, chose to open this codebase in a language they knew was going to no longer be supported quite soon....

    JP Morgan had ten years to mitigate a major, obvious risk and chose not to do so - and, indeed, chose to open this codebase in a language they knew was going to no longer be supported quite soon. If they would prefer to fund the continued maintenance of Python 2.7, they should do so - otherwise, this is entirely their own fault.

    10 votes
  3. [2]
    skybrian
    Link
    Although it's good to keep up with what's going on outside the company, I doubt the deadline itself matters all that much to them. They could put a few people on maintaining their own fork of...

    Although it's good to keep up with what's going on outside the company, I doubt the deadline itself matters all that much to them. They could put a few people on maintaining their own fork of Python 2, backporting any security fixes needed.

    6 votes
    1. The-Toon
      Link Parent
      Maybe they can update Tauthon? It seems to be good for something like this but it doesn't seem to be active anymore.

      Maybe they can update Tauthon? It seems to be good for something like this but it doesn't seem to be active anymore.

      3 votes
  4. [3]
    Akir
    Link
    It's software in the financial industry. They are famous for only needing to update and upgrade when absolutely necessary. This news doesn't surprise me at all. That being said the fact that this...

    It's software in the financial industry. They are famous for only needing to update and upgrade when absolutely necessary. This news doesn't surprise me at all.

    That being said the fact that this launched in 2018 and they decided to use Python 2 seems like they were asking for problems.

    6 votes
    1. [2]
      sigma
      Link Parent
      Im honestly more surprised they wrote a trading platform primarily in Python as opposed to C++ or C or Java like any of the major algo trading firms would have.

      Im honestly more surprised they wrote a trading platform primarily in Python as opposed to C++ or C or Java like any of the major algo trading firms would have.

      5 votes
      1. sqew
        Link Parent
        Yeah, fascinating that they either don't need more speed than Python can provide or have figured out some way to optimize the interpreter to get what they need. Wish they had a good engineering...

        Yeah, fascinating that they either don't need more speed than Python can provide or have figured out some way to optimize the interpreter to get what they need. Wish they had a good engineering blog so we could find out what's going on.

        2 votes
  5. [4]
    thundergolfer
    Link
    35 million lines of Python 2?! Jesus. Dropbox likely has much less than 35 million LoC in Python , but still a lot, and they've done all kinds of advanced work on their Python codebase to have it...

    35 million lines of Python 2?! Jesus.

    Dropbox likely has much less than 35 million LoC in Python , but still a lot, and they've done all kinds of advanced work on their Python codebase to have it scale.

    I wonder what technologies JP Morgan is using or built to handle such a massive Python codebase.

    5 votes
    1. [3]
      sigma
      Link Parent
      They were apparently counting modules and everything, which I am sure was loaded into the system and frozen to make sure random updates didn't break a live trading platform. Other than the main...

      They were apparently counting modules and everything, which I am sure was loaded into the system and frozen to make sure random updates didn't break a live trading platform.

      Other than the main trading engine, regulatory checks, risk checks, feasibility checks, interacting with bank APIs, interacting with exchange APIs, database maintenance, etc etc. Dropbox is a complex platform for sure, but it doesn't operate with a bajillion different APIs and doesnt have regulations that it has to comply with and risk checks to carry out.

      4 votes
      1. [2]
        thundergolfer
        Link Parent
        Oh yeah I recognise that, what I'm saying is that even while having a far smaller Python codebase with probably lower complexity, they've still been advancing the Python state-of-the-art with...

        Dropbox is a complex platform for sure, but...

        Oh yeah I recognise that, what I'm saying is that even while having a far smaller Python codebase with probably lower complexity, they've still been advancing the Python state-of-the-art with technologies like MyPy and Bazel. Dropbox had the creator of Python himself working to make their massive Python codebase good.

        If Dropbox is doing all that, what is JP Morgan doing to stop their codebase turning into a massive ball of mud?

        1. sigma
          Link Parent
          Because they have to interact with different vendors' API's. DropBox's product is all their own stuff; they dont have to go to another company outside their own to get stuff, process it, put it in...

          Because they have to interact with different vendors' API's. DropBox's product is all their own stuff; they dont have to go to another company outside their own to get stuff, process it, put it in a database, etc etc.

          For a trading platform, if I need to do basic things like deposit/withdraw money on my account, which requires I write an entire backend interfacing with the banking system. I need to also write in regulatory checks because the SEC actually has a backbone and will smack JPM with a bill if they don't, and those checks arent trivial checks either.

          Multiply this by 100, as I need to interact with the exchanges (multiple), each with their own API, their own way of consuming and emitting data. I need to clean data from every single one, and I need to transform data into their preferred way of consuming it. I need to do regulatory checks, legal checks, etc etc etc etc.

          By the by, most of the companies doing this aren't as tech savvy as JPM, let alone some of the high class algo traders, so their data formatting is non-trivially garbage in ways that make Google/Facebook/Dropbox/Twitter APIs look amazing. FIX processing alone is a nightmare, and you have to build an entire engine just for that. Add on to that some soft real time constraints and interfacing with components that have soft real time refresh rates like Bloomberg Terminal and bam, 35 million lines of code. And because these standards change all the time, you need to rewrite stuff all the time.

          I am confident, although not sure, that JP Morgan preferentially routes their orders to their in house OTC desk. Because the trading desk existed before this trading platform, they have to write code to transform data into ways that conform with the existing JP Morgan framework on that desk, and sometimes from desk to desk depending on the security.

          Dropbox has it easier in this regard because they just do their own thing. I presume most of the data they use are data they produce, so they can set engineering standards internally to avoid massive databases and data transformations and ingestion engines and output engines. Dropbox doesnt really work in realtime in any sense, and isnt latency sensitive in ways that would produce extremely obtuse code for the sake of optimization. Im sure Dropbox interfaces with Google/Facebook/Twitter/Apple APIs, but honestly at least they are somewhat sanely engineered.

          1 vote