14 votes

Tesla recalls cars with eMMC failures, calling chips soldered to a motherboard a "wear item"

8 comments

  1. [8]
    spit-evil-olive-tips
    Link
    Previous discussion This has some new facepalm-y details:

    Previous discussion

    This has some new facepalm-y details:

    Due to the car’s Linux operating system logging excessively to its 8 GB eMMC storage, the flash modules have been wearing out. This leads to widespread failures in the car, typically putting it into limp mode and disabling many features controlled via the touchscreen.

    With the issue affecting important subsystems such as the heater, defroster, and warning systems, the NHTSA wrote to the automaker in January requesting a recall. Tesla’s response acquiesced to this request with some consternation, downplaying the severity of the issue. Now they are claiming that the eMMC chip, ball-grid soldered to the motherboard, inaccessible without disassembling the dash, and not specifically mentioned in the owner’s manual, should be considered a “wear item”, and thus should not be subject to such scrutiny.

    Tesla’s assertion that the eMMC chip should be considered a ‘wear item’ is a dubious one at best. Flash memory does wear out, it’s true, as Tesla points out when discussing the limits of the technology. Many parts on a modern car wear out over time – brake pads, belts, and air filters are all common examples. The difference is that these parts are all designed to be replaced by the end user or a typical mechanic.

    Trying to claim that a ball-grid array chip, permanently soldered onto a PCB and buried inside the dashboard is a wear item is patently ludicrous. If it were, we’d expect to see several things. There’d be a recommend time and mileage upon which the eMMC would be changed to avoid surprise failures, and this would be listed in the manual. Additionally, Tesla’s repair process would involve desoldering the eMMC chip from the board and replacing it directly. Given that Tesla are instead replacing the computers as a whole is indicative that the part is not being treated as a wear item by anyone, anywhere.

    15 votes
    1. [7]
      streblo
      Link Parent
      What the actual hell? Using tmpfs for /var/log is pretty standard in embedded linux.

      Due to the car’s Linux operating system logging excessively to its 8 GB eMMC storage, the flash modules have been wearing out. This leads to widespread failures in the car, typically putting it into limp mode and disabling many features controlled via the touchscreen.

      What the actual hell? Using tmpfs for /var/log is pretty standard in embedded linux.

      8 votes
      1. [6]
        AugustusFerdinand
        Link Parent
        Can you expand on this? I'm not sure I understand the point you're trying to make.

        What the actual hell? Using tmpfs for /var/log is pretty standard in embedded linux.

        Can you expand on this? I'm not sure I understand the point you're trying to make.

        5 votes
        1. [3]
          streblo
          Link Parent
          When you're dealing with flash memory, you only have so many write cycles across the lifetime of the module. Everything you do involving writing to flash has to be budgeted, when you absolutely...

          When you're dealing with flash memory, you only have so many write cycles across the lifetime of the module. Everything you do involving writing to flash has to be budgeted, when you absolutely need to write to flash usually you will queue batches of writes and flush them periodically.

          Traditionally, you have some sort of rotating log system to store log files from the kernel and your applications, usually found in /var/log. Unless you're very cautious about what your logging, this is usually a bad idea with flash memory. And even if you are careful, logs aren't that useful to the user in an embedded linux product -- they may have some use in debugging an RMA but you can generally find the problem without stored logs.

          So in general most people opt to mount /var/log as tmpfs, temporary storage in the device's memory. You lose all logs on a reboot, but the writes to /var/log are just writes to your RAM so it has no effect on the eMMC lifecycle.

          21 votes
          1. [2]
            AugustusFerdinand
            Link Parent
            Thank you to you and @whbboyd both! I knew about the write cycle life and gathered that logs are usually written to RAM instead of storage, however I'm not sure I'm completely gathering how big of...

            Thank you to you and @whbboyd both!

            I knew about the write cycle life and gathered that logs are usually written to RAM instead of storage, however I'm not sure I'm completely gathering how big of a blunder this is or if it was avoidable. And without knowing what Tesla is writing in those logs/to this specific eMMC this may not be a question that can be answered here.

            By and large modern cars, and Tesla especially, have quite a lot of "black box" monitoring going on, storing states, changes, etc. These are generally used for data recovery of what happened during a crash, but I would imagine that much more than a dump-to-storage-upon-crash is being done at Tesla. So I guess my question/lack of understanding is a bit two parted:

            1. Is /var/log Linux only info or also logs pertaining to the car's state?

            2. Is /var/log even being written to this chunk of eMMC or is it the other logging that's the issue?

            Tesla is known to store videos while the car is being driven and while parked/detecting motion, but as I understand it those are stored on an external thumb drive you have to provide and I can't find other info on how much space is actually in the car's computer from the factory.

            4 votes
            1. streblo
              Link Parent
              I mean we definitely don't have the complete story and I'm not sure where hackaday is sourcing their claim from -- perhaps they popped a failed chip onto another board and were able to view its...

              And without knowing what Tesla is writing in those logs/to this specific eMMC this may not be a question that can be answered here.

              I mean we definitely don't have the complete story and I'm not sure where hackaday is sourcing their claim from -- perhaps they popped a failed chip onto another board and were able to view its contents but I didn't see where that was specified.

              It's certainly possible that Tesla felt they needed to have this excessive logging -- however as @whbboyd points out its possible to meet to those requirements if they were planned for. It's also possible that they fully anticpated their flash requirements but have a software bug that's causing more writes to flash than they anticipated. However I think anyway you slice it there is some level of incompetence or negligence on their behalf if flash is burning out after a few years.

              /var/log Linux only info or also logs pertaining to the car's state?

              Don't read too much into /var/log -- its just the most likely location of their log files. They could be anywhere but in general they aren't going to be stored with pertinent state. In general, estimating your flash lifetime is an achievable task -- it should be a basic part of the requirements engineering for a product at this level.

              9 votes
        2. [2]
          whbboyd
          Link Parent
          If you are putting Linux on an embedded-ish system, and your development team does not consist entirely of rank amateurs, you will be aware of this issue and put most of your logs into memory...

          If you are putting Linux on an embedded-ish system, and your development team does not consist entirely of rank amateurs, you will be aware of this issue and put most of your logs into memory rather than burning flash write cycles on them. (Or possibly make the writable medium removable and easily replaceable, or massively overprovision flash so write exhaustion takes decades rather than years, or make sure your logging volume is low enough that write exhaustion takes decades rather than years. There are plenty of options. The fact that this would be a problem is basic knowledge in the field, and so either Tesla knew about it and chose not to do anything, or their entire engineering staff actually are rank amateurs.)

          edit: To explain the technical terms, "tmpfs" is a filesystem which is backed by memory, not any real storage, and so can be used to make "logging to memory" transparent to programs; and /var/log is the directory to which logs are traditionally written.

          15 votes
          1. Akir
            Link Parent
            Precisely. This kind of mistake is what I would consider to be "defective by design". This whole regulatory issue could have been circumvented if they just used a microSD card instead of an eMMC...

            Precisely. This kind of mistake is what I would consider to be "defective by design". This whole regulatory issue could have been circumvented if they just used a microSD card instead of an eMMC module.

            8 votes