20 votes

As of Python 3.7, dictionary order is guaranteed to be insertion order

20 comments

  1. [2]
    teaearlgraycold
    Link
    I like how in Go the maps not only don't guarantee insertion other is maintained, but they randomize the hashing order on each run so you can't depend on hash order either.

    I like how in Go the maps not only don't guarantee insertion other is maintained, but they randomize the hashing order on each run so you can't depend on hash order either.

    11 votes
    1. spit-evil-olive-tips
      Link Parent
      Almost all languages do randomized hashing now, it's a mitigation against a large class of denial-of-service attacks: http://ocert.org/advisories/ocert-2011-003.html (tl;dr is that if you insert...

      Almost all languages do randomized hashing now, it's a mitigation against a large class of denial-of-service attacks: http://ocert.org/advisories/ocert-2011-003.html

      (tl;dr is that if you insert user-provided data as keys into a hashtable, a malicious user can craft keys that they know will cause hash collisions and cause the hashtable performance to degrade to essentially the same as a linked list)

      eg:

      ~> python3
      Python 3.7.5 (default, Oct 14 2019, 23:08:55) 
      [GCC 8.3.0] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      >>> hash('a')
      -5276623004311757763
      >>> 
      ~> python3
      Python 3.7.5 (default, Oct 14 2019, 23:08:55) 
      [GCC 8.3.0] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      >>> hash('a')
      -6615835358204783878
      >>> 
      
      10 votes
  2. [15]
    vakieh
    Link
    What is there to get used to? It's still kinda silly to rely on as it makes your application weaker against old versions and flavours other than CPython, and it's going to be habit for any...

    What is there to get used to? It's still kinda silly to rely on as it makes your application weaker against old versions and flavours other than CPython, and it's going to be habit for any developer who isn't python only to not rely on an ordered map since they usually aren't.

    This appears to be more about having Python be more robust against programmers who don't know what they're doing, which is a losing game since they're just going to screw up someplace else.

    2 votes
    1. [2]
      mrbig
      Link Parent
      Isn’t that one of the main reasons why Python was created in the first place?

      This appears to be more about having Python be more robust against programmers who don't know what they're doing

      Isn’t that one of the main reasons why Python was created in the first place?

      9 votes
      1. blitz
        Link Parent
        I used to work at the company where Python was invented. If I've learned anything from talking to the greybeards who worked there at the time (they're still a fairly close-knit group!), it's that...

        I used to work at the company where Python was invented. If I've learned anything from talking to the greybeards who worked there at the time (they're still a fairly close-knit group!), it's that the history behind Python is fairly complex. Maybe simplicity was one of the goals, but it certainly wasn't the only one; I would be careful ascribing motives like that to the language.

        Aside: Python was used as the first client-side executed server script through a browser called Grail(as in holy)(as in Monty Python's). We almost had Python in the browser instead of Javascript! What a different world we developers would be in. Unfortunately due to circumstance and internal disagreements at CNRI it was not to be.

        13 votes
    2. [12]
      blitz
      (edited )
      Link Parent
      There’s a faction in the python community that fights any change to Python that would make it “harder for newbies to learn”. This faction was a large opponent of the walrus operator because they...

      There’s a faction in the python community that fights any change to Python that would make it “harder for newbies to learn”. This faction was a large opponent of the walrus operator because they felt that Python was moving away from being a newbie-friendly language.

      I kind of understand where they’re coming from, but as a person whose income comes primarily from writing Python, I feel like my interests are diametrically opposed to theirs. I think that If Python is supposed to be used professionally, it can’t make concessions for newbies, and if Python targets newbies then it won’t have the things I need for professional use. Thankfully it seems that Python is gaining features that are geared more towards professionals (static types, walrus op, etc).

      I definitely feel like ordered dicts by default is a win for the newbie camp, and I agree with you the people who rely on this will likely encounter other problems down the line.

      3 votes
      1. [2]
        Deimos
        Link Parent
        Through total coincidence, I was actually just reading this over-6-months-old Twitter thread by Hynek Schlawack on this exact subject a few minutes ago:...

        Through total coincidence, I was actually just reading this over-6-months-old Twitter thread by Hynek Schlawack on this exact subject a few minutes ago: https://threadreaderapp.com/thread/1155434973807218696.html

        5 votes
        1. blitz
          Link Parent
          Yep, I've been going to PyCon for the past 5 years and I've noticed a shift in the kinds of people who attend. I've even mostly stopped going to talks, I just try to find insightful people like...

          Yep, I've been going to PyCon for the past 5 years and I've noticed a shift in the kinds of people who attend. I've even mostly stopped going to talks, I just try to find insightful people like Hynek and pick their brains in the hallway. I hope I'm not annoying them! D:

          3 votes
      2. [9]
        vaddi
        (edited )
        Link Parent
        I understand your opinion, but I often wonder why people always want stuff to grow infinitely regarding features. Why do programing languages need to always try to implement their own version of...

        I understand your opinion, but I often wonder why people always want stuff to grow infinitely regarding features. Why do programing languages need to always try to implement their own version of feature x that was copied form language y? Wouldn't be much better if each language focused on their use case domains and people used languages more like unix tools? Everybody seems to want to reinvent the wheel instead of trying to maintain and improve old implementations.

        Take for instance Numpy. Ok, I understand that when it was invented and/or started gaining popularity when there wasn't a free alternative to it. (Matlab is not free and it isn't really a programing language. Octave is a free worse Matlab. GNU R is more statistics oriented). However if we really look into Numpy code, it is super verbose and ugly and in my opinion defeats the purpose of Python. I understand that sometimes projects have to piggyback other projects. But I'm tired of seeing people do horrific stuff in Python because they learned Numpy and Matplolib and Dataframes before really learning Python.

        I see Python as glue language, maybe a shell on steroids, to be used when writing a shell script would end up with lots of lines but we still don't want to write C or stuff like that. Also as a good language for prototyping stuff.

        I don't think that for example Dropbox is moving from Python because the language is not evolving, I think that they are moving from it because they already "prototyped" their product and now want a really stable and efficient thing.

        Another thing that I'm against in this industry is the fact that implementing new features always seems to take priority over optimizing what is already done. And by optimizing I don't just mean in the sense of computer resources consumption, take for instance the Python official documentation, many pages need improvement but nobody seems to care. Also, old code in the standard library doesn't follow PEP8 and is not even idiomatic Python. It is a real pain in the ass for newcomers to dig into stuff when they are having trouble with things that aren't flavour of the month like whatever package that was released last year.

        2 votes
        1. [2]
          cge
          Link Parent
          Numpy is useful for types of programming that are distinct from what many people do, with different priorities. I think that there can be clashes and frustration when these different worlds...

          However if we really look into Numpy code, it is super verbose and ugly and in my opinion defeats the purpose of Python. I understand that sometimes projects have to piggyback other projects. But I'm tired of seeing people do horrific stuff in Python because they learned Numpy and Matplolib and Dataframes before really learning Python.

          Numpy is useful for types of programming that are distinct from what many people do, with different priorities. I think that there can be clashes and frustration when these different worlds intersect poorly. From my perspective: likely the majority of the Python code I write with Numpy (at least by volume) once it is finished and debugged, will be run once. Writing code that is very maintainable and readable isn't just unimportant in such situations, but wasteful. The algorithms also need to be fast, and, while I mention some problems with Numpy in that regard below, for run-once code, that often means the code is going to look a bit horrific. It's just a different use of the language.

          But for me, one of the best things about Numpy was that it was built on a well-supported general purpose programming language with an established ecosystem, rather than going the common research route of insisting on implementing something new and specific. I came from Matlab and Mathematica, but R, Octave, and others all have the same problem: for whatever support they have for numerical work, they aren't great programming languages. They make odd, idiosyncratic choices and limitations built around assumptions of how they're going to be used, often in the name of convenience, that end up being frustrating and confusing (eg, with variable scope, indexing, etc). They usually have great libraries for numerical work, but for nothing else. I can easily make nice plots in Mathematica. But what happens when I want to write a program that processes some data and has a convenient CLI? Or when a machine unhelpfully provides its data as a webpage in something that's supposedly HTML? Or when I want to take some of my code, clean it up, and make it into something others can easily install and use?

          To some extent, the research community tried the "each language focused on their use case domains and people used languages more like unix tools". It was awful. It's hard to guess what the bounds of those domains are, and the guesses were usually wrong, giving us many tools, each of which were poorly suited to what we were doing.

          Python is, as you note, a bit like "shell on steroids", and that's often what people doing numerical algorithms need. Numpy lets us use in that way. And yes, that means people are going to write some bad code, because they're writing quickly, and their code doesn't need to be good.

          There are some fundamental frustrations of Numpy, though. Python is an enormously sluggish language for the sort of tight-loop work numerical algorithms that is common. Numpy tries to address the problem through a very vectorized API, but this means that, in order for algorithms to be fast, they need to be wrangled into a completely vectorized form: a single tight loop will make everything hundreds of times slower. For algorithms that are complicated, while they often can be wrangled into Numpy-compatible forms, the code can end up being extremely hard to understand. In some cases, I have baffling series of numpy array operations with comments along the lines of "this code is actually equivalent to this loop". In others, when I've replaced Python+Numpy code with Numpy C code, the C code has been clearer.

          5 votes
          1. vaddi
            (edited )
            Link Parent
            I agree with everything you said. However, I can't help to feel that having a language inside another language, both with different documentations that require the user to go to distinct places...

            I agree with everything you said. However, I can't help to feel that having a language inside another language, both with different documentations that require the user to go to distinct places (because almost everyone reads documentation on the web instead of man pages, and that's fine.) helps perpetuate the idea that

            people are going to write some bad code, because they're writing quickly, and their code doesn't need to be good.

            Scientific code does not have to and should not be bad. Academics should start worrying about sharing their code alongside their research papers. And that requires the code to be well written and that another person should be able to run it. But if you only learn Numpy and Matplotlib, you can not write good Python scripts. But to write good Python scripts, you have to learn Python. So in the end you are learning 2 languages, a domain specific one and a general purpose one.

            2 votes
        2. [2]
          blitz
          Link Parent
          I think the biggest reason for this is that programming is still a very new activity. We really don't know how to do things well yet. New ideas are introduced in one language and other languages...

          Why do programing languages need to always try to implement their own version of feature x that was copied form language y? Wouldn't be much better if each language focused on their use case domains and people used languages more like unix tools? Everybody seems to want to reinvent the wheel instead of trying to maintain and improve old implementations.

          I think the biggest reason for this is that programming is still a very new activity. We really don't know how to do things well yet. New ideas are introduced in one language and other languages adopt them if they can see a value. It's way too soon to announce that a language is "done," because we don't really know what they can do yet.

          The zeitgeist also changes. Static typing and formal type systems have been around since the 80's, but only now are they really gaining mindshare with most developers.

          I see Python as glue language, maybe a shell on steroids, to be used when writing a shell script would end up with lots of lines but we still don't want to write C or stuff like that. Also as a good language for prototyping stuff.

          That's fine, and many people use it this way, but languages are general things that have more than the one use case. Many people implement complex software in Python because it's at an abstraction level they're comfortable with. It's not "wrong" to use Python for complex software if it meets your requirements.

          2 votes
          1. vaddi
            Link Parent
            I agree with that. It is not "wrong" to use any language, we use the one that is simpler and does the job. But I think that people that become advanced at simple languages tend to try and add...

            Many people implement complex software in Python because it's at an abstraction level they're comfortable with. It's not "wrong" to use Python for complex software if it meets your requirements.

            I agree with that. It is not "wrong" to use any language, we use the one that is simpler and does the job. But I think that people that become advanced at simple languages tend to try and add complexity to that language because they dedicated a lot of time to it. However, when new newcomers arrive, what was once a simple language is now a complicated one.

            This is of course a theoretically and taken to the extreme kind of opinion. I don't think that current Python is nearly as complicated as lets say C++. But I fear that as new ways of writing the same logic are added we might get there.

            1 vote
        3. [2]
          Micycle_the_Bichael
          Link Parent
          Because learning new languages fucking sucks. Right now for my role I need to know/learn 2 frameworks of JS for different GUIs Python2 for our scripts that are monitors on centos servers because...

          why do programing languages need to always try to implement their own version of feature x that was copied form language y?

          Because learning new languages fucking sucks. Right now for my role I need to know/learn

          1. 2 frameworks of JS for different GUIs
          2. Python2 for our scripts that are monitors on centos servers because python2.7 is what centos uses as system python
          3. Python3 because python2 is at EOL and we need to prepare to shift.
          4. Golang because of the kubernetes tooling we are building and because python sucks for a lot of tasks.
          5. Perl, because a lot of icinga/nagios checks are in perl or python
          6. Bash, because shell scripts
          7. Ruby (ish), because puppet.
          8. PHP because we have backends and webapps built long enough ago that they use php5.4

          Its insane. The spin up time for me was long, and its even longer for all the guys on our team who have a sysadmin background and little-to-no programming experience. One of my biggest goals for our team is to hack down this list. Ever programming language has its own pros and cons. But the more general purpose a language is, the more tasks a single developer can do because then at least they only need to get domain knowledge, not domain and language knowledge. Is every language right for every task? No. Do I understand why people who like a language want to be able to do more things in that language, hell yeah.

          2 votes
          1. vaddi
            Link Parent
            What about the case when one learns a language, stops using it for some reason and when returns to it the idiomatic way of writing it is completely different?

            Because learning new languages fucking sucks.

            What about the case when one learns a language, stops using it for some reason and when returns to it the idiomatic way of writing it is completely different?

            1 vote
        4. [2]
          arghdos
          Link Parent
          You seem to be missing that Numpy is exactly the glue language you like Python for, but for all scientific computing written in Python. It makes something that would be many C/C++ lines quick to...

          I see Python as glue language, maybe a shell on steroids, to be used when writing a shell script would end up with lots of lines but we still don't want to write C or stuff like that. Also as a good language for prototyping stuff.

          You seem to be missing that Numpy is exactly the glue language you like Python for, but for all scientific computing written in Python. It makes something that would be many C/C++ lines quick to write and easy to prototype.

          1 vote
          1. vaddi
            Link Parent
            Maybe... I see it more like an interface.

            Maybe... I see it more like an interface.

  3. [2]
    The-Toon
    Link
    I'm not a experienced programmer, but why does this matter? A large amount of uses are unaffected, and if it does matter it's easy enough to randomize the dictionary. Then again, I don't see why...

    I'm not a experienced programmer, but why does this matter? A large amount of uses are unaffected, and if it does matter it's easy enough to randomize the dictionary. Then again, I don't see why this change was necessary either, with collections.OrderedDict existing.

    2 votes
    1. Deimos
      (edited )
      Link Parent
      It was initially more of a side effect than something they were doing deliberately. Python 3.6 had a major re-implementation of the dictionary type that reduced their memory usage significantly,...

      It was initially more of a side effect than something they were doing deliberately. Python 3.6 had a major re-implementation of the dictionary type that reduced their memory usage significantly, and the new approach happened to maintain insertion order. It may also have had some relation to PEP 468, maintaining the ordering of **kwargs (which is a dict of the keyword arguments to a function), but I'm not sure if it was intended or just usable for both.

      From the "What's new in Python 3.6" docs:

      The dict type now uses a “compact” representation based on a proposal by Raymond Hettinger which was first implemented by PyPy. The memory usage of the new dict() is between 20% and 25% smaller compared to Python 3.5.

      The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5).

      (Contributed by INADA Naoki in bpo-27350. Idea originally suggested by Raymond Hettinger.)

      Then for 3.7, they decided to formalize it as something you could rely on. That email thread is discussing the comparison of the new dicts and OrderedDict, if you're interested in reading more into it.

      9 votes