7 votes

Fortnightly Programming Q&A Thread

General Programming Q&A thread! Ask any questions about programming, answer the questions of other users, or post suggestions for future threads.

Don't forget to format your code using the triple backticks or tildes:

Here is my schema:

```sql
CREATE TABLE article_to_warehouse (
  article_id   INTEGER
, warehouse_id INTEGER
)
;
```

How do I add a `UNIQUE` constraint?

21 comments

  1. [21]
    mtset
    Link
    Have you ever faced the problem of taking a float or widely-varying int value and constraining it to a handful of discrete values? What did you use to do that? Did you like it? I'm working on a...

    Have you ever faced the problem of taking a float or widely-varying int value and constraining it to a handful of discrete values? What did you use to do that? Did you like it?

    I'm working on a project that does this right now and I'm having a heck of a time designing an API.

    5 votes
    1. [15]
      teaearlgraycold
      Link Parent
      So a function that just maps from a float to an enum?

      So a function that just maps from a float to an enum?

      3 votes
      1. [14]
        mtset
        Link Parent
        Or from a float or int to a restricted set of floats/ints. A great example is a Eurorack module that quantizes voltage to a scale, so you're only playing notes in key. The use case at work is...

        Or from a float or int to a restricted set of floats/ints. A great example is a Eurorack module that quantizes voltage to a scale, so you're only playing notes in key. The use case at work is binning ages into "child", "adolescent", "young adult", etc.

        A big if statement is fine if you're only going to do it for one value but it gets tedious and brittle for many.

        2 votes
        1. [11]
          teaearlgraycold
          Link Parent
          Do you intend to always go with the naive process of evenly chunking the domain and mapping each chunk to a value in the range? If not then there is no generic way to implement this and I would...

          Do you intend to always go with the naive process of evenly chunking the domain and mapping each chunk to a value in the range? If not then there is no generic way to implement this and I would just write a function for each use case. But if you do go with the naive way then I would recommend something like:

          function bin<T>(input: number, input_max: number, input_min: number, outputs: T[]): T {
            const normalized_input =  input / (input_max - input_min) - input_min; // In range of 0..1
            return outputs[Math.floor(normalized_input * outputs.length)];
          }
          

          Note that in the above code input_max is a non-inclusive boundary. If input were to equal input_max then you would index past the last element of outputs.

          4 votes
          1. [10]
            mtset
            Link Parent
            This is a really good approach for that case, thank you! I'm thinking also about cases where there are overlapping ranges, though - I was going to use something like an interval tree internally,...

            This is a really good approach for that case, thank you! I'm thinking also about cases where there are overlapping ranges, though - I was going to use something like an interval tree internally, but then I need an API for specifying precedence.

            1 vote
            1. [9]
              teaearlgraycold
              (edited )
              Link Parent
              I think in that case iterating over a vector of ranges will work just fine. The order in the vector defines precedence. Edit: For example: /// Bin an input (of the D/domain type) into one of many...

              I think in that case iterating over a vector of ranges will work just fine. The order in the vector defines precedence.

              Edit:

              For example:

              /// Bin an input (of the D/domain type) into one of many buckets, returning the
              /// bucket's corresponding value (of the R/range type).
              fn bin<D: PartialOrd, R>(input: D, buckets: Vec<(Range<D>, R)>) -> Option<R> {
                  for (bucket, value) in buckets {
                      if bucket.contains(&input) {
                          return Some(value);
                      }
                  }
                  
                  None
              }
              
              2 votes
              1. [8]
                mtset
                Link Parent
                Yeah, absolutely. What I'm interested in is designing the API - would it be better to produce a closure from these ranges? A mutable collection of ranges you can change over time? Etc.

                Yeah, absolutely. What I'm interested in is designing the API - would it be better to produce a closure from these ranges? A mutable collection of ranges you can change over time? Etc.

                1 vote
                1. [7]
                  teaearlgraycold
                  Link Parent
                  I don’t know enough about your use case to really say - but using closures for this seems like a weird solution, especially so in rust. My recommendation is use the solution that is the most...

                  I don’t know enough about your use case to really say - but using closures for this seems like a weird solution, especially so in rust. My recommendation is use the solution that is the most boring and least impressive.

                  4 votes
                  1. [6]
                    mtset
                    Link Parent
                    Huh. I think I should capture some context around this (so to speak), maybe I'll write a blog post. Thank you for the help!

                    Huh. I think I should capture some context around this (so to speak), maybe I'll write a blog post.

                    Thank you for the help!

                    3 votes
                    1. [5]
                      teaearlgraycold
                      Link Parent
                      That would be helpful.

                      That would be helpful.

                      2 votes
                      1. [4]
                        mtset
                        Link Parent
                        Okay, threw together an example in the Playground. https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ecbcccda9a0a3995cec9457d7366cfb7
                        2 votes
                        1. [3]
                          teaearlgraycold
                          (edited )
                          Link Parent
                          The name "bucketizer" made me laugh. So it looks like they'd both work. But with the closure the API is obtuse at the calling site. If you're not the person that created the closure there's no...

                          The name "bucketizer" made me laugh.

                          So it looks like they'd both work. But with the closure the API is obtuse at the calling site. If you're not the person that created the closure there's no obvious explanation for how this domain/range map works. The type is impl Fn(&T) -> Option<R> which could be implemented in one of an infinite number of ways. With the struct option you can quickly go to the implementation and notice the buckets struct field.

                          Edit:

                          Additionally, if you ever want a second way to interact with these buckets then you're SOL with a closure unless you want to introduce a ton of code smell. A struct with methods can be easily extended.

                          2 votes
                          1. [2]
                            mtset
                            Link Parent
                            Heh, yeah, I considered Quantizer (what they're called in the audio world) but it is overloaded a bit. And, yeah, the closure API does seem a little opaque. I would definitely add some...

                            Heh, yeah, I considered Quantizer (what they're called in the audio world) but it is overloaded a bit.

                            And, yeah, the closure API does seem a little opaque. I would definitely add some documentation to both, though.

                            1 vote
                            1. teaearlgraycold
                              Link Parent
                              Maybe you don’t plan to have this used in such a way where this matters, but there’s no way to attach documentation to a stray closure variable. Imagine you’re in a function far far away from the...

                              Maybe you don’t plan to have this used in such a way where this matters, but there’s no way to attach documentation to a stray closure variable. Imagine you’re in a function far far away from the bucketizing closure’s heap allocation. All you have is a closure. It’s passed in as a parameter. This code was written by someone else so you don’t know where the closure comes from without going up the call stack. There’s not a great way to know from a single source of truth what this closure is and how it works. A modern text editor using a language server could point you directly to a variable’s struct type definition though.

                              2 votes
        2. [2]
          streblo
          Link Parent
          Depends on your requirements but chucking the mantissa and just using a map of exponents and sign value would be simple to bin by powers of ten.

          Depends on your requirements but chucking the mantissa and just using a map of exponents and sign value would be simple to bin by powers of ten.

          1 vote
          1. mtset
            Link Parent
            Oh, good call! Could be very useful for that special case.

            Oh, good call! Could be very useful for that special case.

            1 vote
    2. [5]
      FluffyKittens
      Link Parent
      Need more info on your stack, data scale, and app-specific context to give a good answer. But some basic approaches I would use on my preferred stack as a data engineer: Pandas’ Series.cut, or...

      Need more info on your stack, data scale, and app-specific context to give a good answer. But some basic approaches I would use on my preferred stack as a data engineer: Pandas’ Series.cut, or db-side, stored procedure to create/update a table of balanced Postgres Range types to map onto. “Binning” is the term to search for in your respective ecosystem to find existing solutions.

      2 votes
      1. [4]
        mtset
        Link Parent
        I'll look into those APIs and see if there's anything I can pull for the design, thank you! In this case, it's Rust, and there's no good library function for it at the moment.

        I'll look into those APIs and see if there's anything I can pull for the design, thank you! In this case, it's Rust, and there's no good library function for it at the moment.

        2 votes
        1. [3]
          FluffyKittens
          Link Parent
          To clarify, are we talking about a library API to be consumed by other programs, or a REST-style web API that you’re designing?

          To clarify, are we talking about a library API to be consumed by other programs, or a REST-style web API that you’re designing?

          1 vote
          1. [2]
            mtset
            Link Parent
            The former! The latter would be very wasteful, most likely.

            The former! The latter would be very wasteful, most likely.

            2 votes
            1. FluffyKittens
              Link Parent
              Gotcha, in that case I’m out of my element - I don’t know the Rust ecosystem well enough to comment on design standards.

              Gotcha, in that case I’m out of my element - I don’t know the Rust ecosystem well enough to comment on design standards.

              2 votes