5 votes

Effective Scala

12 comments

  1. [12]
    archevel
    Link
    I used to work with scala quite a bit a few years back. Nowadays java has come quite far in features. Lombok annotations can be used to reduce a lot of boilerplate when creating dtos. Not as nice...

    I used to work with scala quite a bit a few years back. Nowadays java has come quite far in features. Lombok annotations can be used to reduce a lot of boilerplate when creating dtos. Not as nice as case classes, but pretty nice still. The stream API makes it possible to work in a more functional fashion when dealing with collections etc etc. Scala on the other hand last time I used it (v2.10 maybe?) still had a lot of warts. Compilation times were poor. Implicits resolution, while possible to understand, were quite hard to grok in practice. Some patterns e.g. "the cake pattern" caused bloat and boilerplate that even outstripped java base Spring.

    The language has some nice features, but I don't think it provides enough practical benefits today compared to other JVM languages...

    1 vote
    1. [11]
      stu2b50
      Link Parent
      To be honest, I'm not exactly sure why, but Scala 100% has a concrete niche in the market: it is the de facto language for #BigData. From what I've seen, pretty much every large tech company uses...

      To be honest, I'm not exactly sure why, but Scala 100% has a concrete niche in the market: it is the de facto language for #BigData. From what I've seen, pretty much every large tech company uses Scala to interface with Spark, for instance. I suppose the language features as shown above makes it well suited for map reduce, but it's not going away anytime soon.

      2 votes
      1. [5]
        Micycle_the_Bichael
        Link Parent
        I was about to ask about this. I'm 5 years out of the last time I was deeply invested in ML/Big Data, but I'm starting to dip my toes back in and it seems like Scala is the suggestion now (back...

        I was about to ask about this. I'm 5 years out of the last time I was deeply invested in ML/Big Data, but I'm starting to dip my toes back in and it seems like Scala is the suggestion now (back when I left it was mostly about python and sometime Matlab). I was about to start teaching myself Scala as a win-win-win of (1) wanting to get back into maths, (2) wanting to learn functional programming, (3) wanting to start leveling up skills to try to transition back into a more math/data heavy job. Not sure what level of expertise/confidence you have giving advice, but would you suggest learning Scala to someone getting back into big data?

        1 vote
        1. stu2b50
          Link Parent
          That depends on what you mean by ML/Big Data. Scala's job is mainly on the plumbing side. You have petabytes of data on a spark instance(s), and you need to do some transformations to that data,...

          That depends on what you mean by ML/Big Data. Scala's job is mainly on the plumbing side. You have petabytes of data on a spark instance(s), and you need to do some transformations to that data, and take subsets to populate to say, a normal SQL database, or just as raw data to actually use. So, not really mathy? Or not inherently.

          For modeling, analysis, and all that kind of stuff, Python is the heavy de facto language to use for that.

          So yes if you want to get into the world of Spark, no if you want to be a be a cool quant making advance models.

          1 vote
        2. [3]
          archevel
          Link Parent
          Not sure about (1) or (3), but for (2) why not try Haskell if you want type safety or maybe clojure if you want a functional language on the JVM? Personally getting a bit disillusioned by the...

          Not sure about (1) or (3), but for (2) why not try Haskell if you want type safety or maybe clojure if you want a functional language on the JVM?

          Personally getting a bit disillusioned by the programming profession. Most stuff is just shuffling bytes from one place to another with some transformation in between... Not much interesting stuff happening... Then again, could just be my current mood :/

          1 vote
          1. [2]
            Micycle_the_Bichael
            Link Parent
            I’ll agree with you. I’ve accepted that I’ll be moderately content and dissatisfied in any software job I get (+/- some degrees of interest/disinterest). For me, especially in my free time, it’s...

            I’ll agree with you. I’ve accepted that I’ll be moderately content and dissatisfied in any software job I get (+/- some degrees of interest/disinterest). For me, especially in my free time, it’s just toying with things that interest me. Literally the entire reason I looked at Scala is because my old ML advisor posted on LinkedIn about Scala on the same day I was feeling like learning functional programming and missing Big Data research projects. If I decide to fall away from Big Data and only care about functional programming, I’ll keep Haskell in mind as the frontrunner :)

            Idk if this will help or resonate with you, but one thing I’ve found that helps me enjoy programming more is taking concepts of a language or tech stack and applying it to architecture decisions. My team just figured out a problem we are facing is much easier if you attempt to think about the solution in terms of Go interfaces. They hadn’t thought of it before bc the team working on it was all python guys. The second I described Go interfaces and shifted the lens we found an entire new way of thinking that made the solution so much easier. To be clear, we aren’t using go or interfaces in the solution, but we are using the concept of interfaces as a lens with which to view and think about the problem. Implementation will still be little more than shifting bits, but knowing a bit of Go helped us in the planning stage a ton. Similarly, I’ve found a lot of infra tools are best when they view the world similarly to how kunernetes does. In one place you have what you want the world to look like (be it dev, production, anything) and then you have what the world does look like, and then you write code to help you get from point a to point b. We find this mindset helps with faster iteration on problems and really helped us unearth areas for automation. Again, none of this stuff uses k8s, but builds on the models and mindsets we got from learning k8s. I think those kinds of problems are where the most fun in programming is for me these days. I love being handed a problem and then just sitting in it for weeks really finding all the pain points and finding all the micro-automations you can make. Just a little step forward here, little skip forward there. I don’t feel like I find nearly as much joy as I use to in implementation as I use to, but I can still find joy in trying to find new fun ways to solve the puzzle that is sitting in front of me. Look around for the small things inside and outside of implementation that you like and try to find ways to move towards those, and don’t feel bad for straying away from coding if you don’t feel like it’s something you like anymore. Hell, don’t feel bad if tech isn’t something you want to be in anymore. Things change, people grow in directions they thought they might not. I’d also like to add that it’s ok to find your job boring. I feel like people feel a pressure to love their job and have a deep passion for the problems they’re solving. I don’t agree with that. I work for a travel company. My family barely had money to travel out of state, let alone internationally like many of my coworkers. Ya know what? That’s fine. I don’t give a shit about travel, and my job is boring. But I know when I get home (or I guess leave my office since we’re at home these days) that I have my games, I have my painting stuff, I have my old math textbooks, I have my volunteer work. I’m fine with my job being meh because I don’t expect it to be more than meh.

            Sorry, this turned into a huge rant where I DEFINITELY pushed a lot of my recent thoughts and anxieties about coding/tech onto your post. I know there is a psychological term for this and I’ll figure out what it is when I don’t have what I suspect is the flu. Disillusionment in coding is just something I’ve been thinking about a lot and dealing with a lot personally.

            1 vote
            1. archevel
              Link Parent
              Thanks for this. It is nice just to hear echoes of sentiments similar to my own. For me I think a lot of the disillusionment is also due to the attitudes around innovation and the glorification of...

              Thanks for this. It is nice just to hear echoes of sentiments similar to my own. For me I think a lot of the disillusionment is also due to the attitudes around innovation and the glorification of it. Don't get me wrong, true innovation is great! However, I don't think that is what most software development is about, but a lot of regular improvements is touted as such... Hmm, this probably deserves its own write up at some point.

              Thanks for the suggestions though, I'll keep those in mind!

              1 vote
      2. [5]
        parsley
        Link Parent
        Not my experience at all. All BigData(tm) I have worked with has moved to python.

        Not my experience at all. All BigData(tm) I have worked with has moved to python.

        1 vote
        1. [4]
          archevel
          Link Parent
          This is my experience as well. Given I haven't worked at a large tech companies so maybe that's the reason. But python with pandas/numpy for analysis are solid tools. YMMV if you need to do a lot...

          This is my experience as well. Given I haven't worked at a large tech companies so maybe that's the reason. But python with pandas/numpy for analysis are solid tools. YMMV if you need to do a lot of parallelization, but even that is doable on a process or level with python if you can chunk up your data appropriately.

          1 vote
          1. [3]
            stu2b50
            Link Parent
            The kind of scale where numpy and/or pandas still operate properly is not "big data". Python is indeed the de facto data analytics languages now, but that's not what Scala is used for. Although I...

            The kind of scale where numpy and/or pandas still operate properly is not "big data". Python is indeed the de facto data analytics languages now, but that's not what Scala is used for.

            Although I know that PySpark exists, afaik it's uncommon in data infrastructure as opposed to scala.

            2 votes
            1. [2]
              archevel
              Link Parent
              Interesting. Why do you think that is the case? If you're processing pets byes of data you're probably going to be processing it via some streaming mechanism anyway... Is it just that python is...

              Interesting. Why do you think that is the case? If you're processing pets byes of data you're probably going to be processing it via some streaming mechanism anyway... Is it just that python is too inefficient that becomes a problem?

              1. stu2b50
                Link Parent
                First, the strengths of Python are not needed. You don't need developer agility in your data pipeline: you need stability so that it can chug along as your business produces more data. You're not...

                First, the strengths of Python are not needed. You don't need developer agility in your data pipeline: you need stability so that it can chug along as your business produces more data. You're not setting up all the data processing for your company in a jupyter notebook, whereas that is a big deal for when you're playing with data.

                Scala has a stronger static type system, language primitives which work better for map reduce, first class library support by Spark and other BigData engines, and as a language mostly inspired by functional programming, the lack of mutability (which presents zero problems when you're doing map reduce anyway) prevents entire classes of bugs.

                It's also faster, although that's not that big of a deal, the engine is doing most of the computational work, the language is just gluing the pieces together. But Python's multitasking support (the infamous GIL) is famously godawful.

                1 vote