• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics in ~comp with the tag "software engineering". Back to normal view / Search all groups
    1. Any other developers also strongly resistant to adding secondary data stores to their software?

      I'm currently building an MVP for a startup, solo. We've got Postgres pulling triple duty as the go-to database for all normal relational data, a vector database with pgvector, and a job queue...

      I'm currently building an MVP for a startup, solo. We've got Postgres pulling triple duty as the go-to database for all normal relational data, a vector database with pgvector, and a job queue (With the magic of SELECT ... FROM "Jobs" WHERE ... FOR UPDATE SKIP LOCKED LIMIT 1). Every time I go out looking for solutions to problems it feels like the world really wants me to get a dedicated vector store or to use Redis as a job queue.

      Back when I was a Rails developer a good majority of the ActiveJob implementers used Redis. Now that I'm doing NodeJS the go-to is Bull which can only serialize jobs to Redis. They back this with claims that I can scale to thousands of jobs per second! I have to assume this theoretical throughput benefit from using Redis is utilized by 0.01% of apps running Bull.

      So I ended up implementing a very simple system. Bull wouldn't have been a good fit anyway as we have both Python and Typescript async workers, so a simple system that I fully understand is more useful at the moment. I'm curious who else shares my philosophy.

      Edit: I'll try to remember to update everyone in a year with the real world consequences of my design choices.

      16 votes
    2. How do you plan or outline a program?

      I’m currently studying Python Object Oriented programming and got to a point where logic and syntax are the least of my problems. I always get to a stage where I’m completely lost between modules,...

      I’m currently studying Python Object Oriented programming and got to a point where logic and syntax are the least of my problems. I always get to a stage where I’m completely lost between modules, classes, objects and a sea of “selfs”.

      I’m not doing anything too complicated, just small projects for practice, but I think I would benefit from planning. My mental processes are highly disorganized (ADHD) and I need all the help I can get with that.

      I don’t need an automated tool (even though it might come in handy) -- sketching things out on paper is probably enough.

      I only know about UML, which seems fine. Can anyone recommend a tutorial about this and other tools?

      Edit to link my last attempt at following a tutorial:

      This is the last tutorial I tried to follow, a Pygame project from the book Python Crash Course 2ed. Following tutorials is frequently mostly typing, so what I achieved there is not a real representation of my abilities -- I would not be able to do something like that on my own. In fact, I failed to answer the latest exercises, which were basically smaller versions of this project.

      My problem is not with syntax and the basics of how OOP works, but rather with memory and organization of information.

      10 votes
    3. Ask Tildes: Design practices for retrieving dozens (or hundreds) of related records over a RESTful API

      I'm looking for some feedback on a feasible mechanism for structuring a few API endpoints where a purely RFC-spec compliant REST API wouldn't suffice. I have an endpoint which returns $child...

      I'm looking for some feedback on a feasible mechanism for structuring a few API endpoints where a purely RFC-spec compliant REST API wouldn't suffice.

      I have an endpoint which returns $child entries for a $parent resource, let's call it: /api/parent/:parentId/children. There could be anywhere from a dozen to several hundred children returned from this call. From here, a child entity is related to a single userOrganization, which itself is a pivoting entity on a single user. The relationship between a child and user is not strictly transitive, but can each child only has one userOrganization which only has one user, so it is trivial to reach a user from a child resource.

      Given this, the data I need for the particular request involves retrieving all user's for a parent. The obvious, and incorrect solution to the problem is to make the request mentioned above, and then iterate through and make an API request to retrieve each user. This is less than very good as this would obviously be up to several hundred API calls.

      There's a few more scalable solutions that could solve this problem, so any input on these ideas is great; but if you have a better proposal that also works, I'm keen to explore that!

      Include user relationships in the call by default.

      This certainly does solve the problem, but it's also pumping down a load of data I don't necessarily need. This would probably 2x the amount of bytes travelling along the wire, and in 8 out of 10 calls, that extra data isn't needed.

      Have a separate /api/parent/:parentId/users call.

      Another option that partially solves the issue: I need data from both the child and the user to format this view, so I'd still need to make the initial call I documented earlier. Semantically, it feels a bit odd to have this as a resource because I don't consider a user to be nested under a parent in terms of database topology.

      Keep the original call, but add a query parameter to fetch the extra data

      This comes across as the 'least worst' idea objectively, in terms of flexibility and design. Through the addition of the query parameter, you could optionally retrieve the relationship's data. This seems brittle and doesn't scale well to other endpoints where it could be useful though.

      Utilize a Stripe expands-style query parameter.

      Stripe implements the ability to retrieve all related records from an API endpoint by specifying the relations as strings. This is essentially the same as the above answer, but is scaled to all available API endpoints. I love this idea, but implementing it in a secure way seems fraught with disaster. For example, this is a multi-tenancied application, and it would be trivial to request userOrganization.user.organizations.users. This would retrieve all other organisations for the user, and their users! This is because my implementation of expands simply utilises the ORM of my choice to perform a database join, and of course the database has no knowledge about application tenancy!


      Now, I do realise this problem could easily be solved by implementing a GraphQL API server, which I have done in the past, but unfortunately time and workload constraints dictate implementing a GraphQL-based solution is infeasible. As much as I like GraphQL, I'm not as proficient in that area as compared to implementing high quality traditional APIs, and the applications I'm working on at the moment are focusing on choosing boring technology, and not using excessive innovation tokens.

      Furthermore, I do consider the conceptuals around REST APIs to be more of an aspirational sliding scale, rather than a well defined physical entity, because let's face it, the majority of popular APIs today aren't REST-compliant, even Stripe's isn't, and it's usually both financially healthier and feature-rich to choose a development path that results in a rough product that can be refined later, than aiming for a perfect initial release. All this said, I don't mind proposals or solutions to my problem that are "good enough". As long as they aren't too hacky! :)

      10 votes
    4. Programming/software design practice?

      So, I've been going through Project Euler and solving problems as a way to brush up on my programming abilities, but it's mostly a math-focused set of problems. Which is cool..they're nice little...

      So, I've been going through Project Euler and solving problems as a way to brush up on my programming abilities, but it's mostly a math-focused set of problems. Which is cool..they're nice little puzzles that get the gears turning...

      BUT I'm wondering if anyone here has suggestions for a website/course that teaches software design in a piece-wise way. Like... each problem is a nugget of software design that builds off previous problems and eventually you're creating an entire application utilizing different algorithms/design patterns/data structures/etc.

      I'd appreciate any resources similar to that idea. Thanks!

      7 votes
    5. Let's talk best-practice Jenkins on AWS ECS

      [seen on reddit but no discussion - if it's not okay to seek out better discussion here after seeing something fall flat on reddit, I am very sorry and I'll delete promptly] I've had some...

      [seen on reddit but no discussion - if it's not okay to seek out better discussion here after seeing something fall flat on reddit, I am very sorry and I'll delete promptly]

      I've had some experience in this realm for a while now, but I'm having a little trouble with one issue in particular. Before I divulge, I'll present my thoughts on best practice and and what I've been able to implement:

      • Terraform everything (in accordance to terragrunt's "style guide" i.e. organization)
        THIS IS A BIG ONE: for the jenkins master task, make sure to use the following args to make sure jenkins jobs aren't super slow as hell to start:
      -Djava.awt.headless=true -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85
      

      THIS IS A GAME CHANGER (more-so on k8s clusters when the ecs plugin isn't used... hint, it's shit).

      • Create an EFS (in a separate terraform module) and mount it to the jenkins ECS cluster at /var/jenkins_home. Makes jenkins much more reliable through outages and easier to upgrade.
      • Run a logging agent (via docker container) like logspout or newrelic or whatever IN USER_DATA and not as a task - that way you get logs if there are issues during user_data/cloud_init... this I'm actually not sure about. Running a container outside the context of an ECS task means the ECS agent can't really track it and allocate mem/cpu properly... but it does help with user_data triage.
      • Use pipelines and git plugins to drive jobs. All jenkins jobs should be in source control!
      • Make sure you setup docker cleanup jobs on DAY 1! If you hace limited access to your cluster and you run out of disk due to docker cache, networks, volumes, etc... you're screwed till the admin ssh's in and runs a prune. Get a docker system prune going or the equivalent for each docker resource with appropriate filters... i.e. filter for anything older than a few days and is dangling.
      • Use Jenkins Global Libraries to make Jenkinsfiles cleaner (I always just use vars instead of groovy/java style packages because it's easier and less ugly)
        Jenkinsfiles should mostly call other bash files, make files, python scripts to generate and load prop files, etc. The less logic you put in a Jenkinsfile (which is just modified groovy) the better. String interpolation, among other things, is a fuckery that we don't have time to triage.
      • (out-of-scope) Move to using k8s/EKS instead of ECS asap because the ECS plugin for jenkins is absolute shit and it doesn't use priority correctly (sorry whoever developed it and... oh wait abandoned it and hasn't merged anything for years... for for real it's cool, just give admin to someone else).
      • (cultural) Stop calling them slaves. "Hey @eng, we're rotating slaves due to some cache issues. If you have been affected by race conditions in that past, our new update and slave rotation should fix that. Our update may have killed your job that was running on an old slave, just wait a few and the new slaves will be ready" <--This just doesn't look good.
        Hope that was some good stuff for you guys. Maybe I'm preaching to the choir, but I've seen some pretty shit jenkins setups.

      NOW FOR MY QUESTION!

      Has ANYONE actually been able to setup a proper jenkins user on ECS that actually works for both a master and ephemeral jenkins-agents so that they can mount and use the docker.sock for builds without hitting permission issues? I'm talking using the ecs plugin and mounting docker.sock via that.

      I have always resorted to running jenkins master and agents as root, which means you have to chmod files (super expensive time and cpu for services with tons of files). Running microservices as root is obviously bad practice, and chmod-ing a zilliion files is shit for docker cache and time... so I want to get jenkins users able to utilize the docker.sock. THIS IS SPECIFICALLY FOR THE AWS ECS AMI! I don't care about debian or old versions of docker where you could use DOCKER_OPTS. That doesn't work on the AWS Linux image.

      Thanks! And happy Friday!

      5 votes