• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics with the tag "python". Back to normal view
    1. best way to go about with a script that seems to need both bash and python functionality

      Gonna try and put this into words. I am pretty familiar with bash and python. used both quite a bit and feel more or less comfortable with them. My issue is I often do a thing where if I want to...

      Gonna try and put this into words.

      I am pretty familiar with bash and python. used both quite a bit and feel more or less comfortable with them.

      My issue is I often do a thing where if I want to accomplish a task that is maybe a bit complex, I feel like I have to wind up making a script, let's call it hello_word.sh but then I also make a script called .hello_world.py

      and basically what I do is almost the first line of the bash script, I call the python script like ./hello_world.py $@ and take advtange of the argparse library in python to determine what the user wants to do amongst other tasks that are easier to do in python like for loops and etc.

      I try to do the meat of the logic in the python scripts before I write to an .env file from it and then in the bash script, I will do

      set -o allexport
      source "${DIR}"/"${ENV_FILE}"
      set +o allexport
      

      and then use the variable from that env file to do the rest of the logic in bash.

      why do I do anything in bash?

      cause I very much prefer being able to see a terminal command being executed in real-time and see what it does and be able to Ctrl+c if I see the command go awry.

      in python, you can run a command with subprocess or other similar system libraries but you can't get the output in real-time or terminate a command preemptively and I really hate that. you have to wait for the command to end to see what happened.

      But I feel like there is something obvious I am missing (like maybe bash has an argparse library I don't know about and there is some way to inject the concept of types into it) or if there is another language entirely that fits my needs?

      6 votes
    2. Dealing with databases, inserts, updates, etc. in Python

      Current Library: built in sqlite Current db: sqlite (but will have access to Snowflake soon for option 1 below) Wondering if anyone here has some advise or a good place to learn about dealing with...

      Current Library: built in sqlite
      Current db: sqlite (but will have access to Snowflake soon for option 1 below)

      Wondering if anyone here has some advise or a good place to learn about dealing with databases with Python. I know SQL fairly well for pulling data and simple updates, but running into potential performance issues the way I've been doing it. Here are 2 examples.

      1. Dealing with Pandas dataframes. I'm doing some reconciliation between a couple of different datasources. I do not have a primary key to work with. I have some very specific matching criteria to determine a match (5 columns specifically - customer, date, contract, product, quantity). The matching process is all built within Python. Is there a good way to do the database commits with updates/inserts en masse vs. line by line? I've looked into upsert (or inserts with clause to update with existing data), but pretty much all examples I've seen rely on primary keys (which I don't have since the data has 5 columns I'm matching on).

      2. Dealing with JSON files which have multiple layers of related data. My database is built in such a way that I have a table for header information, line level detail, then third level with specific references assigned to the line level detail. As with a lot of transactional type databases there can be multiple references per line, multiple lines per header. I'm currently looping through the JSON file starting with the header information to create the primary key, then going to the line level detail to create a primary key for the line, but also include the foreign key for the header and also with the reference data. Before inserting I'm doing a lookup to see if the data already exists and then updating if it does or inserting a new record if it doesn't. This works fine, but is slow taking several seconds for maybe 100 inserts in total. While not a big deal since it's for a low volume of sales. I'd rather learn best practice and do this properly with commits/transactions vs inserting an updating each record individually within the ability to rollback should an error occur.

      11 votes
    3. [Python] Trouble fetching checkbox and radio fields with PyPDF2

      My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows: reader =...

      My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows:

      reader = PdfReader("data/test.pdf")
      cnt = len(reader.pages)
      print("reading pdf (%d pages)" % cnt)
      page = reader.pages[cnt-1]
      lines = page.extract_text().splitlines()
      print("%d lines extracted..." % len(lines))
      

      However, this text doesn't contain the checked statuses of the radio and checkboxes. I just get normal text (like "Yes No" for example) instead of these values.

      I also tried the reader.get_fields() and reader.get_form_text_fields() methods as described in their documentation but they return empty values. I also tried reading it through annotations but no "/Annots" found on the page. When I open the PDF in a notepad++ to see its meta data, this is what I get:

      %PDF-1.4
      %²³´µ
      %Generated by ExpertPdf v9.2.2
      

      It appears to me that these checkboxes aren't usual form fields used in PDF but appear similar to HTML elements. Is there any way to extract these fields using python?

      2 votes
    4. Python Loops for JSON

      How the heck do these work? I've cobbled together the script below for my bot (using Limnoria) to return the F1 standings in a line. Right now it returns the first value perfectly, but I can't...

      How the heck do these work? I've cobbled together the script below for my bot (using Limnoria) to return the F1 standings in a line. Right now it returns the first value perfectly, but I can't figure out the loop to get the other drivers at all.

      The script
      import supybot.utils as utils
      from supybot.commands import *
      import supybot.plugins as plugins
      import supybot.ircutils as ircutils
      import supybot.callbacks as callbacks
      import supybot.ircmsgs as ircmsgs
      import requests
      import os
      import collections
      import json
      
      try:
          from supybot.i18n import PluginInternationalization
      
          _ = PluginInternationalization("F1")
      except ImportError:
          _ = lambda x: x
      
      
      class F1(callbacks.Plugin):
          """Uses API to retrieve information"""
      
          threaded = True
      
          def f1(self, irc, msg, args):
              """
              F1 Standings
              """
      
              channel = msg.args[0]
              data = requests.get("https://ergast.com/api/f1/2022/driverStandings.json")
              data = json.loads(data.content)["MRData"]["StandingsTable"]["StandingsLists"][0]["DriverStandings"][0]
      
              name = data["Driver"]["code"]
              position = data["positionText"].zfill(2)
              points = data["points"]
      
              output = ", ".join(['\x0306\x02' + name + '\x0303' + " [" + position + ", "+ points + "]"])
              irc.reply(output)
      
          result = wrap(f1)
      
      Class = F1
      

      The output should be

      VER [1, 125], LEC [2, 116], PER [3, 110], RUS [4, 84], SAI [5, 83], HAM [6, 50], NOR [7, 48], BOT [8, 40], OCO [9, 30], MAG [10, 15], RIC [11, 11], TSU [12, 11], ALO [13, 10], GAS [14, 6], VET [15, 5], ALB [16, 3], STR [17, 2], ZHO [18, 1], MSC [19, 0], HUL [20, 0], LAT [21, 0]
      

      ...but it only returns VER [1, 125]

      I'm in that state where I can read the stuff, but when I put it together, it doesn't always work.

      Final script
      # this accepts @champ or @constructor with an optional year
      # and also @gp with an optional race number for the current season
      
      
      import supybot.utils as utils
      from supybot.commands import *
      import supybot.plugins as plugins
      import supybot.ircutils as ircutils
      import supybot.callbacks as callbacks
      import supybot.ircmsgs as ircmsgs
      import requests
      import os
      import collections
      import json
      
      try:
          from supybot.i18n import PluginInternationalization
      
          _ = PluginInternationalization("F1")
      except ImportError:
          _ = lambda x: x
      
      
      class F1(callbacks.Plugin):
          """Uses API to retrieve information"""
      
          threaded = True
      
          def champ(self, irc, msg, args, year):
              """<year>
              Call standings by year
              F1 Standings
              """
      
              data = requests.get("https://ergast.com/api/f1/current/driverStandings.json")
              if year:
                  data = requests.get(
                      "https://ergast.com/api/f1/%s/driverStandings.json" % (year)
                  )
              driver_standings = json.loads(data.content)["MRData"]["StandingsTable"][
                  "StandingsLists"
              ][0]["DriverStandings"]
              string_segments = []
      
              for driver in driver_standings:
                  name = driver["Driver"]["code"]
                  position = driver["positionText"]
                  points = driver["points"]
                  string_segments.append(f"\x035{name}\x0F {points}")
      
              irc.reply(", ".join(string_segments))
      
          champ = wrap(champ, [optional("int")])
      
          def gp(self, irc, msg, args, race):
              """<year>
              Call standings by year
              F1 Standings
              """
      
              data = requests.get("https://ergast.com/api/f1/current/last/results.json")
              if race:
                  data = requests.get(
                      "https://ergast.com/api/f1/current/%s/results.json" % (race)
                  )
              driver_result = json.loads(data.content)["MRData"]["RaceTable"]["Races"][0][
                  "Results"
              ]
              string_segments = []
      
              for driver in driver_result:
                  name = driver["Driver"]["code"]
                  position = driver["positionText"]
                  points = driver["points"]
                  string_segments.append(f"{position} \x035{name}\x0F {points}")
      
              irc.reply(", ".join(string_segments))
      
          gp = wrap(gp, [optional("int")])
      
          def constructor(self, irc, msg, args, year):
              """<year>
              Call standings by year
              F1 Standings
              """
      
              data = requests.get(
                  "https://ergast.com/api/f1/current/constructorStandings.json"
              )
              if year:
                  data = requests.get(
                      "https://ergast.com/api/f1/current/constructorStandings.json" % (year)
                  )
              driver_result = json.loads(data.content)["MRData"]["StandingsTable"][
                  "StandingsLists"
              ][0]["ConstructorStandings"]
              string_segments = []
      
              for driver in driver_result:
                  name = driver["Constructor"]["name"]
                  position = driver["positionText"]
                  points = driver["points"]
                  string_segments.append(f"{position} \x035{name}\x0F {points}")
      
              irc.reply(", ".join(string_segments))
      
          constructor = wrap(constructor, [optional("int")])
      
      
      Class = F1
      
      3 votes
    5. How would you write a GUI? Seeking opinions, recommendations, and what to avoid.

      Hi all. I am asking this open-ended question (bottom of this post) because I am considering making contributions to an open-source project that would directly benefit me and other users. Some...

      Hi all. I am asking this open-ended question (bottom of this post) because I am considering making contributions to an open-source project that would directly benefit me and other users.

      Some background:

      I have worked with an engineering simulation software called Ansys MAPDL basically everyday for the last 4 years, in both an academic and a professional capacity. It's not necessarily relevant whether you are familiar to that program to participate in this discussion. The relevant thing is that the GUI for MAPDL is written in Tcl/Tk and I don’t imagine it is going to be modernized (because of more modern, but distinctly different, replacements). This is a screenshot of the GUI for reference.

      Why do people put up with such an old interface?

      The power of the program is not its GUI, but the scripting language that can be run to setup and solve simulations. The program name is really the scripting language name, Ansys Parametric Design Language (APDL). It's somewhat like Matlab. The program also offers an enormous amount of control when compared to the more modern GUI that's been released, since the modern GUI holds a totally different philosophy.

      The older GUI is really helpful in certain circumstances because it will spit out a file containing commands that were used in the session. This is a great demonstration of how to run a command or use a setting/config command, but a lot of newer features are buried in the documentation and aren't available in the older GUI.

      My coding experience

      I know the MAPDL language very intimately, but my experience beyond it is limited to some Perl scripting, and a bit of Python exposure.

      Motivation

      Open-Source Ansys API

      Recently, Ansys started supporting an open-source Python project called PyAnsys. MAPDL is otherwise fully closed source, and this is really the only public-facing API. PyAnsys has basically converted a lot of MAPDL script commands to a pythonic format, hence Python can now be used to interact with MAPDL. This is great for several reasons, but is limited regarding interactivity. Interacting with MAPDL via Python is basically happening in a fancy console via Jupyter notebook or IDE like Spyder. Certain commands will bring up Python-based graphics displays of solid models and results plots, but there isn't a dedicated GUI open all the time.

      The Question(s)

      My question is whether it is feasible to write a frontend GUI to a bunch of python commands. If you were going to do it, how would you do it? What might you write it with? Would you even do it? Is this a stupid endeavor?

      7 votes
    6. What is a class in Python?

      I've been learning a bit more Python, going through a Udemy course to expand my skills a little. One of the programs the course guides you to make is a little dictionary, but it currently only...

      I've been learning a bit more Python, going through a Udemy course to expand my skills a little. One of the programs the course guides you to make is a little dictionary, but it currently only runs once and then quits.
      I'd like to adapt it to use a nice TUI that keeps itself open until the user specifies they want to quit, using something along the lines of npyscreen. However, this library uses classes, and that's not something I'm yet familiar with. I'd rather have an understanding of what classes are, how they work, and why to use them before I take the plunge and start fiddling around with npyscreen (although I'd be interested to hear if you think that I should Just Do It instead).
      Can anyone give or point me towards a good explanation of the what, how, and why of Python classes? Or better yet, a tutorial that will give me something to write and play with to figure out how it all fits together?
      Thanks!

      9 votes
    7. Input from a text file, pull from multiple APIs, formatting output, etc. in Python

      I don't need answers so much as an idea of where to start. Essentially, I have a Google Sheet that uses importjson.gs to pull from the following APIs OMDB (IMDB) TheMovieDB TVMaze I also use...

      I don't need answers so much as an idea of where to start.

      Essentially, I have a Google Sheet that uses importjson.gs to pull from the following APIs

      • OMDB (IMDB)
      • TheMovieDB
      • TVMaze

      I also use another script to scrape Letterboxd for ratings.

      This works well, but sometimes it'll time out or I'll hit urlFetch limits that Google has in place.

      Basically, I'd like to have a text file (input.txt) where I pop in a bunch of titles and year or IMDB IDs, then the script runs and pulls set endpoints from all of these, outputting everything on one line (a pipe as a delimiter.)

      My thinking is that I can then pull that info a sheet and run all of the formatting, basic math, and whatever else so it suits my Sheet.

      I have a feeling I'll be using requests for the JSON and beautifulsoup for letterboxd -- or maybe a module.

      Can anyone point me in the right direction? I don't think it'll be too difficult and should work well for a first python project.

      7 votes
    8. How to design a database?

      I'm working on an application that allows a user to view playlists belonging to a particular radio show and stream/download/favourite the tracks in them. It has 4 core entities: User, Show,...

      I'm working on an application that allows a user to view playlists belonging to a particular radio show and stream/download/favourite the tracks in them. It has 4 core entities: User, Show, Playlist and Track.

      • Each show has multiple playlists (one-to-many)
      • Each playlist has multiple tracks (one-to-many)

      To be able to reference a playlist belonging to a particular show. I gave those playlists the same uuid as the show they belong to. A few questions though.

      1. Is this the right/best way to associate data?
      2. As a track could potentially belong to multiple playlists, I can't take the same approach as I do for (show/playlist) How would be best to handle this? Ideally I would like to have a single "Track" table containing all tracks for all playlists.

      For any experienced database designers out there, how would you structure this data? What would you consider in designing the schema and why? If I did go with 4 tables only, presumably there would be performance implications given the potential amount of data in any one of those tables, particularly tracks. If that is the case, how best to structure this kind of thing with performance in mind? Thanks in advance for any help :)

      For reference, in case it's of importance, I'm using sqlite3.

      5 votes