• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics with the tag "python". Back to normal view
    1. [Python] Trouble fetching checkbox and radio fields with PyPDF2

      My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows: reader =...

      My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows:

      reader = PdfReader("data/test.pdf")
      cnt = len(reader.pages)
      print("reading pdf (%d pages)" % cnt)
      page = reader.pages[cnt-1]
      lines = page.extract_text().splitlines()
      print("%d lines extracted..." % len(lines))
      

      However, this text doesn't contain the checked statuses of the radio and checkboxes. I just get normal text (like "Yes No" for example) instead of these values.

      I also tried the reader.get_fields() and reader.get_form_text_fields() methods as described in their documentation but they return empty values. I also tried reading it through annotations but no "/Annots" found on the page. When I open the PDF in a notepad++ to see its meta data, this is what I get:

      %PDF-1.4
      %²³´µ
      %Generated by ExpertPdf v9.2.2
      

      It appears to me that these checkboxes aren't usual form fields used in PDF but appear similar to HTML elements. Is there any way to extract these fields using python?

      2 votes
    2. Python Loops for JSON

      How the heck do these work? I've cobbled together the script below for my bot (using Limnoria) to return the F1 standings in a line. Right now it returns the first value perfectly, but I can't...

      How the heck do these work? I've cobbled together the script below for my bot (using Limnoria) to return the F1 standings in a line. Right now it returns the first value perfectly, but I can't figure out the loop to get the other drivers at all.

      The script
      import supybot.utils as utils
      from supybot.commands import *
      import supybot.plugins as plugins
      import supybot.ircutils as ircutils
      import supybot.callbacks as callbacks
      import supybot.ircmsgs as ircmsgs
      import requests
      import os
      import collections
      import json
      
      try:
          from supybot.i18n import PluginInternationalization
      
          _ = PluginInternationalization("F1")
      except ImportError:
          _ = lambda x: x
      
      
      class F1(callbacks.Plugin):
          """Uses API to retrieve information"""
      
          threaded = True
      
          def f1(self, irc, msg, args):
              """
              F1 Standings
              """
      
              channel = msg.args[0]
              data = requests.get("https://ergast.com/api/f1/2022/driverStandings.json")
              data = json.loads(data.content)["MRData"]["StandingsTable"]["StandingsLists"][0]["DriverStandings"][0]
      
              name = data["Driver"]["code"]
              position = data["positionText"].zfill(2)
              points = data["points"]
      
              output = ", ".join(['\x0306\x02' + name + '\x0303' + " [" + position + ", "+ points + "]"])
              irc.reply(output)
      
          result = wrap(f1)
      
      Class = F1
      

      The output should be

      VER [1, 125], LEC [2, 116], PER [3, 110], RUS [4, 84], SAI [5, 83], HAM [6, 50], NOR [7, 48], BOT [8, 40], OCO [9, 30], MAG [10, 15], RIC [11, 11], TSU [12, 11], ALO [13, 10], GAS [14, 6], VET [15, 5], ALB [16, 3], STR [17, 2], ZHO [18, 1], MSC [19, 0], HUL [20, 0], LAT [21, 0]
      

      ...but it only returns VER [1, 125]

      I'm in that state where I can read the stuff, but when I put it together, it doesn't always work.

      Final script
      # this accepts @champ or @constructor with an optional year
      # and also @gp with an optional race number for the current season
      
      
      import supybot.utils as utils
      from supybot.commands import *
      import supybot.plugins as plugins
      import supybot.ircutils as ircutils
      import supybot.callbacks as callbacks
      import supybot.ircmsgs as ircmsgs
      import requests
      import os
      import collections
      import json
      
      try:
          from supybot.i18n import PluginInternationalization
      
          _ = PluginInternationalization("F1")
      except ImportError:
          _ = lambda x: x
      
      
      class F1(callbacks.Plugin):
          """Uses API to retrieve information"""
      
          threaded = True
      
          def champ(self, irc, msg, args, year):
              """<year>
              Call standings by year
              F1 Standings
              """
      
              data = requests.get("https://ergast.com/api/f1/current/driverStandings.json")
              if year:
                  data = requests.get(
                      "https://ergast.com/api/f1/%s/driverStandings.json" % (year)
                  )
              driver_standings = json.loads(data.content)["MRData"]["StandingsTable"][
                  "StandingsLists"
              ][0]["DriverStandings"]
              string_segments = []
      
              for driver in driver_standings:
                  name = driver["Driver"]["code"]
                  position = driver["positionText"]
                  points = driver["points"]
                  string_segments.append(f"\x035{name}\x0F {points}")
      
              irc.reply(", ".join(string_segments))
      
          champ = wrap(champ, [optional("int")])
      
          def gp(self, irc, msg, args, race):
              """<year>
              Call standings by year
              F1 Standings
              """
      
              data = requests.get("https://ergast.com/api/f1/current/last/results.json")
              if race:
                  data = requests.get(
                      "https://ergast.com/api/f1/current/%s/results.json" % (race)
                  )
              driver_result = json.loads(data.content)["MRData"]["RaceTable"]["Races"][0][
                  "Results"
              ]
              string_segments = []
      
              for driver in driver_result:
                  name = driver["Driver"]["code"]
                  position = driver["positionText"]
                  points = driver["points"]
                  string_segments.append(f"{position} \x035{name}\x0F {points}")
      
              irc.reply(", ".join(string_segments))
      
          gp = wrap(gp, [optional("int")])
      
          def constructor(self, irc, msg, args, year):
              """<year>
              Call standings by year
              F1 Standings
              """
      
              data = requests.get(
                  "https://ergast.com/api/f1/current/constructorStandings.json"
              )
              if year:
                  data = requests.get(
                      "https://ergast.com/api/f1/current/constructorStandings.json" % (year)
                  )
              driver_result = json.loads(data.content)["MRData"]["StandingsTable"][
                  "StandingsLists"
              ][0]["ConstructorStandings"]
              string_segments = []
      
              for driver in driver_result:
                  name = driver["Constructor"]["name"]
                  position = driver["positionText"]
                  points = driver["points"]
                  string_segments.append(f"{position} \x035{name}\x0F {points}")
      
              irc.reply(", ".join(string_segments))
      
          constructor = wrap(constructor, [optional("int")])
      
      
      Class = F1
      
      3 votes
    3. How would you write a GUI? Seeking opinions, recommendations, and what to avoid.

      Hi all. I am asking this open-ended question (bottom of this post) because I am considering making contributions to an open-source project that would directly benefit me and other users. Some...

      Hi all. I am asking this open-ended question (bottom of this post) because I am considering making contributions to an open-source project that would directly benefit me and other users.

      Some background:

      I have worked with an engineering simulation software called Ansys MAPDL basically everyday for the last 4 years, in both an academic and a professional capacity. It's not necessarily relevant whether you are familiar to that program to participate in this discussion. The relevant thing is that the GUI for MAPDL is written in Tcl/Tk and I don’t imagine it is going to be modernized (because of more modern, but distinctly different, replacements). This is a screenshot of the GUI for reference.

      Why do people put up with such an old interface?

      The power of the program is not its GUI, but the scripting language that can be run to setup and solve simulations. The program name is really the scripting language name, Ansys Parametric Design Language (APDL). It's somewhat like Matlab. The program also offers an enormous amount of control when compared to the more modern GUI that's been released, since the modern GUI holds a totally different philosophy.

      The older GUI is really helpful in certain circumstances because it will spit out a file containing commands that were used in the session. This is a great demonstration of how to run a command or use a setting/config command, but a lot of newer features are buried in the documentation and aren't available in the older GUI.

      My coding experience

      I know the MAPDL language very intimately, but my experience beyond it is limited to some Perl scripting, and a bit of Python exposure.

      Motivation

      Open-Source Ansys API

      Recently, Ansys started supporting an open-source Python project called PyAnsys. MAPDL is otherwise fully closed source, and this is really the only public-facing API. PyAnsys has basically converted a lot of MAPDL script commands to a pythonic format, hence Python can now be used to interact with MAPDL. This is great for several reasons, but is limited regarding interactivity. Interacting with MAPDL via Python is basically happening in a fancy console via Jupyter notebook or IDE like Spyder. Certain commands will bring up Python-based graphics displays of solid models and results plots, but there isn't a dedicated GUI open all the time.

      The Question(s)

      My question is whether it is feasible to write a frontend GUI to a bunch of python commands. If you were going to do it, how would you do it? What might you write it with? Would you even do it? Is this a stupid endeavor?

      7 votes
    4. What is a class in Python?

      I've been learning a bit more Python, going through a Udemy course to expand my skills a little. One of the programs the course guides you to make is a little dictionary, but it currently only...

      I've been learning a bit more Python, going through a Udemy course to expand my skills a little. One of the programs the course guides you to make is a little dictionary, but it currently only runs once and then quits.
      I'd like to adapt it to use a nice TUI that keeps itself open until the user specifies they want to quit, using something along the lines of npyscreen. However, this library uses classes, and that's not something I'm yet familiar with. I'd rather have an understanding of what classes are, how they work, and why to use them before I take the plunge and start fiddling around with npyscreen (although I'd be interested to hear if you think that I should Just Do It instead).
      Can anyone give or point me towards a good explanation of the what, how, and why of Python classes? Or better yet, a tutorial that will give me something to write and play with to figure out how it all fits together?
      Thanks!

      9 votes
    5. Input from a text file, pull from multiple APIs, formatting output, etc. in Python

      I don't need answers so much as an idea of where to start. Essentially, I have a Google Sheet that uses importjson.gs to pull from the following APIs OMDB (IMDB) TheMovieDB TVMaze I also use...

      I don't need answers so much as an idea of where to start.

      Essentially, I have a Google Sheet that uses importjson.gs to pull from the following APIs

      • OMDB (IMDB)
      • TheMovieDB
      • TVMaze

      I also use another script to scrape Letterboxd for ratings.

      This works well, but sometimes it'll time out or I'll hit urlFetch limits that Google has in place.

      Basically, I'd like to have a text file (input.txt) where I pop in a bunch of titles and year or IMDB IDs, then the script runs and pulls set endpoints from all of these, outputting everything on one line (a pipe as a delimiter.)

      My thinking is that I can then pull that info a sheet and run all of the formatting, basic math, and whatever else so it suits my Sheet.

      I have a feeling I'll be using requests for the JSON and beautifulsoup for letterboxd -- or maybe a module.

      Can anyone point me in the right direction? I don't think it'll be too difficult and should work well for a first python project.

      7 votes
    6. How to design a database?

      I'm working on an application that allows a user to view playlists belonging to a particular radio show and stream/download/favourite the tracks in them. It has 4 core entities: User, Show,...

      I'm working on an application that allows a user to view playlists belonging to a particular radio show and stream/download/favourite the tracks in them. It has 4 core entities: User, Show, Playlist and Track.

      • Each show has multiple playlists (one-to-many)
      • Each playlist has multiple tracks (one-to-many)

      To be able to reference a playlist belonging to a particular show. I gave those playlists the same uuid as the show they belong to. A few questions though.

      1. Is this the right/best way to associate data?
      2. As a track could potentially belong to multiple playlists, I can't take the same approach as I do for (show/playlist) How would be best to handle this? Ideally I would like to have a single "Track" table containing all tracks for all playlists.

      For any experienced database designers out there, how would you structure this data? What would you consider in designing the schema and why? If I did go with 4 tables only, presumably there would be performance implications given the potential amount of data in any one of those tables, particularly tracks. If that is the case, how best to structure this kind of thing with performance in mind? Thanks in advance for any help :)

      For reference, in case it's of importance, I'm using sqlite3.

      5 votes
    7. Architecture for untrained software engineers (Python)

      Hey everyone, I've been programming for some time now but notice without any formalized education in CS I often get lost in the weeds when it comes to developing larger applications. I'm familiar...

      Hey everyone,

      I've been programming for some time now but notice without any formalized education in CS I often get lost in the weeds when it comes to developing larger applications. I'm familiar with the principles of TDD and SOLID - which have helped with maintainability - however still feel that I'm lacking in the ability to architect a properly structured system. As an example, I'm currently developing a flask REST API for a website (just for learning purposes). This involves parsing a html response and serializing the result as JSON. I'm still quite unclear as to structuring this sort of thing. If any more experienced developers could point me in the right direction/offer up their opinion I'd be very appreciative. Currently I have something like this (based - I hope correctly? - on uncle bob's clean architecture).

      Firstly, I'm defining the domain model. i.e the structure of the API response. Then, from outside in.

      1. Infrastructure (Flask): User makes request via interface (in my case a request to some endpoint)
      2. Adapters: request object checks if the request is valid (on the way back it checks if the response is valid) - Is this layer only for error handling?
      3. Repository: I'm struggling a bit here, AFAIUI this layer is traditionally a database. In my case however, where the request is valid, is this where I should handle the networking layer? i.e all the requests to return the website source? I'm also confused given at this stage I should be returning the relevant domain model, like an ORM, but as my data is unstructured, in order to do this I need to transform the response first. Where would it be best to handle this?
      4. Use Cases: Here I transform the domain model depending on the request. For example, filter all objects by id. Have I understood this correctly?
      5. Serializers: Encode the domain model as JSON to return from flask route.

      If you got this far, thanks so much for reading. I really hope to hear the opinions of more experienced devs who can steer me in the right direction/correct me should I have misunderstood anything.

      8 votes
    8. What is the best way to teach Python for my 11-year-old sister that lives in another state?

      This may seem an obvious question, but not as much as it seems. She uses Windows, I’m currently using Linux/macOS. How to instruct her to install her Python environment? Should I use Zoom, Skype,...

      This may seem an obvious question, but not as much as it seems. She uses Windows, I’m currently using Linux/macOS. How to instruct her to install her Python environment? Should I use Zoom, Skype, Google Hangouts, or another solution? Is there and easy way for live-drawing (online blackboard) to explain things to her visually? And, perhaps most importantly, how can I do that for free?

      13 votes
    9. What's the current state-of-the-art in Python package creation/distribution?

      I've been thinking on and off about packaging up a few simple Python utilities I've written to stick up on Github for people to use if they want, but, every time I go to check out how one goes...

      I've been thinking on and off about packaging up a few simple Python utilities I've written to stick up on Github for people to use if they want, but, every time I go to check out how one goes about managing dependencies and all that for a project, I run into a whole wall of options. Does anyone better versed in all of this have any recommendations for me?

      11 votes