-
22 votes
-
Things learned serving on the board of the Python Software Foundation
24 votes -
best way to go about with a script that seems to need both bash and python functionality
Gonna try and put this into words. I am pretty familiar with bash and python. used both quite a bit and feel more or less comfortable with them. My issue is I often do a thing where if I want to...
Gonna try and put this into words.
I am pretty familiar with bash and python. used both quite a bit and feel more or less comfortable with them.
My issue is I often do a thing where if I want to accomplish a task that is maybe a bit complex, I feel like I have to wind up making a script, let's call it
hello_word.sh
but then I also make a script called.hello_world.py
and basically what I do is almost the first line of the bash script, I call the python script like
./hello_world.py $@
and take advtange of theargparse
library in python to determine what the user wants to do amongst other tasks that are easier to do in python like for loops and etc.I try to do the meat of the logic in the python scripts before I write to an
.env
file from it and then in the bash script, I will doset -o allexport source "${DIR}"/"${ENV_FILE}" set +o allexport
and then use the variable from that env file to do the rest of the logic in bash.
why do I do anything in bash?
cause I very much prefer being able to see a terminal command being executed in real-time and see what it does and be able to
Ctrl+c
if I see the command go awry.in python, you can run a command with
subprocess
or other similar system libraries but you can't get the output in real-time or terminate a command preemptively and I really hate that. you have to wait for the command to end to see what happened.But I feel like there is something obvious I am missing (like maybe bash has an argparse library I don't know about and there is some way to inject the concept of types into it) or if there is another language entirely that fits my needs?
6 votes -
Preventing the worst supply chain attack you can imagine in the Python ecosystem
28 votes -
Cybercriminals pose as "helpful" Stack Overflow users to push malware
19 votes -
A very subtle bug
16 votes -
An oral history of Bank Python
15 votes -
Scalene, an open-source tool for dramatically speeding up the programming language Python
12 votes -
Interview of Samuel Colvin, founder and lead maintainer of Pydantic
11 votes -
Python in Excel: Combining the power of Python and the flexibility of Excel
34 votes -
An introduction to Statistical Learning with applications in R and Python
16 votes -
CLI tools hidden in the Python standard library
14 votes -
Dealing with databases, inserts, updates, etc. in Python
Current Library: built in sqlite Current db: sqlite (but will have access to Snowflake soon for option 1 below) Wondering if anyone here has some advise or a good place to learn about dealing with...
Current Library: built in sqlite
Current db: sqlite (but will have access to Snowflake soon for option 1 below)Wondering if anyone here has some advise or a good place to learn about dealing with databases with Python. I know SQL fairly well for pulling data and simple updates, but running into potential performance issues the way I've been doing it. Here are 2 examples.
-
Dealing with Pandas dataframes. I'm doing some reconciliation between a couple of different datasources. I do not have a primary key to work with. I have some very specific matching criteria to determine a match (5 columns specifically - customer, date, contract, product, quantity). The matching process is all built within Python. Is there a good way to do the database commits with updates/inserts en masse vs. line by line? I've looked into upsert (or inserts with clause to update with existing data), but pretty much all examples I've seen rely on primary keys (which I don't have since the data has 5 columns I'm matching on).
-
Dealing with JSON files which have multiple layers of related data. My database is built in such a way that I have a table for header information, line level detail, then third level with specific references assigned to the line level detail. As with a lot of transactional type databases there can be multiple references per line, multiple lines per header. I'm currently looping through the JSON file starting with the header information to create the primary key, then going to the line level detail to create a primary key for the line, but also include the foreign key for the header and also with the reference data. Before inserting I'm doing a lookup to see if the data already exists and then updating if it does or inserting a new record if it doesn't. This works fine, but is slow taking several seconds for maybe 100 inserts in total. While not a big deal since it's for a low volume of sales. I'd rather learn best practice and do this properly with commits/transactions vs inserting an updating each record individually within the ability to rollback should an error occur.
11 votes -
-
Please help me become a python backend developer!
I am looking for an effective roadmap to become a python backend developer. I am from a non-CS background. I can buy a couple of courses on udemy, if the need arises. TIA!
15 votes -
Infinite AI Array
3 votes -
A crash course in python packaging
2 votes -
[Python] Trouble fetching checkbox and radio fields with PyPDF2
My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows: reader =...
My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows:
reader = PdfReader("data/test.pdf") cnt = len(reader.pages) print("reading pdf (%d pages)" % cnt) page = reader.pages[cnt-1] lines = page.extract_text().splitlines() print("%d lines extracted..." % len(lines))
However, this text doesn't contain the checked statuses of the radio and checkboxes. I just get normal text (like "Yes No" for example) instead of these values.
I also tried the
reader.get_fields()
andreader.get_form_text_fields()
methods as described in their documentation but they return empty values. I also tried reading it through annotations but no"/Annots"
found on the page. When I open the PDF in anotepad++
to see its meta data, this is what I get:%PDF-1.4 %²³´µ %Generated by ExpertPdf v9.2.2
It appears to me that these checkboxes aren't usual form fields used in PDF but appear similar to HTML elements. Is there any way to extract these fields using python?
2 votes -
Crimes with Python's Pattern Matching
7 votes -
eCharts for Python
3 votes -
Python data visualisation
6 votes -
Pretty Maps in Python
6 votes -
Making Heatmaps
5 votes -
Python Loops for JSON
How the heck do these work? I've cobbled together the script below for my bot (using Limnoria) to return the F1 standings in a line. Right now it returns the first value perfectly, but I can't...
How the heck do these work? I've cobbled together the script below for my bot (using Limnoria) to return the F1 standings in a line. Right now it returns the first value perfectly, but I can't figure out the loop to get the other drivers at all.
The script
import supybot.utils as utils from supybot.commands import * import supybot.plugins as plugins import supybot.ircutils as ircutils import supybot.callbacks as callbacks import supybot.ircmsgs as ircmsgs import requests import os import collections import json try: from supybot.i18n import PluginInternationalization _ = PluginInternationalization("F1") except ImportError: _ = lambda x: x class F1(callbacks.Plugin): """Uses API to retrieve information""" threaded = True def f1(self, irc, msg, args): """ F1 Standings """ channel = msg.args[0] data = requests.get("https://ergast.com/api/f1/2022/driverStandings.json") data = json.loads(data.content)["MRData"]["StandingsTable"]["StandingsLists"][0]["DriverStandings"][0] name = data["Driver"]["code"] position = data["positionText"].zfill(2) points = data["points"] output = ", ".join(['\x0306\x02' + name + '\x0303' + " [" + position + ", "+ points + "]"]) irc.reply(output) result = wrap(f1) Class = F1
The output should be
VER [1, 125], LEC [2, 116], PER [3, 110], RUS [4, 84], SAI [5, 83], HAM [6, 50], NOR [7, 48], BOT [8, 40], OCO [9, 30], MAG [10, 15], RIC [11, 11], TSU [12, 11], ALO [13, 10], GAS [14, 6], VET [15, 5], ALB [16, 3], STR [17, 2], ZHO [18, 1], MSC [19, 0], HUL [20, 0], LAT [21, 0]
...but it only returns
VER [1, 125]
I'm in that state where I can read the stuff, but when I put it together, it doesn't always work.
Final script
# this accepts @champ or @constructor with an optional year # and also @gp with an optional race number for the current season import supybot.utils as utils from supybot.commands import * import supybot.plugins as plugins import supybot.ircutils as ircutils import supybot.callbacks as callbacks import supybot.ircmsgs as ircmsgs import requests import os import collections import json try: from supybot.i18n import PluginInternationalization _ = PluginInternationalization("F1") except ImportError: _ = lambda x: x class F1(callbacks.Plugin): """Uses API to retrieve information""" threaded = True def champ(self, irc, msg, args, year): """<year> Call standings by year F1 Standings """ data = requests.get("https://ergast.com/api/f1/current/driverStandings.json") if year: data = requests.get( "https://ergast.com/api/f1/%s/driverStandings.json" % (year) ) driver_standings = json.loads(data.content)["MRData"]["StandingsTable"][ "StandingsLists" ][0]["DriverStandings"] string_segments = [] for driver in driver_standings: name = driver["Driver"]["code"] position = driver["positionText"] points = driver["points"] string_segments.append(f"\x035{name}\x0F {points}") irc.reply(", ".join(string_segments)) champ = wrap(champ, [optional("int")]) def gp(self, irc, msg, args, race): """<year> Call standings by year F1 Standings """ data = requests.get("https://ergast.com/api/f1/current/last/results.json") if race: data = requests.get( "https://ergast.com/api/f1/current/%s/results.json" % (race) ) driver_result = json.loads(data.content)["MRData"]["RaceTable"]["Races"][0][ "Results" ] string_segments = [] for driver in driver_result: name = driver["Driver"]["code"] position = driver["positionText"] points = driver["points"] string_segments.append(f"{position} \x035{name}\x0F {points}") irc.reply(", ".join(string_segments)) gp = wrap(gp, [optional("int")]) def constructor(self, irc, msg, args, year): """<year> Call standings by year F1 Standings """ data = requests.get( "https://ergast.com/api/f1/current/constructorStandings.json" ) if year: data = requests.get( "https://ergast.com/api/f1/current/constructorStandings.json" % (year) ) driver_result = json.loads(data.content)["MRData"]["StandingsTable"][ "StandingsLists" ][0]["ConstructorStandings"] string_segments = [] for driver in driver_result: name = driver["Constructor"]["name"] position = driver["positionText"] points = driver["points"] string_segments.append(f"{position} \x035{name}\x0F {points}") irc.reply(", ".join(string_segments)) constructor = wrap(constructor, [optional("int")]) Class = F1
3 votes -
Building an OpenTable bot
4 votes -
How would you write a GUI? Seeking opinions, recommendations, and what to avoid.
Hi all. I am asking this open-ended question (bottom of this post) because I am considering making contributions to an open-source project that would directly benefit me and other users. Some...
Hi all. I am asking this open-ended question (bottom of this post) because I am considering making contributions to an open-source project that would directly benefit me and other users.
Some background:
I have worked with an engineering simulation software called Ansys MAPDL basically everyday for the last 4 years, in both an academic and a professional capacity. It's not necessarily relevant whether you are familiar to that program to participate in this discussion. The relevant thing is that the GUI for MAPDL is written in Tcl/Tk and I don’t imagine it is going to be modernized (because of more modern, but distinctly different, replacements). This is a screenshot of the GUI for reference.
Why do people put up with such an old interface?
The power of the program is not its GUI, but the scripting language that can be run to setup and solve simulations. The program name is really the scripting language name, Ansys Parametric Design Language (APDL). It's somewhat like Matlab. The program also offers an enormous amount of control when compared to the more modern GUI that's been released, since the modern GUI holds a totally different philosophy.
The older GUI is really helpful in certain circumstances because it will spit out a file containing commands that were used in the session. This is a great demonstration of how to run a command or use a setting/config command, but a lot of newer features are buried in the documentation and aren't available in the older GUI.
My coding experience
I know the MAPDL language very intimately, but my experience beyond it is limited to some Perl scripting, and a bit of Python exposure.
Motivation
Open-Source Ansys API
Recently, Ansys started supporting an open-source Python project called PyAnsys. MAPDL is otherwise fully closed source, and this is really the only public-facing API. PyAnsys has basically converted a lot of MAPDL script commands to a pythonic format, hence Python can now be used to interact with MAPDL. This is great for several reasons, but is limited regarding interactivity. Interacting with MAPDL via Python is basically happening in a fancy console via Jupyter notebook or IDE like Spyder. Certain commands will bring up Python-based graphics displays of solid models and results plots, but there isn't a dedicated GUI open all the time.
The Question(s)
My question is whether it is feasible to write a frontend GUI to a bunch of python commands. If you were going to do it, how would you do it? What might you write it with? Would you even do it? Is this a stupid endeavor?
7 votes -
Fun and dystopia with AI-based Python code generation using GPT-J-6B
7 votes -
An exploration of the types of subclassing and when inheritance is a good choice compared to composition, through the lens of Python
4 votes -
Obfuscating "Hello world!" in Python
7 votes -
New major versions released for the six core Pallets projects - Flask 2.0, Werkzeug 2.0, Jinja 3.0, Click 8.0, ItsDangerous 2.0, and MarkupSafe 2.0
7 votes -
Pyodide is now an independent project - The CPython 3.8 interpreter compiled to WebAssembly which allows Python to run in the browser, originally developed at Mozilla
9 votes -
Exploiting machine learning models distributed as Python pickle files, and introducing Fickling: a new tool for analyzing and modifying pickle bytecode
3 votes -
What is a class in Python?
I've been learning a bit more Python, going through a Udemy course to expand my skills a little. One of the programs the course guides you to make is a little dictionary, but it currently only...
I've been learning a bit more Python, going through a Udemy course to expand my skills a little. One of the programs the course guides you to make is a little dictionary, but it currently only runs once and then quits.
I'd like to adapt it to use a nice TUI that keeps itself open until the user specifies they want to quit, using something along the lines of npyscreen. However, this library uses classes, and that's not something I'm yet familiar with. I'd rather have an understanding of what classes are, how they work, and why to use them before I take the plunge and start fiddling around with npyscreen (although I'd be interested to hear if you think that I should Just Do It instead).
Can anyone give or point me towards a good explanation of the what, how, and why of Python classes? Or better yet, a tutorial that will give me something to write and play with to figure out how it all fits together?
Thanks!9 votes -
[Python] Buffer overflow in PyCArg_repr
5 votes -
Input from a text file, pull from multiple APIs, formatting output, etc. in Python
I don't need answers so much as an idea of where to start. Essentially, I have a Google Sheet that uses importjson.gs to pull from the following APIs OMDB (IMDB) TheMovieDB TVMaze I also use...
I don't need answers so much as an idea of where to start.
Essentially, I have a Google Sheet that uses importjson.gs to pull from the following APIs
- OMDB (IMDB)
- TheMovieDB
- TVMaze
I also use another script to scrape Letterboxd for ratings.
This works well, but sometimes it'll time out or I'll hit urlFetch limits that Google has in place.
Basically, I'd like to have a text file (input.txt) where I pop in a bunch of titles and year or IMDB IDs, then the script runs and pulls set endpoints from all of these, outputting everything on one line (a pipe as a delimiter.)
My thinking is that I can then pull that info a sheet and run all of the formatting, basic math, and whatever else so it suits my Sheet.
I have a feeling I'll be using
requests
for the JSON andbeautifulsoup
for letterboxd -- or maybe a module.Can anyone point me in the right direction? I don't think it'll be too difficult and should work well for a first python project.
7 votes -
Python has accepted the proposal for a new pattern-matching structure, will be added in version 3.10
26 votes -
Ten awesome, rigorous, and curated Python interview questions
5 votes -
Installing and analyzing every package in PyPI to look for malicious activity
6 votes -
makesite.py - Simple, lightweight, and magic-free static site/blog generator
7 votes -
How to design a database?
I'm working on an application that allows a user to view playlists belonging to a particular radio show and stream/download/favourite the tracks in them. It has 4 core entities: User, Show,...
I'm working on an application that allows a user to view playlists belonging to a particular radio show and stream/download/favourite the tracks in them. It has 4 core entities: User, Show, Playlist and Track.
- Each show has multiple playlists (one-to-many)
- Each playlist has multiple tracks (one-to-many)
To be able to reference a playlist belonging to a particular show. I gave those playlists the same uuid as the show they belong to. A few questions though.
- Is this the right/best way to associate data?
- As a track could potentially belong to multiple playlists, I can't take the same approach as I do for (show/playlist) How would be best to handle this? Ideally I would like to have a single "Track" table containing all tracks for all playlists.
For any experienced database designers out there, how would you structure this data? What would you consider in designing the schema and why? If I did go with 4 tables only, presumably there would be performance implications given the potential amount of data in any one of those tables, particularly tracks. If that is the case, how best to structure this kind of thing with performance in mind? Thanks in advance for any help :)
For reference, in case it's of importance, I'm using sqlite3.
5 votes -
Architecture for untrained software engineers (Python)
Hey everyone, I've been programming for some time now but notice without any formalized education in CS I often get lost in the weeds when it comes to developing larger applications. I'm familiar...
Hey everyone,
I've been programming for some time now but notice without any formalized education in CS I often get lost in the weeds when it comes to developing larger applications. I'm familiar with the principles of TDD and SOLID - which have helped with maintainability - however still feel that I'm lacking in the ability to architect a properly structured system. As an example, I'm currently developing a flask REST API for a website (just for learning purposes). This involves parsing a html response and serializing the result as JSON. I'm still quite unclear as to structuring this sort of thing. If any more experienced developers could point me in the right direction/offer up their opinion I'd be very appreciative. Currently I have something like this (based - I hope correctly? - on uncle bob's clean architecture).
Firstly, I'm defining the domain model. i.e the structure of the API response. Then, from outside in.
- Infrastructure (Flask): User makes request via interface (in my case a request to some endpoint)
- Adapters: request object checks if the request is valid (on the way back it checks if the response is valid) - Is this layer only for error handling?
- Repository: I'm struggling a bit here, AFAIUI this layer is traditionally a database. In my case however, where the request is valid, is this where I should handle the networking layer? i.e all the requests to return the website source? I'm also confused given at this stage I should be returning the relevant domain model, like an ORM, but as my data is unstructured, in order to do this I need to transform the response first. Where would it be best to handle this?
- Use Cases: Here I transform the domain model depending on the request. For example, filter all objects by id. Have I understood this correctly?
- Serializers: Encode the domain model as JSON to return from flask route.
If you got this far, thanks so much for reading. I really hope to hear the opinions of more experienced devs who can steer me in the right direction/correct me should I have misunderstood anything.
8 votes -
Passing the Same Parameters to Multiple Functions
6 votes -
What is the best way to teach Python for my 11-year-old sister that lives in another state?
This may seem an obvious question, but not as much as it seems. She uses Windows, I’m currently using Linux/macOS. How to instruct her to install her Python environment? Should I use Zoom, Skype,...
This may seem an obvious question, but not as much as it seems. She uses Windows, I’m currently using Linux/macOS. How to instruct her to install her Python environment? Should I use Zoom, Skype, Google Hangouts, or another solution? Is there and easy way for live-drawing (online blackboard) to explain things to her visually? And, perhaps most importantly, how can I do that for free?
13 votes -
Write a Simple Steganography Program using Python
5 votes -
Variations on the Death of Python 2
8 votes -
JetBrains Academy - Learn to program in Python, Java, or Kotlin by creating working projects - Currently available for free
17 votes -
What's the current state-of-the-art in Python package creation/distribution?
I've been thinking on and off about packaging up a few simple Python utilities I've written to stick up on Github for people to use if they want, but, every time I go to check out how one goes...
I've been thinking on and off about packaging up a few simple Python utilities I've written to stick up on Github for people to use if they want, but, every time I go to check out how one goes about managing dependencies and all that for a project, I run into a whole wall of options. Does anyone better versed in all of this have any recommendations for me?
11 votes -
Python Web Scraping with Virtual Private Networks
3 votes -
JPMorgan's Athena has 35 million lines of Python code, and won't be updated to Python 3 in time
6 votes -
Jupyter Notebooks in the IDE: Visual Studio Code versus PyCharm
4 votes -
(ESR) Notes on the Go translation of Reposurgeon
8 votes