19
votes
Which technical/technological issues or needs do you think should have been sorted out by now?
20 years ago I saw a computer scientist on TV saying that operating systems should come up with a better way to organize and present files, something that took into consideration the files we used the most and the ones we were likely to use again. Not just a recent files menu, but some form of AI prediction that would prepare our desktops with little intervention. This, of course, didn't happen, but I think about it from time to time. I would love to have an AI that would understand my workflow and do a bunch of things for me.
This is obviously way too advanced as an answer to this thread, but I'm curious: what did you expect to already exist in the field of computer science, but simply didn't pan out?
After 2FA online banking came around 1999, I'm flabbergasted that a single global clearing house for transactions isn't even being talked about 20 years later.
Why does it take days to make many international money transfers in the 21st century? How is tax evasion through regular digital money transfers still possible? Why in the world is it easier to send someone a picture or a movie than money?
Why is airplane ticketing still based on platforms that run on both software and hardware that's decades old?
Paper checks are still a thing. Some places even require that you use them, or even digital checks. That boggles the mind. Why haven't modern solutions taken over this space?
How hasn't speech to text become good enough to be mainstream yet? People spend silly amounts of time typing things out slowly. Sure, there's the issue of multiple languages and accents, but interfacing and the software around the algorithms is still so bad text to speech isn't mainstream practically anywhere.
In terms of pure coding, why isn't everything that hits an external network always encrypted? That seemed like it'd be a no-brainer possibly even months after https started being a thing.
Huh. Just realized Brazil is more advanced than America when it comes to payment.
It doesn't.
It takes you days to get access to that money. The banks have access to it the entire time, and are collecting interest on it.
That's why it still takes you days to transfer money internationally.
If it works, why "fix" it?
I can give a little insight here as a computational linguist. Basically, any technology involving human language will only be robust if it leveraged both rules (that must be hard-coded, e.g. phonotactic constraints like sonority hierarchies) and statistical language models (what is the likelihood that someone said “forgotten” vs. “four got in”). For any given natural language, the former requires a lot of linguistic expertise and the ability to encode the rules in a maintainable data format. The latter requires lots of manually annotated training data (across many domains, linguistic registers, and language communities).
The only entities that are capable of robustly attacking these problems from the statistical side are Microsoft, Google, Apple, Amazon, etc (often leveraging signal processing and models from smaller companies such as Nuance etc.). And, even then, they are statistical, so they won’t work all the time and when they do fail, they’ll fail in unexpected ways. And even more than that, when disambiguation requires world knowledge, they are more likely to fail. So, not only do you need a statistical model of language, you need statistical knowledge of the state of the world of your users, and possibly access to historical states.
In essence, natural language understanding is something that only healthy human brains are capable of at this point, and in broad terms, we really haven’t spent much time or effort on these problems relative to how difficult they are. They’re likely several orders of magnitude more difficult than you imagine, because you are a human who developed natural language proficiency naturally through social interaction with other humans. Computers are at a severe disadvantage there, so it will take time for linguists and software engineers to build the resources and technologies to make systems capable of learning natural language proficiency and world knowledge as robustly as humans have evolved to.
This one I can probably answer. Because powerful people doing the tax evading want it to be possible.
Airlines compete a lot and cut costs regularly. They don't want to shell out for a new system if the old one still works half of the time. And they don't want downtime as they install the new system.
I don't want to look like an idiot talking to my phone as I walk around, I guess that's a phone call though but you get the point. And I don't want to talk to my computer at all. Also consider these:
And what if I actually wanted to use the words equals or apostrophe?
Speech AI requires training and that usually results in my voice being sent off to some company so that they can use it to train the AI, as has just been shown Google has been doing
I make less typos than the speech AI will make mistakes, but I guess it will improve with time.
Gaming, you cant say "w w w w a a a mouse1 mouse1 mouse1" and perhaps a speech to text for chat would actually be kinda cool, but it needs to be accurate and voice chat already exists.
I like my mechanical keyboard
keyboard shortcuts and macros won't work anymore
imagine offices with everyone talking to their computers all the time
There are probably better educated people than me to answer this, but I think one of the main reasons is legacy support. Old websites from before https was common still exist and the owners are unlikely to bother to update them to support it. If you require https, these sites disappear (well I guess there is the web archive). Plus most of these sites don't really need https.
And I think the other part is that companies cut costs and are incompetent sometimes. I don't think they care too much about security until money gets stolen or networks go down.
I guess you did say "In terms of pure coding" though, which I assume means "is there a method to do this securely" in which case I think pretty much anything on the internet has the capability to be securely transmitted. Could be wrong though, if anyone can point out some examples.
Purely on the use of voice in gaming, I use a macro program called Voice Attack, and you can trigger your macros by voice command. It's not so great for movement controls, but it's been a wristsaver for me.
A lot of this is just personal preference. I care about looking silly on my phone and people overhearing what I say. You don't.
True that most people are not programmers, but I am (kinda) so that's how I look at it. For most this is not an issue. And as for the blind programmers, they use dictation because they are blind, if they were sighted they would almost certainly use a keyboard.
I think you are looking at this from more of a mobile perspective. And I can see it being more useful there for some people, and they are welcome to use it, but it just isn't for me.
A new POSIX. We're hamstrung by POSIX being just good enough which means there's not quite enough of an incentive to make something more modern, more elegant, and more generalized. The Redox project is one which looks promising among others but it's far from usable right now.
That's Plan 9. It was elegant, broke the mould, brought UTF-8, procfs, and 9P into existence, and… totally failed as a popular OS.
Plan 9 is a great example of what I'm trying to say. It was evidently better than POSIX, but POSIX was just good enough that the transition cost of moving to Plan 9 was too much of a hurdle.
Nothing more permanent than a temporary solution. I think that explains a lot of these.
We have a joke in my small software shop that a permanent solutions lasts around 5 years and a temporary 12.
POSIX is quite generalized... Purposefully so. It's a portable operating systems standard.
Perhaps I didn't convey myself well. I think a new POSIX would likely feature better abstractions than simply 'everything is a file'. A system designed from the ground up with structured filetypes as the default would likely allow a far richer and more intuitive command line ecosystem too.
This has already been tried (WinFS among many others). It's a cool idea that never pans out in reality.
Filesystems are already complicated enough when you just need "sequence of bytes" semantics from a file. Adding in structure / pseudo-database features at the OS / filesystem level makes it way too complicated. The "right way" to do it is have the structured data layer at the userspace level, using SQLite for example, or JSON/XML/etc for simpler files.
That also allows portability - a MS Office
.docx
file is a structured ZIP file following a certain standard. If it was instead a structured file using NTFS features, you couldn't open it on a Mac or Linux machine. Can't email it to someone (unless there's a way to export/import it to a "plain" byte-oriented file), can't save it to a Dropbox folder unless Dropbox also understands the NTFS structure.I do understand all the complexities. But text is such a clumsy way of passing data. It just about does the trick, if you're good enough with
sed
andgrep
to pull out the bits that you want. But wouldn't it be much more nicer if simple, does-one-thing-well utilities fed into each other in an elegant, semantically meaningful, intuitively scriptable waterfall of data?What happens when someone comes up with a new structure of file type? How do you propose all future structures be accommodated for? Can you even possibly imagine what file structures would be 30 years from now?
POSIX managed that, fwiw.
And really, what sort of better abstraction do you think could work?
A wild competing standard appears.
Where do I start?
Bookmarks. Tabs. They are more often than not used as a "read later" list, but the former can break at any time, while the latter use memory and CPU time. A better browser history could help here, saving not just the time of the last visit, but also a snapshot of the page and the text it contains.
The snapshots could be limited to the most recent ${N-100}, while saving the text would allow for full-text search over one's browsing history - why not allow bookmarking of searches, at that point?
Then, a lot of the ideas from OLPC are very interesting - the Journal, for example, which makes the activity the basic unit of information, instead of the file. An activity can be a photo, a document, a browsing session, a note, but it's not just a blob of bytes - it contains the state of the app where it was left, it can be assigned a name, a note, and tags. Activities rise up in the journal when you open them again, or slowly fall to the bottom otherwise, which makes deciding which ones to prune or archive easier.
I don't think I would work better with this model, but I can see it working pretty well on phones and similar devices and nope. Hierarchical file systems pretty much everywhere.
Why is there no real unified interface, even just for the command line? Everyone parses things like command line options in a different manner. Short options can be collated (
-ax
) or not (-a -x
), and some programs (tar
, for example) don't even require an hyphen in front of them. Long options can be there or not, sometimes they require a double hyphen and sometimes only a single one. Some programs use subcommands, some just split their executables.Sure, every decent programming language will have an argument parser in its stdlib, but their implementations will usually behave differently, and why do they even need to provide an argument parser in the first place?
Window focus. I absolutely hate it when I'm launching a program while trying to type something. Some of them have little splash screens, then updating screens, then openning for good. Each time it hijacks my focus and sometimes causes me to just trigger random things bc of my inputs.
What's your OS? On Linux KDE/KWin (and presumably others) have good support for "focus stealing prevention" options, allowing you to configure the rules for when and how windows can gain or lose input focus.
It's even possible to set application- or window-specific options to control placement and focus rules or exceptions individually.
There might be something similar on Windows, I know there's at least an option for focus-follows-mouse behavior, but it's a bit flaky IIRC.
Windows, unfortunately. After my old laptop died I figured I should try windows again to see if it's still just as triggering. I've got a really stripped down version of LTSB and after a lot of tweaking, got something quite usable. But now I'm just left with all those minor "windowsy" bugs that aren't major enough to get me to switch back, but are really starting to pile up.
I really ought to just dive in and re-install though. I'm just so unsure of what I'd need to ferry over and whatnot at this point, and there's quite a few utilities I have running which are unique enough to not have linux equivalents.
Thanks for letting me know about KDE/KWin though, it'll make it a lot easier to choose a starting distro. I'm quite a novice in that regard so never too sure what the pros and cons are between em.
Dude, I totally agree. It is honestly the most infuriating thing I experience day to day computing. I don't understand the behavior either. I can't think of a single reason I would ever want an application to automatically steal focus from me. Flash a notification, sure, I'll click on it when I get around to it, but why would you just assume I now want to interact with your stupid ass program just because it's finished loading, or has an update, or literally any reason other than physically clicking on it?
If I made a list of things I hate about windows, there would be a lot of equal firsts and this would be one of them. It's much better (or at least controllable) on GNU / Linux
I imagine some of these are windows specific, but I remember hearing a complaint about how windows deals with errors for certain tasks. Say you're trying to delete or move a lot of files, you select a set of folders and click delete. If windows runs into a problem deleting a file the whole show comes to a stop. But why? Shouldn't windows at-least skip that file and continue deleting all the other files. Then you can tell it how to deal with the problem files?
There's a thing you can get on ninite.com that lets you do unsupervised transfers, also 90% of the time it's faster than windows at doing batch operations too.
That makes sense. Linux is the same. IDK if there's a technical explanation for that behavior.
Fail quickly, and fail noisily. It's a UNIX principle.
But why does the entire string of operations fail? The single operation on that file should fail quickly and noisily but the rest should continue.
If I were to guess, I'd say it's intended to be (or at least emulate) a database-style transactional behavior, where either the operation (delete everything in this folder) completes completely successfully or else fails entirely and returns the system to precisely the state it was in before starting the operation.
This is generally a good thing, since it avoids leaving things in an inconsistent state, for example when attempting to remove an installed program, it should either succeed by removing everything or fail by removing nothing (and thus leave the program still in a valid runnable state with its dependencies and/or uninstaller scripts intact).
In practice though, I agree that the more common case of "delete these random things in my Downloads folder that I don't care about anymore" should probably be treated as a series of individual operations that can fail or succeed separately.
So I just did some experimentation on a folder with some items I couldn't delete and windows is kind of interesting. Folders are treated somewhat like a transaction; if anything within a folder can't be deleted nothing will. However, if you select multiple items and folders, windows will actually go ahead and delete everything it can and let you know about the items it couldn't delete at the end.
Because it would leave the operation in an indeterminate state (Which files were deleted?)
That makes sense too.
I'm continually amazed that the operating system situation on ARM platforms is still such an utter disaster compared to x86. In spite of minimal hardware variation in mainstream devices, it's rarely, if ever possible to install arbitrary operating systems without manufacturer support and device specific images, and tens, if not hundreds of millions of devices annually are dropped by manufactures and left running outdated software. It's insanity, I don't need to get device specific updates or install images from Microsoft or some Linux distro for my PC, and am free to install the latest version of their OSes on whatever hardware I like, usually without issue, even though my system could be made of a hodgepodge of any combinations of parts from the past fifteen years. Somehow that isn't possible on a platform that usually consists of millions of devices sharing the exact same hardware configuration.
Also, smartphones have been used by billions of people for over ten years, how is there still only one general purpose OS and barely a custom ROM scene anymore?
I feel like these problems are slowly being addressed with true GNU Linux phones like Librem 5 and the PinePhone coming out, as well as efforts by Google to take device manufacturers out of the OS update loop like Project Treble, but it sure fucking took long enough. It's not like phones were invented in a vacuum, but they really took their sweet time reinventing the wheel and getting to the point desktop operating systems were at twenty or thirty years ago.
Around the time of Dragon NaturallySpeaking becoming a thing, I was expecting a systems-level voice API that would allow complete control of the OS and applications on top. Something like
Steam: Download Borderlands 2
, orWord: Save document
with a simple audible command.We're almost there with mobile and voice assistants, but desktop systems are still much behind. I still feel like this could be a huge boon to productivity.
The comment from @HanakoIsBestGirl summarize best the problem with Speech controlled interfaces:
And I feel each point is enough to be skeptical about this technology, but I am already on the side of deliberately not using the Google assistant for usability and especially for privacy reasons.
Sorry, but I think that comment is really missing the point of the suggestion. I wasn't talking about using voice for typing, but issuing commands. It wouldn't make sense to use voice input for typing unless you're a particularly slow typist, or for accessibility reasons.
However, having your computer perform a specific task without changing contexts could be a significant efficiency boon. I gave some suggestions above like saving documents or downloading a game through Steam. These could be performed without interrupting what you're already doing. How about launching a program, or finding a file?
Why not? I talk to my home assistant in just the same way and it feels natural and convenient. I check the time in the mornings. I play music when doing dishes. I set reminders and ask basic questions throughout my day. I don't see why it'd be weird to talk to your computer. Cultural norms adapt quickly to quality-of-life improvements.
I mean, who is even suggesting this? That's definitely not the point. If it even made sense to implement such an API in games it would look like "walk forward" or "open chest", not entering raw button inputs. But even that would only make sense in cases of extreme accessibility issues and would not be applicable to all games.
Hi I'm the guy who fandegw quoted.
I did originally consider replying to you with the message that I posted elsewhere, but I left a different message instead as I don't think this one quite fit what you were asking for (see your first paragraph).
But now that we are here, I may as well bring up a few things.
That's my personal preference, if you want to that's fine. I actually mutter at it a lot and if it could hear those profanities it might just lock me out. What I mean to say is that it would misinterpret my muttering as commands. I simply wouldn't use it when macros already exist. (I'm repeating myself here, I already made this argument to you in my other comment). This is very much personal preference and it just isn't for me.
And as for gaming, I seem to have forgotten to write it, but I meant for existing titles that are unlikely to receive an update to support such voice functionality. I agree with :
I have to admit I overlooked the exact point and arguments used because I put my own conceptions about the subjects while reading between the lines of the orignal comment.
I also want to admit that this was surely a lazy comment on my part which is exactly what tildes is trying to prevent, and why I continue to go there regularly. So double sorry on my part.
What I felt was good points, was the part on shortcut and macros. I feel the whole point about issuing commands could be done with good keyboard shorcuts, like the example of saving a doc on word could be done by Alt-Tabing to word, Ctrl+S to save, Enter for the eventual dialog, and Alt-Tabing to original window.
And while this a bit much to learn when you begin on the shortcut route.
I feel that remembering a list of commands to do precise things with precise apps could easily becomes more of a hurdle, than using the same shorcuts set by your OS.
These will eventually feel like automatism, like for example all context switching on any OS that I know of (Windows and linux, don't know about Mac) is doable with the same simple shortcuts.
Remembering these is enough to speed up any tasks on any computer.
The Speech interaction could be made better, like what the Google Assistant is trying to do with supporting all kinds of phrases to do the same thing. But I feel that taking this route at the moment poses all kind of privacy problems, or even working conditions problems for that necessary humans worker that have to listen to these phrases.
There is also problems to the shortcuts approach such as non-implementation by the apps themselves (like your example of downloading Borderlands 2 on steam, I don't know if there is any shortcuts in steam), or questionable choices of key combination, like using Ctrl+S for doing something other than Save.
But I feel its a small price to pay, for otherwise complex problems arising from the Google approach.
That would be awesome.
all fun and games until someone walks up and says "terminal: sudo rm -rf --no-preserve-root /" or perhaps just says "close without saving" or "shut down". Is the solution voice recognition? It isn't very good yet and would either make the computer unresponsive as get falsely thinks I'm not me or would be too responsive and falsely think others are me.
And how would it differentiate normal speech from speech directed at the computer? If I tell someone to "close that window" (meaning a real window) and my computer closes whatever I'm doing, I'm going to be annoyed. And if you suggest a mic toggle button, I'm going to suggest a macro tied to that button instead.
I see the use but I don't want a 24/7 microphone + macros / keyboard shortcuts already exist.
These are genuine concerns, but I'd argue not unsolvable problems.
To the question of someone else saying "close without saving", voice confirmation is actually already used in voice assistant devices. While anybody may be able to play music or do a quick lookup, sensitive tasks like accessing reminders do require verifying your voice. I've run into this occasionally when running water which might distort my voice.
I'd like to think that highly-destructive commands like
rm -rf --no-preserve-root
would probably not be added to such an API regardless. :)My suggestion would be a wake word (see Star Trek's "Computer:"), a quick invocation shortcut (button on mouse/keyboard), or more interestingly a different tone of voice to denote a command. It's a new world and there's still plenty to explore on that topic.
I make heavy use of macros and keyboard shortcuts as well, but most people aren't "power users" like us Tildes readers.
The rm -rf thing was more of a joke, just to demonstrate the potential for annoyance from other people who think they are funny, even just the command "minimise window" would get annoying if someone yells it at you. But people will be people. And if the voice commands are restricted then they become less useful. Keyboards can type powerful and useful but dangerous commands and can use a password for security... voice not so much, I don't want to say my password out loud.
I already mentioned voice recognition in my previous comment. It just isn't reliable enough. It'll either be too strict and be annoying or too lax and be insecure.
Wake word maybe. But it doesn't fix the 24/7 mic. Tone of voice? In the future when computers can accurately detect it.
It'll be a good feature for those that want it. But it's not for me at all and I (perhaps selfishly?) don't want it taking focus or usability away from the keyboard and mouse.
Phone battery life. If only there were some way of swapping batteries on the go. oh wait...
Not in CS but toilets.
What about them?
So many problems:
1 - 5 are issues of economics, not technical or technological issues. Toilets don't pay for themselves except through positive externalities (like not having human shit on the sidewalks, or having stairwells and elevators that don't reek of piss), so few organizations will pay for enough and pleasant enough toilets.
There are also already toilets where you don't urinate or defecate into a bowl of water, so there's no splash. In those, you get a jet of water after the fact to wash everything away. The lack of implementation in the West is due to cultural resistance paired with a large existing infrastructure base of the other sort, not for technical or technological reasons.
Is there something in the "..." that is a technical or technological issue, as opposed to a economic or cultural one?
Go on...