23
votes
What's an achievable technological, scientific, or computational breakthrough that you're really looking forward in the next fifteen years?
Title! Anything goes, both minor and major developments, as long as they can conceivably happen in the next 15 years.
There are three things i'm looking forward to.
High-Refresh e-ink displays
I've been using some e-ink displays with higher refresh rates like the Remarkable 2 and it's great. But in 15 years we might have full color e-ink displays with 60hz refresh rates and that would be a game changer. Even if you're only thinking of animated art displays hanging on your wall. But i'd be interested in seeing what other uses people can think of.
Cheaper Improved Batteries
This one's pretty basic, I've been wearing a Bangle.js 2 for a couple months. It's been really interesting to have a device that only needs charging every 3 weeks or so. If battery technology is able to improve and we can extend that kind of lifespan to laptops, phones, etc. Also if batteries were cheaper I might be able to buy an electric car some day. I think we'll begin to see interesting changes in how people use their devices.
Solving the vergence-accommodation problem in VR
This is a big one for me because I've been avoiding purchasing VR because of it. I believe it also marks a milestone for VR where the technology crosses from in-development stage to a finished product that just needs iterative improvements.
Fully homomorphic encryption: the ability to send totally encrypted data to a remote server, and have it do calculations and processing on the data while still encrypted. I expect it’ll be an absolute game changer for privacy and security as it moves from current prototype stage to real-world application.
https://en.wikipedia.org/wiki/Homomorphic_encryption
I've heard about this. IIRC IBM had an implementation of this they were touting a few years ago. I'm having difficulty understanding what a real-world application would like that would benefit from this. I.e., I have some personal data, say my health records from my PCP. I homomorphically encrypt this data. Now who do I share it with, and why is it useful for them to perform calculations and processing on this encrypted form? And why should anyone else offer to run such calculations and processing on my data instead of me running it myself? It also raises the question of how would you verify that an application processing such data performed correctly/accurately? Won't you still need unencrypted data to validate this stuff anyway?
I’m sure it would enable plenty of governments and businesses to use cloud computing resources for sensitive data?
Why can’t they use regular encryption for that?
I'll give you a highly relevant example where this would be useful.
There are certain engineering simulation models (military, nuclear, etc.) that cannot be viewed by individuals working outside of the country the model originated in. So, if a colleague in France receives a model that gets solved on hardware in France that American employees also have access to, that's a potential issue if the organization has an embedded threat actor. I think they call it corporate sabotage or something like that. It's better than encrypting at-rest. Sure, you can encrypt a model when it's not being solved, but at some point it needs to be decrypted to solve and postprocess.
But another example that's more likely is trade secrets, especially in the current cloud computing landscape. Let's say a company has secret materials for a tire, and they regularly simulate it. Something like Formula 1 tires, where this wouldn't be outrageous. Except now they want to simulate a really detailed model and need a cloud HPC system. The tire company has to trust that the cloud provider won't be breached and their material data won't be leaked. The modern technology is that we can encrypt the material data, but when the model is solved, we have to temporarily decrypt to allow the solution to proceed, then we can re-encrpyt after solution.
A similar situation occurs if the tire company sends the model out for some consulting service. Perhaps they don't want the consultant to see the material data, and don't trust them. Encrypting the model would be ideal.
I understand that material data can be reverse engineered in simulation, so this isn't a watertight example, but it's relevant to what the industry is asking for right now.
The need for encryption would have to be stronger than the need for computational speed, though.
Ok so this is useful for cloud service providers who want to convince customers to send their data to the provider for processing while remaining encrypted. For individual people, then, this seems like it isn’t interesting. Really only useful for situations where your data is proprietary and you don’t own your own machine big enough to do useful things with it yourself.
From a abstract, theoretical view (which is to say, knowing basics of encryption, but not details about homomorphic encryption) I would hazard the guess that the result of the computation in its encrypted form may not betray anything about its decrypted contents, or the decrypted inputs. Basically, if you got your hands on encrypted tyre simulations results, you shouldn't be able to tell what experiments they performed in simulation, nor what tyre compounds they tried, otherwise the encryption would be brittle.
Coincidentally, that comes (if my assumptions are correct) with a few more or less nasty corollaries. I'm assuming here further that the encryption is symmetric, as asymmetric opens other (very interesting) cans of worms. But if I can get my hands on the simulation code and the simulation data, knowing the code shouldn't compromise the data. Since the computer running the code in the encrypted form is not trusted, it stands to reason that anyone could perform that computation. And if that simulation code has nasty little effects such as stopping early if a condition is met, I would be able to observe that by just tracking how long the simulation runs. For example, if the simulation stops in case of tyre blowout, then I should be able to detect if your tyre compound is shit by seeing how long the encrypted simulation takes. I have not broken the encryption, but I still know something about what's been encrypted. That's real bad.
To compensate, evaluation would have to continue until such a time as it can be determined that each evaluation path has been completed by now. Which means you have to track every possible execution path irrespective of if the data uses that path all the way to the end. In general, that's an exponential number of paths with difficult-to-grasp depth. That would be seriously intractable. That can't be it. I must have taken a wrong turn somewhere.
Anyone around who knows how homomorphic encryption should work who can sort out my assumptions?
You're hinting at something that is the real heart of the issue. But, I'll note a few things for context.
Simulation time is basically meaningless since many factors (hardware, product version, etc.) could change the runtime. But so can mundane factors, like how dense is the mesh? You may have 3 models that all run for different lengths of time but only due to mesh density. Also, if an unstable model has been created and the solver fails to converge on specific hardware, therefore stopping early, there's still nothing to glean from this simulation unless you can read the program output. And, even then, you may not know much. So, in that way, you may know something, but the tangible affect of what you know is not actually important. However, if you could circumvent the encryption to reverse engineer encrypted material data, then the encryption is obviously not that effective. But, what's stopping you from doing a cleanroom reverse-engineering process of the exact same thing?
With that in mind, I don't think it's worth surmounting the complexity of tracking execution paths for the example I gave. However, I'm keen to know if there are better examples!
I have also always wondered this - sure, being unable to differentiate
1 + 2 = 3
and5 + 2 = 7
is great, but how is it useful if you’re doing analysis of the execution of a larger program? Are conditionals encoded using functions likemin
ormax
, or is the control flow clear with only data being encrypted?Realistic, specific, and game changing. Great!
There are two innovations I really want to see in the video game world, particularly for Bethesda-style RPGs:
These are big problems to solve but I think there have been good advancements toward both already. Bethesda's been dynamically generating endless quests for years, not main quests but still. The nemesis system introduced in Middle-Earth: Shadow of Mordor is an attempt at something like this. And of course there are deep-simulation games like Dwarf Fortress and Crusader Kings that we can look to for examples of compelling emergent narratives. To the speech synthesis point, home assistants (Siri/Alexa/Google) have made great strides. I've also heard unbelievable voice results from neural networks but those are still computationally intensive. Anyway, 10-15 years may not be an unreasonable timeframe for both ideas to mature, and I hope they do.
Yeah, as far as speech synthesis, the neural text to speech (NTSS) models I've seen, esp. from Microsoft, are making me think that there will be a lot more YouTube channels that will be autogenerating content from text and stock photos/videos.
Some of MS's previews here are really good.
Narrative engines are something I think a lot about for sure, but not just for fictional worlds. We'll eventually need only to define certain parameters for the machine to write an entire thesis or dissertation that is both logical and persuasive.
We'll be able to prototype arguments before dedicating our entire lives to it. The machine will tell the logical errors in our reasoning even before we commit it to paper.
I think that narrative cohesiveness will come far before logical coherency displayed in a human language. Just talking with an NPC or following a quest has a much lower barrier to acceptability than any sort of logic checking.
De-extinction: with the advances in genetics over the last 20 years, I’m very interested in the potential for de-extinction of various species (my personal candidate: Carolina Parakeets). We already technically de-extincted the Pyrenean Ibex (a subspecies of Spanish ibex) for seven minutes, and there are proposals to de-extinct and release passenger pigeons by 2030.
Fusion: ITER is supposed to have first plasma in 2025, with D-T operations beginning in 2035. Even if it isn’t the basis for a future power plant, ITER will still give us more insight into net-positive fusion, and will be important for validating the engineering and operational challenges of future reactors and designs.
Nuclear power: The Chinese already have their HTR-PM pebble bed reactors running, but more generally I’m very excited and curious to see which next-gen designs (pebble bed, molten salt, etc) are built, operate the most smoothly, are most economic, etc.
I’m also interested in a potential spent fuel reprocessing method that avoids organic solvents entirely; it uses crystallization to separate the actinides (including uranium) from the fission products, eliminating the risk of a red-oil explosion, simplifying the flow sheet, and being more proliferation resistant than standard PUREX methods.
Lab-grown meat at commercial scale.
I'm on board with this one. Both the animal-cell-petri-dish variety, and the pseduo-meats like Impossible and Beyond Meat. People are not going to give up meat voluntarily until the climate crisis has literally flooded them out of their house. So I'm hoping either of those 2 meat alternatives reach price parity soon, and then out-compete "real" meat.
Flat panel displays that are fast enough, high resolution enough (filters could get REALLY sophisticated), and have black levels good enough to approximate both the quirks and the genuine advantages of CRT monitors. I swear I'm not holding on to these things for nostalgic value. I'm hoping OLEDs with black frame insertion or something get there eventually.
Alternatively the revival of CRT technology in a modern form would do what I want, but that's highly unlikely to happen regardless of how desirable they are for enthusiasts. The expertise is quickly disappearing...I've seen some interviews with experts and honestly the people who could make this happen are all either out of the industry or dead.
I'm super disappointed that everybody working on display tech seems to have just straight up forgotten SED technology. It was a flat panel display tech that showed a lot of promise in the late 2000's that essentially operated as a grid of tiny CRTs used as pixels, and was better in essentially every way than LCDs, power efficiency, response time, viewing angle, contrast ratio, etc. It's only with the development of micro LED tech that we're beginning to see something that even vaguely competes, but with the extreme difficulty of manufacturing I honestly still think of SEDs as having more potential.
Having discrete pixels eliminates many of the key advantages CRTs had, so I'm not sure SED displays would be all that exciting for retro gaming. I think the resulting product would be similar to a plasma TV. I'm guessing research fell off a cliff because OLED was on the horizon.
That's true, I do like that it has all of the other advantages and lacks OLED's burn in issues. I do wonder how the power efficiency compares to OLED and uLED. That said I keep some CRTs around for retro gaming myself for the resolution thing.
Unfortinately, SEDs reliance on phosphors would give it the same burn in problems experienced by CRTs and plasmas.
I wish Hisense's ULED TVs took off, 2021 was their first and last model year. It is a shame their first model had some 60 ms of input lag minimum, since that ruins them for games.
CRTs burn in? Huh, didn't know that. Did it require more time than plasmas or OLEDs?
Yep, they burned in. The phosphors wear out with use just like they do in plasma TVs. It is why screen-savers used to be popular on computers, running 15 minutes of moving images on the screen could help limit the effects of burn-in and clear temporary image retention.
I think burn-in on CRTs is just about as bad as on a plasma, however it is probably less noticeable because your typical CRT is a lot smaller than your typical plasma.
I'm looking forward to a better understanding of how our gut flora affect our health - from mental health, obesity, auto-immune disorders, the next medical frontier is our stomach and intestines.
It's founded by George Church (who has his hand in just about every bio-* related venture under the sun), so take it with a grain of salt, but there's a company called Colossal Biosciences that is aiming to deextinct the Woolly Mammoth by 2027. I don't know if it's actually achievable or not, but it doesn't seem like this should be impossible with sufficient funding and effort by genetic engineers.
Let's talk a bit about AI and machine learning.
I expect the current scaling up of neural networks can't continue for much longer. Current models consume more and more resources to train, in terms of data, power, and hardware. There's a variety of things that stand to break this "monopoly".
For one, I could see gradientless optimiziation taking off. Instead of following the derivative to your objective, some kind of approach that skips all the nastiness of convergence, ideally only using information locally available in your computation, could help things along a good bit.
For another, exploiting symmetries: A lot of tasks we ask AIs to do have enormous amounts of symmetry that is ripe for exploiting. The general principle here is "if you treat entity X using computation Y, then you should treat X' using Y as well. The complicated part here is figuring out that X and X' are related. A naive neural network approach might try to learn Y twice (call it Y and Y'), but that way you split your training data: If you know about the symmetry, Y is trained on X and X'. If you don't, Y is trained on X and Y' is trained on Y'. The consequence is a larger networks supported by less data.
As a third one, inducing structure and/or known computations. We often have a good concept of how we want our AI to compute the target at a high level, we just can't express the details using the available tools. What we do is we shove X into a numerical swamp and get Y'. We want Y instead, so we coax the numerical swamp accordingly. If instead we could write down how the swamp is supposed to work, we could understand better what's going on and what's going wrong. Consider a problem such as the following:
To a human, the computational structure is clear: Identify all the vehicles; classify them according to the best heuristic of fuel consumption we have (e.g. "SUV vs compact car" or "Ford F150 vs Renault Zoe"; the level of detail depends), sum all the estimated fuel consumptions. There's a very clear algorithm here that we could expect the AI to obey. If we can induce that structure onto e.g. a neural network, we stand to profit:
It's not clear from your post ... do you expect the current trends in AI development will, with better algorithms and techniques, ultimately get us to real AGI? Or do you feel (as I do), that the research will eventually dead-end in some kind of local maxima, requiring a substantially different approach to get us there?
I feel the current techniques and trends in techniques will not achieve AGI. As I alluded, the scaling problem is too big: We can't build or train neural networks big enough to achieve what we need. (Side note, I'd be interested in the limits of the universal approximation theorem as far as "how many neurons do we actually need" goes. The scaling factors here are hardly talked about, but I suspect they're punishingly high.)
I feel some amount of the above could help in pushing AI research much farther than we are right now, while making smaller models feasible again. Though I don't think we'll actually see downscaling of current models barring some truly disruptive new techs that make scale detrimental: Now that we have such big models, the genie is out of the bottle. However, I don't think anyone is qualified enough to say whether the above improvements I outlined will actually make AGI a reality by themselves. Some of them certainly are a substantially different approach that could hypothetically be the missing piece for AGI.
Based on current research with transformers, they seem to scale linearly with data set size and the number of parameters. That said, they tend to memorize a lot from their training sets, and both training and inference on models with too many parameters becomes limited by hardware and availability of clean data.
The large language models (where petabytes of text are not infeasible to collect) that are currently being developed are scaling up with no limits in sight except for dollar costs for compute. Practically speaking, those limits are real and important, but theoretically with unlimited time/space budgets, the sky’s the limit.
The problem is that “summing up some numerical properties over N records” is trivial for computers. You don’t need AI for that. You need an Excel formula, or equivalent. The “how many cars are in this photo, and what is the estimated fuel efficiency of each one” is a completely different kind of problem, fraught with all kinds weedy ill-defined rabbit holes that humans can skip over due to having common sense, but single purpose machine learning models can’t. Like, a human tasked with this isn’t going to count the car on a poster or billboard that happened to be captured in the background of the photo, but a computer vision model very well might. A human who recognizes the photo is a screenshot from a movie is going to reject the input as out of domain. A computer vision model isn’t necessarily capable of doing that (even if it can emit low confidence scores on detection and classification tasks). A human might see a motorcycle in the photo and count that (or not based on context), too. A computer vision model, even if capable of detecting and classifying these vehicle types, won’t have the common sense to include or exclude them in the downstream steps of the task.
What you need for this kind of task is AGI capable of understanding the world and learning common sense, which I think we’re further than 15 years away from. It’s possible there will be end-to-end multimodal models that might be capable of this within 15 years, but I think the sheer amount of resources needed to train them and validate them will not pay off the debt of creating them. Like, it might just be cheaper, on balance, holistically, to pay humans to do this work. Or maybe just invest the money you would have spent in a fund so you can pay more for the fuel. (And I’m of the opinion that if we do achieve AGI it probably won’t be any more interested in this kind of work than humans are, so it would probably be immoral to force an AGI agent to spend all its time on tasks like this.)
I think you might be conflating a few things here. For one, AGI does not necessitate free will or consciousness; those are closely related but ultimately distinct phenomena imo.
I also feel like you might have missed my point here: My example about the cars is that giving this structural bias could be viable for improving the scaling of existing techniques: We can even go your route and make the summing of the N records explicitly algorithmic. The point remains that you try to break down a certain problem for the neural network into smaller subproblems that either (A) you've seen before or (B) you can solve relatively simply. I could for example easily build from existing datasets models that (1) detect and highlight individual cars in an image, then (2) classify an image of one car. With the tools I'm thinking of here, I can just glue them together because I know the algorithmic structure. Ideally, the tiniest bit of fine tuning will give you an excellent pipeline. But right now, this glue doesn't exist.
And yes, the model I talk about will not necessarily be capable of solving these hard tasks such as contextual processing (which I'm not sure is AGI-complete; far from it). But it will be orders of magnitude simpler to make than the from-first-principles model that you would otherwise build, in terms of the power, data and computation resources involved in making it. With that simplification, maybe we can use the scale we just freed up to reason our way through the context problems you posed. For example, I could build a classifier that detects billboards and other depictions within an image. In the algorithmic glue connecting my models I could then write
And any model that wants to use only directly depicted objects of any kind could use the above code to filter the detections of any other model. If you want only those images that contain lawnmower billboards, use the above code and a lawnmower detector.
The point of all this is abstraction, which imo fundamentally underpins all our advances in hardware and software design, because humans tend to intuitively abstract away unnecessary detail. We should do the same when building AIs, but our current workflows make that prohibitively hard.
That’s just one example within an infinite space of additional classifiers you’d end up requiring in your glue/pseudo code, though. My point is that trying to build common sense, bottom-up, from if blocks is a fools errand. Even people who are intimately familiar with a given problem are usually incapable of unpacking all the potential externalities that they just gloss over because they are intelligent human beings who possess common sense. What your pseudo-code is doing is basically a start to creating task-specific guidelines. And as someone who has experience writing such guidelines, they are usually at minimum 50 or so pages—sometimes hundreds—to handle edge cases.
I don’t think the workflows make it prohibitively hard. If you align on reasonable APIs and data formats, Python makes excellent “glue”. The actual formulation of well-defined problems is the prohibitively hard part from my perspective. I think actual successes with machine learning end up being those that are applied to problem spaces where humans have actually been able to specify tasks with sufficient depth and breadth so as to make all the detail explicit.
Yeah, I can't say I disagree. But I don't think that's what I'm attempting. I think there's a lot of value in training such a pipeline end-to-end and getting lots of useful building blocks out of it. With all the other things in place, such a classifier should basically train itself. Basically, with the entire pipeline being end-to-end differentiable and split into several distinct blocks, I can train it all as if it was one network, rearrange the blocks to cope with a different task and train on a different task.
I'm not really talking about APIs and data so much as latent representations. Or rather, in my vision, not so latent representations. Basically, what I envision is in large part about forcing a certain communication protocol within a pipeline, such that the parts of the pipeline can be reused. For example, in the vehicle example I could just build one big pipeline. And things would work out reasonably well. But If I now need a different pipeline, I'm stuck with (a) retraining the wheel or (b) fine-tuning the wrong wheel. With the methods of abstraction that I mentioned in place, I can reuse the parts of the wheel that I need, patching in new parts where I need them. However, I retain compatibility of the reused parts. For example, I could reuse the car detector from above and slap a different pipeline at the end. Now I can train on a different dataset, and my mileage pipeline benefits from it too. If this sounds like embeddings, that's because those are a special case of this kind of reuse. However, this kind of reuse is currently associated with a fair bit of engineering effort.
Without assigning blame to either of us, I feel like you're not understanding what I'm trying to explain. Maybe you can rephrase what you picked up so far about what I'm proposing and why it's cool. That way, I can fix more directly any misunderstandings, rather than trying to infer misunderstandings from your opinion on the utility of my ideas.
I guess I’m not convinced that problems that end-to-end models can currently solve can always be decomposed like you are aiming for. I don’t disagree that getting reusable components out of a project wouldn’t be worthwhile. I just think that most problems aren’t necessarily decomposable in a way that the parts would make any sense to humans who might want to reuse them.
The domain I’m most familiar with is NLP. The approaches that are going out of style currently are more in line with what you’re asking for. You would train a segmenter (or use rules), you would set up sequence tagging models like POS taggers, you’d have normalization or morphological analyzers (usually FSTs), etc. But these tasks require a lot of linguistic knowledge and somehow humans seem to understand natural language without knowing how to do all those sub tasks themselves. Thus you have the newer style end-to-end models that can solve a lot of tasks fairly well even though they aren’t decomposable like the older style pipelines (unless you train them on the subtasks explicitly). There is a lot of research (“BERTology” etc.) that is making the case that these transformer models may actually be learning to do something like the pipelines you’re interested in, but it’s really difficult to probe them in the right ways to suss that out.
Have you seen any work on probing end-to-end models to extract reusable components in other domains? If nobody’s working on it, it’s not likely to materialize in the next 15 years. 😝
Sorry for the long wait. Life happened.
Anyway, of course the decomposition isn't always possible. Some things are best kept as number soup. I do beg to differ though on the frequency of such decomposable problems. These need not always be percectly clean, semantic decompositions: Consider the previous code example. From the outside, this code cares not whether it's a billboard, a screen, a printout, a photo. These are all lumped in as indirect, and we can use that info for our pipeline. Ideally, the training data for our fuel economy example will basically give us at least a little nudge towards classifying these properly as "indirect" depictions of cars.
From my NLP knowledge, let me tell you that what I'm proposing is nothing like the symbolic methods of e.g. NLTK. It's much more about synergies between neural and symbolic methods: For example, if we know certain things for sure about how language works, maybe even based on hard-to-acquire data, we can use that. Vague example: We can deduce from the way different parts of speech refer back and forth to one another what content might or might not be encoded. Say you have a data set of texts and true facts encoded by those texts (I think this is called Entailment?). For example, the respective news article from the other day with the fact "Ukraine convicted a Russian soldier." I can write an algorithm that deduces - from a parse tree that identified where different parts of speech refer to - whether a statement is entailed or not. Let's assume we have that algorithm. Maybe you need a few "fuzzy" parts like soft matching. Sure, let's put a neural network there to do the matching. Maybe you reject the notion that this algorithm could exist, in which case I'd say that we can tolerate some faultiness in the algorithm and we can also find different examples where it does exist.
So, you write this algorithm, plug all the holes that are too fuzzy for algorithms with neural networks, and you go for it. You assemble a pipeline that computes "entailment(fact, parse_and_find_coreference(text))". If you built the system right, what you get out of a completely irrelevant dataset (Entailment) is a neural network that can parse a text and find coreferences. Again, ideally (remember, we're talking about future scientific breakthroughs here!) this system is as robust as a neural network and the resulting information is as interpretable and rich as that of an algorithmic solution. The goal isn't to go back to the 60s, it's to use what we have now to bring back the good parts of the 60s, to overcome some of the shortfalls of NNs.
As for probing an existing black-box model such as BERT, no thank you. The idea is to build a pipeline around NNs that will coax these NNs into naturally assembling useful blocks.
Another interesting example is computer vision: We could do an operation on sets of detected objects in the scene. Well, if we know that our operation is invariant of the order in which we detect these things, we can enforce that. In fact, let's take the cars example. We can build a pipeline like so:
If you look at that, what that's doing is exactly what we can suppose the NN would do anyway if we trained it end to end. And we can train it end-to-end, on data of (Image, Float). The shuffle is just there to make sure the network doesn't carry any information forward that still needs to be put into context with the rest of the image. In this case, I'm not sure that would be a problem, but it can be a neat trick if you want to process the entire list of ImageSlices later without permitting the latter stages to rely on ordering. I hope you can see how this architecture wouldn't adversely affect performance in a big way, but we're getting out of it a lot of the info into how the NN does things. We can use them both by reusing them in similar pipelines and as part of Explainable AI, where I think the intermediate results are of great interest. I hope the last example gave you a bit of insight into how exactly these pipelines are supposed to work.
Anyway, as for active research: That's very much a thing. Maybe put "Neuro-Symbolic AI" or even "Neuro-Symbolic NLP" into Google Scholar, see what comes up. It's quite the active sub field these days. I'm basically outlining my vision of where NSAI will go in the future.
Medical advances in technology post RNA vaccines are going to be something to behold. Be it nanobots, printed organs on demand, or better management of chronic issues, anything that improve lives at scale is a win in my book.