Apparently most layers in deep learning networks aren't actually used. From the article: "The lottery ticket hypothesis states that a randomly initialized, dense, feed-forward network contains a...
Apparently most layers in deep learning networks aren't actually used. From the article: "The lottery ticket hypothesis states that a randomly initialized, dense, feed-forward network contains a pool of subnetworks and among them only a subset are 'winning tickets' which can achieve the optimal performance when trained in isolation."
This seems to explain why the networks can easily be pruned after training is done. There's a lot of data that's left over from failed attempts.
Apparently most layers in deep learning networks aren't actually used. From the article: "The lottery ticket hypothesis states that a randomly initialized, dense, feed-forward network contains a pool of subnetworks and among them only a subset are 'winning tickets' which can achieve the optimal performance when trained in isolation."
This seems to explain why the networks can easily be pruned after training is done. There's a lot of data that's left over from failed attempts.
Very interesting, thanks.