phoenixk's recent activity

  1. Comment on How we rebuilt Next.js with AI in one week in ~comp

    phoenixk
    Link
    I wrote a mild response to this on my blog: On vinext. Excerpts follow. Please do not take the length of this comment as its authoritativeness. In fact, the opposite is probably wise. Going into...
    • Exemplary

    I wrote a mild response to this on my blog: On vinext. Excerpts follow.

    Please do not take the length of this comment as its authoritativeness. In fact, the opposite is probably wise.


    Going into this, I thought state-of-the-art AI models were generally quite good. I've heard many success stories, but I've also tried all the major models' free tiers and they haven't been much of a help---[^1]although I can already feel myself getting sucked into relying on AIs to solve problems.

    Let's review the code! I only have time for one file, so let's start with packages/vinext/src/server/prod-server.ts, which states:

    /**
     * Production server for vinext.
     *
     * Serves the built output from `vinext build`. Handles:
     * [...]
     */
    

    That sounds pretty fundamental to me, so let's dive in.

    The file starts out with the definition of a function, readNodeString:

    /** Convert a Node.js IncomingMessage into a ReadableStream for Web Request body. */
    function readNodeStream(req: IncomingMessage): ReadableStream<Uint8Array> {
    

    Now, I didn't know this, but those of you who know NodeJS may have realized that IncomingMesssage extends NodeJS's stream.Readable, which has a toWeb method to do this already. Why didn't vinext just use this instead, you may ask? Well, they do use it. In the same file.

    Moving on, next, prod-server.ts defines an options interface.

    export interface ProdServerOptions {
    	/** Port to listen on */
    	port?: number;
    	/** Host to bind to */
    	host?: string;
    	/** Path to the build output directory */
    	outDir?: string;
    	/** Disable compression (default: false) */
    	noCompression?: boolean;
    }
    

    There are two really weird choices I see: negating a boolean like noCompression (I'd prefer the positive compress?: boolean), and why is everything optional? One potential corollary of the parse-don't-validate mindset is doing, uh, anything but what vinext has done here.

    [But this feels pretty user-facing, so this may just be a holdover from NextJS.] [...]

    [...]

    Moving on, up next we have some utility functions to compress a response:

    /** Content types that benefit from compression. */
    const COMPRESSIBLE_TYPES = new Set([
    	"text/html",
    	// [...]
    	"application/wasm",
    ]);
    
    /** Minimum size threshold for compression (in bytes). Below this, compression overhead isn't worth it. */
    const COMPRESS_THRESHOLD = 1024;
    
    function createCompressor(encoding: "br" | "gzip" | "deflate"): zlib.BrotliCompress | zlib.Gzip | zlib.Deflate {
    	// [...]
    }
    function sendCompressed(
    	req: IncomingMessage,
    	// [...]
    	compress: boolean = true,
    ): void {
    	// [...]
    }
    

    This seems completely fine to me. Then we have this:

    /** Content-type lookup for static assets. */
    const CONTENT_TYPES: Record<string, string> = {
    	".js": "application/javascript",
    	".mjs": "application/javascript",
    	".css": "text/css",
    	".html": "text/html",
    	".json": "application/json",
    	".png": "image/png",
    	".jpg": "image/jpeg",
    	".jpeg": "image/jpeg",
    	".gif": "image/gif",
    	".svg": "image/svg+xml",
    	".ico": "image/x-icon",
    	".woff": "font/woff",
    	".woff2": "font/woff2",
    	".ttf": "font/ttf",
    	".eot": "application/vnd.ms-fontobject",
    	".webp": "image/webp",
    	".avif": "image/avif",
    	".map": "application/json",
    };
    

    This is extremely not fine to me. It has .eot, an (apparently) font format I've never heard of, while leaving out:

    • all audio formats
    • all video formats
    • some niche image formats (like .pdf)
    • WebAssembly files, which will at least be compressed before being sent as application/octet-stream

    And then we have tryServeStatic, which looks like this:

    /**
     * Try to serve a static file from the client build directory.
     * Returns true if the file was served, false otherwise.
     */
    function tryServeStatic(
    	req: IncomingMessage,
    	res: ServerResponse,
    	clientDir: string,
    	pathname: string,
    	compress: boolean,
    ): boolean {
    

    Okay, compress isn't optional anymore, weird. And I'd like to see some Apps Hungarian on clientDir to note it's already been path.resolve()d, as it is in every codepath that gets here. Because it wasn't clearly marked, they're going to immediately resolve it again ([tag]s mine):

    	// Resolve the path and guard against directory traversal (e.g. /../../../etc/passwd)
    	const resolvedClient = [1] path.resolve(clientDir);
    	let decodedPathname: string;
    	try {
    		decodedPathname = [2] decodeURIComponent(pathname);
    	} catch {
    		return false;
    	}
    	const staticFile = [3] path.resolve(clientDir, "." + decodedPathname);
    	if (!staticFile.startsWith(resolvedClient + path.sep) && staticFile !== resolvedClient) { [4]
    		return false;
    	}
    	if ( [5]
    		pathname === "/" ||
    		!¯fs.existsSync(staticFile) ||
    		!fs.statSync(staticFile).isFile()
    	) {
    		return false;
    	}
    

    First off, they're resolving clientDir twice? [1] [3]; and then this try/catch is really scary to me. I definitely would try to split this up into a few separate functions. An incomplete sketch follows:

    • getRequestPathname, which takes an IncomingMessage, calls decodeURIComponent [2], and sanitizes the path.

      Sanitizing here should fail if the path contains any characters that any platforms consider special, like :\?, and additionally locally resolve ..s (but without access to the filesystem, like path.resolve would.) This would change behavior. Current vinext will bail out the instant it touches a symlink, but resolving the path this early means we don't know where we'll end up looking.

      I'd argue that following symlinks is correct when you're serving static files, but I'm not Cloudflare, either.

    • resolveToStaticFile which takes a path and the static file directory and returns some sort of byte stream,

    • serveStream which takes a content type and a stream, heuristically compress the response, and send the response body.

    At [4] they're just checking that this file is within the static file root, which is at clientDir. Calling startsWith and then !== is totally fine here---if the static file root is /var/www/my-site, we don't want to allow /var/www/my-site-2/foo/bar (which would resolve from a pathname of "../my-site-2".) The worse part is that this should be its own function, as this pattern is used again later on in the same file, and this kind of function feels like it might have nonobvious edge cases.

    And then we get to the really big issue: [5] we're calling synchronous functions! These will block the event loop in Node.JS, and the only way to get around this is to run a fleet of microservers at the edge that will automatically scale up and down based on load. Running at the edge will also reduce latency, so your customers will be happier. (For the record, this is a joke---I do not believe Cloudflare is writing worse code to make you start using Cloudflare-or-equivalent services.)

    Thankfully we're only running existsSync and statSync. These probably won't take very long to complete. And while we're on the topic, another minor nit is that I'd prefer to skip existsSync and catch the potential error from statSync. Less syscalls is always good.

    [...]


    Personally, I would not run this server in production. (To be fair, I'd say the same about NextJS for entirely unrelated reasons.)

    There is a lot of code here, and it seems like everywhere I look, I see duplicate and/or wasted code. I am fairly confident the rest of the codebase is similar; one other thing that stuck out to me when I was skimming was a regular expression parser to lint against malicious double-repeats. Surely this doesn't belong in a web server?

    The texture of the code is utterly alien to me, and I no longer think state-of-the-art AI models are good.

    It's actually little baffling that these issues are here. Every issue I pointed out is trivial to fix, even by LLM standards; I'm not sure anyone has looked at this code.

    [^1]: I wrote this emdash. Trust me or don't.

    13 votes
  2. Comment on The last PCB you'll ever buy — rapid PCB prototyping with pre-made vias in ~tech

    phoenixk
    Link
    This sort of garage(-ish) fabrication is incredibly inspiring. From just copper-on-fiberglass blanks and probably somewhere around $10k-$20k of equipment (hence the -ish), you can automatically:...

    But there's always been one thing I haven't been able to figure out. Vias.
    Vias are super difficult and time consuming to make yourself, with a bunch of acid baths, electroplating, and chemicals.

    What if you had a [circuit] board, a template PCB with a standard array of vias already baked into it?

    This sort of garage(-ish) fabrication is incredibly inspiring. From just copper-on-fiberglass blanks and probably somewhere around $10k-$20k of equipment (hence the -ish), you can automatically:

    • Etch traces into copper plating
    • Cut through-holes into circuit board substrate
    • Etch solder mask
    • Place solder paste
    • Place surface-mount parts

    The video mentions using rivets for vias, and these are described as “super manual to install”. It seems like these are generally installed as, well, rivets, using a press to install ([1], [2]). I wonder if it'd be possible to instead use pick-and-placeable rivets over solder paste, such that they can simply be reflowed into place? The hardest part would probably be getting solder to bridge the gap on the back of the board, but this seems plausible, if finicky.

    Also, I now really want to design/engineer/build an omni-CNC machine that can do everything. That this is obvious to me means that many, many, people have tried and failed to make an omni-CNC machine that can do everything!

    8 votes