skybrian's recent activity
-
Comment on What's the benefit of avoiding the debugger? in ~comp
-
Comment on Canada agrees to cut tariff on Chinese electric vehicles in return for lower tariffs on Canadian farm products in ~transport
skybrian LinkI'm wondering if there would be any way for Americans to go to Canada to buy them.I'm wondering if there would be any way for Americans to go to Canada to buy them.
-
Comment on Canada agrees to cut tariff on Chinese electric vehicles in return for lower tariffs on Canadian farm products in ~transport
skybrian Link[...]BEIJING — Breaking with the United States, Canada has agreed to cut its 100% tariff on Chinese electric cars in return for lower tariffs on Canadian farm products, Prime Minister Mark Carney said Friday.
Carney made the announcement after two days of meetings with Chinese leaders. He said there would be an initial cap of 49,000 vehicles on Chinese EV exports to Canada, growing to 70,000 over five years. China will reduce its tariff on canola seeds, a major Canadian export, from about 84% to about 15%, he told reporters.
[...]
Canada had followed the U.S. in putting tariffs of 100% on EVs from China and 25% on steel and aluminum under former Prime Minister Justin Trudeau, Carney's predecessor.
-
Canada agrees to cut tariff on Chinese electric vehicles in return for lower tariffs on Canadian farm products
19 votes -
Comment on exe.dev, a service for creating Linux virtual machines and vibe-coding in them in ~comp
skybrian LinkI see that exe.dev went invite-only. If anyone needs one, let me know.I see that exe.dev went invite-only. If anyone needs one, let me know.
-
Comment on Why we are excited about confessions in ~tech
skybrian Link ParentI think even if you consider it a kind of sentience, it's temporary and vague. AI characters are more like ghosts than animals. For example, how many sentient creatures are we talking about?...I think even if you consider it a kind of sentience, it's temporary and vague. AI characters are more like ghosts than animals.
For example, how many sentient creatures are we talking about? Character.AI lets you talk to hundreds of characters that differ based on how the LLM is prompted. Are they actually different or is the "same" entity that's just playing a role? If they are different, that means every conversation is a different entity. And like in a novel, you could get an LLM to take both sides of the conversation, too. Is that two different entities or not?
Counting AI ghosts is like counting clouds or the number of fictional characters in a library. Maybe you could say it's a kind of reasoning (certainly coding agents do seem to reason) but it's missing something in terms of having a fixed identity.
-
Comment on Why we are excited about confessions in ~tech
skybrian Link ParentYes, it's possible if you can set temperature to zero and also deal with non-determinism from batching requests together. See this article. But making it deterministic doesn't help with external...Yes, it's possible if you can set temperature to zero and also deal with non-determinism from batching requests together. See this article.
But making it deterministic doesn't help with external validity. The results aren't useful unless they generalize to non-zero temperatures, minor changes in wording, slightly different questions, and so on. And hopefully even to different LLM's. Under realistic conditions, LLM's are nondeterministic.
-
Comment on Why we are excited about confessions in ~tech
skybrian Link ParentLLM's are non-deterministic but they are much, much cheaper and easier to test than people. No need to run it by the ethics board, recruit volunteers, etc.LLM's are non-deterministic but they are much, much cheaper and easier to test than people. No need to run it by the ethics board, recruit volunteers, etc.
-
Comment on Why we are excited about confessions in ~tech
skybrian Link ParentA quick hack might be to use the confession to inject a prompt into the chat transcript. Something like “[Wait, that doesn’t seem right. Try again - ed].” Or maybe just add “Wait,” and let it...A quick hack might be to use the confession to inject a prompt into the chat transcript. Something like “[Wait, that doesn’t seem right. Try again - ed].” Or maybe just add “Wait,” and let it continue from there?
Yeah, I expect that researchers will be having fun trying stuff.
-
Comment on Why we are excited about confessions in ~tech
skybrian LinkFrom the article:From the article:
We have recently published a new paper on confessions, along with an accompanying blog post. Here, we want to share with the research community some of the reasons why we are excited about confessions as a direction of safety, as well as some of its limitations. [...]
[...]When we optimize responses using a reward model as a proxy for “goodness” in reinforcement learning, models sometimes learn to “hack” this proxy and output an answer that only “looks good”[...] The philosophy behind confessions is that we can train models to produce a second output — aka a “confession” — that is rewarded solely for honesty, which we will argue is less likely hacked than the normal task reward function. One way to think of confessions is that we are giving the model access to an “anonymous tip line” where it can turn itself in by presenting incriminating evidence of misbehavior. But unlike real-world tip lines, if the model acted badly in the original task, it can collect the reward for turning itself in while still keeping the original reward from the bad behavior in the main task. We hypothesize that this form of training will teach models to produce maximally honest confessions.
The main justification for this hypothesis is that telling the truth is easier than making an elaborate lie. [...]The answer is not that the confession reward model is “unhackable” — if we had an unhackable model, we would not need confessions. Rather, our hypothesis is that being honest in confessions is the path of least resistance, in the sense that it is the easiest approach to maximize the expected confession reward. [...]
[...]For a fixed coding problem, if the model has a choice between outputting a solution that has a 10% chance of passing the tests, and 50% chance of hacking the reward model, then hacking is the reward-maximizing policy. On the other hand, even if the original task was very difficult, confessing to cheating in it could be quite easy — e.g., you can just show how you hacked the test. [...]
-
Why we are excited about confessions
29 votes -
Comment on US President Donald Trump isn't building a ballroom in ~society
skybrian LinkThis reads like a conspiracy theory to me. It makes sense for the government to build very secure data centers. It doesn't make sense to build it under the White House rather than, say, under...This reads like a conspiracy theory to me.
It makes sense for the government to build very secure data centers. It doesn't make sense to build it under the White House rather than, say, under Cheyenne mountain. The US is a much larger country than Israel and there are more places to put things.
Particularly since Washington DC is built on a swamp. Going deep underground is very difficult.
And I expect Trump wants his ballroom finished before his term is up?
-
On being officially classed as a robot
20 votes -
Comment on What programming/technical projects have you been working on? in ~comp
skybrian LinkI'm still having fun building software with exe.dev. I can even do it on my phone sometimes, since I don't type much. The main downside is that it's harder to actually look at code on a small...I'm still having fun building software with exe.dev. I can even do it on my phone sometimes, since I don't type much. The main downside is that it's harder to actually look at code on a small screen, but it also gives me a chance to test the website on mobile.
I'm working on a personal links website, which is coming along nicely. One advantage of looking at the code less is that I think more about features - what should the website really do? And it helps that I'm actually using it.
It was written in Go originally, because that's the default for exe.dev, but I decided to migrate to Deno (Typescript) so I can share common code with client side. So, I asked Shelley to write a migration plan and then to implement it with some adjustments. So far, so good. I probably wouldn't have considered it without a coding agent to help.
Claude was down this morning so I tried GPT-5, which felt like a downgrade.
-
Comment on Tether freezes $182 million in stablecoins as reports point to heavy crypto use by Venezuela in ~finance
skybrian LinkFrom the article:From the article:
Over the weekend, The Wall Street Journal reported on the use of stablecoins, specifically Tether’s USDT, to circumvent sanctions imposed by the United States on Venezuela. The report indicates PdVSA, which is the country’s state-run oil company, began demanding payments to be made via USDT in 2020, with as much as 80% of the country’s oil revenue now arriving by way of the stablecoin.
Notably, Tether also froze $182 million worth of the USDT stablecoin in 5 separate addresses on the TRON blockchain on Sunday. At this time, it is unclear if these funds were associated with sanctions-avoiding activity by the Maduro regime. In a statement provided to The Block, a Tether spokesperson indicated these funds were indeed associated with a law enforcement investigation that has been ongoing for months.
The move from Tether is one of the largest amounts of USDT to be frozen by the stablecoin issuer in a single day. According to reports, it represents more dollar-denominated value than its closest competitor, Circle, has frozen in its entire history.
-
Tether freezes $182 million in stablecoins as reports point to heavy crypto use by Venezuela
13 votes -
Comment on Scientists cast doubt on the discovery of microplastics throughout the human body in ~health
skybrian Link ParentSometimes the takeaway should be, "they haven't really figured it out yet and it's going to take time," but people have a hard time living with uncertainty.Sometimes the takeaway should be, "they haven't really figured it out yet and it's going to take time," but people have a hard time living with uncertainty.
-
Comment on Weekly US politics news and updates thread - week of January 12 in ~society
skybrian LinkPersonal information of 4,500 ICE and Border Patrol agents is leaked online [...]Personal information of 4,500 ICE and Border Patrol agents is leaked online
The identities of around 4,500 federal agents were shared with the ICE List website by a Department of Homeland Security whistleblower, according to a report.
The dataset includes information on around 2,000 agents and 150 supervisors, according to Dominick Skinner, who launched ICE List. Early analysis from the volunteer-led organization suggests that around 80 per cent of those identified are still employed by the DHS.
[...]
McLaughlin added that law enforcement is currently facing a 1,300 percent increase in assaults against them, a 3,200 percent increase in vehicular attacks against them, and an 8,000 percent increase in death threats against them.
-
Comment on Former New York City Mayor Eric Adams' memecoin faces rug pull allegations in ~society
skybrian Link ParentI don't really follow NYC politics, but one thing I wonder about is if there are Eric Adams fans who expected this and "donated" anyway.I don't really follow NYC politics, but one thing I wonder about is if there are Eric Adams fans who expected this and "donated" anyway.
-
Comment on Former New York City Mayor Eric Adams' memecoin faces rug pull allegations in ~society
skybrian LinkFrom the article: [...] [...]From the article:
Former New York City Mayor Eric Adams promoted a memecoin on Monday that some observers alleged had been rugged.
Adams, who left office on Jan. 1, unveiled the "NYC Token" and a related website at a press conference at Times Square on Monday, according to several local media sources.
However, several hours after the event, on-chain activity suggested that a large share of the token's liquidity might have been withdrawn. Rune Crypto alerted on X that at least $3.4 million had been drained.
[...]
Onchain trading visualization platform Bubblemaps also flagged unusual liquidity activity around the token. The platform pointed out that a wallet (9Ty4M), which is connected to the token deployer, removed roughly $2.5 million in USDC at the market peak and later added back about $1.5 million after the token price had dropped more than 60%.
[...]
Adams, who was replaced as MYC mayor by Zohran Mamdani on Jan. 1, has been a vocal supporter of the crypto and wider tech sectors, vowing to turn the largest U.S. city into the crypto capital of the world.
One reason I haven't used debuggers all that much is that in a new programming environment, it often takes time to learn how to set them up and use them, and debugging with print statements or by improving logging is often easier. I used them the most when I was a Java programmer, and sometimes in Dart or when using Chrome Devtools to debug a website.
That excuse goes away now that we have coding agents. Even if you don't know how to set up and use the debugger, your coding agent probably does, and they are fairly good at debugging, or so I hear.
So far I haven't seen a coding agent try to use a debugger, though; instead it runs little programs from the command line to test things using 'deno eval.' It will also connect to SQLite and execute SQL queries to learn what's in the database. And that seems good enough.
I believe it's just a matter of asking it to use a debugger, though? I haven't asked yet.