15
votes
What's our thoughts on Perplexity.ai for search?
If you haven't used it yet, it's more like a cited source summary tool. I actually really like for questions such as "Who is X and why are they important?"
I'm interested in people's thoughts on it.
Perplexity engage in outright plagiarism, and don’t respect robots.txt. If that’s something you care about, stay away.
That's a little misleading. Perplexity uses snippets with attribution in the same way search engines do. This has historically been considered a fair use application. It doesn't fit the definition of plagiarism as they are not claiming this content as their own.
Additionally, Perplexity does respect robots.txt for training their AI model. They only do not respect it when following a user's request to scan a page, which is the correct behaviour. robots.txt is specifically for automated web crawlers or spiders. User agents, meaning tools that follow a user's commands, are not subject to robots.txt. Your web browser and command line applications like wget do not follow robots.txt either, because they are acting as user agents, not as robots.
I reject your characterization. Maybe I should have linked the Forbes article directly, but what we're talking about has absolutely nothing to do with what you are on about.
This is just theft. The WIRED article also mentions what they would consider /plagarism/, not a cute little search engine snippet like you suggest.
So I had to dig a bit to find the actual material but it's here:
https://www.perplexity.ai/search/perplexity-is-a-41uH2h6JT0qazoM87BO.kw
From my perspective, this complaint is inane. The user asked this question: "tell me about the wired article "perplexity is a bullshit machine." What perplexity did, is read the wired article, and then produce a summary, filled with hyperlinks to the article. In particular, the fifth paragraph where the content was quoted is specifically annotated with a link to the source where the content came from. What more are people expecting here?
Here's the perplexity article that set this all off:
https://www.perplexity.ai/page/Eric-Schmidts-AI-boKJzWQcRFmCLk5XjgKJEQ
Now this is post-complaint, and apparently the links to Forbes are more prominently placed at the top of the page than they originally were but as I understand it, the rest of the content below is the same. I honestly don't see an issue with this. It's extensively hyperlinked to the source material -- like these pages, which use Forbes as a source, yet they aren't being characterized as plagiarism. What's the difference?
https://tech.hindustantimes.com/tech/news/former-google-ceo-eric-schmidt-jumps-into-ai-attack-drones-space-looks-to-transform-military-tech-71706178365198.html
https://www.businessinsider.com/eric-schmidt-poaches-apple-spacex-google-ai-drones-report-2024-6
I assume the reason they aren't being criticized is because they clearly state that they used Forbes as a source -- exactly like perplexity's article, in the second paragraph:
and fourth paragraph:
Keep in mind that this is a summary with only 6 paragraphs -- two inline mentions of Forbes, and 4 hyperlinks. The Hindustan Times article in comparison only mentions Forbes once.
Finally, I think it's pretty funny that in all of this content criticizing perplexity for plagiarism, I couldn't find a single link to perplexity's actual article. I had to go to twitter and forum posts to actually see what they were talking about.
Apologies for the late reply, I don't browse Tildes often:
(This is entirely wrong, as shown by this Wayback Machine archive prior to the Forbes article. In addition, this fails to cover the podcast that Perplexity autonomously generated using this plagarized article).
Hold on here, this is absolutely not plagiarism. Their entire raison d'etre is explicit attribution of all material.
Please see my above comment for why I wholeheartedly reject your characterization.
Is that a bad thing?
https://wiki.archiveteam.org/index.php/Robots.txt
I am always hesitant to try different AI models in any real usage, and that's been the case for years as I've worked in robotics and ML predictions has always been uh... distant from reality, and the new types have bolstered that.
But, Perplexity has been surprisingly solid, it's a summarizer first and foremost, and so it tends to not have the same issues of hallucinations as others.
Is it perfect? Absolutely not. But is it useful? Absolutely, especially since its one of the few new types (calling it that gives me gundam vibes tbh), that cites its sources incredibly well. I've actually been enjoying playing with it for basic requests. Nothing serious, but just anything I might want to quickly grok Wikipedia for an answer.
Kagi has a very similar summarizer for webpages (Universal Summarizer) that works rather well for single sources, and on their search results they have a "Quick Answers" that afaic tell, uses their Universal Summarizer to summarize their own search results page and then links into an LLM (Claude 3 Haiku) to give it a bit more readable personality and style that I've been enjoying, but it doesn't cite its sources as nicely as Perplexity.
Additional:
The Kagi summarize and quick answers are pretty nice. Been using them fairly regularly since I subscribed. The quick answer appears to take key points from the first page of results and put them into a short list of bullet points. Sometimes you can see the exact sentences used in the summary in the descriptions of some of the results.
https://www.theverge.com/2024/6/27/24187405/perplexity-ai-twitter-lie-plagiarism
The author clearly has no idea what the term “rent-seeking” means. Perplexity is quite obviously adding value here. I don’t want to personally wade through pages and pages of fluff to get the actual information that I want.
I personally prefer Phind, but they are very similar, and it isn't uncommon that I'll use both on a query.
They are far from perfect, but as the other poster said, they're already well in the Useful range. Just be extra careful when you're working outside of your areas of expertise, as it becomes much harder to catch inaccurate information.
Haven't used Perplexity, but as @drannex mentioned, Kagi has a similar feature, and I use it for queries where I trust LLMs to give me accurate information. For example, just yesterday, I searched for "is azelaic acid poisonous?" since I use it as a skincare product and was worried about accidentally getting it on my lips, and the summarizer responded that it is not considered toxic and cited a few studies.
This saved me a bunch of time, since most of the results were pages with general information about azelaic acid or broad scientific papers about the topic, so I would've had to do a bit of digging to find the actual answer.
On some queries, I usually don't trust the AI and prefer to check the results myself. In most cases, those are the searches where I want to find the sentiment or experiences of other people about a particular thing, rather than specific facts.
Considering AI had been known to invent sources in the past, I would follow up on their citations before trusting any answers
The point of those search summary AIs is that they cite the sources with specific links
https://i.ibb.co/TcRptMF/Screenshot-20240629-140308.png