I understand the very clear concern around AI that in audio format, feels much more real and human, and therefore more persuasive, than it does in things like visual media. It's also harder to tell that this is synthetic than with visual media. However, I can't help but get stopped on the idea that there seems to be no purpose of this technology other than to create misleading content that is, as far as I can tell, very clearly trained off of the work of popular podcasts. I noticed a distinct similarity between the speaking style and word choice for these episodes as with podcasts that I listen to regularly.
While the ability to create audio content that is convincing is spooky, especially as a goal, because I can think of no more likely reason for this technology's development, I also wonder if this sort of tech is making a statement in the discussion of AI regulation and creative work. If you'll allow me to take out my brush that blurs the line between analysis and conspiracy theory, keep that in mind that it's what I am painting with. One important argument in this discussion is that artists are not directly harmed by AI, and that, more importantly, that this is a reflection of the idea that art is somehow a luxury career, or less legitimate and worthy of protection than less creative options. Many believe, explicitly or implicitly, that the value of art is dependent upon the level of proficiency of the artist, and that the level of proficiency is determined based upon the level of difficulty that someone else would have in replicating that art. Taking a look at the history of modern art proves this to be false in a variety of ways, but that's something I don't want to elaborate on here.
The point being that, even if you look at other comments, there's not exactly people jumping to defend podcasting as an art form relative to, say, painting. It's a joke, one that many people have made, me included, that podcasts are so easy to create, that there's so many of them, that everyone you know makes one, and listens to an astronomically small number of them. It's funny to say that a system that generates an incomprehensible amount of content that frequently makes factual errors, spends the majority of its run-time saying nothing of consequence, and, most importantly, feels more authentically human, is no different than ordinary podcasting. This is because podcasting makes an art form out of something that theoretically requires less proficiency than other art forms, when there's likely even more text or visual art out there. This is what makes the idea that podcasting is or can be art questionable for most. The idea that someone would have a career in it is even more bothersome. The high visibility of podcasting on all levels of skill makes it easy to ignore that there are podcasts that are important components of peoples' careers and livelihoods. That is to say very little about the consequences of seeing art as an economic commodity, or about the critical misunderstandings surrounding the "point" of creative work.
Whether intended or not, moving the discussion around AI art as theft to a medium that many do not consider art acts to remove legitimacy from the discussion around whether such theft is justified.
To comment on how human this feels, whether I would have known ahead of time that this is AI, it still would have bothered me. It speaks with all the conventions of a structurally crafted podcast, but with none of the correct placements of these changes in pacing. The only thing that matters is that it is convincingly human for some, or may be convincingly human enough for others in the future. It doesn't need to fool people. It only needs to make them feel as though the speakers are human. This would make a more effective, more interesting and entertaining, and more humanized synthetic podcast more risky, not less. That is to say, the issue here is not about the low quality of information and entertainment being produced rapidly at high volume, but that this would be compelling information and entertainment produced rapidly and at high volume. It would be a technology that is inauthentic in a medium prized for its appearance of its authenticity. The reason this is concerning is because such audio has more in common with a deep fake than with LLM text or visual art, in terms of its ability to persuade or manipulate people, with less visible indications of such manipulation than deep fake videos.
Well said! “ And yet, if I am being honest, the AI hosts aren’t that different from many human voices to be found in podcast space and on social media, where an engaging format, compelling storytelling and a superficial and somewhat cavalier attitude toward the truth, leads to similar content.
In this respect, NotebookLM is scarily good at emulating very human online content which is equally untrustworthy.”
I’m glad you acknowledged a very basic error commonly made in relation to AI hallucinations, specifically the way AI makes things up as being uniquely bad when it’s so obviously the case that human beings often lie and make up stuff so it’s a bit ridiculous how many old AI to standard that? No humans even meet in terms of honesty or very rarely.
And indeed to your point, this is exactly what makes the notebook LM quite compelling, not only in how human like it sounds and how engaging it is I have to admit I actually feel things and laugh at it how engaging it is and ironically the little bit it does sort of take liberties with and make up arguably are not so cool AI mistakes but make it even more like us in those mistakes and stuff it makes up because humans are pretty awful at making up things and hating themselves if we are honest which some people have a bit of difficulty in doing, but I’m glad you are at least
In the near future, it may be that there will be few commodities more valuable to humans than genuine, certified, artisanal, human-generated content. Everything else will be suspect; potentially subtly full of influences to which one did not intend to expose oneself.
Carson Johnston's recent post on user agency picks up on some of the same themes you do regarding anthropomorphism:
Truly thought-provoking.
I understand the very clear concern around AI that in audio format, feels much more real and human, and therefore more persuasive, than it does in things like visual media. It's also harder to tell that this is synthetic than with visual media. However, I can't help but get stopped on the idea that there seems to be no purpose of this technology other than to create misleading content that is, as far as I can tell, very clearly trained off of the work of popular podcasts. I noticed a distinct similarity between the speaking style and word choice for these episodes as with podcasts that I listen to regularly.
While the ability to create audio content that is convincing is spooky, especially as a goal, because I can think of no more likely reason for this technology's development, I also wonder if this sort of tech is making a statement in the discussion of AI regulation and creative work. If you'll allow me to take out my brush that blurs the line between analysis and conspiracy theory, keep that in mind that it's what I am painting with. One important argument in this discussion is that artists are not directly harmed by AI, and that, more importantly, that this is a reflection of the idea that art is somehow a luxury career, or less legitimate and worthy of protection than less creative options. Many believe, explicitly or implicitly, that the value of art is dependent upon the level of proficiency of the artist, and that the level of proficiency is determined based upon the level of difficulty that someone else would have in replicating that art. Taking a look at the history of modern art proves this to be false in a variety of ways, but that's something I don't want to elaborate on here.
The point being that, even if you look at other comments, there's not exactly people jumping to defend podcasting as an art form relative to, say, painting. It's a joke, one that many people have made, me included, that podcasts are so easy to create, that there's so many of them, that everyone you know makes one, and listens to an astronomically small number of them. It's funny to say that a system that generates an incomprehensible amount of content that frequently makes factual errors, spends the majority of its run-time saying nothing of consequence, and, most importantly, feels more authentically human, is no different than ordinary podcasting. This is because podcasting makes an art form out of something that theoretically requires less proficiency than other art forms, when there's likely even more text or visual art out there. This is what makes the idea that podcasting is or can be art questionable for most. The idea that someone would have a career in it is even more bothersome. The high visibility of podcasting on all levels of skill makes it easy to ignore that there are podcasts that are important components of peoples' careers and livelihoods. That is to say very little about the consequences of seeing art as an economic commodity, or about the critical misunderstandings surrounding the "point" of creative work.
Whether intended or not, moving the discussion around AI art as theft to a medium that many do not consider art acts to remove legitimacy from the discussion around whether such theft is justified.
To comment on how human this feels, whether I would have known ahead of time that this is AI, it still would have bothered me. It speaks with all the conventions of a structurally crafted podcast, but with none of the correct placements of these changes in pacing. The only thing that matters is that it is convincingly human for some, or may be convincingly human enough for others in the future. It doesn't need to fool people. It only needs to make them feel as though the speakers are human. This would make a more effective, more interesting and entertaining, and more humanized synthetic podcast more risky, not less. That is to say, the issue here is not about the low quality of information and entertainment being produced rapidly at high volume, but that this would be compelling information and entertainment produced rapidly and at high volume. It would be a technology that is inauthentic in a medium prized for its appearance of its authenticity. The reason this is concerning is because such audio has more in common with a deep fake than with LLM text or visual art, in terms of its ability to persuade or manipulate people, with less visible indications of such manipulation than deep fake videos.
Well said! “ And yet, if I am being honest, the AI hosts aren’t that different from many human voices to be found in podcast space and on social media, where an engaging format, compelling storytelling and a superficial and somewhat cavalier attitude toward the truth, leads to similar content.
In this respect, NotebookLM is scarily good at emulating very human online content which is equally untrustworthy.”
I’m glad you acknowledged a very basic error commonly made in relation to AI hallucinations, specifically the way AI makes things up as being uniquely bad when it’s so obviously the case that human beings often lie and make up stuff so it’s a bit ridiculous how many old AI to standard that? No humans even meet in terms of honesty or very rarely.
And indeed to your point, this is exactly what makes the notebook LM quite compelling, not only in how human like it sounds and how engaging it is I have to admit I actually feel things and laugh at it how engaging it is and ironically the little bit it does sort of take liberties with and make up arguably are not so cool AI mistakes but make it even more like us in those mistakes and stuff it makes up because humans are pretty awful at making up things and hating themselves if we are honest which some people have a bit of difficulty in doing, but I’m glad you are at least
In the near future, it may be that there will be few commodities more valuable to humans than genuine, certified, artisanal, human-generated content. Everything else will be suspect; potentially subtly full of influences to which one did not intend to expose oneself.
Carson Johnston's recent post on user agency picks up on some of the same themes you do regarding anthropomorphism:
https://carsonjohnston.substack.com/p/user-agency-is-at-stake
Thanks Mark -- I agree. And great connection with Carson's post which I think is essential reading here!