cwbe coordinatez:
101
63533
63608
8771344
9103894

ABSOLUT
KYBERIA
permissions
you: r,
system: public
net: yes

neurons

stats|by_visit|by_K
source
tiamat
commanders
polls

total descendants::0
total children::0
show[ 2 | 3] flat


We evaluated GPT-4 in a public online Turing Test. The best-performing GPT-4 prompt passed in 41% of games, outperforming baselines set by ELIZA (27%) and GPT-3.5 (14%), but falling short of chance and the baseline set by human participants (63%). Participants' decisions were based mainly on linguistic style (35%) and socio-emotional traits (27%), supporting the idea that intelligence is not sufficient to pass the Turing Test. Participants' demographics, including education and familiarity with LLMs, did not predict detection rate, suggesting that even those who understand systems deeply and interact with them frequently may be susceptible to deception. Despite known limitations as a test of intelligence, we argue that the Turing Test continues to be relevant as an assessment of naturalistic communication and deception. AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness.
https://arxiv.org/abs/2310.20216
https://arxiv.org/pdf/2310.20216.pdf