Propose your own Turing tests, and optionally, also AIs failing them!

Message Bookmarked

Bookmark Removed

Not all messages are displayed: show all messages (31 of them)

I haven't played with it in a while, but chatgpt was fascinatingly bad at attempting to write blank verse - this is a bit under a year ago:

https://pastebin.com/0y7jdAs8

― woof, Thursday, 16 May 2024 14:17 (two months ago) link

I tried it with maybe Claude or the Google one lately and it blew out in just the same way. I find it fascinating that it's got these statistical clusters that it almost can't escape, like that's its ontology - a poem is a rhymed object.

'make a list of random words' is also a fascinating test - it's profoundly non-random and seems (for semi-obvious reasons) drawn to the kind of words that are 'interesting', or would turn up in a 'post the most random word' thread on a forum. This was basically 'list 200 random words':

Giraffe scamper elastic drizzle twist sapphire plummet parchment sing broccoli wander lighthouse chirp kangaroo evaporate avalanche button juxtapose oregano swivel nebula whisper ukulele float flabbergasted oscillate xylophone sparkle turquoise recite sycamore zipper leap quartz trust bucket kaleidoscope weave ninja resonate harpoon meander meadow scuttle zephyr collide translucent illuminate fracture trot mosaic glisten orbit fracture resonate cobalt bellow hammock shiver elastic quiver obsidian wobble twirl kite shimmer ponder telescope shuffle octopus blink synchronize amble citrus zigzag whirlpool vault harmonica quench elastic glow lighthouse chatter kangaroo slurp fractal wriggle sapphire gallop blizzard intertwine wrench yawn nebula embrace zipper squawk meadow twinkle xylophone hibernate quartz scribble turbine chortle kangaroo elope parchment dive oregano contemplate lighthouse baffle whirlpool ambulate turquoise skip nebula ripple wrench slumber giraffe convolute parchment frolic mosaic ponder bucket navigate mosaic daydream sycamore chuckle telescope pirouette hammock snicker citrus glide zephyr mesmerize lighthouse murmur ukulele skedaddle oregano froth zipper undulate sapphire rustle giraffe interlace avalanche shimmer bucket tiptoe broccoli daydream kangaroo jumble xylophone.

note the repeats too, eg xylophone

― woof, Thursday, 16 May 2024 14:30 (two months ago) link

If you asked a human to name 200 random words I doubt it would be particularly different from that list though, randomness is not something people are actually good at, so in that sense AI is approximating human thinking pretty well.

― silverfish, Thursday, 16 May 2024 14:35 (two months ago) link

but to get back to something mentioned a bit earlier, crossword clues are an example of something that takes a lot of creative thinking and seems like something that is still a way off from being possible for an LLM and solving an average crossword puzzle is something that I doubt is currently possible for any version of AI currently other than through brute force

― silverfish, Thursday, 16 May 2024 14:41 (two months ago) link

if you correct chatgpt ona mistake, it tends to give a serviceable apology and try again. if you don't correct it and just leave it be, it says nothing

but what if they gave the AI an anxious attachment personality type and if you leave the chat open for more than 15 minutes without acknowledging its answer, it gets all antsy and starts asking you if it was up to standard, then emailing "second thoughts" and asking for feedback that night

― your mom goes to limgrave (dog latin), Thursday, 16 May 2024 14:56 (two months ago) link

Deliberately simulating fatigue could be a diabolical way to induce more in-game purchases of virtual coffee for your AI personal assistant!

In that vein, I think all the pieces are in place for a fully autonomous bot to search, apply for, interview, and perform a gig/job meant for a human, but I think this has yet to ever have been demonstrated? People have been "outsourcing" their own work to programs for awhile, but there's always been some human tailoring and intervention involved at some stage of the scam. Maybe an extended version of this test is for the package to set up its own bank account for payment (likely with forged credentials), effectively attaining true recognized worker-hood, if not personhood.

Turing Taskrabbit Test?

― Philip Nunez, Saturday, 18 May 2024 14:35 (two months ago) link

four weeks pass...

Apple co-founder Steve Wozniak suggested the coffee test, whereby a robot would be challenged to enter your home, find the kitchen and brew a cup of coffee. The programme should be able to walk into any kitchen, find the ingredients required and then perform the task of making a coffee.

I might and did fail this test.

― Philip Nunez, Saturday, 15 June 2024 15:23 (one month ago) link

Beating a reverse Turing test of sorts:
https://petapixel.com/2024/06/12/photographer-disqualified-from-ai-image-contest-after-winning-with-real-photo/

― Philip Nunez, Thursday, 20 June 2024 18:23 (one month ago) link

the songwriting turing test

https://suno.com/song/6d8d4b98-1d13-4224-9cf5-5b0c0108df33

― | (Latham Green), Thursday, 20 June 2024 19:26 (one month ago) link

You must be logged in to post. Please either login here, or if you are not registered, you may register here.