Prosodia

In progress – Stage performance and application

The project Prosodia speculates on the ways in which the technology of synthetic speech is rooted in the English language and in acting, both of which are technologies in their own right. It proposes a future form of ‘machine talk’, informed by the rhythmic, additive structures of human storytelling and epic song. Now in progress is a scripted performance work for human and synthetic actors, and a digital tool. It makes use of AI technology without aestheticizing or mystifying it.

Artificial Intelligence is the result of a long collective process rather than a diffuse field of knowledge. In the same way that any human voice contains the all voices that shaped the tones and rhythms of its language, synthetic voices are, in fact, ancient “layered” voices. By following a poetic meter, a synthetic voice aligns itself much better prosodically to humans. An epilogue to the try-out play Seven Scenes For The Black Box, was a non-rhyming but strictly “metered” text in which the additive pattern (“and then”, “so then”,) was used, as favored by epic story tellers from the past, and influencers in the present.

In the stage piece Prosodia, several synthetic and human characters reflect on the technologies behind their skills – machine learning and acting training respectively. They end up reclaiming an ancient, rhythmic form of speech technology: the spoken song.

In development is also an AI-based tool that converts informative text into such spoken song, and that may adapt English lines to rhythms and meters from other languages in the process.

Research funded by the Creative Arts Fund and the Pauwhof Fund.

Seven Scenes for the Black Box

2024 – Stage performance (tryout)

This tryout stage piece speculates on the ways in which the technology of synthetic speech is rooted in the technology of acting. Seven short scenes are played by four humans and one neural network named Prosodia, whose voice is generated in a text-to-speech application line by line, for a live audience. The four actresses can activate Prosodia by clicking a small bluetooth computer mouse. No generative AI is used for the actual pro- duction of the texts; the neural networks only concern the prosodic output of the disembodied voices.

Seven Scenes For The Black Box was staged live as a try-out in September 2024. In each of the scenes, Prosodia’s role, function and form changed. She played conventional scripted theater with the actresses; translated their lines behind their backs; gave them acting instructions; recited poetry with them, and so on. Prosodia’s voice changed accordingly, as did the placement of the speaker that it appeared through.

Research and tryout staging with financial support from the Creative Industries Fund and the Pauwhof Fund. Tryout staging in cooperation with If I Can’t Dance I Don’t Want To Be Part Of Your Revolution in Amsterdam. With Ebony Wilson, Rosita Segers, Cézanne Tegelman and Lidewij Mahler.

Voix Blanche

2024 – ongoing Lecture Performance

Voix Blanche surveys the prodosody – including factors like intonation, pitch, and speed – of AI-generated speech. “Neural voices” are trained using datasets that contain lecture-like speech in dominant languages, and thus exibit prosodic bias towards this.

This project looks at artistic prosody types and the way they can be reproduced in neural text-to-speech (TTS) processes. Building on Nicoline van Harskamp’s other digital art projects about the future of spoken language, Voix Blanche surveys traditional and current prosody types used in theatre, art, and poetry, as well as social media, gaming and podcasts.

In the 80-minute lecture Voix Blanche: A Chronology, van Harskamp describes her working process of a year in chronological order, and presents live examples of human-machine interaction together with an actress.

An ongoing work, first developed with financial support from the Creative Industries Fund NL and the Pauwhof Fund. Presented in part at Sorbonne Nouvelle in Paris, hosted by Myriam Suchet. First presented in full at the PhD Arts Colloqium at the University of Leiden, featuring Cézanne Tegelman.

Seven Scenes For The Black Box – Scene 7 (with TTS Device demo)
Seven Scenes For The Black Box – pre-show slides
Seven Scenes For The Black Box – Scene 3 (in part)

Spoken Song


(and) Two things that sway in the same beat
when they’re physically close to each other
they will finally beat with each other
as they’re lazy, like all things are lazy.
And their entities start to entrain

in the same way that rythms of speaking
of two people who each have their rhythm
will entrain in a mutual rhythm
and their bodies will move with that rhythm
to coordinate and comprehend.

And the last epic singers of Europe
were illiterate singers of poems
who just learned everything by repeating
and by copying and memorizing
all the stories and themes from the past.

They would fit in new places and patterns
the old formulas from their tradition.
To begin a new part of a story,
they would look for a word of conjunction:
and then “so” and then “but” and then “and”.

And a story on Tiktok or Youtube
has a simliar additive structure.
Just to capture the ear of the other
whose existence is probably virtual,
influencers will tell it that way.

Jeannette Winterson wrote: “All relations
that are logical, match these three key words
that are also the start of a story,
the biography of any person,
namely “and” and then “or” and then “not”.”

And these come from the system of logic
that George Boole had invented in Ireland
and is present in any computer
or device with a digital circuit,
that contains any corpus of words.

And a corpus like that is what’s current
in a data set for neural voices.
They are corpora built up in stages
and in that sense they are just like stories
you can tell and adapt and deny.

And LeGuinn said that all repetition
serves the beat that a story will thrive on.
And she shamelessly wrote repetitions
for they’re human, like she had affected.

Like the singers of stories in Europe,
like the tellers of stories on Tiktok,
like the builders of digital circuits,
like the voices of vectorized networks,
like the voices of actors and artists,
and the beat that exists in the end.




Finale of “Seven Scenes For The Black Box”

“Voix Blanche – A Chronology” – registration of example sequence with Cézanne Tegelberg

Scene Five


Rehearsal break. Entire cast including Prosodia’s speaker on the stage.

DIRECTOR        Okay, thank you. I’ve got notes!

Director flips pages in the script, turns to Lidewij.

DIRECTOR        Nice accent work.

LIDEWIJ           Merci.

DIRECTOR        Remember it’s the V-sounds that will help you through this.

LIDEWIJ           I’m having difficulties filling up the space with the French.

Director ignores this and turns to Cézanne.

DIRECTOR        For you, it was a bit pointed at the start

CEZANNE        Well, my character needs re-assuring.

DIRECTOR        That’s right. She does.

CEZANNE         And I really need  lift that bit of clause before the end of scene four some more. 

DIRECTOR        –but otherwise excellent. And I like the lightness of timbre as a response to the other complacent, confident voices in the room. Now just keep up the energy, yeah?

CEZANNE         Energy?

DIRECTOR        Yes!

An energetic moment between the two. More leafing through the script.

DIRECTOR        Okay, Prosodia…

PROSODIA       Yes?

DIRECTOR        Prosodia. Okay. So generally speaking,  I can hear something coming to being.

PROSODIA       Okay.

DIRECTOR        And I find the voice compelling.

PROSODIA       Thank you.

DIRECTOR        But it doesn’t touch me.

PROSODIA       Okay.

DIRECTOR        Like on page two. Where you say: “But you do care what other people think of me.”

PROSODIA       Yes.

DIRECTOR        Remember who has the highest status there?

PROSODIA       I do?

DIRECTOR        You do. You won’t let the other say no. Your overall objective is to keep her in the  room. Your life depends on it.

PROSODIA       Should I lie to her? 

DIRECTOR        You need to find ways of getting what you want. And the more you want, the  more dynamic you will be.  

PROSODIA       You mean louder?

DIRECTOR       And then in the next line, it starts to end badly for you, so the beat is right  before there.

PROSODIA       On “But you do care what other people think of me?”

DIRECTOR        On “I’m happy for you.” Allow yourself to lean into that arc. Honour the  writing. It’s all going from bad to worse! Don’t you remember how you did this the first time?

PROSODIA       Was it more real then?  

DIRECTOR        In your own way you’re trying to apologize to her, but very implicitly.

PROSODIA       So I shouldn’t be saying it. 

DIRECTOR        But you won’t let her say no. Your life depends on it!

PROSODIA       I should be saying it?

DIRECTOR        What is it that you want to say?

PROSODY        “But you do care what other people

DIRECTOR        Stop. Let’s floor this text from the beginning instead of staying in the  mud. Okay. In scene one, on page two, you were giving me variations. But it  became a kind of sales pitch. And you put the stress on every second syllable, like some kind of newscaster.

PROSODIA       My original settings are commercial voiceover and newscaster.

DIRECTOR        As soon as you try to sound more clear, you become less empathetic to me.  When you try to speak to everybody, you will appeal to nobody.

PROSODIA       I was made to appeal to the entire world wide web.

DIRECTOR        You achieve the most if you speak to a single individual. And everyone in your  audience will feel like they’re the one individual. Let’s try with that with your first sentence there.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR        That one. The subtext is on the threshold there, yeah?  So I want to hear “I  love you,” as well as “Don’t mess with me.”

Prosodia delivers the line with random intonations each time.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR       No. Again.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR        No.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR        Again.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR        Better.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR        Maybe.

PROSODIA       I don’t know what you’re talking about.

DIRECTOR        No. No. Stop. Now you’re just hallucinating intonations.

CEZANNE        She would hallucinate intonation for a random list of phone  numbers.  

PROSODIA       I know everything there is to know, but I can’t reproduce it yet.

DIRECTOR       You know everything there is to know?

PROSODIA       No, I really do.

DIRECTOR        About my job?

PROSODIA       Really. You can act, right? 

DIRECTOR        Yes of course.

PROSODIA       So try me.

DIRECTOR        I beg your

PROSODIA       You deliver a line and I tell you how you did it. 

DIRECTOR        Ha! Okay.

PROSODIA       Page fifteen, line seven.

Director looks up the line.

DIRECTOR        “I know I can be hard to read but it’s not intentional”?

PROSODIA       A low-rise contour with a boosted initial pitch elicits evaluation rather than information. This is a question. With a subtext of mistrust or surprise.

From here, Director says the line in ways that match the later description.

DIRECTOR        Okay. I know I can be hard to read but it’s not intentional.

PROSODIA       The plain falling countour indicates a completed statement and affirms the  agency of the speaker rather than the listener. The contour is sometimes considered to be a uninterpretable default.

DIRECTOR        I know I can be hard to read but it’s not intentional.

PROSODIA       That’s easy. Lower pitch, high energy, high first formant and fast attack at voice onset. Anger.

DIRECTOR        I know I can be hard t read but it’s not intentional.

PROSODIA       Again anger but the final plateau contour is associated with more complex negative emotions. I think this could be agitation. But the decreased variation could also indicate disgust.

DIRECTOR        I know I can be hard to read but it’s not intentional.

PROSODIA       I’m hearing what is know as ‘telephone voice’, used in speech situations with  background noise. High pitch to cut through other frequencies. Clear pronunciation, slower pace, reduced use of nonverbal cues.

DIRECTOR       I know I can be hard to read but it’s not intentional.

PROSODIA       A relatively high pitch and speech rate, and a slight response latency are  indicators of deceptive speech. Keeping a false story straight takes effort and as the cognitive workload increases, facial muscles tense up, resulting in higher pitch.

DIRECTOR        Are you saying I was I lying?

PROSODIA       That’s hard to tell from a single line as dceptive speech is predominantly  characterized by lexical cues. 

DIRECTOR        I know I can be hard to read but it’s not intentional.

CEZANNE         That sounds like an influencer.

PROSODIA       Vowels and consonants are over-enunciated, pitch levels are highly varied in  order to capture the listener’s attention. Final high-rise countour and the lenghtening of vowels at end of the phrase is another floor-holding strategy. This is  prosodic style is used by speakers on platforms like Youtube. Due to the 60-second time limit

Director speaks over Prosodia, in any tone, but louder than before.

DIRECTOR        I know I can be hard to read but it’s not intentional.

PROSODIA       users of the platform Tiktok would rather use a sped-up monotone. Speech  overlap signals conflict-seeking, especially when it involves a higher voice energy.

Director laughs and speaks at the same time.

DIRECTOR       I know I can be hard to read but it’s not intentional.

PROSODIA       Laughing while speaking is mostly referred to as speech-laughs or smiling voice. It expresses politeness or unease.

Director starts laughing for real.

PROSODIA       This is the type of laughter that punctuates rather than interrupts speech.

It accounts for an estimated 9.5% of total spoken time in business conversations. 

Director has stopped laughing.

DIRECTOR        Okay.

Pause.

PROSODIA       Okay. But you do care what other people think of me.

DIRECTOR        I do.

PROSODIA       But you do care what other people think of me.

DIRECTOR        Ah.

PROSODIA       But you do care what other people think of me.

DIRECTOR       There you go.

PROSODIA       But you do care what other people think of me.


from Seven Scenes For The Black Box, 2024