Here’s an uncontroversial statement: vocals are the most important musical element in commercial music. And they come in many forms.

Think about how different the controlled soul of Beyoncé is from Chris Stapleton’s rugged croon or Future’s pitch-locked autotune rap sound. 

They all have a completely different effect on the listener. So, knowing what kind of vocals a track has – if any – can help listeners navigate large catalogues more effectively. Whether your users are looking for the perfect instrumental sync track or similar female-vocal-led tracks for a playlist, having granular tags that refine search results can turn hours of thankless catalogue searches into mere minutes of musical cherry-picking.

In this blog post, you’ll learn more about our vocal classifiers and how they can elevate the catalogue navigation experience.

We have four vocal classifiers:

  1. Instrumentation
  2. Vocal Presence
  3. Vocal Register
  4. Autotune Presence

If you’ve used the Musiio Tag Demo, it’s everything in the grey ‘Vocal Tags’ box.


Here’s a brief overview of how the AI tags your track:


This classifier has only two tag options: Vocal and Instrumental. Knowing whether a track has a vocal is key in video production. Maybe one of your users is the next Marques Brownlee filming tech reviews with voiceovers and to-camera reactions. For them, instrumental tracks are the only way to go. The instrumentation classifier simplifies this catalogue filtering.

Conversely, if you have a database of track submissions and you’re a hot-shot A&R looking for the next Beyoncé or Sam Smith, you can use the tech to immediately show only tracks with a vocal. There’s no need to waste time skipping through instrumentals any more.

Vocal Presence

Curators don’t usually have access to such detailed tools. With this set of tags, playlist curators can filter by vocal presence to maintain consistency from one track to the next. Set to high vocal presence, curators can bank on more songs, not just the occasional adlib. That means fewer tracks are skipped. 

Vocal presence is a measure of how much of the track’s duration features vocals, not the vocal's volume compared to the instrumental parts. 

Video editors looking to maintain interest in a scenic drone shot may want a track with a little je ne sais quoi – maybe a cheeky ooh or aah – without distracting attention from the visuals. In this case, the low vocal presence tag is ideal.

Vocal Gender

Musiio’s AI-powered tagging can determine whether a vocal is in a lower register (Barry White to Ben Howard), a higher register (predominantly female) or a mix of the two. 

A mix of the two could mean a duet like Lady Gaga and Bradley Cooper’s ‘Shallow’. It also applies to falsetto kings Bruno Mars and 80s Bon Jovi, who both have wide vocal ranges.

Knowing whether the vocal sounds male or female to the AI can streamline playlisting and recommendations. 

For playlisting, you could filter only ‘sounds like female’, heartfelt moods tag; and the soul genre to get candidate tracks for a ‘Women of Soul’ playlist. 

Interestingly, Justin Bieber’s voice at 15 ‘sounds like female’ according to the AI. We wrote a blog about it. While you may not want young male singers on your playlist, you may still find these curveball picks fit your chosen vibe.

For recommendations, if a listener mostly listens to music with female singers, streaming services can use that data to serve them more Adele than Harry Styles.

Autotune Presence

This powerful classifier, launched in 2022, can determine how processed a vocal is. Applications are huge for playlisting. Imagine a user assembling an old-school hip-hop playlist based on a sound like A Tribe Called Quest. They put out The Low End Theory before the invention of autotune, so an easy way to filter out inappropriate tracks would be to set the autotune presence of ‘None’. 

In the sync context, consider how a tech brand like Razer might position itself with music. The sound of an autotuned rapper evokes futurism and tech culture in a way that a melodic, natural voice doesn’t. Equally, a high-quality food brand like Marks & Spencer is more likely to sync Bill Withers than T-Pain. Autotune presence is an important filter to help users find the vibe they want quicker.

Autotune presence for A&R is also meaningful. If you want to find tracks in a database of submissions in the vein of ElyOtto’s ‘SugarCrash!’ that are more likely to pop on TikTok, medium and high levels of autotune are desirable. That digital edge is desirable. But, if it’s the next Shaun Mendes that you seek, low and medium are the way to go. 

Want to understand more about our technology? Check out our previous explainers on standard moods, enhanced moods, choosing which mood is right for your audience, and energy.

If you’d like to reach out for a tech walkthrough or learn more about how Musiio tech can help you maximise your music catalogue, email or fill in the contact form.

Share this story