Today’s Lesson: An Luu – Pourquoi tu me fous plus des coups?

Standard

I have not been spending enough time on my blog lately, and it’s time for that to change with “Today’s Lesson:” a weekly blog segment in which I review a song or an album that I’ve recently been listening to and find myself interested in. Unlike a traditional review, I won’t be assigning scores or anything. Instead, I focus on writing about music that I think is worth writing about. In some tacit sense, you can assume that these are all recommendations. However, the real purpose of this segment is to use a song as a stepping off point to talk about whatever else is on my mind. We’re too quick to separate music out from other things in life, and this is my own small way of questioning the merit of doing that.

 

Just bask in the glory of this unabashedly 80s pop single by An Luu, a French actress. I don’t know how or why Spotify recommended this track to me, but I sure am grateful they did. I love it. There is absolutely no pretense, posturing, or even showing off. The song sets up a basic, spartan groove as An’s breathy voice floats over top. There is a certain innocence to the vocal delivery, too. A vulnerability. I’ve listened to the song maybe dozens of times over the past week or so. I’m utterly fascinated by it, partially because I challenged myself to understand the song without looking up a translation of the lyrics.

In writing this post, though, I decided to lift that curtain and track down a translation of the lyrics. Despite being molested by Google Translate, the song’s lyrics are immediately understandable: she’s asking a lover why he stopped beating her – maybe he doesn’t love her anymore? That’s heavy stuff, to put it mildly. The scenes are immediately evocative of Lou Reed’s Berlin concept album: domestic, bluntly laying out the cruelty, and also confronting the hard-to-understand inner workings of the abused that stay in the relationship. Though it might just be coincidence, the lover in An’s song is referred to as LouLou…

I won’t lie, I’m kind of bowled over right now. And this is the kind of musical experience I really cherish: drawn into a complicated beauty of a song. Listen to it. It’s just pretty. But it’s simultaneously so baldly ugly too. This is something special because it gives us a slice of life. A bad slice, but a slice none-the-less. Music can contain truths that are lost in mere words and, though disgusting as they may be, we mustn’t run from this. This is us, this is who we are. This is reality.

In our social media, we often find ourselves framing ourselves in a certain light, highlighting the good and erasing the bad. Hiding it away, and only putting your best self forward. But it’s a lie, isn’t it? And even when someone tries to show a more complete picture, we look down our noses at them for sharing drama or just being too sad. Something media scholars study at length is this phenomenon of how we present ourselves and how we act when (we think) someone is watching: we change. We pretend to be something else. An Luu strips away all of that with this song, and it’s arresting in its beauty.

Eno: Music is political

Standard

Recently, an interview with Brian Eno appeared in Pitchfork  about Ambient music, what it means, and where it might be going. As much as I generally find Pitchfork to be annoying with their condescending attitude, this is a great interview done by Philip Sherburne. I debated what I wanted to highlight from the interview, but one quote from Eno seems to be resonating with me the most right now:

You can’t really make apolitical art. We started out talking about ways of composing; ways of composing are political statements. If your concept of how something comes into being goes from God, to composer, conductor, leader of the orchestra, section principals, section sub-principals, rank and file, that’s a picture of society, isn’t it? It’s a belief that things work according to that hierarchy. That’s still how traditional armies work; the church still works like that. Nothing else does, really. We’ve largely abandoned that as an idea of how human affairs work. We have more sophisticated ways of looking at things. [Emphasis mine. – JDS]

Do be sure to read the full interview, linked above.

What ever happened to surround sound music?

Standard

This post is based on a presentation I gave at the inaugural Indiana University Media School Graduate Student Conference.

About 10 or so years ago, it seemed like there was a new game in town: surround sound music! Of course, those of you old enough can recall that this isn’t the first time such a promise was made. But this time, by golly, it’s going to work! And if you believe that, then I have a 3D TV to sell you, too. But surround sound music seems like such a natural evolution, much like 3D TV. But time and again, surround sound music has failed to launch.

Back in my undergrad I took a course where, for one assignment, we had to produce a 5.1 mix of a project one of our peers has recorded. Even while explaining the assignment, the professor seemed doubtful about surround sound music really taking off and this being a relevant skill to build. Sure was fun to play with for the assignment, even if my mix was terrible.

As much as I try to avoid jargon, this post is going to have some. So, before I really dive in here, I’m going to hit you with some definitions:

  • Mono: one channel of audio information. Might be coming out of 1 to 2 or more speakers, but when each speaker is playing the exact same sounds, it’s mono.
  • Stereo: two channels of audio information.
  • Surround: more than 2 channels of audio, where some number of channels are positioned in such a way that the sound that comes out of them is coming from the sides, behind, above, or below the listener.

Also, I think to contextualize my argument properly, I need to give a (painfully!) brief history of recorded music, too.

  • Sheet music: circa 2000 BCE, cuneiform tablets had musical notation on them
  • Mechanical reproduction: circa 9th century (!!!), a hydro-powered organ that performed music etched into interchangeable cylinders by the Banū Mūsā brothers

A diagram of the hydro-powered organ.

  • Phonograph: 1877, Thomas Edison. Wax cylinders that could have audio waveforms etched into them and played back later.
    • Recorded live; recording and playback on one mechanism.
  • Disc phonograph: 1889, Emile Berliner. Platters instead of cylinders.
    • 33 ⅓ rpm (the LP): 1948, Columbia Records
  • (Practical) stereo sound: Bell Labs, 1937
  • Surround first attempted by Disney’s premiere of Fantasia in 1940
  • First “big” consumer format was Quadraphonic in the very early 1970s
    • Actually 3 competing and not cross-compatible formats
    • Could be done on tape or vinyl
    • CDs hypothetically could contain quadraphonic sound and it is allowed (but under-specified) in the “Red Book” but this was never commercially attempted 
  • Once the DVD and home theater setups became largely ubiquitous, DVD-A was attempted (among others)

OK, now to the good stuff!

The Case Against Stereo

It’s kind of hard to imagine given its ubiquity and sort of obvious design, but stereo music was not met with resounding embrace. Perhaps most understandably, the public needed to be convinced that it was more than a mere gimmick. But even musical luminaries like Brian Wilson of the Beach Boys and Phil Spector of convicted murder and the famous “Wall of Sound” production aesthetic spoke out against stereo. Spector thought that stereo would take control away from the producer, and power away from his Wall of Sound. It was an issue of scale: the Wall of Sound didn’t seem to work in stereo.

Wilson’s concerns were similar, but more focused on feeling like stereo necessitated trusting the public to set up their stereo systems correctly. If the speakers weren’t placed right, the stereo image would be strange and the balance between the left and right sides of the music would be bizarre or at the least, transformative to the recording. To contrast with mono systems, you just need to plug it in and turn it on. There’s nothing to calibrate.

To make matters worse, when companies were pushing stereo, they needed to be able to sell stereo records to people. As such, lots of recordings that were designed for mono were reprocessed as stereo. Back to Spector’s concerns, these recordings were not conceptualized for stereo. Even on a well set up stereo system, it is ultimately a perversion of what it was meant to be. Even more damning is that audiences had mixed reactions to “stereoized” recordings.

Surround Sound: more channels = more music?

It was only about 10 or 15 years after the initial foray into stereo music that surround sound first came to the consumer market in the form of quadraphonic sound: four speakers positioned around the perimeter with the listener in the middle. Just think about the physical reality of that for a moment! A few years ago a mono system could just be plunked down wherever convenient. No wires running every which way, sounded pretty good in a large portion of the room, and it was cheap. Then two speakers, but the sweet spot was still pretty large and the wires were limited to one side of the room at least. But then quad? This required an entire room to be dedicated to the listening of music, and you couldn’t stray too far from the sweet spot and have it still sound “good.” Wires would have to run the perimeter of the room, too. And the cost of four speakers and the specialized playback systems. Yes, systems. There were several competing quad formats that were not cross compatible. Yikes. Couple that with the quad-ized recordings and it was a bit of a mess.

All of that aside, there is a certain parity for the mono-to-stereo move and the stereo-to-surround move. But one worked and the other didn’t. Why?

Affordances of the Medium

Every medium is unique: Van Gogh’s Starry Night rendered in watercolor would be a different work because watercolor and oil do different things. The same thing applies to music formats: each has a unique set of strengths and weaknesses. Things tend to be most interesting, it seems to me, when artists leverage these affordances of the medium to create something that only works in that medium. The concerns about surround sound delivery are becoming less and less pronounced, thanks to modern surround emulations on headphones and even home theater soundbars can kind of fake surround sound. But where’s the music?

Starry night as painted by VanGogh

A watercolor re-interpretation.

I think that it has to do, largely, with the fact that not many artists need (or want) a surround sound space to do their work. In the West, our music listening traditions are deeply rooted in musicians being collected together in one area and the audience paying attention to them. (It hasn’t always been this way, but it has been for a few hundred years for the most part.) With our two ears in any physical space, we will hear stereo sound. So between our cultural practices of music and our built-in stereo receiver, stereo music works nicely.

Let’s go back to Spector’s Wall of Sound. The Wall of Sound didn’t scale well to stereo because it was built upon the idea that adding multiple, multiple layers of a single part together, he could create an all-encompassing assault of music. Splitting this into stereo meant he would need to double what were already some of the largest, most complex recording sessions. It just couldn’t be done effectively. Now recall that surround is at least doubled yet again in terms of channels.

What does surround sound music even sound like?

Ever listen to early stereo recordings? You might hear the drums all the way in the left, the bass all the way in the right, and so on. Maybe there would be extra reverb added to fill the space. It was a bit extreme, but it was a necessity. Those sources weren’t recorded to be stereo, so all they could do was put individually mono signals in different spaces in the mix. And by golly, if people are paying for stereo, let’s make sure they hear it! This was also due to limitations of early stereo recording consoles where panning (placing things in the stereo field) was reduced to “L-C-R:” a three way toggle for left, center, or right. But back to surround… what should go in those additional channels?

“This 5.1 mix of Megadeth is so going to be worth it.”

The answer to this question is similar to the answer for early stereo: grabbing elements from recordings conceptualized for stereo and distributing them across the additional channels. The result is an emaciated surround mix, spread thin around the room. Crucial pieces would be excised and hung out in periphery. Even worse, those sounds from aside or behind the listener have very different psychological meanings than sounds from in front of you. On a fundamental and animal level, sounds from sources that we can’t see are startling.

Other approaches were to take a stereo recording and make it sound like you were listening in an idealized listening environment. Some kind of emulation of a space. This is an interesting idea, but there’s no way to account for what the listener’s room already sounds like. Once more, this is ultimately noise. The signal is the music!

#NotAllSurround

I don’t mean to universalize. There are some wonderful examples of surround sound music out there, but it’s very niche. And it’s because it necessitates the entire process of recording the music (if not the conceptualization of the music itself!) to be done, from the ground up, for surround sound. And it’s hard. It’s very, very hard to do because there is so little basis for comparison. Part of successful artistic endeavor is pushing against the boundaries of the possible. In surround sound, those boundaries are so much more distant than stereo or mono that it’s hard to even find them. It’s for these reasons that I think surround sound music will never leave the niche. If the content is good, people will make excuses to jump through the hurdles to listen to it.

A Recommendation

Even though I’ve been dumping on surround sound music, I don’t want you to think that I dislike it or think it’s dumb. Far from it! It’s just hard to find examples of surround sound music that sound like they should be surround or that they are doing something that can only be done in surround. But those examples do exist, and I’d like to recommend one:

The Flaming Lips: Yoshimi Battles The Pink Robots 

I recommend this one in particular because it’s a reasonably well known recording in its own right, but also because the stereo and surround versions allow for a compare and contrast: the ‘Lips didn’t just release a surround version of stereo mixes: they’re different versions of the songs with different elements and different vibes. The Flaming Lips have long played around with surround sound, so it only seems fitting that they knocked this one out of the park. And despite its age, it still sounds like the future – and that’s what surround sound is all about, right?

I come not to praise the album, but to bury it

Standard

Ok, not really. The album has been and probably will be a viable and vibrant means of artful expression for the foreseeable future. But like the vinyl medium, I think it’s halcyon days are over. All because of this little bastard:

shuffle-icon-614x460

Damn you!

An album is, at least, a loose collection of related songs. Typically the recordings contained within are from one band or artist, from a similar time period, and contain sonic cues to relate the songs to one another. In short, they sound like they belong together. But now if you were to open up your music player of choice, I’d put even money on you having some kind of playlist or aggregate view that contains works from multiple artists, times, genres, and so forth. And you probably (gasp!) mix these together to organize them into ad hoc compilations that suit some purpose or setting.

As a music consumer, being able to make playlists is an endlessly fun and fruitful way to explore connections between my life and music. I find new ways to connect and relate to artists and music when I have free reign to build playlists. Was Friendly Rich‘s “Mr. Skin’s Hymn” meant to be put in conversation with Scott Walker’s “30th Century Man?” I don’t know, but now that I’ve put them on a playlist together, I quite enjoy it.

scottwalker-horzb

Please don’t confuse the Scott Walker’s “30th Century Man” with this Scott Walker, a 17th century man

I’m sure you have your own similar experiences with making playlists. Even the simple act of putting music of a similar tempo together for the purpose of a workout playlist is destructive to the concept of the album. So why, then, this disconnect between the way people listen to music and the way music is released?

Putting on my musician hat, something I’ve grappled with a long time is “should I record an album?” Aside from my legions of adoring fans demanding such a release, why should I? My music listening habits inform my music creation habits. I don’t have the material for an album per se because my collection of recordings is more like a playlist. There is some kind of implicit thread through them all but an album implies a genre, a mood, a production sense… something. And I don’t have it.

empty-theater

Pictured: the wild throngs of fans

I don’t think it’s a matter of discipline, either. This is a pointed effort: I want each song to exist in what I think is its best and truest form. I want to celebrate diverse inspirations. I want my music to reflect the way I listen to music. That means sacrificing the obvious sonic cues that these recordings belong together. And I know I’m not alone. In fact, composers have been playing with this idea for a long time. The likes Stockhausen and Cage challenged us to question what sounds belong together in music. Moving up a level, what songs belong together on an album?

There’s something coming over the horizon – a new way to think about a collection of recordings that belong together – and it isn’t an album as we know it. It’ll be some new way to approach the underlying logic of how and why songs belong together, and what it means for them to exist in one release. I’m excited to find out what it is.

Once More, with (Less) Feeling: artificialized vocals

Standard

 

This semester has been challenging and fun. One class, in particular, really pushed me. It’s a class on music information processing. In other words, it’s a class on how computers interpret and process music as audio. I’ll spare you a lot of the technical stuff, but generally speaking we were treating audio recordings are vectors with each value of the vector corresponding to the amplitude of a sample. This allowed us to do all sorts of silly and interesting things to the audio files.

The culmination of the class is an independent project that utilizes principles learned from the class. This presented a unique opportunity to design an effect that I’ve wanted but couldn’t find: a way to make my voice sound like a machine. Sure, there’s vocoders, pitch quantizers, ring modulators, choruses, and more… but they don’t quite do what I want. The vocoder gets awfully close, but having to speak the vocals and also perform the melody on a keyboard is no fun. iZotope’s VocalSynth actually gets very close to what I want, but even that is hard to blend the real and the artificial. There had to be something different!

And now there is. Before I can explain what I did, here’s a little primer on some stuff:

Every sound we hear can be broken down into a combination of sine waves. Each wave has 3 parameters: frequency (pitch), amplitude (loudness), and phase. You’ll note that phase doesn’t have an everyday analog like frequency does with pitch. That’s probably because our hearing isn’t sensitive to phase (with some exceptions not covered here). Below is a picture of a sine wave.

zxfec

See how the wave starts at the horizontal line that bisects the wave? This sine wave has a phase of 0 degrees. If it started at the peak and went down, it would have a phase of 90 degrees. If it started in the middle and went down, it would have a phase of 180, and so forth.

As I said, we don’t really hear phase, but it’s a crucial part of a sound because multiple sine waves are added together to make complex sounds. Some of them reinforce each other, others cancel each other out. All in all, they have a very complex relationship to each other.

This notion of a complex wave represented by a series of sine waves comes from a guy named Fourier. (He’s French so it’s “Four-E-ay.”) There’s a lot of different flavors of the Fourier Transforms, but the type relevant here is the Finite (or Fast) Fourier Transform. This one only deals with finite numbers, which are very computer friendly.

There’s a subset of the FFT called the STFT (short-time Fourier Transform) that maintains phase information in such a way that it’s easier to play with. One of the simplest tricks is to set all of the phases to 0. This makes a monotone, robotic voice with a few parameters changed. Hm! That’s fun, but not very musical.

STFTs, as the name implies, analyze very short segments of audio then jump forward and analyze another short segment. Short, in this case, means something like 0.023 seconds (1024 samples at 44.1k) of audio at a time. Here’s where the robot voice comes in: instead of jumping ahead to the next unread segment, I’ll tell it to jump ahead, say, a quarter of the way and grab 0.023 seconds, then jump another quarter and so on. This imposes a sort of periodicity to the sound, and periodicity is pitch!

By manipulating the distance I am jumping ahead, I can impose different pitches on the audio. This is essentially what I did in my project. More specifically, I:

  1. Made a sample-accurate score of the desired pitches
  2. Made a bunch of vectors for start time, end time, and desired pitches (expressed as a ratio)
  3. Made a loop to step through these vectors
  4. Grabbed a chunk of sound from a WAV file
  5. Performed an STFT using the pitches I plugged in
  6. Did an inverse STFT to turn it back into a vector with just amplitube values for samples
  7. Turned that back into a WAV file

(See the end of the post for a copy of my code.)

Here’s what I ended up with!

And here’s what it started as:

Please be forgiving of the original version. It’s not great… I was trying to perform in such a way that would make this process easier. It did, but the trade off was a particularly weak vocal performance. Yeesh. My pitch, vowels, and timbre were all over the place!

Anyway, here’s the code. You’ll need R (or R Studio!) and TuneR. Oh, and the solo vocal track.

setWavPlayer("/Library/Audio/playRWave")

stft = function(y,H,N) {
 v = seq(from=0,by=2*pi/N,length=N) 
 win = (1 + cos(v-pi))/2
 cols = floor((length(y)-N)/H) + 1
 stft = matrix(0,N,cols)
 for (t in 1:cols) {
 range = (1+(t-1)*H): ((t-1)*H + N)
 chunk = y[range]
 stft[,t] = fft(chunk*win)
 } 
 stft
}

istft = function(Y,H,N) {
 v = seq(from=0,by=2*pi/N,length=N) 
 win = (1 + cos(v-pi))/2
 y = rep(0,N + H*ncol(Y))
 for (t in 1:ncol(Y)) {
 chunk = fft(Y[,t],inverse=T)/N
 range = (1+(t-1)*H): ((t-1)*H + N)
 y[range] = y[range] + win*Re(chunk)
 }
 y
}

spectrogram = function(y,N) {
 bright = seq(0,1,by=.01) 
 power = .2
 bright = seq(0,1,by=.01)^power
 grey = rgb(bright,bright,bright) # this will be our color palate --- all grey
 frames = floor(length(y)/N) # number of "frames" (like in movie)
 spect = matrix(0,frames,N/2) # initialize frames x N/2 spectrogram matrix to 0
 # N/2 is # of freqs we compute in fft (as usual)
 v = seq(from=0,by=2*pi/N,length=N) # N evenly spaced pts 0 -- 2*pi
 win = (1 + cos(v-pi))/2 # Our Hann window --- could use something else (or nothing)
 for (t in 1:frames) {
 chunk = y[(1+(t-1)*N):(t*N)] # the frame t of audio data
 Y = fft(chunk*win)
 # Y = fft(chunk)
 spect[t,] = Mod(Y[1:(N/2)]) 
 # spect[t,] = log(1+Mod(Y[1:(N/2)])/1000) # log(1 + x/1000) transformation just changes contrast
 }
 image(spect,col=grey) # show the image using the color map given by "grey"
}


library(tuneR) 
N = 1024
w = readWave("VoxRAW.wav")
y = w@left
full_length = length(y)


bits = 16
i = 1
# this is a vector containing all of the pitch change onsets, in samples
start = c(0,131076,141117,152552,241186,272557,292584,329239,402666,
 459154,474012,491649,697317,786623,804970,824932,900086,924171,
 944914,968743,984086,1082743,1088571,1120457,1132371,1151571,
 1335171,1476343,1614943,1643400,1666886,1995600,2133514,2274429,
 2300571,2325686,3332571,3412114,3437400,3451800,3526457,3540343,
 3569314,3581657,3600943,3610371,3681086,3694800,3745200,3763371,
 3990000,4072371,4091143,4113000,4195286,4216200,4233429,4254000,
 4286743,4380771,4407701,4422086,4443686,4630114,4750886,4768029,
 4906371,4934829,4958914,5286171,5409686,5428714,5565943,5595086,
 5618829,5944543,6068829,6086057,6223714,6250543,6275057)

#this is a vector containing all of the last samples necessary for pitch changes. in samples
end = c(131075,141116,152551,241185,272556,292583,329238,402665,459153,
 474011,491648,697316,786622,804969,824931,900085,924170,944913,
 968742,984085,1082742,1088570,1120456,1132370,1151570,1335170,
 1476342,1614942,1643399,1666885,1995599,2133513,2274428,2300570,
 2325685,3332570,3412113,3437399,3451799,3526456,3540342,3569313,
 3581656,3600942,3610370,3681085,3694799,3745199,3763370,3989999,
 4072370,4091142,4112999,4195285,4216199,4233428,4253999,4286742,
 4380770,4407700,4422085,4443685,4630113,4750885,4768028,4906370,
 4934828,4958913,5286170,5409685,5428713,5565942,5595085,5618828,
 5944542,6068828,6086056,6223713,6250542,6275056, full_length)

#this ratio determines the pitch we hear by manipulating the window size
ratio = c(4.18128465,3.725101135,3.318687826,3.725101135,4.693333333,
 4.972413456,4.693333333,3.132424191,4.18128465,3.725101135,
 3.318687826,3.132424191,4.18128465,3.725101135,3.318687826,
 3.725101135,4.693333333,4.972413456,4.693333333,3.725101135,
 3.318687826,4.18128465,4.18128465,3.725101135,3.318687826,
 3.132424191,4.972413456,5.581345393,3.725101135,4.18128465,
 4.972413456,4.972413456,5.581345393,3.725101135,4.18128465,
 4.972413456,4.18128465,3.725101135,3.318687826,3.725101135,
 4.18128465,4.693333333,4.972413456,4.693333333,3.725101135,
 3.318687826,2.486206728,4.18128465,3.725101135,3.132424191,
 4.18128465,3.725101135,3.318687826,3.725101135,4.693333333,
 4.972413456,4.693333333,3.725101135,3.318687826,4.18128465,
 3.725101135,3.318687826,3.132424191,4.972413456,3.725101135,
 5.581345393,3.725101135,4.18128465,4.972413456,4.972413456,
 3.725101135,5.581345393,3.725101135,4.18128465,4.972413456,
 4.972413456,3.725101135,5.581345393,3.725101135,4.18128465,
 4.972413456)

w = readWave("VoxRAW.wav")
sr = w@samp.rate
y = w@left
ans = 0

for (i in 1:81) {
#the loop steps through each of the 3 above vectors 
frame = y[start[i]:end[i]] #take a bit of the wave from start to end
 
H = N/ratio[i] #make the window this size to change the perceived pitch
Y = stft(frame,H,N)


Y = matrix(complex(modulus = Mod(Y), argument = rep(0,length(Y))),nrow(Y),ncol(Y)) # robotization
ybar = istft(Y,H,N)
ans = c(ans,ybar) #concatinate all of the steps along the way

i = i + 1 #step through the loops
}

ans = (2^14)*ans/max(ans) #do some rounding to make sure it all fits
u = Wave(round(ans), samp.rate = sr, bit=bits) # make wave struct
#writeWave(u, "robotvox.wav") #save the robot version
o = readWave("VoxRAW.wav")
o = o@left
spectrogram(o, 1024) #what does the original recording look like?
r = readWave("robotvox.wav")
r = r@left
spectrogram(r, 1024) #what does the robot version look like?
#play(u) #listen to the robot version

MP3s don’t matter (until they do)

Standard

I’ve written before on some of the differences in MP3s vs WAVs, specifically how MP3s seem to invoke more negativity than WAVs in a blind test. I don’t know about you, but I thought those results were interesting and weird. So, I thought it made sense to kind of zoom out and try and get a bigger picture of this phenomenon.

A logical first step was to ask “Can people even hear the difference between WAVs and MP3s in their day-to-day life? If so, in what circumstances?” As the title implies, people generally can’t tell in most circumstances but once they do, it is a very pronounced shift.

The Experiment

I made an online experiment, asking people to listen to 16 different pairs of song segments and select the one they thought sounded better. There were 4 levels of MP3 compression: 320k, 192k, 128k, and 64k.

‘Why those levels of compression?’ you might be wondering. Amazon and Tidal deliver at 320k, Spotify premium does 192k, YouTube does 128k, and Pandora’s free streaming is 64k.

For each pair, one version of the segment was a WAV and the other was an MP3. (See below for more detail.) I also asked basic demographic information and how they usually listen to music and how they were listening to the experiment. For example, a lot of people use Spotify regularly for music listening on their phones, and a lot of people used their phones to do the experiment. Doing the experiment gave up a lot of control over how and where people listened, but the goal was to capture a realistic listening environment.

The Songs

I selected songs that are generally considered to be good recordings capable of offering a kind of audiophile experience. Also, I tried to choose “brighter” sounding recordings because they are particularly susceptible to MP3 artifacts. The thought behind this was to maximize the chance for identification of sonic differences, because I was doubtful there would be any difference until a very high level of compression.

I also split the songs into eras: Pre and Post MP3. I thought that maybe music production techniques might change to accommodate the MP3 medium, and maybe MP3s would be easier to detect in recordings that were not conceived for the medium.

The Song List by Era

Pre MP3 (pre 1993):

  1. David Bowie – Golden Years (1999 remaster)
  2. NIN – Terrible Lie
  3. Cowboy Junkies – Sweet Jane
  4. U2 – With Or Without You
  5. Lou Reed – Underneath the Bottle
  6. Lou Reed & John Cale – Style It Takes
  7. Yes – You and I
  8. Pink Floyd – Time

Post MP3:

  1. Buena Vista Social Club – Chan Chan
  2. Lou Reed – Future Farmers of America
  3. Air – Tropical Disease
  4. David Bowie – Battle for Britain
  5. Squarepusher – Ultravisitor
  6. The Flaming Lips – Race for the Prize
  7. Daft Punk – Giving Life Back to Music
  8. Nick Cave & The Bad Seeds – Jesus Alone

The Song List by Compression Level

320k

  1. Cowboy Junkies – Sweet Jane
  2. Lou Reed – Underneath the Bottle
  3. Squarepusher – Ultravisitor
  4. Daft Punk – Giving Life Back to Music

192k

  1. David Bowie – Golden Years (1999 remaster)
  2. NIN – Terrible Lie
  3. The Flaming Lips – Race for the Prize
  4. Air – Tropical Disease

128k

  1. U2 – With Or Without You
  2. Lou Reed & John Cale – Style It Takes
  3. Buena Vista Social Club – Chan Chan
  4. Nick Cave & The Bad Seeds – Jesus Alone

64k

  1. Pink Floyd – Time
  2. Bowie – Battle for Britain
  3. Lou Reed – Future Farmers of America
  4. Yes – You and I

The Participants

I had a total of 17 participants complete the experiment (and 1 more do part of the listening task) and a whole lot of bogus entries by bots…. sigh. Here’s some info on the real humans that did the experiment:

Pie Charts2.png

Note: options with 0 responses are not shown

Pie Charts3.png

Pie Charts4.png

“Which best describes your favorite way to listen to music that you have regular access to?” was the full question. I didn’t want everyone to think back to that one time they heard a really nice stereo!

Pie Charts5.png

Pie Charts6.png

Pie Charts7.png

“This includes informal or self-taught training. Examples of this include – but are not limited to – musicians, audio engineers, and audiophiles.”

 

Unfortunately, the sample size wasn’t big enough to do any interesting statistical analyses with this demographic info, but it’s still informative to help understand who created this data set.

The Results

Participants reliably (meaning, a statistically significant binomial test) selected WAVs as higher fidelity when the MP3s were 64k. Other than that, there was no statistical difference.

OUTPUT.png

OUTPUT1.png

OUTPUT2.png

OUTPUT3.png

11 to 57 in favor of WAV, p <0.001

When I first looked at the Pre/Post MP3 comparison, I was flummoxed. There is a statistical difference in the Post MP3 category… favoring WAVs.

866

That’s pretty counter-intuitive. That would be like finding that people preferred listening to the Beatles on CD instead of vinyl. It just doesn’t make sense. Why would recordings sound worse in the new hip medium that everyone’s using?

They don’t. My categorization was clumsy. So, yes, I selected 8 songs that were recorded after MP3s were invented, but what I didn’t consider is that the MP3 was not a cultural force until about a decade later, and not a force in the music industry until later than that even. So I went back and looked at just the Post MP3 category and split it again. Figuring out when the MP3 because a major force in the recording industry was a rabbit hole I didn’t want to go down, so I used a proxy: Jonathan Sterne, a scholar who looks at recording technology, published an article in 2006 discussing the MP3 as a cultural artifact. And luckily enough, using 2006 ended up being fruitful because of my 8 songs in the Post MP3 category, none were released on or even near 2006. I had 5 released before and 3 released after, and when I analyzed those groups, there was a strong preference for WAV in the older recordings but not in the newest recordings. This suggests that yes, recordings, after a certain date, are generally recorded to sound just as good as MP3s of a certain quality or WAVs. Here’s the analysis:

better-mp31

25 to 60 in favor of WAV, p < 0.001

 

better-mp3

So, to sum up: the debate between WAV and MP3 doesn’t matter in terms of identifying fidelity differences in real world situations for these participants UNTIL the compression levels are extreme. And, recordings designed for CDs and not MP3s sound better on CDs than MP3s, but it doesn’t matter for older recordings. If I had to guess it could be because some of the limitations of the vinyl medium are similar to MP3 (gasp! Heresy!) and so recordings designed for vinyl work kinda well as MP3s, too.

Let’s define music!

Standard

Goodness, I have written lots of word about music, but I’m not sure if I have ever thoroughly defined what I mean by “music.” In this post you’ll find my definition, of course, but I want to clarify right up front that this may read to be slightly antagonistic. In a sense it is meant to be, but ultimately it is about how to define music in the context of communication. I’m trying to push boundaries, not hurt feelings.

I don’t claim all of these thoughts as my own, but this may be a unique synthesis of standing ideas. I’ve also touched on some of these ideas in previous posts, but I wanted to put them all together.

Music describes a way of thinking about sound.

Music is a bit like the infamous Supreme Court ruling on pornography: it’s hard to define but when you’re presented with an example, you recognize it immediately. Once you start leaving the very obvious examples, it gets kind of hard to find the boundary between music and regular sound. That’s because music describes a way of thinking about sound, not a specific kind of sound.

I think the most famous example of pushing the boundaries of music in the western world might be John Cage’s 4’33.” A pianist sits down, prepares to play, then does nothing for 4 minutes and 33 seconds. Is that music? Well, Cage would certainly say so but the audience in the music hall is split. Some say yes, some say no. Who is right?

I would argue that 4’33” in that example is definitively music, and here is why: the context. In his autobiography, Frank Zappa argued that context is key. He called it “putting a frame around it.” Let’s explore this a bit. The audience in my example above is at a music hall to hear music. A performer sits at an instrument, prepares to play, then plays silence for 4’33”. While it is certainly up to audience members to decide how much they enjoy the performance, they can’t really argue about whether or not music happened because the context clearly articulated that music happened.

Here’s another example: you’re walking in the woods alone, and you come to a clearing to find a pianist sitting at a piano. As you approach, she hops up and says “ah! I just finished my performance of 4’33”! What did you think?” Did you hear music for the last 4 minutes and 33 seconds? I don’t think so. There was no contextual clue to encourage you to think about sounds as music for the previous four and a half minutes. (Unless, of course, you just so happened to be doing it on your own free will, but the odds of that are remote.)

Another way to think about it is the old paradox: don’t think about an elephant. It’s impossible to not think about an elephant when you are given this prompt. Similarly, the people in the music hall are thinking about music and thinking about sound as music. Even if they’re thinking “ugh, this is stupid, this isn’t music,” they are still thinking about sound as music.

Music is communication.

When we hear sound as music, we are interpreting and processing it. Music is inherently more vague in its meaning than language, but there is still meaning. Music has emotional impacts, triggers memories, and causes physiological responses. Language does all of these things, too.

I think a lot of people get hung up on the idea of “music is communication” because music isn’t specific or declarative. I agree wholly that music is non-specific and non-declarative. I can’t play you a tune on a recorder to ask you to get me a beer (I would if I could, though!). And if you ask 10 people to listen to the same song, they’ll each tell you something different when asked what it means.

However, language suffers some of the same faults. Has anyone ever misunderstood you? Or have you ever said something that came out wrong? Of course you have. Language is specific, but the interpretation is difficult. I think music suffers a somewhat similar fate: a composer can intend to convey a scene or a feeling, but different audience members will have different responses.

Also, I’m blogging right now. (Duh.) But why? Well, blogging has a certain set of affordances that other kinds of communication lack. I could say this out loud, but only the other people near my desk would hear me. And once I’ve said it, it’s gone forever. I could write a book, but that means people need to buy it to read my thoughts. I could write a poem, but my prose is terrible. The point is that I’m writing this in blog form because it seems to be the best way for me to share these specific ideas in a way that I want to share them. Music is no different. I can express things that are difficult or impossible to express outside of music.

I think a more complete analysis of the affordances of music would be a swell thing to do, but here’s a short sketch: musical expression has no substitute mode of expression. I can’t accurately tell you about a piece of music, I can only approximate it in words. Information is lost when I talk about it compared to you experiencing it first hand. I think what is lost is the thrill and the emotion. Not only am I sharing words, but I’m sharing my interpretation of it. I’ve taken the experience out of it. It’s like baby food: the nutrition is there, but the experience of texture is lost in the processing.

Music is interesting.

Unlike language, music is inherently interesting. Language is designed to convey specific ideas. The goal is clarity and meeting expectations of normal patterns of communication. Sentences have at least a noun and a verb. Normal communication is utilitarian and functional. Musical communication is impressionistic and fanciful.

Part of the joy of listening to music is the blend of having your expectations met and defied in unexpected but carefully constructed ways. A piece of music establishes or implies a set of rules, but then defies those rules for your enjoyment. For example, a common thing to do in a pop song is to modulate up part of the way through the song. This defies expectations because the song has clearly established itself to exist in a given key, but then everything suddenly shifts upwards. The foundation the song was built on just got pushed upward a little bit. It’s startling, but it can be pleasant when done artfully. Another example is establishing a phrase (a pattern) by repeating the structure, but then unexpectedly stopping the pattern short. Again, this can be quite exhilarating and pleasant when done carefully. Imagine that happening in a conversation, though. Someone is talking to you and they just stop right in the

… Language doesn’t work that way, does it? Language is meant to inform and music is meant to challenge and entertain you, in a broad sense. Attempts to describe music in terms of musical forces (like physical forces) sometimes stumble because music does unexpected things. A thrown ball will always obey physical forces. In that sense, it is uninteresting. Music, however, will only sometimes obey musical forces and that’s part of the point.

Music is important.

Music is a means of expression for both performers and listeners. It is therapeutic. Music helps build identity both for individuals and groups. These are concrete, real psychological benefits. Music helps us survive, and it helps shape societies.

And now, I think a brief explanation of what music is not would be useful.

Sheet music is a lie.

Sheet music is not music nor is it an accurate representation of music. It is a shorthand expression and a necessary means to preserve musical ideas in the era before recording audio was possible. It is a useful guide for memorization and performance. Systems that explicitly or implicitly rely on sheet music as if it is real music are faulty.  Sheet music captures onsets and durations in an abstract and imperfect way, and make little to no attempt to capture feeling.

Schenkerian analysis is a way to analyze music, but it is not the way.

Schenkerian analysis is a useful tool to analyze music of a certain type when asking certain questions. However, since it is by far the dominant (heh) method of musical analysis, it is often applied to situations where it is not relevant or meaningful. Schenkerian analysis also presumes that sheet music is an accurate representation of music. Schenkerian analysis is performed on sheet music, not actual music. It is also produces a tautological result: each piece of music can be reduced to simpler and simpler versions, eventually ending in a descending pattern of notes. On the surface, this is a stunning revelation about how music works but the problem is that Schenkerian analysis demands this outcome.

When studying the psychological implications of music, it is important to ask questions about the music that most people actually experience.

Remember, music is a phenomenon that exists in the mind. It then follows that it is important to study the kinds of music found in most minds. And I think it’s safe to say that Schubert isn’t it. It’s time to roll up our sleeves and dig into the music of the now.

Music perception and cognition research largely limits itself to SERIOUS CLASSICAL MUSIC and maybe jazz when feeling cheeky. This is a problem! And please don’t think I’m knocking serious classical music or jazz, or the study of this music. It’s very important and relevant and I am grateful that people do it because both of the forms of music profoundly influence our current popular music.

What I am advocating is that music be studied in such a way that is more related to how most people experience music. Artificiality is a challenge in any line of research, but this stumbling block seems easy enough to avoid. The barriers to studying popular music are institutional elitism, not practical issues.

Anyway, I hope you enjoyed this or at the very least found it provocative. I know it helped me a lot to codify all of these thoughts in one place, so I thank you for the indulgence.

Metaphors, music, and learning from the absurd

Standard

It finally happened. I think every graduate student gets one, and I got mine: a reading assigned for class that is completely blowing my mind. Steve Larson’s Musical Forces is provocative, funny, and controversial. Larson argues that, like the physical world, music has forces that govern (or, in the case of music, “influence” might be appropriate) its motion through time. Music has forces that are similar to the physical forces because of the one thing common to every human: the experience of having a body and existing in the physical world. And we base all of our knowledge in metaphor for the physical world. (Note: “base,” “in,” etc.)

 

Larson even says he can quantify the musical forces. You’ll have to read it yourself to see if you agree. I have yet to make up my mind.

Anyway, time to pivot:

1761013715_b52beca319_z

… says the pawn shop, without a hint of irony.

I’m finally starting to gain some perspective on what truly interests me and the conceptual continuity that connects all of my expression. From a personal perspective, I see little distinction between my identities as a scientist and a creative. Research, to me, is a fundamentally creative endeavor and despite the stereotypes about creative types, I think scientists and creatives face very similar problems:

 

  • What hasn’t been done yet?
  • How can I synthesize things that have been done to produce new things?
  • How do I know if it’s good?
  • When is it done?
  • What do I do with it when it’s done?
  • What value does this create?
  • What else could I have been doing if this fails?

The threads that I see more and more connecting these aspects of my life are all about levels of abstraction. Cast in another light, it might be described as metaphor in the same way that Hofstadter and Larson mean it: cross-domain mapping. (As well as allegory, which is intra-domain mapping). Now, before you recoil in horror at that jargon, let me clarify this idea a bit while also making it more opaque.

Cross-domain mapping is about making an association between two unrelated things. First of all, think of domains as categories. The classic example is “the legs of a chair.” Chairs don’t have legs. Not really. Animals have legs, and a chair is not an animal. We call those sturdy vertical protuberances on the bottom of a chair “legs” because their function and form are evocative of actual legs. An example of intra-domain mapping is something like saying “[song a] starts the same way as [song b].” They don’t literally start the same way, but we choose to relate them. Surely the notes played, arrangement, tempo, etc. might be highly, highly similar but they aren’t literally identical. Larson calls this kind of comparison “hearing as.” Going back to the legs of a chair, that would be an example of “seeing as.”

Right about now, if you’re still with me, you might be thinking “oh, well this isn’t so hard.” But there’s that sense of something lurking in the depths, isn’t there? A sense of unease. An ugly question rears its head: what exactly qualifies as a domain? The short answer is that there is no answer. There are big, obvious domains that would be hard to argue as being part of the same domain like cars vs dogs, South Indian cuisine vs Southern Indiana cuisine, blogs vs good sources of information, and so on. Got it? Good.

For your consideration, what is this pictured below?

uss_enterprise_ncc-1701-a

Depending on your individual knowledge, possible answers range from “that Star Wars thing” to “the Enterprise NCC-1701-A, a refit Constitution class cruiser, under command of Admiral James T. Kirk.” Now, given the disparity between those descriptions, and not even considering everything in between, can you see how it would be hard to define universal, concrete domains? Let’s go further. Is the ship below the same or different from the one above?

uss_enterprise

Very quickly, you’ve probably come to the conclusion that “it depends – it’s complicated.” You’d be right. Domain mapping gets complicated quickly because domains are highly context driven as well as individualized.

There’s good news, though. Metaphors and allegories can organize nicely into hierarchies depending on your level of analysis: human vs animal -> animal kingdom vs plant kingdom -> multicellular life vs single cellular life -> … Whatever the context or individualized knowledge you possess, we all have hierarchies of abstraction.

giphy1

Inevitably, you end up with this trope.

And at least right now, that’s the thing that interests me: how do we, as humans, manipulate these hierarchies of abstraction to communicate effectively? Music, to me, is a primary example of this. I could orate, paint, or even write all I want to try and have you understand a piece of music and it wouldn’t matter one bit if you haven’t actually heard it. The music-ness of the abstraction of thought is part of communication itself, and it can’t be expressed in any other way. At least, I don’t think so.

Furthermore; when creating music, how do we manipulate levels of abstraction to communicate something? What does it mean to strum a guitar? When I’m working with my bandmates on a new song, what do we talk about and why? How does it influence what we play? And when assembling a song for dissemination as a piece of media, what does it mean to put the guitar in the mix one way or another?

Brian Eno talks at length about some absurdities he uses when working with other musicians to provoke and evoke certain moods, vibes, or styles of play. One of my favorites is Oblique Strategies, which was originally a deck of cards meant to be a guide through abstract ideas and commands when stuck on some sort of creative task. Follow that link, check out a few cards.

You draw a card and read it, then put it back down in a huff. What the hell does “Change nothing and continue with immaculate consistency” mean? Well, it’s up to you whether or not that prompt relates something meaningful to you. It’s a pointedly absurd way to provoke someone into thinking about different levels of abstraction, but none the less it’s a tool that people (myself included) swear by.

I don’t think there’s any one answer to any of the questions I’ve raised about manipulating levels of abstraction. I do think if I constrain myself to one type of communication (recorded music) there’s probably commonalities to what it means to experienced listeners and what it means to them on some basic level, since we have so much more in common than different because we’re all grounded in the same physical reality.

Beauty and the Beast: is there any difference between listening to MP3 vs CD quality?

Standard

TL;DR: yes. But come on! There’s a bunch of graphs and some lame jokes if you actually read the post.

Preface

As I sit here at my desk, I am surrounded by audio equipment and CDs. Spotify is open right now (streaming quality set to “Extreme,” thank you very much). My favorite pair of headphones are within arm’s reach. My studio monitors are effortlessly reproducing a lovely Terry Riley piece. Clearly, I am spoiled. But wait, let’s rewind a moment: I’ve got a stack of CD’s next to me, but I’m streaming compressed audio when I could be enjoying clean, uncompressed audio from my CDs? Why would I do that? (I also have a record player and a few choice vinyls, but it’s an obviously inferior format to CD so it’s not part of the comparison.)

I do it because it’s convenient. And there’s a massive amount of diversity on Spotify that simply isn’t legally accessible to me given my grad student budget. And I’m not alone: a whole heck of a lot of people in the US use streaming services. But all of them, save one, stream in what’s called lossy formats. In fact, other than listening to a CD or vinyl, the music you listen to is probably in a lossy format. It means the previously uncompressed and pristine digital audio of a CD is reduced not just in file size, but in information it contains.WAVs, by comparison, are lossless. It’s kind of bonkers to think, but MP3s and other lossy formats throw away a LOT of sound. That’s partially why they’re so small. The goal, of course, is to only throw away things you can’t hear.

It might sound kind of like science fiction (or the fantasy of scared parents of metal fans): unheard sounds in recordings? It’s true, though. In fact, our cognitive systems are really excellent at filtering out unwanted noise. It’s called the cocktail party effect. So why not automate the process and only save the parts that we hear anyway? It might not be that simple. I, along with a classmate and our advisor, decided to test if there was a difference in the subjective enjoyment of music listening between WAVs and MP3s.

The Experiment

We selected eight songs: four recorded before MP3s were even a glimmer in the Fraunhofer Institute’s eye, and four very recent songs. We did this because there’s an idea floating around in audio engineering and audiophile circles that, for example, the Beatles sound better on vinyl than CD because the albums were recorded for the idiosyncrasies of vinyl in mind. The easiest way to control for this was to have two “early” songs and two “recent” songs as MP3 and another set of two and two as WAVs.

The Song List

  • Aretha Franklin – RESPECT
  • Michael Jackson – Thriller *
  • The Eagles – Hotel California
  • The Beatles – Help! *
  • Carly Rae Jepsen – Call Me Maybe
  • Sia – Chandelier *
  • Rihanna – We Found Love
  • Daft Punk – Get Lucky *

* = MP3, 128k, LAME encoder

Note: the oldest available CD mastering was used for the pre-MP3 songs to eliminate / reduce the chance that some modern mastering techniques would be used to make it more MP3 friendly. For example, “Hotel California” was sourced from the original CD release in 1989.

We had people come in, put on headphones we provide them with, and listen to all 8 songs presented to each person in a random order. After each song, they would rate how positive it made them feel, how negative it made them feel, and how much they enjoyed it. The reason we asked positive and negative separately is because we conceptualize those feelings as representing activations of appetitive or aversive systems, respectively. They can activate separately or they can activate together.

Keep in mind, we told the participants nothing about the sound quality, MP3s or WAVs. As far as they knew, they just had to listen to 8 songs and respond to those 3 questions for each.

Results

I instigated this experiment because I didn’t think there would be a difference. We ended up hypothesizing that there would be a difference between the formats, such that people would like WAVs more. But to be honest I was skeptical, even if I had a theory-driven rationalization as to why I thought it would come out this way. (More on that later.) I thought people might even prefer MP3s since our participants are young and have probably been listening to MP3s their whole lives, give or take.

H1 figure.png

F(1, 17) = 2.162, p = 0.16

The graph above shows the mean positivity results by Format. It’s not statistically significant, but it is in the direction we predicted. Admittedly, this one result alone isn’t convincing. But wait — there’s more!

H2 figure.png

F(1, 17) = 5.224, p < 0.05

And this is a prime example of why we split out positivity and negativity into two measurements: the negative scores are significant, and support our hypothesis that people would like MP3s less.

H3 figure.png

F(1, 17) = 1.7, = 0.21

Again, not statistically significant findings here but the data are trending in the direction we predicted.

RQ1 figure.png

F(1,17) = 5.285, p < 0.05

And here’s the kicker: people rated early era songs as MP3s more negatively than anything else. And this finding is statistically significant.

Discussion

So what gives? Well, it could be as simple as our participants just hated “Thriller” and “Help!” as songs. But more than they hated The Eagles‘ “Hotel California?” I sincerely doubt it. But it is possible, I’ll admit that openly.

Here’s what I think went on, though: remember how I said that MP3s strip out a lot of information, most of which you can’t hear anyway? I bet that process is flawed. It clearly works very well, but I bet that it is imperfect and listening to MP3s is actually MORE work for your brain than uncompressed audio (like WAVs). Our minds are very lazy and, under most circumstances, seek the path of least resistance when hit with a task. If MP3s tax the cognitive systems more than WAVs because we need to actively fill in some of the missing gaps or work harder to do our usual filtering, then it seems logical that we would rate the experience more negatively.

Moving Forward

This study isn’t perfect. I would prefer to have run it with a counterbalanced design where some participants heard Song A as MP3 and others heard Song A as a WAV. That would help remove unwanted effects of the song itself. That, and while I have some ideas as to why these results came about, this experiment doesn’t prove or even directly support my ideas. I need more information before I can put that claim forward more strongly.

The good news is that we have a lot more research in the pipeline regarding audio compression and how it impacts the listening experience.

Who recorded the first power ballad?

Standard

This summer I had the opportunity to spend a couple weeks in Japan for a conference and vacation. While there, I met up with my uncle who is presently living in Korea. My uncle is a pretty cool dude: an avid cyclist, musician, and long time veteran of the music industry.

Over some mind-bogglingly good sushi, my uncle posed this question to me: “Who recorded the first power ballad?”

We both agreed that Elton John (and Bernie Taupin) wrote a prime example of the power ballad: Don’t Let The Sun Go Down On Me. It has all the qualities we decided to be crucial to the power ballad:

  • slow tempo
  • sincerity
  • a sense of yearning
  • it’s personal
  • a highly sing-able melody
  • a big chorus that pulls the listener in
  • easy to remember
  • a quiet sort of rage that threatens to become unhinged
  • it needs to be at least somewhat popular
    • What good is a sing-along when there’s no one to sing it?
  • only duple meters
    • For example, if it’s in 3/4 then it’s a waltz

The power ballad isn’t a happy song. It’s about loss or lack. It’s sincere and brutally emotionally honest without being overly intellectualized.  There’s a sense of anger, but it’s buried under the sing-song melody. In short: it’s a highly relatable song that invites everyone to sing along and feel with and through it. That’s the definition we put together for the power ballad.

It turns out a scholar by the name of Charles Aaron has a slightly different take on it. From wikipedia (I know, I know, but I don’t want to link to a paywall):

According to Charles Aaron, power ballads came into existence in the early 1970s, when rock stars attempted to convey profound messages to audiences.[14]

Aaron argues that the power ballad broke into the mainstream of American consciousness in 1976 as FM radio gave a new lease of life to earlier songs such as Led Zeppelin‘s “Stairway to Heaven” (1971), Aerosmith‘s “Dream On” (1973), and Lynyrd Skynyrd‘s “Free Bird” (1974).[14]

A journalist named Pierre Peronne has argued that The Carpenters’ created the power ballad with “Goodbye To Love,” which was released in 1972.

It’s hard to believe that the power ballad began in 1972. (I’m not buying it that Stairway is a power ballad – it’s a 3rd person-perspective story, and it’s awfully goofy.) Even then, the Carpenters don’t exactly push it hard enough to have that driven sound. It might be a little too pretty. That’s hardly a crime, but it might disqualify it from being a power ballad.

So where does that leave us? What examples are there of popular western music pre-1974 that fit the requirements laid out above? Let’s explore a few options.

 

1970: The Velvet Underground – Oh! Sweet Nuthin’

the-velvet-underground

People often forget that they’re an early example of women in rock bands. Maureen Tucker is a highly influential drummer with a unique style.

This is a B-side off of the Velvet Underground’s album Loaded. It’s got the tempo, the longing, the sincerity, the sing-along chorus… nearly everything! Is that it? Do we have a winner? No, of course not. That would be anti-climactic. Aside from the fact that you can guess that this list is longer than one entry, the biggest problem facing this song is that it wasn’t popular. Loaded never charted.

During their existence, the Velvet Underground didn’t sell many albums or attain much commercial success of any kind. I mean, I guess they are one of the founding voices in alt rock, punk, and so much more… but not popular. C’est la vie. At least everyone that bought an album started a band.

1969: David Bowie – Space Oddity

hqdefault

Smoking in a helmet seems like a bad idea, Major Tom.

I’ve made no secret about my Bowie fandom, so it shouldn’t be a surprise that he’s on this list. Not only is Space Oddity an earlier example than the Velvet Underground’s entry on this list, it was even popular! But again, there are problems. Despite the fact that this is an emotive and compelling song that invites the listener to sing along, and speaks to fundamental human emotions such as loneliness and disconnect; it is still a bit abstract and intellectualized. That and it’s missing a big structural feature of the power ballad: the sing-along chorus.

1969: Frank Sinatra – My Way

deadstate-frank-sinatra

“There’s no way anyone will ever make these hats uncool.”

Sinatra is certainly outside of the rock tradition, but his popularity speaks for itself. And this song has it all: the big chorus, the emotion, the drive, the passion, the quiet rage… I can’t find a single flaw with it. “My Way” is undeniably a power ballad, even if the instrumentation is a bit different than other entries on this list. But is it the first?

1968: Claude François – Comme d’habitude

la-mort-de-claude-francois-a-suscite-beaucoup

François flipped out when he first saw Dumb & Dumber, claiming they copped his style.

Well, if “My Way” is a power ballad, would the French song it’s based on be a power ballad also? François doesn’t quite have the same gravitas to his voice as Sinatra, but it’s all still there. In fact, this version pushes even harder than Sinatra’s version to get the power. I’d be willing to say yes, yes it is a power ballad. It’s got it all!

 

1967: Hervé Vilard – Comme d’habitude

1021271-herve-vilard-l-interprete-de-capri-580x0-2

“Ouch, shoulder pain got you down?” Vilard seen here in his failed Icy Hot ad campaign.

Turns out François wasn’t the first person to record this song. In fact, François recorded his own version after being displeased with the original recording by Vilard. So does Vilard’s version stand up to scrutiny? Well, it’s hard to say. I can’t find a copy of it. Vilard did release a recording of the song, but not until 1984. Given the criterion, there’s no need to speculate the qualities of the 1967 version since it wasn’t popular (or even released?).

giphy

Zoolander: best movie ever, or bestest movie ever?

 

1966: Beach Boys – God Only Knows

og1

Look at these nice young men. So clean cut, and a good influence on our youth!

This is a pretty close call with “God Only Knows.” It has the sing-able melody. It’s sincere and profound. It’s beautiful. It even pushes towards a build at the end! I think there’s a problem, though. The ending never quite reaches that quiet rage, and the ending is a round as opposed to the big chorus. Like the Carpenters before them, The Beach Boys might be too pretty for a power ballad.

1966: Them – How Long Baby

7c8db8d7818dc25e3c886be94f09ea83

Them showing off some early wireless / amp-less music tech.

This track is from Them’s second album, blandly titled Them Again. While not a runaway success, the album did chart – peaking at #138 in the US – and it has all the key features. It’s a hell of a song with a powerful and frank vocal take. That’s it, close up shop and go home. We have a winner! Except wait, there’s a problem: it’s in 6/8, a compound duple meter. It’s still a duple, but inside the duple is the triple feeling. And in this song, the triple is very prominent.

 

 

1965: Nina Simone – I Put A Spell On You

1361465825nina-simone

“I’m sure if someone makes a movie of my life after I die, it’ll be with taste and restraint, as well as full cooperation of my estate. Anything else would be massively disrespectful of my contributions to popular music and society as a whole.”

Coincidentally, there’s another song on Them Again that would be a good fit for a power ballad. Except Nina Simone did it first and much more famously. However, I am not sure Nina’s version qualifies as a power ballad. It is an incredible, slinky vocal performance. The arrangement skulks along with her. It’s brooding and imposing, but it never lets go and just blows the doors down. The good news is that this, too, is a cover. And it’s hard to write a list like this without invoking some of the earliest and most influential voices in the formulation of rock music.

1956: Screamin’ Jay Hawkins – I Put A Spell On You

maxresdefault

“I’m sorry, I can’t hear you over my suit.”

Screamin’ Jay Hawkins originally released the song a long while before Nina Simone made it a huge hit. It’s also in 6/8 metre, though the triple feel is less pronounced. Because 6/8 is a compound duple metre, there’s wiggle room. I’m going to give this one a pass because the triple-aspect of it isn’t as emphasized like in “How Long Baby.” Jay brings the power, but his bellicose performance takes away from the sing-along possibilities. Despite his over-the-top vocal take (in all its glory), the song still begs to be sung. Singing along with the recording is mandatory, however. So this doesn’t quite fit, but it’s awfully close!

Thanks to the state of race relations in the US in the mid 1950s, the song never charted despite its respectable sales. The sales alone speak to its popularity, however. All in all, I think this is the proto-power ballad. It doesn’t fit the mold quite right, but it fits the spirit perfectly. The granddaddy of them all. This is the holy grail!

 

So where does this leave us? 

The goal was to find the first power ballad, and we’ve done that: Claude François wins with “Comme d’habitude” in 1968. But stopping there does a disservice to the antecedents of the power ballad. By relaxing the requirements just a little bit, it’s easy to see that the power ballad is a much older song form that is at least as old as Screamin’ Jay Hawkins’ “I Put A Spell On You” from 1956. It’s all too easy to forget that contributions earlier artists make to music, and doubly so when there’s complex issues at play like race relations.

Reflecting back on the two songs, the connections become more apparent: the raw emotion, the yearning, the anger, and the build of power. I’m not sure anyone has connected Claude François to Screamin’ Jay Hawkins before, but there you have it.