19

Is there (somewhere) a command-line program for Windows which will create PNG/JPEG visual from MP3/WAV?

EDIT: This is a good example of how the image should look like. enter image description here

Alex G
  • 3,048
  • 10
  • 39
  • 78

5 Answers5

38

Sox, "the Swiss Army knife of audio manipulation", can generate accurate PNG spectrograms from sound files. It plays pretty much anything, and binaries are available for Windows. At the most basic level, you'd use something like this:

sox my.wav -n spectrogram

If you want a spectrogram with no axes, titles, legends, and a light background that's 100px high:

sox "Me, London.mp3" -n spectrogram -Y 130 -l -r -o "Me, London.png"

Sox accepts a lot of options if you only want to analyze a single channel for example. If you need your visuals to be even cooler, you could post-process the resulting PNG.

Here is a short overview from the commandline about all available parameters, the manpage has more details:

-x num  X-axis size in pixels; default derived or 800
-X num  X-axis pixels/second; default derived or 100
-y num  Y-axis size in pixels (per channel); slow if not 1 + 2^n
-Y num  Y-height total (i.e. not per channel); default 550
-z num  Z-axis range in dB; default 120
-Z num  Z-axis maximum in dBFS; default 0
-q num  Z-axis quantisation (0 - 249); default 249
-w name Window: Hann (default), Hamming, Bartlett, Rectangular, Kaiser
-W num  Window adjust parameter (-10 - 10); applies only to Kaiser
-s  Slack overlap of windows
-a  Suppress axis lines
-r  Raw spectrogram; no axes or legends
-l  Light background
-m  Monochrome
-h  High colour
-p num  Permute colours (1 - 6); default 1
-A  Alternative, inferior, fixed colour-set (for compatibility only)
-t text Title text
-c text Comment text
-o text Output file name; default `spectrogram.png'
-d time Audio duration to fit to X-axis; e.g. 1:00, 48
-S time Start the spectrogram at the given time through the input
Benjamin
  • 229
  • 1
  • 6
  • 15
Wander Nauta
  • 18,832
  • 1
  • 45
  • 62
  • 12
    Note that this isn't technically a waveform. It is, however, a visual. – Wander Nauta Mar 31 '12 at 14:57
  • I already use SOX, but never thought it can do spectrogram. =))) Amazing. – Alex G Mar 31 '12 at 14:58
  • 1
    Thanks, this was really helpful! I used this to generate a bunch of spectrograms for all .wav files in my folder: for %f in (*.wav) do ("c:\Program Files (x86)\sox-14-4-0\sox.exe" %~nf%~xf -n spectrogram -o %~nf.png -r -m -y 100) – Filip Skakun Aug 29 '12 at 22:43
  • great, thanks for sharing it. few corrections. it's a spectrogram not a waveform. and it's -Y 130 minimum. – OWADVL Feb 23 '14 at 18:11
  • It seems the height restriction is new - I've edited the example. Thanks! (I mentioned the output is not a waveform in the first comment.) – Wander Nauta Feb 24 '14 at 13:03
  • 3
    This is not a waveform, but a spectrogram. It is not a valid answer to the question being asked. – Maciej Jankowski Apr 03 '14 at 08:58
  • 1
    @MaciejJankowski Like I said in my first comment, the question asks for a PNG/JPEG visual. This is a PNG visual, and, therefore, a valid answer. – Wander Nauta Apr 03 '14 at 09:42
  • Just found that, the output spectrogram width default value is 800 pixels, with min [100 to max 200000], so it is better check audio duration and map duration to a suitable width, for my case duration in seconds x100 looks good. eg (4.5s *100)= 450 pixels – Steven Du Aug 20 '14 at 02:03
  • 1
    `sox --help-effect spectrogram` for more options – Sanya_Zol Dec 28 '14 at 18:17
  • I was looking for a way to produce graphs similar to those on [Infinite Wave](http://src.infinitewave.ca/), and it looks like Sox is exactly the tool they have used for this. – paddy Feb 02 '17 at 00:50
  • did they change the parameter meanings in the newer versions? For example the linked man page states -x is endian swap and -X is reverse-bits opposed to size in pixels or pixels per second as @WanderNauta states. – MetalSnake Feb 20 '17 at 15:25
  • 1
    @Diskutant You are likely talking about `sox` options in general. The parameters listed in the answer are specifically for (and should be listed after) the spectrogram subcommand. – Wander Nauta Feb 21 '17 at 10:58
11

A real waveform is possible with ffmpeg, you can download it here.

Install it somewhere and use the following command line as example:

ffmpeg.exe -i "filename.mp3" -lavfi showwavespic=split_channels=1:s=1024x800 waveform.png

or the following to match your example picture color, or other colors:

ffmpeg.exe -i "filename.mp3" -lavfi showwavespic=s=1024x800:colors=0971CE waveform.png

Documentation of FFmpeg showwavespic

KoalaBear
  • 2,755
  • 2
  • 25
  • 29
3

I've created a small PHP library that does this: https://github.com/jasny/audio


It works as following. It gets the samples using

sox TRACK.mp3 -t raw 4000 -c 1 -e floating-point -L -

This downsamples the track to 4k and puts everything in 1 channel.

Next I take chunks of samples (per pixel witd) and calculate the min and max. Use them to draw the waveform.

Arnold Daniels
  • 16,516
  • 4
  • 53
  • 82
1

I found this here quite nice (from a web archive, the original one is gone): http://web.archive.org/web/20140715171716/http://andrewfreiday.com/2011/12/04/optimizing-the-php-mp3-waveform-generator/

its PHP based and uses lame through shell.

update : the site seems dead from time to time, howerver here is the repo : https://github.com/afreiday

markasoftware
  • 12,292
  • 8
  • 41
  • 69
xamiro
  • 1,391
  • 1
  • 16
  • 32
0

An updated, batched version of Wander Nauta which generate histogram for all wav files into folder (BASH/DASH):

for i in *.wav; do ./sox $i -n spectrogram -y 130 -l -r -o ${i%%.wav}.png; done
Andrea Leganza
  • 447
  • 7
  • 7