INSTRUCTIONS FOR THE SPECTRATUNEPLUS (The example screenshots on the screenshots page, and the text around them, elucidate on these instructions, so you should reference them. For convenience, I have also placed a screenshot of the SpectratunePlus Control Panel below on this page.
Further, I have put letters (A) thru (X) on various sections of the Control Panel, to help you find the various controls referenced in the instructions. The demo videos, on the demo video page, may also help you. (Note there is one on .spf creation, which you can look at if you have trouble.)
Three Basic Modes of Operation. The program always is doing one of three things, that is, it has 3 different Operation Modes (ignoring one called "Programmer Diagnostic Mode" which is just for me and can be accessed only in my version). The options are "live spiral", "Create analysis .spf file", and "Play .spf + .wav". These are set in section (I) of the Control Panel.
Live Mode = Operation Mode 1 (You can get the Spiral Panel Only in Live Mode--no Over Time View):
When you start the program, you are in live mode, with live device that grabs sound selectable at the left of section (A) on the control panel (if it isn't what you want automatically). (You can actually do two separate live devices simultaneously with different spectrograms/single pitches, if you want to). In this live mode, the checkbox on the spiral panel will add the single-pitch (and overtone-level histograms if the spectrogram is kept on). You also have gain and range controls for the spectrogram on the spiral panel. Key/potential key overlays for both live and playback and over time modes are in Contol Panel section (F).
Lowering the sample rate for the live input device (section (A) of the Control Panel) may increase display speed, though frequencies higher than about half the sample rate will not be shown (though these are not usually of much musical interest). You may be able to further increase display speed by setting an undersanple, (also section (A) of the Control Panel). However, a disadvantage of the latter (undersample) is that it can cause the spectrograms to mix up certain high with certain low frequencies (technical: called "aliasing"), so you should avoid it, if possible. (In the case of both low sample rate and undersample, to alert the user, I yellow out the frequency areas that will either not be detected, or are subject to being misrespresented as the wrong frequency . This warning by yellowing only applies to the program versions released after 7/1/14).
Separately, note that, when working from a recording (as below), there is no issue of any misrepresented or cut off frequencies. All frequencies recorded will be displayed. (However, of course, if the sample rate is <40,000 per second for each channel, frequencies less than 1/2 the sample rate will not have made it to the recording itself.)
<
.SPF creation = Operation Mode 2 (Needed for Over-Time View):
To avoid confusion, let me point out that all of the functionality in the screenshots is in the current version of the software, but I may have changed colors of some things, for better visibility with new features.
To make an .spf spectral information file, needed to get the over-time view, set the "Create analysis .spf" file checkbutton above section (L). (This will also turn off any live-mode spectrogram and single-pitch processing, which would be a waste to keep on because you need the program to do different work than that.) Select the .wav music file you have (must be standard 44100, 22050, or 11025 kHz, 16 bit 2 channel or 1 channel .wav), and select a name for the .spf.
You have an option of making a mono or stereo .spf. A stereo .spf contains spectrum information for both channels separately, which can be displayed when you play it back. However, it takes about twice as long to make, and twice as much space, so if you find you don't use it, avoid it. Anyway, to get stereo, check the "st" box in section (J) of the control panel.)
After doing the above, hit start. (The processing here will get split among multiple processors. While the .spf is being made, the Main Panel will show the spectral information from only one processor. No need to look at that -- it's just if you wonder if the program is doing anything. If you do want to look, you may need to adjust gain.) (Note: On a multi-processor PC, the time it takes to make the .spf file is roughly the time of the .wav recording. Further, the size of the .spf file is roughly the same as that of the .wav file you process.)
.SPF + .WAV replay = Operation Mode 3 (Resulting in Over-Time View, with a few other more minor view options).
Note that this operation mode has a lot of options. These options are described in most of the instructions from here on down.
To Play Back and look at the overtime analysis, select the .wav and created .spf files, and hit the start button. (This works for me without restarting the program. In certain orders of operations, there may be a software glitch, and you may have to stop and restart my program.) The "over time" window, which is where the over-time spectral looks at the music are shown -- the panel in most of my screenshots -- pops up automatically.
A red section on the OT window means the part-ii analysis in process (and therefore data is not quite ready to be displayed). Part ii is relatively quick, and occurs when you start a playback of an .spf+.wav, or when you reset any part-ii analysis (peak/note detection parameter) in (T) or (V). The part-ii analysis fitting progress is tracked in "pct fitted" in J. In some cases, owing to minor issues with multiple threads and locks that I may not have fully resolved, you may need to unpause/pause to get part (ii) analysis to recommence.
While Playing Back .SPF + .WAV:
Controlling What is Displayed (Peaks and/or All-Power) and whether there is also an Octave-Overlaid Display:
To bring up the octave-overlaid display (which comes always in addition to the non-overlaid display), in section (Q), choose the 3rd option ("Center-of-stripe note center + overlaid pitch"). Note that the octave-overlaid display can only show peaks (detail: all peaks are calculated and then overlaid). It can not show all power (as I can't think of a way to display "all-power" in a tonally-relevant way). Also note that, above (M2), the checkbox "Show Note-Fit or Peaks (as below, OTW)" controls what is displayed on the non-overlaid display only. The Note-Fit or Peaks is always displayed on the overlaid display, regardless of that checkbox. (And further, on the overlaid display, it always satisfies the selections and adjustments set in the rest of (M2).) The reason I did it that way is to allow peaks to be turned off on the non-overlaid display (leaving just all-power) while still retaining the octave-overlaid display.
What, exactly, is displayed on the non-octave-overlaid display, and on any octave-overlaid display (i.e. all power and/or peaks, format of these, darkness and width of these, etc.) is controlled in areas (M1) and (M2) and the checkboxes of area (M) just above (M1) and (M2). Each of (M1+checkboxes above) and (M2+checkboxes above) turns on a separate set of things to be displayed, and also controls width and darkness of those things. (M1+) is used for displaying the all-power representation, and (M2+) is used for peaks (or sometimes, my deprecated primitive note recognition). (M1+) and (M2+) items can be turned on at the same time, resulting in both and all-power and a peak-type display.
Selecting "show all power" in area (M) above area (M1) shows the gray all powers. Selecting "Show Note Fit or Peaks" (above (M2)) with "All Peaks in gray" (in M2) is the standard way to get the peaks AND all-powers displayed, and this is also the default when replay is started. (Actually, despite "in gray", when peaks and all-powers are both set, the peaks come up clear, not in gray -- which I did to optimize visibility.) NOTE1: Some other selectable (M2) options besides "all peaks" are based on my earlier experimental efforts to separate peaks which represent notes from those which represent only overtones automatically. Thes are deprecated because they don't work too well, and are based on certain misconceptions I had about relative power of fundamentals vs overtones. The default mode for the program is NOT to attempt to separate fundamentals from overtones, and further, all screenshots on this page are made with the SpectratunePlus set not not attempt the separation (that is, in section M2, "all peaks in gray" is set).
Stereo Left to Right Information (You can get this only if you made the .spf in stereo, regardless of that the .wav you recorded from may have been stereo. To see whether an .spf you are currently playing back was made in stereo, look at, at the info about the .spf file at the bottom of area (L), what "stspf" is. 1 is stereo, 0 is not stereo.)
For all power displays, you can switch between the displayed powers being of left, right, and center, in section (M1).
For any peaks (when "all peaks in gray" is selected in (M2), checking "color code L-R on peaks" gives you, instead of gray, greener for peaks with most power to the left, gray if near center, and red for right. Colors are brighter the more sidewise the sound.
You also get, when you middle-mouse click to get a power, the db difference left minus right is also shown. (Norm, you've though of everything!)
Adjusting Darkness and Line Width of Spectrograms: For the gray-shade "all power", their are "dB gain" and "dB Range" display adjustments in area (M1). "dB gain" and "dB range", determines what power levels are visible in the varying shades of gray-- without being so low as to be invisible, or so high as to be over-range (that is, displayed in the single "overrange indicator" shade of green.) "Auto Set Gain: scr. max" is a way to set the aforementioned gain automatically, based on making the highest power currently on the screen show at the darkest shade of gray, using the range you have set to determine the lowest power that will not be invisible. The ": all pwr mrkr" does an alternative automatic gain, based on centering the power at the place of the last "all power" measurement (done with middle mouse button while z key is down).
"dB gain" and "dB range" for anything displayed by selecting anything in an area just over or in area (M2) is controlled by separate analogous controls in area (M2). Thus, the power-to-display-shade correspondence of peaks and powers of the peaks requested in ((M2)+checkboxes just above (M2)) are controlled by adjustments in area (M2).
"Octave-overlaid display and Chord/Key Assistance Windows for attempting chord or chord-soundingness or key recognition or other tonality tasks: I have described this in the explanation of screenshots 2b(ii) and 2b(iii) some distance above.
Optional Moveable Narrow Vertical Striped Scale: Designed for use with the "all-power" display, that little horizontally slideable narrow striped scale as in screenshot (1B) above is toggled on or off (default is off) with the "n" key (while the over-time window has focus--i.e. click the mouse on it first). (Note1: To slide this scale, left click the mouse over it and drag it. Also, this slidable scale needs to be toggled on to appear. You toggle it on or off by hitting the "n" key. Additionally, when the scale is showing, the mouse wheel, which the program uses to raise of lower the vertical range of notes shown is deactivated by the Windows OS and I don't seem to be able to stop that, so if you want to use the mousewheel, you need to temporarily unshow that striped scale.)
Optional Plot of Powers at All Frequencies at Any One Time: You get this by, while holding down the m key, right clicking over the time you want the powers of. (The plot comes up on the side of the thing slideable striped scale discussed just above.. You can remove it my hitting n, or by left-clicking with the n down below the area of the spectral display.)
Sound Resynthesis from Spectrogram: You can test the level of ear-meaningfulness of the power peaks shown by the program, and learn other things, by switching from playback to a synthesis based on sines generated from peaks of the spectrogram only (check "all peaks" in area (U)), using a synthesis divisor in area (P) to remove the unmusical noisy sound of overload.
Some control over the frequency-range used in the resynthesis is also in area (H). Note that you can get very fine control over resynthesis frequency range by right clicking on a frequency height on the spectrogram, with either "C" (lower limit) or "V" pressed. (When you do this, there are corresponding markers on the left that show you what you selected, half triangles on the left, as in screenshot (12c) below.) This lets you figure out what instruments, voices, etc. are coming from particular partials (i.e. sines). If you want to try to hear all parts of a particular note (i.e. to figure out what instrument it is, or who is singing the note), I have the checkbox "or range overtone 1 to 16", which, if you select a tight resynthesis range around a fundamental, gives you a resynthesis using that range and the first 16 overtone zones.
I've found that on musical sounds, as well as voice, the resynthesis works pretty well as a demonstration that my spectrogram is very meaningful. (This is, the resynthesis is a generation in a simple, direct way -- by generating sines of the peak frequencies and amplitiudes in the spectrogram.) The regeneration isn't perfect -- in particular there are some "artifacts", but you get the musically recognizable sound, with recognizable timbres, and usually recognizable singers. (NOTE: some software products: vocoders, pitch shifters, spectral editors, etc. have methods of resynthesizing perfectly, however, as far as I know, none of them use my more musical "constant Q" scaling. In any case, my goal with the resynthesis is not sound editing, but rather allowing the accuracy of my spectral representation to be tested, and further, to allow the user to learn some things about sound, timbre, etc.
The other options in section (U) are experimental, and many did not work out that well. You might find "listen filter lasted marked freq" useful, which filters the actual .wav sound (not the resynthesized sound) keeping only sounds around the last marked frequency.
To overlay a sung pitch, reselect a sound input device in (a), and you should probably keep the sampling rate low (to avoid sluggish response from the computer needing to do too many things at once), and hit superimpose in (M). (To make sure you've got audio input devices all correctly set in Windows OS, don't do it here -- too much going on at this point -- rather, do it in live mode when you first start the program so that you there see a spectrogram and a single-pitch detection when that is turned on.)
Note range shown vertically moves up or down with middle mouse scroll. Range height and time range width adjust with (G).
Navigation Through Music: When playback is paused (via controls in section (K)), arrows (K) will make small or larger moves. Also, when over a spectrogram, the left mouse button moves you express back to a time, or plays forward to a time. You can skip forward silently if you wish by having the "X" keyboard key down while you left-click to a point ahead. The right mouse button (over a spectrogram) plays midi at the pitch-height where you click. When your mouse is at the bottom of the OT window, not over a spectrogram but below any spectrogram, R/L mouse buttons control markers for loop between markers (which looping goes into effect only when that box is checked). Further, in this below-spectrogram area, the middle mouse button centers the spectrogram to display on wherever the current-play-time cursor is.
There is additional MIDI sound feedback, sort of a little piano keyboard (relative to the tonic of the selected key): by touching the roughly first 13 keys of each of the first 3 rows of the keyboard (i.e. rows starting with "1". "q", and "a" gets you all notes starting with the tonic of the set key, these rows are 1 octave above the "working midi octave", the working midi octave itself, and an octave below the working midi octave. "1". "q", and "a" get you the tonic, and you go up a half-step each key across each row. (The "working midi octave" is just the octave where you last got a midi sound by right-clicking over the spectrogram.) This works only when the over-time window has "focus". You can even have more than 1 key down at once to generate chords, etc.
If you choose, MIDI notes can be adjusted to any tuning you make of the spectrogram away from 220hz = A. (This tuning adjustment is done in section (D) of the Spectratuneplus Control Panel, and you would typically do it if you noticed from the default A=220hz spectrogram that the musicians were playing instruments tuned to other than A=220hz.). To do this MIDI tuning compensation, you need to check the appropriate line in "MIDI adjust to non-220hz A0" down by the bottom of section (X). The two options have to do with how your MIDI hardware or software handles "pitch bends": usually I think it is "1/3 Oct Bend". You test that you have the correct box checked by pushing "test bend correct". If correct, the tones that you will hear will be of constant pitch. (NOTE: You only need to worry about this if you adjust the A tuning frequency in section (D) and you want your MIDI reference sounds to reflect this tuning.)
Overtone-Position Markers: When you have the Z key down over the non-overlaid-spectrum part of the OT-window, L mouse button adds a fundamental/overtones marker-set that uses little arrows (to 15th overtone in current software version), R removes it.
When you have the B key down over the non-overlaid-spectrum part of the OT-window, L and R mouse buttons add and remove a marker of the same fundamentals and overtones, but based on small bright dots instead of arrows. (I have found this more suitable for use on the gray-scale "all-power" display, and with some practice you can often deduce all fundamentals, that is, notes played, in simpler music.)
With both the Z-down and A-down harmonic markers described above (gotten by mouse-button action over the non-octave-overlaid display), you get also, if you have the octave-overlaid spectrum showing, a marker of the fundamental on the octave-overlaid position. (My idea is to help deduce chord, also key, when individual notes are deduced with the harmonic markers.)
With Z-down, hitting the L mouse button over the octave-overlaid display gives markers of the fundamental and first 7 overtone positions (totaling just 4 markers due to some overtones repeating "pitch-class"). This displays on the octave-overlaid section only, with nothing corresponding on the non-overlaid section.
To get the actual dB number of power and exact frequency: for a point on a peak at a specific time, point to it and click the middle mouse button. (In some display modes, the X key down is needed to get power and frequency at the peak.) For any point (not just a peak), point to it and click the middle mouse button while holding the Z key down.
Copying Spectral Peaks at a Time to the Clipboard (in a format that can then be read by my ToneGen to generate that timbre and play with its overtones: With "<" key down, hit the right mouse button over the time of the part of the spectrum whose peaks you want. (The peaks selected will be precisely those defined as peaks with the adjustments in section (T) of the control Panel. These adjustments control the peaks determined here and in all other features accessed from, or shown on, the Over Time display. So, for instance, to get rid of low-power peaks, which may be just noise, raise the level of "Min Absolute dB for Peak (OT Only)".
Spiral Display During Replay (if Desired): You can also display the spiral spectral information (back on the main panel) if you choose by deselecting near (C) "Skip Spiral Window for SPF Replay". This will slow down playback, depending on how many cores your cpu has, so it may be most useable when paused an stepping through little sections of music. One thing that it does also display, you may wish to note, is the original unspectralized sound (above the spiral). I've also hooked in left and right mouse button functions over that unspectralized sound to let you determine frequency when periodic waves are identifiable. Over the spiral, left and right mouse buttons give you overtone markers.
Parameters Setable During Replay: (The defaults are fine, so you don't have to set any of these. The information here is for advanced users.) Peak fit parameters in section (T): In certain more complicated music, you get some "snow" as peaks, due to odd combinations of spectral powers, if you detect all peaks -- that is, just require the spectral energy to be higher than the spectrogram just above and below the frequency (math terminology: require just a local maximum). (Snow: see screenshot 12a below.) You can reduce the snow by looking at just substantial peaks: requiring the peak to be at least x dB above the energy on at least one frequency each above and below with the required frequencies being within so many half-steps away. The first 2 settings in (T) control this. An example of the effect is screenshot 12b below. You can also adjust the minimum power required for a detected peak -- this is the 3rd setting in (T), and is scaled in the same relative dB levels that you get when you use the middle mouse button to get a power printout. (NOTE: when you change any of these parameters, peak detection gets redone for all times, starting at the times you are looking at. While this is being done, you get a red over-time window wherever the peaks are not yet done. (And progress is indicated by a little "pct fitted" reading in section (J).) Sometimes left or right clicking on the OT window is ultimately needed to get the red to go away.
Most parameters that govern the spectrogram, etc. are set when the .spf file is played back, not when it is created. (For example, parameters affecting how peaks are detected, the shades of corresponding to each power level, and the parameters affecting sound resynthesis.) However, a few are set at the time of .spf file creation, described below. (Though except for advanced uses, you should just ignore them, and accept the defaults, which are tested to work pretty well.)
------------------------------
SPF-creation Analysis Parameters: (The defaults are fine, so you don't have to set any of these. The information here is for advanced users.) When you create the analysis .spf file, several parameters affecting the analysis can be set. These are on the bottom of section (J). (These have defaults, so you don't need to adjust them. Further, they have an effect only when you create the analysis .spf file -- not when you play one back, and also have no effect on live spiral mode.) In in any case, the last of these is "Q_erb". What this is is a parameter defining the trade-off between quickness of detection vs pitch discrimination sharpness (called generally "Q" in engineering circles) that exists in all mechanical, electrical, computer-based, or any other systems that detect pitch in a way that will give similar results to the resonant process on the basilar membrane of the ear (and thus represent similar perception to a person hearing with an ear). Anyway, this trade-off must exist in all spectrograms which will pick up what the ear hears, whether the spectrogram is made by a Fourier Transform Method, or a filterbank method. (This program uses a filterbank method -- in engineering parlance, I am using constant Q filters.
Here are some details on the exact discrimination shape with the default setting Q_erb=32:
Frequency discrimination:
1 half-step discrimination 17 dB,
2 half-step discrimination 20 dB,
3 half-step discrimination 23 dB,
4 half-step discrimination 27 dB,
5 half-step discrimination 28 dB,
8 half-step discrimination 30 dB,
Above: SpectratunePlus Control Panel.
Release notes for most recent versions (and useable for determining latest version)
2016.08.07: In .spf+.wav replay mode, all of the following:
live-single-hummed-pitch can now be shown on spiral display of .spf+.wav (as well as over-time view, which it always could);
undersample for live-single-hummed-pitch switches to 4 automatically if lower than 4 (to keep display speed fast--i.e. prevent jerkiness in replay);
In the spiral display, the shading can be made the same as the all-power over-time (grayshade) display;
In the spiral display panel (whether live or in .spf + .wav replay mode), there is an option to replace the spiral display with a display inspired by the Terhardt virtual pitch.
2015.09.25 Fixed one lockup issue.
2015.09.24 Added all-power stereo visualization; peak/chain synthesis adjustable for stereo based on selectable L-R balance; added additional controls for from-chain synthesis; knocked out a few lockup cases.
2015.09.06: Fixed some lockup issues.
2015.04.25: Fixed issue causing program crash when midi notes are played while at the same time chordkey help is on.