2007年4月24日 星期二

Factor Analysis using All 55 Features

Factor analysis is a statistical data reduction technique used to explain variability among observed random variables in terms of fewer unobserved random variables called factors. The observed variables are modeled as linear combinations of the factors, plus "error" terms. Factor analysis originated in psychometrics, and is used in behavioral sciences, social sciences, marketing, product management, operations research, and other applied sciences that deal with large quantities of data.

In this case, the factor analysis includes three factors, and so the biplot is three-dimensional. Each of the 55 features is represented in this plot by a vector, and the direction and length of the vector indicates how each feature depends on the underlying factors.

For example, you have seen that after promax rotation, the feature 31~35, and 50~55, but 53 have positive loadings on the first factor, and unimportant loadings on the other two factors. That first factor, interpreted as Loudness effect, is represented in this biplot as one of the horizontal axes. The dependence of those 10 features on that factor corresponds to the 10 vectors directed approximately along that axis. Similarly, the dependence of features 7, 8, 10,11, 14, 15, and 36 primarily on the second factor, interpreted as Timbre effect, is represented by vectors directed approximately along that axis. Each of the 2251 observations is represented in this plot by a point, and their locations indicate the score of each observation for the three factors. For example, points near the top of this plot have the highest scores for the 3rd factor, interpreted as Pitch effect. The points are scaled to fit within the unit square, so only their relative locations can be determined from the plot.

2007年4月22日 星期日

Selection of Dominant Features

Why we select data with smaller features from larger data sets ? By removing the most irrelevant and redundant features from the data, feature selection helps improve the performace of learning models by: 1. Alleviating the effect of the curse of dimensionality
2.
Enhancing generalization capability 3. Speeding up learning process 4. Improving model interpretability

My approach to solve this issue is to observe their performance of normalized density distribution in discriminating each class: if the specific feature has obvious global pattern, I select it! According to famous Pareto principle saying: for many phenomena, 80% of the consequences stem from 20% of the causes I choose the dominant 9 features from original 55 ones, particularly, 3 of them are the most dominant due to their clear outward appearance. The other 6 features are less important than them. Thus, these sets with lower dimensionality will be basis for each classifiers.


2007年4月17日 星期二

Musical Rule of Thumb and Related Features

According to Rules of Thumb listed in "The Influence of Musical Structure on Emotional Expression", 4 main factors are emphasized here: Dynamics (or loudness), Rhythm (or Beat), Pitch, and Harmony. Comparing with our feature sets extracted before, similar meaning but with different terminology should be considered together. Musical feature table could be check in past study, briefly, Dynamics is related to F31~F35; F50~F55, Rhythm is related to F1~F6, Pitch is related to F26~F30; F42~F49, Harmony is related to F40~F43. As figure shown here, we separate our data sets into another group sets with new sub-set's label:. For example: Dynamics+, Dynamic-, and unknown. Similar reason by analogy. Thus, super-group is illustrated here, and it represents that these super-groups are mutually dependent because there is no clear on-off boundary for any label. And, the rule of thumb using descriptive and qualitative level terms like: Loud and Soft is restricted to quantitative research. However, these cues are convenient for me to find the local optimum occurring in some possible and reasonable sub-set. I claim that each class having its own representative attribute easily perceived by human, however, it should be explored and reconstructed by me. These boundaries are nonlinear and maybe case by case. Thus, I think 200 musical samples are still too small to see the obvious tendency. And, process of feature extraction should be reexamined.

Binary Classification: Class A or Not

During feature-selection period, representation using all 55 features in binary classification make the issues clear.

1. Is the specific class representative? Does this class have its idiosyncrasy different from others? However, as figures shown here, combination of different subset of feature sets are required for its discriminability for specific class. For an arbitrary example, rhythmic information for classifying class 1 and class 2.

2. In statistics, an outlier is an observation that is numerically distant from the rest of the data. Statistics derived from data sets that include outliers will often be misleading. I think I need to cut off some bad data in relatively standard procedure. After browsing these 8 figures, #1353 has been dropped from the original one. Thus, for building the robust classifier for each class, this process is necessary.

Thus, further trials for refining my classifiers proceeds!

2007年4月13日 星期五

Issues to be solved


Music-Emotion Data Distribution in 2-D feature spaceAll 1200 music clips using all 55 features mapping to the first two canonical axis

Currently, music-emotion database are 1200 clips, which are labeled by over 300 people based on modified Hevner Checklist, and each is of 55 musical feature (including dynamics, rhythm, pitch, harmony, and others) and the averaged emotion label per clip is 1.88 (multi-label classification, label from 1 dominant one to 2 overlapping ones by some heuristic criteria). 8 Single binary classifiers using specific feature set with high discriminability and interpretation power are still tuned. Support Vector Machine-based classifier using cross-validation method for building model, further precision/recall for performance results are under construction.

Issues addressed here to be solved:
1. Feature Selected by Trial and Error.
2. Efficient Model calculated by Entropy using Decision-Tree

More figures could be found in PowerPoint File

2007年4月11日 星期三

This webPage

For my current research and final project in course MULTIMEDIA ANALYSIS AND INDEXING , emotion-based media player show will be discussed and constructed here.

ABSTRACT

Nowadays, the media player software is often featured with some visual effects when listening to music. But most of them are always meaningless patterns to the musical content. A novel method is exhibited here to show a fancy media player show, integrating auditory and visual perception. Database of 400 nature photos and 200 clips from soundtracks with emotion labels is constructed by near five hundreds of users through web. A complete music is temporally analyzed and segmented into specific labels according to suitable emotion phrase by multi-label SVM (support vector machine). Images with the same emotion are then accompanied to present the show. As music proceeds, color contrast of images is also calculated to align the spectral flux of musical clips frame by frame. Foregoing sequence-matched problem is formulated, and Viterbi algorithm is used for optimization. Results of subjective feedbacks show that extra moved feelings are obtained and evaluated well.