Emotion Based Music Player: A Python Project in Machine Learning

Music is a universal language that connects emotions and brings people together across cultures. Today, you can personalize your music experience based on your current mood and emotional state using machine learning.

This article will teach you how to build an emotion-based music player using Python. The idea is to recognize a user's emotion through facial expression analysis and provide a customized playlist that matches their mood perfectly.

Project Overview

An emotion-based music player uses machine learning algorithms to recognize emotional patterns and suggest songs that fit the user's current state. The system combines computer vision for emotion detection with recommendation algorithms for music selection.

The main components include:

  • Emotion Detection Using facial expression recognition

  • Music Classification Categorizing songs by mood and genre

  • Recommendation System Matching emotions to appropriate music

Datasets Required

For this project, we'll use two essential datasets ?

Data Preprocessing

First, let's set up the basic configuration for our emotion detection model ?

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, BatchNormalization, MaxPooling2D, Dropout, Flatten, Dense
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Configuration
img_shape = 48
batch_size = 64

# Dataset paths (adjust according to your setup)
train_data_path = '/path/to/fer2013/train/'
test_data_path = '/path/to/fer2013/test/'

Creating Data Generators

# Data preprocessing with augmentation for training
train_preprocessor = ImageDataGenerator(
    rescale=1/255.,
    rotation_range=10,
    zoom_range=0.2,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Simple rescaling for test data
test_preprocessor = ImageDataGenerator(rescale=1/255.)

# Load training data
train_data = train_preprocessor.flow_from_directory(
    train_data_path,
    class_mode="categorical",
    target_size=(img_shape, img_shape),
    color_mode='rgb',
    shuffle=True,
    batch_size=batch_size
)

# Load test data
test_data = test_preprocessor.flow_from_directory(
    test_data_path,
    class_mode="categorical", 
    target_size=(img_shape, img_shape),
    color_mode="rgb",
    shuffle=False,
    batch_size=batch_size
)
Found 28709 images belonging to 7 classes.
Found 7178 images belonging to 7 classes.

Building the CNN Model

We'll create a Convolutional Neural Network to classify facial emotions into 7 categories: angry, disgust, fear, happy, neutral, sad, and surprise ?

def create_cnn_model():
    model = Sequential()
    
    # First CNN block
    model.add(Conv2D(32, (3,3), activation='relu', input_shape=(img_shape, img_shape, 3)))
    model.add(BatchNormalization())
    model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.25))
    
    # Second CNN block
    model.add(Conv2D(64, (3,3), activation='relu'))
    model.add(BatchNormalization())
    model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.25))
    
    # Third CNN block
    model.add(Conv2D(128, (3,3), activation='relu'))
    model.add(BatchNormalization())
    model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.25))
    
    # Dense layers
    model.add(Flatten())
    model.add(Dense(1024, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))
    
    model.add(Dense(512, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))
    
    # Output layer (7 emotions)
    model.add(Dense(7, activation='softmax'))
    
    return model

# Create and compile the model
cnn_model = create_cnn_model()
cnn_model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

Music Recommendation System

Now let's create the music recommendation component. First, load the music dataset ?

# Load music dataset
music_data = pd.read_csv("/path/to/data_moods.csv")
music_data = music_data[['name', 'artist', 'mood', 'popularity']]

# Define emotion classes
emotion_classes = ['Angry', 'Disgust', 'Fear', 'Happy', 'Neutral', 'Sad', 'Surprise']

def recommend_songs(predicted_emotion):
    """
    Recommend songs based on predicted emotion
    """
    if predicted_emotion in ['Disgust', 'Sad']:
        playlist = music_data[music_data['mood'] == 'Calm']
    elif predicted_emotion == 'Happy':
        playlist = music_data[music_data['mood'] == 'Happy']
    elif predicted_emotion in ['Fear', 'Angry']:
        playlist = music_data[music_data['mood'] == 'Calm']
    elif predicted_emotion in ['Surprise', 'Neutral']:
        playlist = music_data[music_data['mood'] == 'Energetic']
    else:
        playlist = music_data[music_data['mood'] == 'Happy']
    
    # Sort by popularity and return top 5
    recommended = playlist.sort_values(by="popularity", ascending=False)
    return recommended[:5].reset_index(drop=True)

# Example usage
sample_emotion = 'Happy'
recommendations = recommend_songs(sample_emotion)
print(f"Recommended songs for {sample_emotion} mood:")
print(recommendations[['name', 'artist', 'mood']])

Real-time Emotion Detection

For real-time emotion detection using a webcam, we can integrate OpenCV ?

import cv2

def detect_emotion_from_camera(model):
    """
    Real-time emotion detection using webcam
    """
    # Load face cascade classifier
    face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
    
    # Start video capture
    cap = cv2.VideoCapture(0)
    
    while True:
        ret, frame = cap.read()
        if not ret:
            break
            
        # Convert to grayscale
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        
        # Detect faces
        faces = face_cascade.detectMultiScale(gray, 1.1, 4)
        
        for (x, y, w, h) in faces:
            # Extract face region
            face_roi = frame[y:y+h, x:x+w]
            face_roi = cv2.resize(face_roi, (48, 48))
            face_roi = face_roi.astype('float32') / 255.0
            face_roi = np.expand_dims(face_roi, axis=0)
            
            # Predict emotion
            prediction = model.predict(face_roi)
            emotion_idx = np.argmax(prediction)
            emotion = emotion_classes[emotion_idx]
            
            # Draw rectangle and label
            cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
            cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)
        
        # Display frame
        cv2.imshow('Emotion Detection', frame)
        
        # Break on 'q' key press
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    
    cap.release()
    cv2.destroyAllWindows()

Complete Music Player Application

Here's a simplified version of the complete emotion-based music player ?

class EmotionMusicPlayer:
    def __init__(self, model_path, music_dataset_path):
        self.model = tf.keras.models.load_model(model_path)
        self.music_data = pd.read_csv(music_dataset_path)
        self.emotion_classes = ['Angry', 'Disgust', 'Fear', 'Happy', 'Neutral', 'Sad', 'Surprise']
    
    def predict_emotion(self, face_image):
        """Predict emotion from face image"""
        face_image = cv2.resize(face_image, (48, 48))
        face_image = face_image.astype('float32') / 255.0
        face_image = np.expand_dims(face_image, axis=0)
        
        prediction = self.model.predict(face_image)
        emotion_idx = np.argmax(prediction)
        return self.emotion_classes[emotion_idx]
    
    def get_playlist(self, emotion):
        """Get playlist based on emotion"""
        mood_mapping = {
            'Happy': 'Happy',
            'Sad': 'Calm', 
            'Angry': 'Calm',
            'Fear': 'Calm',
            'Surprise': 'Energetic',
            'Neutral': 'Energetic',
            'Disgust': 'Calm'
        }
        
        target_mood = mood_mapping.get(emotion, 'Happy')
        playlist = self.music_data[self.music_data['mood'] == target_mood]
        return playlist.sort_values(by='popularity', ascending=False)[:10]
    
    def start_player(self):
        """Start the emotion-based music player"""
        print("Starting Emotion-Based Music Player...")
        print("Press 'q' to quit")
        
        cap = cv2.VideoCapture(0)
        face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
        
        current_playlist = None
        last_emotion = None
        
        while True:
            ret, frame = cap.read()
            if not ret:
                break
                
            gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
            faces = face_cascade.detectMultiScale(gray, 1.1, 4)
            
            for (x, y, w, h) in faces:
                face_roi = frame[y:y+h, x:x+w]
                
                # Predict emotion
                emotion = self.predict_emotion(face_roi)
                
                # Update playlist if emotion changed
                if emotion != last_emotion:
                    current_playlist = self.get_playlist(emotion)
                    print(f"\nEmotion detected: {emotion}")
                    print("Recommended songs:")
                    print(current_playlist[['name', 'artist']].head())
                    last_emotion = emotion
                
                # Display emotion on frame
                cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
                cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)
            
            cv2.imshow('Emotion-Based Music Player', frame)
            
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        
        cap.release()
        cv2.destroyAllWindows()

# Usage
# player = EmotionMusicPlayer('emotion_model.h5', 'music_data.csv')
# player.start_player()

Additional Features

You can enhance your emotion-based music player with these features:

  • User Preferences Learn from user feedback to improve recommendations

  • Playlist Sharing Allow users to share emotion-based playlists

  • Music Discovery Suggest new songs based on listening history

  • Multi-language Support Include songs from different languages

  • Offline Mode Download songs for offline listening

Conclusion

An emotion-based music player combines computer vision, machine learning, and music recommendation to create a personalized listening experience. The system uses facial emotion recognition to detect user moods and suggests appropriate music automatically. This technology has potential applications in therapy, stress management, and enhancing overall well-being through music.

Updated on: 2026-04-02T17:12:24+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements