Columbia University

Technology Ventures

Identification of similar songs within a large database

Technology #cu12223

Questions about this technology? Ask a Technology Manager

Download Printable PDF

Image Gallery
Daniel P.W. Ellis
Managed By
Satish Rao

Plagiarism and piracy in the music industry have become increasingly common in the digital age. The unauthorized use of part or the entirety of a song will cost the original artist revenue that could have been made in licensing fees. This technology is an algorithm that analyzes digital song samples and identifies similar samples within a large database. This method is tailored for high computational speed and can be used to quickly match songs based on musical similarity. As such, this technology may be useful to search for unauthorized song copies or cover songs within a large database.

Computationally-efficient song comparison using spectral analysis

Current technologies to compare song samples use elements of dynamic programming that are computationally expensive and time-consuming. These approaches are less suited for use with large databases and can take hours or days to analyze thousands of songs. However, this technology uses spectral analysis to generate a 2-D Fourier spectrum for each song, which can quickly be compared using a simple distance computation. This technology is notably well-suited to identify cover songs, which may differ in tempo and instrumentation from the original song. Other applications of this technology include the suggestion of similar songs for users of internet radio or digital music stores.

The algorithm has been developed and tested using the Million Song Dataset, a database consisting of feature analysis and metadata for one million contemporary popular songs.

Lead Inventor:

Daniel P.W. Ellis, Ph.D.


  • Identification of cover songs or other unauthorized use of audio content within a large database
  • Discovery of musically-similar songs for use in internet radio or digital music stores
  • Song recognition software
  • Classification system for audio samples


  • Well-studied and highly efficient method of analysis
  • Fast comparison enables searching of large databases
  • Can identify cover songs which may differ in tempo and instrumentation

Patent information:

Patent Pending (US 20130226957)

Tech Ventures Reference: IR CU12223

Related Publications: