topK Algorithm (To Be Improved)

Link: https://gist.githubusercontent.com/tyrchen/32c50aadca48aee3da10a77a18479517/raw/3aa07629e61239cd26cf514584c949a98aa38d67/movies.csv Sort by rank, take top K

/**
 * 
 * @param {*} data Fetched data
 * @param {*} num Take top num items
 */
let top = (data, num) => {
    let res = [];
    let dataArr = data.split('\n');
    let dataLength = dataArr.length;

    for (let i = 0; i < dataLength; i++) {
        // Handle dirty data
        // example: "27, most popular",127.6053254325764,6.03051434574081
        if (dataArr[i].match('"')) {
            dataArr[i] = dataArr[i].replace(/".+"/, i);
        }

        dataArr[i] = dataArr[i].split(',');

        dataArr[i][2] = parseFloat(dataArr[i][2]);

        // Perform a sort by rank from small to large in each loop
        res.sort((a, b) => {
            return a[2] - b[2];
        });

        // First fill the data up to num items
        if (i > 0 && res.length < num) {
            res.push(dataArr[i]);
        }

        // If larger than the smallest rank in the array, replace the smallest rank in res
        if (res.length === num && dataArr[i][2] > res[0][2]) {
            res[0] = dataArr[i];
        }
    }
    console.log(res);
}

top(data, 20);

The approach here used the Array.sort function. Now thinking back, it seems not quite right. Directly sorting and taking the top 10 would be enough. Should normally write a Quick Sort. For dirty data handling, things like 10e-1 are still missing. (parseFloat seems to handle it too) Summary of improvements needed: 1. Write a request to get raw data. 2. Rewrite with Quick Sort. 3. Try to write a Java PriorityQueue in JS.