RankingMetrics#
- class pyspark.mllib.evaluation.RankingMetrics(predictionAndLabels)[source]#
- Evaluator for ranking algorithms. - New in version 1.4.0. - Parameters
- predictionAndLabelspyspark.RDD
- an RDD of (predicted ranking, ground truth set) pairs or (predicted ranking, ground truth set, relevance value of ground truth set). Since 3.4.0, it supports ndcg evaluation with relevance value. 
 
- predictionAndLabels
 - Examples - >>> predictionAndLabels = sc.parallelize([ ... ([1, 6, 2, 7, 8, 3, 9, 10, 4, 5], [1, 2, 3, 4, 5]), ... ([4, 1, 5, 6, 2, 7, 3, 8, 9, 10], [1, 2, 3]), ... ([1, 2, 3, 4, 5], [])]) >>> metrics = RankingMetrics(predictionAndLabels) >>> metrics.precisionAt(1) 0.33... >>> metrics.precisionAt(5) 0.26... >>> metrics.precisionAt(15) 0.17... >>> metrics.meanAveragePrecision 0.35... >>> metrics.meanAveragePrecisionAt(1) 0.3333333333333333... >>> metrics.meanAveragePrecisionAt(2) 0.25... >>> metrics.ndcgAt(3) 0.33... >>> metrics.ndcgAt(10) 0.48... >>> metrics.recallAt(1) 0.06... >>> metrics.recallAt(5) 0.35... >>> metrics.recallAt(15) 0.66... - Methods - call(name, *a)- Call method of java_model - Returns the mean average precision (MAP) at first k ranking of all the queries. - ndcgAt(k)- Compute the average NDCG value of all the queries, truncated at ranking position k. - precisionAt(k)- Compute the average precision of all the queries, truncated at ranking position k. - recallAt(k)- Compute the average recall of all the queries, truncated at ranking position k. - Attributes - Returns the mean average precision (MAP) of all the queries. - Methods Documentation - call(name, *a)#
- Call method of java_model 
 - meanAveragePrecisionAt(k)[source]#
- Returns the mean average precision (MAP) at first k ranking of all the queries. If a query has an empty ground truth set, the average precision will be zero and a log warning is generated. - New in version 3.0.0. 
 - ndcgAt(k)[source]#
- Compute the average NDCG value of all the queries, truncated at ranking position k. The discounted cumulative gain at position k is computed as: sum,,i=1,,^k^ (2^{relevance of ‘’i’’th item}^ - 1) / log(i + 1), and the NDCG is obtained by dividing the DCG value on the ground truth set. In the current implementation, the relevance value is binary. If a query has an empty ground truth set, zero will be used as NDCG together with a log warning. - New in version 1.4.0. 
 - precisionAt(k)[source]#
- Compute the average precision of all the queries, truncated at ranking position k. - If for a query, the ranking algorithm returns n (n < k) results, the precision value will be computed as #(relevant items retrieved) / k. This formula also applies when the size of the ground truth set is less than k. - If a query has an empty ground truth set, zero will be used as precision together with a log warning. - New in version 1.4.0. 
 - recallAt(k)[source]#
- Compute the average recall of all the queries, truncated at ranking position k. - If for a query, the ranking algorithm returns n results, the recall value will be computed as #(relevant items retrieved) / #(ground truth set). This formula also applies when the size of the ground truth set is less than k. - If a query has an empty ground truth set, zero will be used as recall together with a log warning. - New in version 3.0.0. 
 - Attributes Documentation - meanAveragePrecision#
- Returns the mean average precision (MAP) of all the queries. If a query has an empty ground truth set, the average precision will be zero and a log warning is generated. - New in version 1.4.0.