References

https://en.wikipedia.org/wiki/Precision_and_recall

https://gab41.lab41.org/recommender-systems-its-not-all-about-the-accuracy-562c7dceeaff#.4533v326j

https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/

Even if we decide to use multiple metrics to evaluate our solution, our troubles aren’t over yet. Using multiple metrics often means that there are trade-offs between the different metrics. For example, with the precision and recall measures that are commonly used to evaluate the performance of search engines, it is rare to be able to increase both precision and recall at the same time. Precision is the percentage of relevant items out of those that have been returned, while recall is the percentage of relevant items that have been returned out of the overall number of relevant items. Hence, it is easy to artificially increase recall to 100% by always returning all the items in the database, but this would mean settling for near-zero precision. Similarly, one can increase precision by always returning a single item that the algorithm is very confident about, but this means that recall would suffer. Ultimately, the best balance between precision and recall depends on the application.