和search ranking system做個比較??
為什麼通常選擇用implicit data 去建立recommendation system instead of explicit ?
What is data stream?
It’s constant stream of data. For example, twitter messages and online news article.
Usually, for data stream related problem, there’s add function which will keep adding data from data stream into data structure you choose once it’s called. So, initially, we need a good data structure as a container to store past data stream.
Some pattern about split string/array related problem. Usually, naive solution is generate all possible split, which takes exponential running time. And optimal solution is greedily split in most of cases. And, this is tricky. So, I try to summarize what I learned in this post.
This is a very classical problem, so-called K-th problem.
Basically, given a user, our problem can boil down to finding most recent K data/ top k data with a certain priority given a data stream
Sorting+get top k data points
So, the naive approach is we can use sorting. Find all tweets of a user’s followers, then sort them by creating time in descending order and return top K out of them.
In this way, the advantage is intuitive and easy to implement. But, disadvantage is not very efficient in terms of time complexities, which is O(NlogN). …
Case study: you’re asked to design a search relevance system for a search engine.
relevance comes from ground truth
Except the metric such as AUC, precision, recall, and etc, we also ensure that we meet the capacity and performance requirements in the mean time
And, performance and capacity are the most important to think about when designing the ML system. Performance based SLA ensures that we return the results back within a given time frame (e.g. 500ms) for 99% of queries. Capacity refers to the load that our system can handle, e.g., the system can support 1000 QPS (queries per second).
This article is to make sure when you use heap library in the real interview, the interviewer ask you details about it, you won’t fail.
PQ is implemented using binary heap tree.
Binary heap tree is
There’re mainly two methods:
1.Insert using heaplify up.
I’m Taiwanese expat. I’v worked in Singapore as data scientist after graduation from Taiwan and currently I work in Amsterdam as machine learning engineer.