Repetition in News Aggregator

service that organises news from different sources by topic
published: (updated: )
by Harshvardhan J. Pandit
is part of: news aggregator
machine-learning news

Content curation can be carried out either manually or automatically. In the first case, it’s done by specially designated curators. In the second case, it’s done using one or more of the following: Collaborative filtering, Semantic analysis, and Social rating.[Wikipedia](

What does a user want to do when he opens a news aggregator app? He wants to read the news. That’s exactly the approach that has been taken by every news aggregator app until now. The user wants to read the news from various sources, and he may be searching for a particular news source, so the app will contain filters, folders, tags etc. to let the user do the hard work. Apps like Flipboard rely on users to curate the articles into Magazines, which mostly will not satisfy other users looking for their own version of news reading. In such a case, a machine algorithm that curates the news based on the topic at hand is ideal for the user, as it allows the freedom to then specify which of the current popular topics the user is interested in.

Let’s take the example of Amazon’s Fire Phone launch a few days back (Jun 18th). Virtually every news source was filling in post after post about the many wonders brought on by Jeff Bezos. Translate this to what the user saw : A list full of Fire phone articles repeated again and again. Compare with what the user wanted : To simply know about the Fire phone. There are two things to learn and understand here –

  • The user wanted to know more about the Fire phone without reading about it again and again
  • The repeated Fire phone articles overshadowed other news articles

Which brings us to our solution – which for now, simply addresses the two things we’ve just learnt – repetition and preference. Repeated articles can simply be clubbed together under a label called Amazon launches Fire Phone and since there were more articles about it than any other topic, it floats right to the top. Other similar articles are clubbed together and shown based on how frequent the different sources wrote about it.

To tell the user how many articles are currently available about the Amazon Fire Phone, a small number could be shown in the corner of the label. Selecting the label would open the feeds page for the articles related to the topic. The user can then browse the topics by time, or source. Once the user has finished reading, marking the entire label as read allows the user to clean up the feed and to focus on other topics. This allows the reading of more news, and makes a better use of time.

Other existing (if any) features can still be built on top of this structure as the feeds are merely categorized under a label, but are still available. The UI would not be drastically any different from other apps that have folders, filters, streams etc. already in use. This means that existing apps can easily adapt this approach as an added functionality without much UI change.