26.03.2014 7:29

How to predict the popularity of content in social networks

Want to have your pictures to share on Facebook as many people? Science to help you!

One of the main highlights of social networking – the ability to share, “Share” your favorite pictures, photos, video, text for your friends or those who looked at your profile to the light. While some content is always becoming more popular, which leads to a cascade effect – when more people share this photo or nominal precisely this video.

There are suspicions that very many people would like to be able to predict what will be popular. Think it is impossible, because we need to consider too many factors that are difficult to measure: for example, the nature of content and communication between people?

Examples cascades spread. Yes, that such bizarre ways to extend your fotochki social networks.

Yet studies that suggest that the method is found to appear punctually. Like, immediately after the publication of photos are measuring public interest in a short period of time and extrapolating, predict the popularity of content in the future. It is clear that this method is somewhat abstract and more like a collection of statistical data.

Justin Cheng ( Justin Cheng ) from Stanford University and his colleagues at Cornell University and Facebook (all – USA) offered to look at the problem in a new way. Researchers have shown why so hard to predict the popularity of studying the early stages of publication. But some stage “cascade popularity” and indeed one can predict with amazing accuracy that is based on these data consider the future of publishing.

Mr. Cheng came to this conclusion by analyzing the way “rassharivaniya” photos on Facebook within 28 days after their initial placement in June 2013. Monitored 150 thousand photos that other users have shared more than 9,000,000 times. As a result, people have been identified (nodes) delivshiesya strange images, and time “rassharivaniya” after the original publication, which allowed to build a network of content distribution.

Until now, researchers tracked how the spread started, for example, already popular video and then tried to repeat the same sequence of events with other content. The results were … mixed.

Mr. Cheng and the company used a different approach. They took the photo, which has several times shared, and determined the likelihood that this picture is “Share” twice. In other words, the task is to predict whether the photo spread is twice as active, faster.


Justin Cheng, young face of modern science (photo from Justin Cheng ).

Scientists are not just used this method, because the strength of the cascade propagation obeys a certain law. One half of the cascade of a given size will increase twofold, while the second – no. That is a random guess is correct half of the cases.

It is clear that half is not the best result for the predictor. So the question is how to improve the method of using learning artificial intelligence (AI). So Justin Cheng teammates used some of the data they have collected manually, learning AI and improve prediction stages. Whether seen in the picture close up street person or if there is an inscription, how many people have shared the original image, what is the velocity of propagation – all important in determining the future shape of the cascade. Oh yeah, these same forms may be different: the most simple – the star, the popularity gradually fades when moving to the Rays …

Once scientists AI dragged on different data, it is the turn of artificial intelligence test. Start small: taken as a basis of the image, which shared the beginning of testing five people. The task was to accurately predict “Share” if their 10 times and more. It turned out that such is easier to predict: the algorithm was accurate in 79.5% of cases.

However, the different characteristics of the cascade predicted with varying accuracy. AI is best able to determine the speed of propagation. But why then the AI, if any Internet-dependent people say the same thing: the faster at the beginning of something spread, the higher the probability that the propagation velocity will increase?

Justin Cheng notes that the forecast accuracy is also affected by the initial number of publications. It is also clear: a lot of information – always good, the more people who share the pictures, the better the prognosis. According to scientists, this is why previous studies have failed: they started with too little data.

Of course, you can complain to some limitations of this work, because it operated only Facebook-only information and photographs. It may well be that users Twitter, for example, operate on a different algorithm, and video distribution may differ from the pictures, not to mention the usual references to, say, the texts “Kompyulenta.”

Justin Cheng & Co. do not insist on their fundamental research, but believe that it will help other scientists. “Despite the limited results, we believe that the work gives a general idea, which will be useful in the future,” – says Mr. Cheng.