Guest Writer Spotlight: Fake news can’t fool new algorithm

From Holly Ober, University California Riverside/UCRToday

RIVERSIDE, Calif. (www.ucr.edu) — A University of California, Riverside, computer scientist has received reinforcements in his battle against fake news.

Snap Research, the research division of Snap, Inc., has made a $7,000 donation for Evangelos Papalexakis, an assistant professor of computer science and engineering in the Bourns College of Engineering, to continue improving an algorithm that can already detect fake news stories with 75 percent accuracy.

A new algorithm puts news articles into “data cubes,” breaks them down into clusters of data, and links articles that are similar across different contexts to put them in categories and determine if they are fake news.

The gift formalizes an ongoing project between Papalexakis’ Multi-Aspect Data Lab and Snap Research scientist Neil Shah to create an automated algorithm that sorts news stories into categories based on clusters of words and contextual information and flags them as potentially fake news. The algorithm could be used by social media platforms to help users make more informed decisions about the news they click on and share.

Most attempts to automate fake news detection rely on locating particular words, identifying URLs, or fact checking websites like Snopes. All require human input and evaluation.

Most research to date has focused on carefully handcrafted features that predict an article’s legitimacy. The methods require specialists to extract those features and depend on a large library of examples already labeled as fake news.

Papalexakis and Shah start with the hypothesis that news articles appearing frequently near each other across a wide variety of contexts are more likely to belong to the same category.

Papalexakis’ group developed a two-tiered algorithm using a method called “tensor decomposition” that exploits articles’ structure to avoid reliance on human expertise. Tensors are multi-dimensional cubes. They excel at modeling and analyzing data with many different components, called multi-aspect data.

For instance, in online social networks, people interact with one another in a variety of ways: they message each other, they post on one another’s pages, and so on. All these interactions are parts of the same social network, and can be modeled as a tensor “data cube” composed of person, person, and means of interaction.

The researchers use a tensor to model the content of the article and map words spatially within the article. For each article they count how many times two particular words occur within a window of five to 10 words.

Tensor decomposition uncovers patterns by breaking the tensor into elementary pieces of data, each one representing a pattern, or topic. The group’s previous work showed that these topics successfully cluster misinforming articles.

In the first tier of the algorithm tensor decomposition represents the data compactly in a space that brings possibly fake articles close together. The second tier connects two articles if they are close to each other in the space computed by the tensor decomposition.

Next, “semi-supervised” machine learning is applied on the graphs. The method requires a small base knowledge of articles labeled by people, from which it learns and sorts other articles. But the approach requires far fewer human-annotated articles than current methods.

The team members put three sets of articles— two public datasets and their own collection of 63,000 news articles— through their algorithm and found that it accurately sorted articles into fake news categories 75 percent of the time. The result compares favorably to approaches that require a large number of human-labeled articles.

The gift from Snap Research enhances the team’s efforts to develop more robust and eventually fully automated techniques for identifying misinformation.

Social media companies could use the finished algorithm to filter misinformation out of user newsfeeds. Papalexakis would prefer that social media platforms flag articles rather than omit them, so that users can make more informed decisions about which articles to read and share.

Tensor decomposition is a major research thrust at the Multi-Aspect Data Lab, which has received funding from Naval Sea Systems Command, Naval Engineering Education Consortium, the National Science Foundation, and Adobe.

In addition to Papalexakis and Shah, UC Riverside computer science master’s student Gisel Bastidas Guacho and doctoral student Sara Abdali are working on the project. The group’s latest paper, “Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings,” is available on the arXiv.org pre-print server and will appear at the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining in August this year, in Barcelona, Spain.

 

Contact the editor: [email protected]

Trevor Montgomery, 47, recently moved to the Intermountain area of Shasta County from Riverside County and runs Riverside County News Source and Shasta County News Source. Additionally, he writes for several other news organizations, including Riverside County based newspapers, Valley News, The Valley Chronicle and Anza Valley Outlook, as well as Bonsall/Fallbrook Village News in San Diego County and The Mountain Echo in Shasta County.

Trevor spent 10 years in the U.S. Army as an Orthopedic Specialist before joining the Riverside County Sheriff’s Department in 1998. He was medically retired after losing his leg, breaking his back and suffering both spinal cord and brain injuries in an off-duty accident. (Click here to see segment of Discovery Channel documentary of Trevor’s accident.)

During his time with the sheriff’s department, Trevor worked at several different stations, including Robert Presley Detention Center, Southwest Station in Temecula, Hemet/Valle Vista Station, Ben Clark Public Safety Training Center and Lake Elsinore Station, along with other locations.

Trevor’s assignments included Corrections, Patrol, DUI Enforcement, Boat and Personal Water-Craft based Lake Patrol, Off-Road Vehicle Enforcement, Problem Oriented Policing Team and Personnel/Background Investigations. He finished his career while working as a Sex Crimes and Child Abuse Investigator and was a court-designated expert in child abuse and child sex-related crimes.

Trevor has been married for more than 27 years and was a foster parent to more than 60 children over 13 years. He is now an adoptive parent and his “fluid family” boasts 13 children and 14 – soon to be 16 – grandchildren.