T&S #1: A Case of Spam in UGVC
The Significance of User Generated Video Content Platforms
2 things that we as people value are {social connections} and {entertainment & education}. With the improvements in affordability for mobile, personal internet communicator devices1 and for mobile, personal internet connectivity, user generated video content (UGVC) platforms offer people unified experiences bringing together {social connections} and {entertainment & education} into a single, accessible, and convenient moment.
3 platforms deliver this delightful experience best: YouTube, Instagram and TikTok. As of the end of calendar year 2024, each of these platforms have over 1 billion users and over 25 billion USD in advertising revenue2.
The Attractiveness for Spammers, Scammers and Fraudsters
However, these delightful experiences also invite the focused efforts of spammers, scammers and fraudsters due to these 3 platform properties:
- A concentrated ecosystem of personally sensitive information due to the social connections
- A massively scaled ecosystem due to the billions of users and their activities
- A valuable ecosystem due to the billions of advertising revenue
The existence of these actors in the ecosystem pose a threat to a north-star situation where “content becomes community”.
The Deterrence of Spammers, Scammers and Fraudsters
To deter these undesirable behaviors, each of the 3 platforms define policies and enforce those policies through computer-program-driven (e.g. classifiers) and/or people-driven approaches (e.g. investigations). Focusing on the policies aspect, each of these platforms communicate their policies through the publicly available community guidelines3. In this sentence are links to YouTube’s, Instagram’s and TikTok’s community guidelines.
Community guidelines serve to define what is allowable, what is restricted, and what is not allowed. For these UGVC platforms, the guidelines define policies related to (1) the video content, (2) the interactions and activities associated with that video content, (3) some properties related to the video author.
Across each platform’s community guidelines are commonly covered areas of concern which include and are not limited to violent or dangerous content, regulated goods and services, inauthentic behaviors, and deceptive behaviors. Focusing on inauthentic and deceptive behaviors, one common issue across these platforms is spam. In this sentence are links to YouTube’s, Instagram’s and TikTok’s spam-specific guidelines.
A Basis for Spam Identification
To start, one needs a basis for identifying spam. Spam can present the following characteristics:
- Inauthentic
- Deceptive
- Repetitive
- Promotional
In addition to these characters, one needs to raise confidence in the likelihood that something is spam, and this can be done by a UGVC spam evaluation framework which assesses:
- Content
- Interactions and Activities
- Account
Before reviewing an example, readers of this article must know that the vast majority/the lion’s share of all content, interactions and activities and accounts on any of these 3 UGVC platforms are real people seeking to add to and/or enjoy the delightful experience of each platform. With this in mind, any examples of spam reviewed are considered “likely spam” due to the “higher levels” of confidence that an analysis of an example brings, or “possibly spam” due to some spam-like behavior, or “likely not spam”, or “not spam”.
An example of spam in the comments section
Let’s assume the following:
- We are using UGVC platform U
- We come upon a post P on U by video author A
- We click into P because we are interested in A’s viewpoints on the world’s economic landscape recorde in the video content V
- We see that P also comes with comments section C, and we want to find community members who agree and/or disagree with our opinions
- We scroll through C, and we find two very similar looking threads CT1 and CT2
- We wonder, “Oh no, is this spam?”
Comments thread #1 (CT1)
Comments thread #2 (CT2)
Reviewing CT#1, let’s apply the UGVC spam evaluation framework:
- Content
- Interactions and Activities
- Account
Here is a walkthrough of how the evaluation framework could be applied, and let us begin with Comment #1 in CT#1.
| Framework Dimension | Dimension Consideration | Evaluation Outcome |
|---|---|---|
| Content | Does the comment topic divert from the content topic? | Yes, because although economics and finance are related, the first comment in the thread is about an individual’s finances and does not tie back to the video’s topic of the economic landscape of the world. |
| Does the comment direct other users to another piece of content? | No | |
| … to another author’s contact information? | No | |
| … to another author’s channel? | No | |
| … off the platform? | No | |
| Interactions and Activities | Does the comment present duplicated text from the same comment author’s own comments? | Unknown |
| … from a different author's comments? | Unknown | |
| Does the comment present nearly duplicated text from the same comment author’s own comments? | Unknown | |
| … from a different comment author’s comments? | Yes, because a comparison between comments threads #1 and #2 reveal that the first comment in both threads are highly identical. | |
| Does the comment have more engagement (in the form of likes, replies or other publicly available counts) as compared to other comments? | Yes, because a comparison between comments threads #1 and #2 reveal that the first comment in both threads are highly identical. | |
| … and does the engagement have short time differences in between interactions? | Unknown | |
| Does the comment have replies which direct other users to another piece of content? | Yes, because several comments mention a <named individual>, and at least one of those comments mentions to visit <named individual’s> website. | |
| Account | Does the comment author’s username look similar to other usernames in the same thread in the same video? | Unknown |
| Does the comment author’s look similar to other usernames in a different thread in the same video? | Unknown | |
| Does the comment author’s look similar to other usernames in a different thread in a different video? | Unknown | |
| Does the comment author have a recent account creation timestamp in comparison to the video’s publication timestamp? | No | |
| Does the comment author have a recent account creation timestamp in comparison to the accounts which interacted with the comment author’s post? | No | |
| Does the comment author have a profile picture? | No | |
| … and does that profile picture have matches on Google Image Search results? | Unknown | |
| … and do the matches link to reported cases of spam or scam or fraud? | Unknown | |
| … or do the matches link to 2 or more unique identities? | Unknown |
Applying the evaluation framework to these threads leads to a conclusion that they exhibit multiple characteristics commonly associated with spam. Applying a specific conclusion such as “Likely spam”, “Possibly spam”, “Likely not spam” and/or “Not spam” would be up to an individual analyst’s own weighting of each dimension/characteristic combination.
What comes next after this blogpost?
After walking through one example of application of the UGVC spam evaluation framework, one may begin to scroll other UGVC posts and look for potential cases of spam. After looking at enough cases, one may wonder, “Why are there ‘so many’4 uncaught cases of spam? I thought <UGVC platform> was a software company? If I can see this is spam, AI(-powered computer programs) should be able to see it too! <UGVC platform> are a bunch of <derogatory term to comment on a collection of people’s capability and integrity>.”
To address this possible curiosity, moving forward, the following blogposts will further focus on spam in UGVC platforms.
- T&S #2: A Rationale for Uncaught Spam in UGVC
- T&S #3: A Person-driven Evaluation Framework for UGVC Spam
- T&S #4: A Computer-program-driven Evaluation Framework for UGVC Spam
- T&S #5: More Examples of Uncaught Spam on UGVC
- T&S #6: Open Source Projects for Detecting Spam in UGVC
- T&S #7: An Overview of Companies Detecting Spam in UGVC