Browse by Year

Depolarizing Polarity: Data Mining Shared Likes on Twitter to Uncover Political Gateway Groups

September 23, 2020
(*Please email us at for more information or to get access to the full article.)
(*Please email to to get access to the full article.)


Abstract: Abstract This project applies a new theory in the field of intergroup conflict known as "Gateway group theory," which posits that to decrease conflict between two groups, a third group with specific characteristics that appeal to both sides needs to be identified, enabling them to act as a medium. This group is known as a "Gateway group." With the background of the bitter digital divide and echo chambers plaguing the United States’ current political discourse, this paper sought to find the Gateway group between polar Democrats and Republicans on Twitter. This project data mined and examined the shared “likes” of these two populations using originally developed code and definitional parameters. Then, the study analyzed the profiles of the authors of these liked Tweets to compile an aggregated Gateway group profile that can be used to find Gateway group individuals on Twitter who have the ability to decrease conflict between Democrats and Republicans. The study found that Gateway group members exist. They are a group of Moderate Democrats. Every post that was liked by both a Democrat and Republican was also tagged and analyzed for similarities in content. It was found that 55% of all posts referenced “Trump” and 92% of those votes had a negative sentiment. Additional similarities in content were found, for example a keen interest in elections and certain Democratic candidates. This project develops an effective methodology that can be applied to any conflict on Twitter to find the Gateway group for that conflict to decrease polarity between polar groups.

Keywords: Gateway group theory, Democrat and Republican, political discourse, Twitter

I. Introduction

In 2016, The New York Times opinion columnist Lee Drutman penned an op-ed titled “The Divided States of America.” He commented on the fact that local elections were blowouts for one party or another, and instead of being a one two-party nation, America had devolved into two one-party nations. Current elected officials are more polar than they have been in decades because voter bases are more polar. In September of 2017, eight months into the Trump presidency, the Pew Research Center released a report which found that 75% of Republicans had negative views of Democrats and 70% of Democrats have negative views of Republicans. This was a large increase from the mid-1990s, when about 20% of the members of each party held unfavorable views for the other. These sentiments came to a boil during the confirmation hearing of Brett Kavanaugh to the Supreme Court. According to a CNN national survey, 91% of Democrats opposed Kavanugh’s confirmation while 89% of Republicans supported it. This project attempts to find a middle ground between the two groups -- perhaps a way back to “We” the People -- using applications of social psychology and computer science. All conflict is a result of two or more groups disagreeing with each other on a fact, principle, or idea. The more divided, the greater the conflict. One of the guiding theories in intergroup contact and conflict is the Common Ingroup Identity Model. Broadly defined, an ingroup consists of members with similar beliefs, and an outgroup is a group with the conflicting ideology. At its core, the model states that if members of conflicting groups would think of themselves as belonging to a singular larger group with shared values and perceptions, they would have more positive beliefs, feelings, and behaviors toward one another. The key notion and guiding principle of the Common Ingroup Identity Model is that successfully integrating ingroup and outgroup members into a one-group through a shared identity can reduce feelings of racism or friction between the two groups (Gaertner and Dovidio, 2012). One study found that if individuals faced certain similar stressors, they were more likely to develop “we feelings,” which caused them to act as a single common group (Dovidio and Morris, 1975). The Common Ingroup Identity Model, and theory of increased contact leading to a decrease in prejudice, has been quantified. Pettigrew and Tropp (2006) conducted a meta 2 analysis of 515 studies from 38 countries and found that 94% of them showed this negative correlation. Studies in Italy, Germany, Northern Ireland and the U.S. demonstrate that simply having ingroup friends who have outgroup friends can diminish prejudice (Pettigrew et. al, 2011). This is interesting because it proves that even proving that indirect contact can have an effect on the polarization of two groups, an idea that will be explored further in this paper. The key idea of the Common Ingroup Identity Model is that the characteristics that ingroup members use to connect with other ingroup members need to be expanded to the outgroup to create a connection between the two and form a larger unitary group (Gaertner et al., 1993). Thus, more positive beliefs, feelings, and behaviors, which are usually reserved for ingroup members, are extended or redirected to former outgroup members because of their recategorized ingroup status. Gaertner et. al. (1993) furthers that consequently, recategorization changes the intergroup dynamic as an “us” versus “them” orientation to a more inclusive “we”. Once people regard former outgroup members as ingroup members, the conflict can then start to diminish and even be resolved because the behavior of the two groups changes and it is, in effect, one large ingroup (Gaertner and Dovidio, 2012). This last finding has been reexamined in recent years and has evolved into the Gateway Group theory. Gateway groups are members of both an ingroup and an outgroup. They are defined as groups that can be characterized by “unique social categorizations that enables them to be categorized as and identified with more than one group in the context of intergroup relations” (Levy et. al., 2017). In other words, they share key characteristics of each group and recent research suggests their existence and role can be crucial. Having multiple identities could bridge the gap between two completely separate groups without shared identities. In the United States, for example, the rise of a multiracial (African American and Caucasian) group poses the interesting question of whether their existence can help repair race relations (Levy et. al., 2017). Gateway groups can exist in multiple forms and capacities. Currently, the most important and relevant work on Gateway groups explores the issue of dual identity. Dual identity is a subgroup of a population that also identifies with another group (Dovidio, et. al., 2009). In the context of ingroup conflict, people with dual identities share characteristics with both an ingroup and an outgroup. Combined with the Common Ingroup Identity Model, the existence of common 3 groups could imply that those people could be used to create common identities, but still have two distinct groups. In other words, a Gateway group could bring together groups to decrease conflict but the additional step of creating one “we” group wouldn’t be necessary. The most important part of Gateway Group theory is that Gateway groups have been proven to decrease conflict. According to research by Hornsey and Hogg (2000), different groups with different identities have more bias towards one another. However, upon the introduction of Gateway groups, the amount of intergroup bias between the two groups actually decreased because the two groups felt more related, which supports the Common Ingroup Identity Model. The majority of research so far has been theoretical, as this theory is relatively new. And -- when gateway groups have been involved in experimentation, they have been defined before the experiment. This paper will take the opposite approach. First, it will define the two ingroups, and then examine who the Gateway group is and what are its defining characteristics. There is another key difference between past exploration of Gateway Groups and their impact on conflict reduction and what this paper is interested in examining, and it relates to the arena of the conflict. Previous studies involving Gateway groups have only examined the introductions of Gateway groups into real-world, corporal conflicts. However, conflict has increasingly been migrating from the physical world to the virtual. Platforms like Facebook and Twitter have been the battlegrounds of current ideological wars. So that is where this research chooses to focus. With large amounts of data available, and many recent studies into the use of data to gather information about people, social media is an optimal source of information for this paper. Several recent studies have deduced physical traits and characteristics of humans in the physical world using their online behavior – their likes and follows. For example, an analysis of the “likes” of people on social media was an accurate source to trace certain characteristics. One study correctly identified homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and accurately sorted Democrat and Republican in 85% of cases, all based on a thumbs-up “like” (Kosinski et. al., 2013). Other factors that could be identified, with relative degrees of accuracy, were Openness, Agreeability, Extroversion, and Density of Friendship Network (on Facebook). On Twitter, one way of determining political polarity is by comparing the number of Democrats versus the number of Republicans that users 4 follow (Demszky et. al., 2019). The analysis of likes and tweets demonstrates the relative ease researchers can create profiles of random users with a high degree of accuracy. So, extrapolating from that, how would one define an ingroup online? On social media, people with similar views tend to comment, share, They serve as amplifiers for the same viewpoints, in essence, an echo chamber. Another term for this would be: an ingroup. Bessi (2016) argues that these echo chambers are problematic due to the fact that discussion with like-minded peers only increases polarization towards an outgroup (Zollo et. al., 2015). His key finding was that users would undergo a positive selection bias, by which they partook in groups that were aligned with their own beliefs, joining polarized virtual communities in the process. Additionally, in these insular bubbles, blatantly or deliberately false information is received as fact, which can be extremely dangerous because Bessi found that as ideas became more conspiratorial and radical, more people shared them in the same social media group, and more of it was taken as fact. The online world has created some of the most rigidly defined polar in and out groups. In the context of this study, the echo chambers will be formed by two conflicting polar political groups who have negative attitudes towards each other on Twitter. The overall hypothesis of this study is that by analyzing the shared Tweet likes of these two groups a profile of an online Gateway group on Twitter can be synthesized and used to decrease the effect of the online political echo chamber in the future.

II. Methods

This analysis will answer two different questions that have not been scientifically answered in the realm of intergroup conflict studies. The target population is a group of Democrats and Republicans on the social media platform Twitter. The end goal is to apply social psychology -- specifically Gateway Group Theory as defined below -- to identify what factors contribute to a decrease in the effect of political echo chambers on social media and beyond. A new theory in the intergroup contact branch of social psychology called Gateway Group (GG) Theory modifies the received notion that polarization could only be decreased through direct contact between the two groups. GGs are people with connections to both, and GG theory states 5 that their mere existence leads to decreased polarization. Given that this is a relatively new theory, the few previous experimental applications looked at a specific predefined GG and explored its effects on the target populations. This experiment differs in two key ways. First, it explores GGs in social media and determines if their effects in the physical world parallel those in the digital world; and second, its goal is to define the GG based on parameters derived from social media activity (Twitter). This contrasts with previous studies which have started knowing what their GG is, and then measured its effects on a target population. This experiment will have four separate parts. The first is data collection round one, the second is analysis, the third is data collection round two, and the fourth is overall analysis. Every step of this project was completed with originally developed Python code created specifically for this research. The online setting will be preserved as it is crucial that the observed natural interactions are untampered. A prerequisite to data collection is the definition of operational variables. A random sampling of Twitter users will be taken using the Twitter API. The users will be selected based on the number of current members of the 116th United States Congress they follow. Users who follow more than 25 Congresspeople will be included in the set. To determine their level of political polarity, the number of politicians they follow from one party will be divided from the other to determine the net polarity. For example, if a person follows ten Democrats but only two Republicans they will have a net ratio of 5 Dem. This is important because it sets the context for the rest of the study, as all users must be labeled beforehand. The next part of the study analyzes shared likes. The data set will be divided into quintiles based on users’ polarity ratios. The fifth quintile will be the highest “Democrat” ratio, or the most polar Democrats, and the first quintile will have the highest “Republican” ratio, or the most polar Republicans. Using this, the last 200 likes of 400 users in the first quintile will be compared to the last 200 likes of 400 users in the fifth quintile. A list of “political” terms was created in order to filter tweet content so only political tweets are evaluated. Political terms include (but are not limited to): “impeach”, “vote”, “President”, “immigration”, and “election”. Every time the users like the same post, the post and the user who posted it will be filed away. 6 This profile is a Gateway person because they attracted likes from the opposite ends of the spectrum. First, the profiles will be collected and using the Twitter API, their important characteristics will be put into the RapidMiner TurboPrep and Auto Modeler Software and that will determine the most prevalent characteristics for a Gateway group. The characteristics that will be evaluated will be user location, how many followers the user has, how many friends the user has, the number of Tweets the user has liked, and how many times the user has Tweeted or retweeted. Second, the posts that receive likes from both Democrats and Republicans will be tagged and will be matched against the list of political words, most of which are topic areas (i.e. elections, immigration, impeach, metoo) to see what subjects appear most often. Additionally, the profiles of the Gateway Groups will be evaluated in the same manner the original Democrats and Republicans were classified -- by using the politicians they followed as the metric for determining polarity. The purpose of this is to better understand who these people are in the context of Twitter.

III. Results

The analysis of the shared likes was fruitful in gathering data on Gateway groups. First and foremost, the 800 polar users analyzed had thousands of shared likes with each other, and 155 active “Gateway people” were identified. The individual numerical characteristics (friends, followers, favorites (Tweets liked), Tweets) of the profiles are listed in Table 1 below. For each of these, the data column was broken into quintiles and the upper and lower bounds of the third quintile was taken as the range of values. The Tweets that received the shared likes were analyzed for content. The two most prevalent Tweet topics were “Trump” and “election”/ “vote”. “Trump” was mentioned in 55% of the Tweets, and “election”/“vote” was mentioned in 13% of all Tweets. In the case of “Trump” Tweets, 92% had a negative sentiment. Additional recurring topics included “impeach” (10%), “war” (9%), and “kurds”/“turkey”/“syria” (4%).

Contrary to the narrative of a hopelessly divided America, especially online, this paper finds that the most polarized Democrats and Republicans do share likes on Twitter of certain third party accounts. In other words, they won’t see or share or like each other’s Tweets but will both see, share, and occasionally like the Tweets of a third subset of people on the platform. So the most important finding of this project is therefore that Gateway groups on Twitter do, in fact, exist. This is thanks to the novel application of weak link theory (Goyal, 2005) based on the methodology of using shared likes to determine the existence of a Gateway group. No previous study about Gateway groups has not known who Gateway group was in advance of the study. 9 They have all studied the effect of predefined Gateway groups on predefined target populations. This is also the first time Gateway groups are explored within the virtual, rather than physical, world. Those previous studies established that Gateway groups deescalate conflict. This is valuable information because a following experiment to this study should be to use this aggregated profile to find users who fit it on Twitter, and introduce them into the feeds of polar Republicans and Democrats and observe the effect of depolarization. So what is the profile of a Gateway group that can bridge polarized Republicans and Democrats on Twitter? The data reveals certain repetitive characteristics, starting with location. Approximately one-third of the users were from the Northeastern United States (including Washington, D.C.). On Twitter, these users have amassed a large following -- the median range of followers was in the hundred thousands. They have many more followers than people they follow. They are prolific Tweeters -- the median number of Tweets they generate is in the tens of thousands. Combined, this reveals that in order to appeal to multiple sides, they need to populate the feeds of their many followers constantly. Probably the most important finding in the context of the characteristics is the political affiliation and relative degree of polarity of Gateway group members in comparison to the overall political spectrum on Twitter. To provide context, the polarity of the sample Democrats and Republicans overall (i.e. not the Gateway group) was in the high tens, low hundreds, which meant that for every 100 Democrats or Republicans they followed, they followed one member of the opposite party. The Gateway groups in comparison were much more moderate. The average follow ratio of the Gateway groups was 3 Democrats for every 1 Republican. The breakdown of the Gateway profiles’ political affiliation was also interesting: 87.74% of the profiles were either Democratic or Neutral. Only 12.26% of the users were Republicans. This uniquely places these Gateway group members in a political context and who they are in relation to Democrats and Republicans becomes clearer. By and large they are Moderate Democrats. This data can have an interesting real-world application: Moderate Democrats evoke more positive reactions from the staunch liberals and staunch conservatives than do Moderate Republicans; the average “Republican” Gateway profile only followed 2 Republicans for every 1 Democrat (even more moderate and closer to the center than the average Gateway). In the 10 context of the current political reality, the fact that Moderate Democrats specifically are able to bridge the divide between the extremes with political content is a critical finding. One application of this finding is that it supports the argument that in 2020, the Democratic Party should strongly consider nominating a Moderate Democrat to run for President because they have the best chance of building a coalition of Democrats and Republicans. The idea of Moderate Democrats as a Gateway group is also borne out through the examination of the actual names of the Twitter account holders. While the names of Gateway profiles were not sought for in the data, the code written to extract profile information also provided the names of the Gateway people. While most of them were not household names, there were a few known personalities. Not surprisingly, they were politicians. The two Democratic candidates who appeared the most in the shared likes pool were Tulsi Gabbard and Joe Biden. Biden has claimed the role of the Moderate Democratic candidate in the 2020 Democratic Primary race. Gabbard, who is more controversial, is running on a platform that mixes ideas from the right and the left. So while not a classical “Moderate,” her overall aggregated profile is that of a Centrist. Both these candidates received more than double the shared likes than the next Democratic candidate, Elizabeth Warren, who is perceived to be much farther to the left. While this paper is not endorsing a Gabbard or Biden candidacy, it does suggest that whomever runs against President Trump should not skew too far to the left on policy, but rather stay centric and build a large enough support base from both sides to win the election. Within the vast number of people who follow Gateway groups, there are polar Democrats and Republicans who actively like the Gateway group’s content, and because only political Tweets were examined in this study, that opens the door to more discourse and exposure to content that is not necessarily within the confines of their respective echo chambers. The content Tweeted about and liked spanned many topics. To better understand what was Tweeted, the data was sorted by keywords. The highest-occurring keyword in the data was “Trump”, appearing in 55% of the Tweets. But more telling was the fact that 92% of these Tweets were negative. This is partially expected and partially surprising. That polar liberals are not enamored with President Trump is not surprising, but it is interesting that many polar conservatives like content that is critical of the President: these negative Tweets Tweeted by Moderate Democrats were also liked 11 by polar Republicans. This raises an interesting question: if a Moderate Democrat does run, would some polar Republicans abandon President Trump to vote for that candidate? The point of this paper is not to dole out political advice to any one party, but rather to explore ways to reduce the digital divide that plagues our political culture and national discourse. Now that we have established the existence and profile of Gateway group members on Twitter, we must explore their role and how to amplify it. First of all, according to Gateway Group Theory, their mere organic existence suggests the ability to reduce conflict. There are also a few steps that could be taken proactively to increase their influence. Currently, the majority of the Gateway group members are Democrats. These Democrats have not actively become Gateway persons (they don’t think of themselves as such or even know they belong to a unique group that decreases polarization on Twitter), rather, their online activity makes them so. But there is actually an ingroup and an outgroup dynamic occurring within the Gateway group, where Democrats are the ingroup and Republicans the outgroup. And while this paper deals with Gateway Group theory, the Common Ingroup Identity Model still holds true. The Background Section explained that a key idea of the Common Ingroup Identity Model is that the characteristics that ingroup members use to connect with other ingroup members need to be expanded to the outgroup to create a connection between the two and form a larger unitary group, changing the dynamic from “us” vs “them” to “we.” In this case, the “we” is the Gateway group. If these Gateway Democrats actively followed more Republicans and started to increase the similarities in Tweet content, more Republicans could be made a part of the Gateway and consequently increase the scope of its influence. These Gateway groups could ultimately decrease the echo-chambers. Unfortunately, this cannot be mandated or achieved artificially. But one idea is to encourage the platform to design algorithms that would recommend Gateway group members to polar Republican or Democratic users (platforms routinely recommend additional accounts to follow.) As this study shows, Twitter, which is often seen as the source of so much of the polarity in contemporary US political discourse, can actually become the vehicle for reducing that polarity. This via Gateway Twitter members deliberately reaching out to those beyond their current sphere of shared followers, for example by growing their own ratios of Republican vs. 12 Democratic users they themselves follow. To truly evaluate the long-term effect of Gateway groups on the echo chamber, a further study would have to observe the same polar profiles over time as they interacted with more Gateway group members and use sentiment analysis on their Tweets at different points in time to see if they became less polarized. Using this study’s profile, that should happen. And there is wide-applicability to this idea well beyond politics on Twitter. While the methodology of the study was based in computer science, and the study examined social media, it was all through the lens of social psychology. The guiding principle behind social psychology is that human behavior is constant and so theories are made and applied to a variety of different situations. Because of that, this study has two key impacts and implications. First, it successfully defines Gateway groups on social media through its unique methodology. It establishes two conflicting ingroups. It then uses a metric (shared “likes”) to find the people that evoke positive reactions from both ingroups. Finally, it extracts profile data on these people to make a composite Gateway profile. But the second implication is crucial in the context of social media, which is only making people more and more polarized because of the echo chambers that are so prevalent on these platforms. This method of identifying a Gateway group online via shared likes can be applied to any “conflictual” echo chamber on social media. The originally developed code from this project can be adapted and used to this end. It is the first step in combating the polarity of users on social media because now researchers can build on this idea to find the Gateway groups to then insert them into polarized feeds to decrease intergroup conflict. Social media has been the scapegoat for the political polarization plaguing this country. But if the influence of the Gateway groups this study identified could be increased, then social media could become a social medium and decrease the polarity of our political discourse.


Bessi, A. (2016). Personality traits and echo chambers on facebook. Computers in Human Behavior,65, 319-324. doi:10.1016/j.chb.2016.08.016 Demszky, D., Garg, N., Voigt, R., Zou, J., Shapiro, J., Gentzkow, M., & Jurafsky, D. (2019).

Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings. Proceedings of the 2019 Conference of the North. Dovidio, J. F., & Morris, W. N. (1975).

Effects of stress and commonality of fate on helping behavior. Journal of Personality and Psychological Behavior,31(1), 145-149. Gaertner, S. L., & Dovidio, J. F. (2012). The Common Ingroup Identity Model.

Handbook of Theories of Social Psychology,2, 439-457. Gaertner, S. S., Dovidio, J. F., Anastasio, P. A., Bachman, B. A., & Rust, M. C. (1993).

The Common Ingroup Identity Model: Recategorization and the Reduction of Intergroup Bias. European Review of Social Psychology,4(1), 1-26. doi: Goyal, S. (2005).

Strong and Weak Links. Journal of the European Economic Association, 3(2/3), 608-616. Retrieved from Hornsey, M. J., & Hogg, M. A. (2000).

Subgroup Relations: A Comparison of Mutual Intergroup Differentiation and Common Ingroup Identity Models of Prejudice Reduction. Personality and Social Psychology Bulletin,26(2), 242-256. doi:10.1177/0146167200264010 Kosinski, M., Stillwell, D., & Graepel, T. (2013).

Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences,110(15), 5802-5805. doi:10.1073/pnas.1218772110 14 Levy, A., Saguy, T., Halperin, E., & Zomeren, M. V. (2017).

Bridges or Barriers? Conceptualization of the Role of Multiple Identity Gateway Groups in Intergroup Relations. Frontiers in Psychology,8. doi:10.3389/fpsyg.2017.01097 Pettigrew, T. F., Tropp, L. R., Wagner, U., & Christ, O. (2011).

Recent advances in intergroup contact theory. International Journal of Intercultural Relations,(35), 271-280. Zollo, F., Novak, P. K., Vicario, M. D., Bessi, A., Mozetič, I., Scala, A., . . . Quattrociocchi, W. (2015). Emotional Dynamics in the Age of Misinformation. Plos One,10(9). doi:10.1371/journal.pone.0138740