Commit 33833888 authored by Jeremy Singer-Vine's avatar Jeremy Singer-Vine
Browse files

Initial commit

parents
# Fact-Checking Facebook Politics Pages
This repository contains the data and analysis for the BuzzFeed News article, "[Hyperpartisan Facebook Pages Are Publishing False And Misleading Information At An Alarming Rate](https://www.buzzfeed.com/craigsilverman/partisan-fb-pages-analysis)," published October 20, 2016.
## Data
You can find a spreadsheet of all the posts, fact-check ratings, and Facebook engagement figures [__here__](data/facebook-fact-check.csv). The methodology for collecting and rating the pages can be at the beginning and end of [the main article](https://www.buzzfeed.com/craigsilverman/partisan-fb-pages-analysis).
The Facebook engagement figures were obtained from the Facebook API on October 11, 2016.
## Analysis
The full set of cross-tabulations [can be found here](notebooks/facebook-fact-check.ipynb).
## Questions / Comments?
Please contact Jeremy Singer-Vine at jeremy.singer-vine@buzzfeed.com.
This source diff could not be displayed because it is too large. You can view the blob instead.
%% Cell type:markdown id: tags:
# Fact-Checking Facebook Politics Pages — Analysis
See [this page](https://github.com/BuzzFeedNews/2016-10-facebook-fact-check) for context.
%% Cell type:markdown id: tags:
## Prepare data
%% Cell type:code id: tags:
``` python
import pandas as pd
```
%% Cell type:code id: tags:
``` python
percentify = lambda x: (x * 100).round(1).astype(str) + "%"
```
%% Cell type:code id: tags:
``` python
posts = pd.read_csv("../data/facebook-fact-check.csv")
```
%% Cell type:code id: tags:
``` python
len(posts)
```
%%%% Output: execute_result
2282
%% Cell type:code id: tags:
``` python
ENGAGEMENT_COLS = [
"share_count",
"reaction_count",
"comment_count"
]
```
%% Cell type:code id: tags:
``` python
RATINGS = ["mostly false", "mixture of true and false", "mostly true", "no factual content"]
FACTUAL_RATINGS = ["mostly false", "mixture of true and false", "mostly true"]
```
%% Cell type:code id: tags:
``` python
category_grp = posts.groupby("Category")
page_grp = posts.groupby([ "Category", "Page" ])
type_grp = posts.groupby([ "Category", "Page", "Post Type" ])
```
%% Cell type:markdown id: tags:
## Rating by category
%% Cell type:markdown id: tags:
Counts:
%% Cell type:code id: tags:
``` python
rating_by_category = category_grp["Rating"].value_counts().unstack()[RATINGS].fillna(0)
rating_by_category["total"] = rating_by_category.sum(axis=1)
rating_by_category
```
%%%% Output: execute_result
Rating mostly false mixture of true and false mostly true \
Category
left 22 68 265
mainstream 0 8 1085
right 86 167 318
Rating no factual content total
Category
left 116 471
mainstream 52 1145
right 95 666
%% Cell type:markdown id: tags:
Percentages, of all posts:
%% Cell type:code id: tags:
``` python
(rating_by_category[RATINGS].T / rating_by_category[RATINGS].sum(axis=1)).T\
.pipe(percentify)
```
%%%% Output: execute_result
mostly false mixture of true and false mostly true \
Category
left 4.7% 14.4% 56.3%
mainstream 0.0% 0.7% 94.8%
right 12.9% 25.1% 47.7%
no factual content
Category
left 24.6%
mainstream 4.5%
right 14.3%
%% Cell type:markdown id: tags:
Percentages, of posts not rated "no factual content":
%% Cell type:code id: tags:
``` python
(rating_by_category[FACTUAL_RATINGS].T / rating_by_category[FACTUAL_RATINGS].sum(axis=1)).T\
.pipe(percentify)
```
%%%% Output: execute_result
mostly false mixture of true and false mostly true
Category
left 6.2% 19.2% 74.6%
mainstream 0.0% 0.7% 99.3%
right 15.1% 29.2% 55.7%
%% Cell type:markdown id: tags:
## Rating by page
%% Cell type:markdown id: tags:
Counts:
%% Cell type:code id: tags:
``` python
rating_by_page = page_grp["Rating"].value_counts().unstack()[RATINGS].fillna(0)
rating_by_page["total"] = rating_by_page.sum(axis=1)
rating_by_page
```
%%%% Output: execute_result
Rating mostly false mixture of true and false \
Category Page
left Addicting Info 8 25
Occupy Democrats 9 33
The Other 98% 5 10
mainstream ABC News Politics 0 2
CNN Politics 0 4
Politico 0 2
right Eagle Rising 30 54
Freedom Daily 26 26
Right Wing News 30 87
Rating mostly true no factual content total
Category Page
left Addicting Info 96 11 140
Occupy Democrats 102 65 209
The Other 98% 67 40 122
mainstream ABC News Politics 172 26 200
CNN Politics 385 20 409
Politico 528 6 536
right Eagle Rising 121 81 286
Freedom Daily 56 4 112
Right Wing News 141 10 268
%% Cell type:markdown id: tags:
Percentages, of all posts:
%% Cell type:code id: tags:
``` python
(rating_by_page[RATINGS].T / rating_by_page[RATINGS].sum(axis=1)).T\
.pipe(percentify)
```
%%%% Output: execute_result
mostly false mixture of true and false \
Category Page
left Addicting Info 5.7% 17.9%
Occupy Democrats 4.3% 15.8%
The Other 98% 4.1% 8.2%
mainstream ABC News Politics 0.0% 1.0%
CNN Politics 0.0% 1.0%
Politico 0.0% 0.4%
right Eagle Rising 10.5% 18.9%
Freedom Daily 23.2% 23.2%
Right Wing News 11.2% 32.5%
mostly true no factual content
Category Page
left Addicting Info 68.6% 7.9%
Occupy Democrats 48.8% 31.1%
The Other 98% 54.9% 32.8%
mainstream ABC News Politics 86.0% 13.0%
CNN Politics 94.1% 4.9%
Politico 98.5% 1.1%
right Eagle Rising 42.3% 28.3%
Freedom Daily 50.0% 3.6%
Right Wing News 52.6% 3.7%
%% Cell type:markdown id: tags:
Percentages, of posts not rated "no factual content":
%% Cell type:code id: tags:
``` python
(rating_by_page[FACTUAL_RATINGS].T / rating_by_page[FACTUAL_RATINGS].sum(axis=1)).T\
.pipe(percentify)
```
%%%% Output: execute_result
mostly false mixture of true and false \
Category Page
left Addicting Info 6.2% 19.4%
Occupy Democrats 6.2% 22.9%
The Other 98% 6.1% 12.2%
mainstream ABC News Politics 0.0% 1.1%
CNN Politics 0.0% 1.0%
Politico 0.0% 0.4%
right Eagle Rising 14.6% 26.3%
Freedom Daily 24.1% 24.1%
Right Wing News 11.6% 33.7%
mostly true
Category Page
left Addicting Info 74.4%
Occupy Democrats 70.8%
The Other 98% 81.7%
mainstream ABC News Politics 98.9%
CNN Politics 99.0%
Politico 99.6%
right Eagle Rising 59.0%
Freedom Daily 51.9%
Right Wing News 54.7%
%% Cell type:markdown id: tags:
## Number of posts by date
%% Cell type:markdown id: tags:
Counts:
%% Cell type:code id: tags:
``` python
posts_by_date_by_category = category_grp["Date Published"].value_counts().unstack()
posts_by_date_by_category["Avg. Per Day"] = posts_by_date_by_category.mean(axis=1).round(0)
posts_by_date_by_category
```
%%%% Output: execute_result
Date Published 2016-09-19 2016-09-20 2016-09-21 2016-09-22 2016-09-23 \
Category
left 55 70 58 54 66
mainstream 154 156 151 146 135
right 97 91 97 93 93
Date Published 2016-09-26 2016-09-27 Avg. Per Day
Category
left 80 88 67
mainstream 223 180 164
right 100 95 95
%% Cell type:code id: tags:
``` python
posts_by_date_by_page = page_grp["Date Published"].value_counts().unstack()
posts_by_date_by_page["Avg. Per Day"] = posts_by_date_by_page.mean(axis=1).round(0)
posts_by_date_by_page
```
%%%% Output: execute_result
Date Published 2016-09-19 2016-09-20 2016-09-21 2016-09-22 \
Category Page
left Addicting Info 22 18 17 21
Occupy Democrats 20 30 20 19
The Other 98% 13 22 21 14
mainstream ABC News Politics 36 22 23 21
CNN Politics 54 61 53 62
Politico 64 73 75 63
right Eagle Rising 41 41 42 41
Freedom Daily 19 16 17 15
Right Wing News 37 34 38 37
Date Published 2016-09-23 2016-09-26 2016-09-27 Avg. Per Day
Category Page
left Addicting Info 22 17 23 20
Occupy Democrats 29 47 44 30
The Other 98% 15 16 21 17
mainstream ABC News Politics 22 47 29 29
CNN Politics 48 66 65 58
Politico 65 110 86 77
right Eagle Rising 41 41 39 41
Freedom Daily 15 15 15 16
Right Wing News 37 44 41 38
%% Cell type:markdown id: tags:
## Rating by post type
%% Cell type:code id: tags:
``` python
rating_by_post_type = type_grp["Rating"].value_counts().unstack()[RATINGS].fillna(0)
rating_by_post_type["total"] = rating_by_post_type.sum(axis=1)
rating_by_post_type
```
%%%% Output: execute_result
Rating mostly false \
Category Page Post Type
left Addicting Info link 8
photo 0
video 0
Occupy Democrats link 7
photo 2
video 0
The Other 98% link 1
photo 4
video 0
mainstream ABC News Politics link 0
photo 0
text 0
video 0
CNN Politics link 0
photo 0
text 0
video 0
Politico link 0
photo 0
text 0
video 0
right Eagle Rising link 27
photo 3
video 0
Freedom Daily link 26
text 0
Right Wing News link 29
photo 1
Rating mixture of true and false \
Category Page Post Type
left Addicting Info link 25
photo 0
video 0
Occupy Democrats link 26
photo 2
video 5
The Other 98% link 7
photo 0
video 3
mainstream ABC News Politics link 2
photo 0
text 0
video 0
CNN Politics link 4
photo 0
text 0
video 0
Politico link 2
photo 0
text 0
video 0
right Eagle Rising link 50
photo 3
video 1
Freedom Daily link 25
text 1
Right Wing News link 86
photo 1
Rating mostly true no factual content total
Category Page Post Type
left Addicting Info link 94 7 134
photo 1 3 4
video 1 1 2
Occupy Democrats link 60 1 94
photo 25 49 78
video 17 15 37
The Other 98% link 40 3 51
photo 11 26 41
video 16 11 30
mainstream ABC News Politics link 104 2 108
photo 13 0 13
text 1 0 1
video 54 24 78
CNN Politics link 316 10 330
photo 3 5 8
text 1 0 1
video 65 5 70
Politico link 458 3 463
photo 5 2 7
text 1 0 1
video 64 1 65
right Eagle Rising link 117 37 231
photo 3 37 46
video 1 7 9
Freedom Daily link 56 4 111
text 0 0 1
Right Wing News link 139 4 258
photo 2 6 10
%% Cell type:markdown id: tags:
# Engagement
%% Cell type:markdown id: tags:
Count of missing engagement figures:
%% Cell type:code id: tags:
``` python
posts[ENGAGEMENT_COLS].isnull().sum()
```
%%%% Output: execute_result
share_count 70
reaction_count 2
comment_count 2
dtype: int64
%% Cell type:markdown id: tags:
## Median engagement by page
%% Cell type:code id: tags:
``` python
page_grp[ENGAGEMENT_COLS].median().round()
```
%%%% Output: execute_result
share_count reaction_count comment_count
Category Page
left Addicting Info 563 2230 271
Occupy Democrats 10931 22360 1205
The Other 98% 3942 12083 521
mainstream ABC News Politics 13 80 28
CNN Politics 50 340 194
Politico 33 314 95
right Eagle Rising 92 186 22
Freedom Daily 947 2245 214
Right Wing News 266 913 91
%% Cell type:markdown id: tags:
## Average engagement by page
%% Cell type:code id: tags:
``` python
page_grp[ENGAGEMENT_COLS].mean().round()
```
%%%% Output: execute_result
share_count reaction_count comment_count
Category Page
left Addicting Info 1270 3120 392
Occupy Democrats 29205 34669 2858
The Other 98% 18007 20971 915
mainstream ABC News Politics 44 177 71
CNN Politics 183 678 322
Politico 182 900 170
right Eagle Rising 616 520 79
Freedom Daily 2474 3685 516
Right Wing News 1398 2454 360
%% Cell type:markdown id: tags:
## Engagement by truthfulness
%% Cell type:code id: tags:
``` python
grp = posts.groupby([ "Category", "Page", "Rating" ])
```
%% Cell type:markdown id: tags:
Counts:
%% Cell type:code id: tags:
``` python
grp[ENGAGEMENT_COLS].size().unstack().fillna(0)
```
%%%% Output: execute_result
Rating mixture of true and false mostly false \
Category Page
left Addicting Info 25 8
Occupy Democrats 33 9
The Other 98% 10 5
mainstream ABC News Politics 2 0
CNN Politics 4 0
Politico 2 0
right Eagle Rising 54 30
Freedom Daily 26 26
Right Wing News 87 30
Rating mostly true no factual content
Category Page
left Addicting Info 96 11
Occupy Democrats 102 65
The Other 98% 67 40
mainstream ABC News Politics 172 26
CNN Politics 385 20
Politico 528 6
right Eagle Rising 121 81
Freedom Daily 56 4
Right Wing News 141 10
%% Cell type:markdown id: tags:
Medians:
%% Cell type:code id: tags:
``` python
grp[ENGAGEMENT_COLS].median().round()
```
%%%% Output: execute_result
share_count \
Category Page Rating
left Addicting Info mixture of true and false 1132
mostly false 285
mostly true 523
no factual content 399
Occupy Democrats mixture of true and false 10654
mostly false 5541
mostly true 7755
no factual content 18345
The Other 98% mixture of true and false 4749
mostly false 11571
mostly true 2896
no factual content 10337
mainstream ABC News Politics mixture of true and false 76
mostly true 12
no factual content 38
CNN Politics mixture of true and false 270
mostly true 48
no factual content 64
Politico mixture of true and false 7325
mostly true 33
no factual content 48
right Eagle Rising mixture of true and false 110
mostly false 534
mostly true 46
no factual content 250
Freedom Daily mixture of true and false 342
mostly false 1623
mostly true 908
no factual content 2025
Right Wing News mixture of true and false 457
mostly false 713
mostly true 87
no factual content 3917
reaction_count \
Category Page Rating
left Addicting Info mixture of true and false 3087
mostly false 1910
mostly true 1966
no factual content 2351
Occupy Democrats mixture of true and false 17085
mostly false 17525
mostly true 15951
no factual content 37326
The Other 98% mixture of true and false 9040
mostly false 19682
mostly true 7082
no factual content 25951
mainstream ABC News Politics mixture of true and false 479
mostly true 78
no factual content 78
CNN Politics mixture of true and false 1374
mostly true 343
no factual content 245
Politico mixture of true and false 20344