电影数据集可视化

电影数据集可视化

本文是学习A quick look at Bechdel test data (& an awtools update)的笔记,讲述了一些数据处理的技巧和线图的绘制。

数据整理:

R
1
2
3
4
5
6
7
library(tidyverse)
library(awtools)
movies <- jsonlite::read_json('http://bechdeltest.com/api/v1/getAllMovies',
simplifyVector = T) %>%
data.frame() %>%
mutate(year = as.numeric(year),
id = as.numeric(id))

绘图:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
ggplot(data = movies,
aes(x = year)) +
geom_point(stat = 'count',
aes(color = rating)) +
geom_line(stat = 'count',
aes(color = rating)) +
hrbrthemes::theme_ipsum(base_family = 'STSongti-SC-Bold') +
a_step_color('评分') +
theme(legend.position = 'right') +
labs(title = 'Bechdel评分',
subtitle = '不同Bechdel评分电影数量的走势',
caption = '数据来源:http://bechdeltest.com',
x = '年份', y = '数量')

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
ratings <- ggplot2movies::movies

movies_with_rating <- left_join(movies, ratings, by = 'title')

movies_with_rating <- subset(movies_with_rating, !is.na(movies_with_rating$votes))

avg.test.rate <- movies_with_rating %>%
group_by(year.x, rating.x) %>%
summarise(imdb = mean(rating.y), n = n())

avg.test.rate <- subset(avg.test.rate, avg.test.rate$year.x >= 1967)

ggplot(avg.test.rate, aes(x = year.x,
y = imdb,
color = rating.x)) +
geom_point(alpha = 0.35) +
geom_smooth(se = F) +
hrbrthemes::theme_ipsum(base_family = 'STSongti-SC-Bold') +
a_step_color('Bechdel评分') +
labs(title = 'IMDB评分',
subtitle = 'IMDB平均评分,按年,按Bechdel评分',
x = '年份', y = 'IMDB评分')

似乎随着时间的推移,电影的得分在下降。下面将电影分类进行观察:

R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
genres <- movies_with_rating %>% 
group_by(rating.x) %>%
gather(genres, gys, 22:28)
genres <- subset(genres, gys > 0)

genre.rate <- genres %>%
group_by(year.x, genres, rating.x) %>%
summarise(imdb = mean(rating.y), n = n())

ggplot(genre.rate,
aes(x = year.x, y = n, color = rating.x)) +
geom_point(alpha = 0.3) +
geom_smooth(se = F) +
facet_wrap(~genres, ncol = 1, scales = 'free_y') +
hrbrthemes::theme_ipsum(base_family = 'STSongti-SC-Bold') +
a_step_color('Bechdel评分') +
labs(title = 'Bechdel test spread of movies by genre over the years',
x = '年份', y = 'IMDB评分')

# R

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×