Data from

author: ‘Akash’ date: ‘15 December 2017’

介紹

美國2017年發生了大規模的槍擊案件,此分析可能為移民,槍支法,精神健康問題和吸毒問題提供信息。但這個分析僅限於大規模射擊,其他犯罪,如武器,殺人和爆炸等都沒有討論。

In this Project, The USA mass shooting dataset is used to analyze information about the shooter with respect to Race, States, Causes and Targets.

本次目的-藉由分析美國相關州之間,槍手犯案的各類因素

environment

library(data.table)
library(shiny) 
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:data.table':
## 
##     hour, isoweek, mday, minute, month, quarter, second, wday,
##     week, yday, year
## The following object is masked from 'package:base':
## 
##     date
library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(tidyverse)
## -- Attaching packages -------------------------------------------- tidyverse 1.2.1 --
## √ tibble  1.4.2     √ purrr   0.2.5
## √ tidyr   0.8.1     √ dplyr   0.7.6
## √ readr   1.1.1     √ stringr 1.3.1
## √ tibble  1.4.2     √ forcats 0.3.0
## -- Conflicts ----------------------------------------------- tidyverse_conflicts() --
## x lubridate::as.difftime() masks base::as.difftime()
## x dplyr::between()         masks data.table::between()
## x lubridate::date()        masks base::date()
## x dplyr::filter()          masks plotly::filter(), stats::filter()
## x dplyr::first()           masks data.table::first()
## x lubridate::hour()        masks data.table::hour()
## x lubridate::intersect()   masks base::intersect()
## x lubridate::isoweek()     masks data.table::isoweek()
## x dplyr::lag()             masks stats::lag()
## x dplyr::last()            masks data.table::last()
## x lubridate::mday()        masks data.table::mday()
## x lubridate::minute()      masks data.table::minute()
## x lubridate::month()       masks data.table::month()
## x lubridate::quarter()     masks data.table::quarter()
## x lubridate::second()      masks data.table::second()
## x lubridate::setdiff()     masks base::setdiff()
## x purrr::transpose()       masks data.table::transpose()
## x lubridate::union()       masks base::union()
## x lubridate::wday()        masks data.table::wday()
## x lubridate::week()        masks data.table::week()
## x lubridate::yday()        masks data.table::yday()
## x lubridate::year()        masks data.table::year()
library(readr)
library(tidyr)
library("leaflet")

資料讀取

#格式化日期 

Dateformat = cols(Date = col_date(format = "%m/%d/%Y"))

#讀取csv檔

Mass_Shootings_Dataset_Ver_5 <- read_csv("../week3/us-mass-shootings-last-50-years/Mass Shootings Dataset Ver 5.csv"
                              , col_types = Dateformat)

#將數據分配到數據框
Crime <- data.table(Mass_Shootings_Dataset_Ver_5)

#Dimenstions of the dataset
dim(Crime)
## [1] 323  21
#Headers of the dataset
head(Crime)
##    S#                               Title               Location
## 1:  1          Texas church mass shooting Sutherland Springs, TX
## 2:  2 Walmart shooting in suburban Denver           Thornton, CO
## 3:  3     Edgewood businees park shooting           Edgewood, MD
## 4:  4       Las Vegas Strip mass shooting          Las Vegas, NV
## 5:  5          San Francisco UPS shooting      San Francisco, CA
## 6:  6   Pennsylvania supermarket shooting        Tunkhannock, PA
##          Date                               Incident Area
## 1: 2017-11-05                                      Church
## 2: 2017-11-01                                    Wal-Mart
## 3: 2017-10-18                            Remodeling Store
## 4: 2017-10-01 Las Vegas Strip Concert outside Mandala Bay
## 5: 2017-06-14                                UPS facility
## 6: 2017-06-07                                Weis grocery
##    Open/Close Location    Target     Cause
## 1:               Close    random   unknown
## 2:                Open    random   unknown
## 3:               Close coworkers   unknown
## 4:                Open    random   unknown
## 5:               Close coworkers      <NA>
## 6:               Close coworkers terrorism
##                                                                                                                                                                                                                                                                                                                                          Summary
## 1:                                                                                                                                                                                     Devin Patrick Kelley, 26, an ex-air force officer, shot and killed 26 people and wounded 20 at a church in Texas. He was found dead later in his vehicle.
## 2:                  Scott Allen Ostrem, 47, walked into a Walmart in a suburb north of Denver and fatally shot two men and a woman, then left the store and drove away. After an all-night manhunt, Ostrem, who had financial problems but no serious criminal history, was captured by police after being spotted near his apartment in Denver.
## 3: Radee Labeeb Prince, 37, fatally shot three people and wounded two others around 9am at Advance Granite Solutions, a home remodeling business where he worked near Baltimore. Hours later he shot and wounded a sixth person at a car dealership in Wilmington, Delaware. He was apprehended that evening following a manhunt by authorities.
## 4:                                                                                                                                     Stephen Craig Paddock, opened fire from the 32nd floor of Manadalay Bay hotel at Last Vegas concert goers for no obvious reason. He shot himself and died on arrival of law enforcement agents. He was 64
## 5:                                                                                                                                                             Jimmy Lam, 38, fatally shot three coworkers and wounded two others inside a UPS facility in San Francisco. Lam killed himself as law enforcement officers responded to the scene.
## 6:                                                                             Randy Stair, a 24-year-old worker at Weis grocery fatally shot three of his fellow employees. He reportedly fired 59 rounds with a pair of shotguns before turning the gun on himself as another co-worker fled the scene for help and law enforcement responded.
##    Fatalities Injured Total victims Policeman Killed Age Employeed (Y/N)
## 1:         26      20            46                0  26              NA
## 2:          3       0             3                0  47              NA
## 3:          3       3             6                0  37              NA
## 4:         59     527           585                1  64              NA
## 5:          3       2             5                0  38               1
## 6:          3       0             3               NA  24               1
##              Employed at Mental Health Issues  Race Gender Latitude
## 1:                  <NA>                   No White      M       NA
## 2:                  <NA>                   No White      M       NA
## 3: Advance Granite Store                   No Black      M       NA
## 4:                  <NA>              Unclear White      M 36.18127
## 5:                  <NA>                  Yes Asian      M       NA
## 6:          Weis grocery              Unclear White      M       NA
##    Longitude
## 1:        NA
## 2:        NA
## 3:        NA
## 4: -115.1341
## 5:        NA
## 6:        NA

前期先處理的數據.

#Preprocessing Race
Crime[Race=="unclear",Race:="Unknown"]
Crime[is.na(Race),Race:="Other"]
Crime[ Race=="black" | Race=="Black American or African American" | Race=="Black American or African American/Unknown",Race:="Black"]
Crime[Race=="white" | Race=="White American or European American" | Race=="White American or European American/Some other Race" ,Race:="White"]
Crime[Race=="Asian American"| Race=="Asian American/Some other race" ,Race:="Asian"]
Crime[Race=="Unknown"|Race=="Two or more races" | Race=="Some other race",Race:="Other"]

#Preprocessing Date
Crime[,Month:=as.factor(month(Date))]
Crime[,Day:=as.factor(day(Date))]
Crime[,Year:=as.factor(year(Date))]

#Preprocessing Gender
Crime[Gender=='M',Gender:="Male"]
Crime[Gender=='M/F',Gender:="Male/Female"]
Crime[is.na(Gender),Gender:="Unknown"]

#Preprocessing Cause
Crime[is.na(Cause)|Cause=="unknown",Cause:="Unknown"]
Crime[Cause=='domestic dispute'|Cause=='domestic disputer', Cause:='domestic dispute']
Crime[Cause=="anger"|Cause=="frustration",Cause:="Anger and Fustration"]
Crime[Cause=="failing exams" | Cause=="suspension", Cause:="Failing exams and suspension"]

#Preprocessing Target
Crime[is.na(Target),Target:="Unknown"]
Crime[Target=='Family'|Target=='Family/Neighbors'|Target=='Family+students'|Target=='Coworkers'|Target=='coworkers'
|Target=='Students+Parents'|Target=='Family+random'|Target=="partner's family"|Target=="Coworker's Family"
|Target=='neighbors'|Target=='Girlfriend'|Target=="House Owner"
|Target=="Friends",Target:="Family,Girlfriend(Relationships at that time)"]

Crime[Target=='Children'|Target=='school girls'|Target=='Students'
|Target=='Students+Teachers'|Target=='Teachers',Target:="School(Teachers or Students)"] 

Crime[Target=='Ex-Girlfriend'|Target=='Ex-Wife & Family'|Target == 'Ex-Girlfriend & Family' |Target == 'Ex-GirlFriend'
|Target=='Ex-Girlfriend+random'|Target=='Ex-girlfriend'|Target=="Ex-Coworkers"|Target=='Ex-Wife'|Target=='Ex-Girlfriend', Target:="Girlfriend/Wife/coworkers ( past relationships)"]

Crime[Target=='Sikhs'|Target=='monks'|Target=='prayer group',Target:="Motives against religion"] 

Crime[Target=="Marines"|Target=="Policeman+Council Member"| Target=="police"|
Target=="Policeman"|Target=="Trooper"|Target=="Social Workers", Target:="Marines,Police and Social Workers(Police Brutality)"]

Crime[Target=="birthday party bus"|Target=="party guests"|Target=="uninvited guests", Target:="Parties"]

#Preprocessing Mental Health Issues
Crime[`Mental Health Issues`=="unknown",`Mental Health Issues`:="Unknown"]

#Preprocessing Age
Crime[is.na(Age),Age:=0]
NewAge <-
sapply(Crime$Age,function(x){
if(x>=10 && x<20){
"10+ Teen" }
else if(x>=20 && x<30){
"20's"}
else if(x>=30 && x<40){
"30's"}
else if(x>=40 && x<50){
"40's"}
else if(x>=50 && x<60){
"50's"}
else if(x>=60 && x<70){
"60's"}
else if(x>=70 && x<80){
"70's"}
else if(x>80){
"Multiple Shooters"}
else{
"Age Unknown"}
})

#Preprocessing State
Crime$State <- sapply(Crime$Location, function(parts)
        {
            temp <- strsplit(parts, split = ",")
            sapply(temp, function(new)
                {   
                    new[2]
    
                }
                  )
        })

Crime$City <- sapply(Crime$Location, function(parts)
        {
            temp <- strsplit(parts, split = ",")
            sapply(temp, function(new)
                {   
                    new[1]
    
                }
                  )
        })

Crime[is.na(`State`), `State`:="Unknown"]
Crime[`State`==' CA'| `State`== ' San Diego'| `State`==" LA" |`State`== " California", `State`:="California"]
Crime[`State`==' NV'| `State`==" Nevada",`State`:="Nevada"]
Crime[`State`==' CO'| `State` == " Colorado",`State`:="Colorado"] 
Crime[`State`=='  Virginia'|`State`==" Virginia",`State`:="Virginia"]
Crime[`State`==" TX"| `State` == " Texas",`State`:="Texas"]
Crime[`State`==" MD",`State`:="Maryland"]
Crime[`State`==" PA"|`State`==" Lancaster"|`State`==" Souderton",`State`:="Pennsylvania"]
Crime[`State`==" WA"|`State`==" Washington",`State`:="Washington"]


#Preprocessing Map Data
KnownLocations = Crime %>% drop_na(Latitude) %>% drop_na(Longitude)

colmap <- colorFactor(topo.colors(9), Crime$Race)
colgen <- colorFactor(topo.colors(6), Crime$Gender)

Data after preprocessing

Modified Gender, Race and Mental Health Issues

table(Crime$Race)
## 
##                            Asian                            Black 
##                               18                               85 
##                           Latino Native American or Alaska Native 
##                                5                                3 
##                            Other                            White 
##                               68                              144
table(Crime$`Mental Health Issues`)
## 
##      No Unclear Unknown     Yes 
##      93      13     111     106
table(Crime$Gender)
## 
##      Female        Male Male/Female     Unknown 
##           5         292           5          21

Frequency in Day Month and Year

table(Crime$Day)
table(Crime$Month)
table(Crime$Year)

Modified Age and State

table(NewAge)
## NewAge
##          10+ Teen              20's              30's              40's 
##                38                43                36                41 
##              50's              60's              70's       Age Unknown 
##                11                 3                 1               145 
## Multiple Shooters 
##                 5
table(Crime$State)
## 
##         Alabama          Alaska     Albuquerque         Arizona 
##               9               1               1              11 
##        Arkansas     Connecticut         Florida         Georgia 
##               2               3              20              13 
##          Hawaii           Idaho        Illinois         Indiana 
##               1               1               9               1 
##            Iowa          Kansas        Kentucky       Louisiana 
##               1               4               5               4 
##           Maine   Massachusetts        Michigan       Minnesota 
##               1               4               6               4 
##     Mississippi        Missouri         Montana        Nebraska 
##               4               3               2               3 
##      New Jersey      New Mexico        New York  North Carolina 
##               3               2              10              11 
##            Ohio        Oklahoma          Oregon    Pennsylvania 
##              10               5               4               6 
##  South Carolina    South Dakota       Tennessee            Utah 
##               5               1               5               2 
##         Vermont   West Virginia       Wisconsin         Wyoming 
##               1               1               9               1 
##      California        Colorado        Maryland          Nevada 
##              32               7               1               6 
##    Pennsylvania           Texas         Unknown        Virginia 
##               3              18              46               6 
##      Washington 
##              15

Modified Cause and Target columns

table(Crime$Cause)
## 
##         Anger and Fustration                      breakup 
##                           62                            1 
##             domestic dispute                        drunk 
##                           13                            1 
## Failing exams and suspension                       psycho 
##                            4                           68 
##                       racism         religious radicalism 
##                            6                            1 
##                      revenge                      robbery 
##                           10                            1 
##                    terrorism                unemployement 
##                           65                           10 
##                      Unknown 
##                           81
table(Crime$Target)
## 
##                                  basketball players 
##                                                   1 
##                                           black men 
##                                                   1 
##                                        club members 
##                                                   1 
##                                       Congresswoman 
##                                                   1 
##                                          Contestant 
##                                                   1 
##                                         drug dealer 
##                                                   1 
##       Family,Girlfriend(Relationships at that time) 
##                                                  73 
##     Girlfriend/Wife/coworkers ( past relationships) 
##                                                  25 
##                                             hunters 
##                                                   1 
##                                             lawyers 
##                                                   1 
## Marines,Police and Social Workers(Police Brutality) 
##                                                  13 
##                            Motives against religion 
##                                                   3 
##                                             Parties 
##                                                  10 
##                                          postmaster 
##                                                   1 
##                                          protestors 
##                                                   1 
##                           psychologist+psychiatrist 
##                                                   1 
##                                              random 
##                                                 140 
##                                       rapper+random 
##                                                   1 
##                        School(Teachers or Students) 
##                                                  39 
##                                         TSA Officer 
##                                                   1 
##                                             Unknown 
##                                                   5 
##                              welding shop employees 
##                                                   1 
##                                               women 
##                                                   1

分析開始

州之間的事故分析

有事故的州

根據資料,加州、德州及佛羅里達州都有大規模槍擊,作者認為這些州的共通點都有海岸及炎熱的天氣,「可能」與槍擊案發生的原因有關

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`) %>% 
layout(title = "Number of incidents by State"
         , xaxis = list(title = "Year"))

州之中的受害者

資料中沒有出現內華達州,表示該州並沒有發生太多事件

plot_ly(data = Crime[,.('Total victims'= sum(`Total victims`)),by=.(`State`)]
        ,type = 'bar'
        ,x = ~`State`
        ,y = ~`Total victims`) %>% 
layout(title = "Total victims by State"
         , barmode = 'stack'
         , xaxis = list(title = "State")
         , yaxis = list(title = "Victims")
         , legend = list(x = 0, y = 1)
         , hovermode = 'compare')

聚類圖

KnownLocations %>%
                 leaflet() %>%
                 addTiles() %>%
                 addProviderTiles("OpenStreetMap.HOT")%>%
                 addMarkers(lng = ~Longitude, lat = ~Latitude,clusterOptions = markerClusterOptions(), label=~Location) 

基於性別和心理健康問題的分析

性別州

加州,伊利諾伊州和阿拉巴馬州有女性射手。

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`
        ,color =~`Gender`) %>% 
  layout(title = "Total incidents by State with respect to Gender"
         , showlegend = T
         , barmode = 'stack'
         , xaxis = list(title = "State")
         , legend = list(x = 1, y = 1)
         , hovermode = 'compare')

有精神健康問題的州

資料顯示,槍手之中有精神疾病的在加州最多,其次是華盛頓。

心理健康問題的原因很多,在學校欺凌,沒有達到社會標準,容易獲得影響大腦獎勵制度的藥物,以及社交媒體(當人們停止互相交流並失去與現實的聯繫時)。

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`
        ,color =~`Mental Health Issues`) %>% 
  layout(title = "Total incidents by State with respect to Mental Health"
         , showlegend = T
         , barmode = 'stack'
         , xaxis = list(title = "State")
         , legend = list(x = 1, y = 1)
         , hovermode = 'compare')

#### 精神健康問題的散點圖

plot_ly(data = Crime
        ,type = 'scatter'
        ,mode = 'markers' 
        ,hoverinfo = 'text'
        ,x = ~Year
        ,y = ~State
        ,size = ~`Total victims`
        ,color = ~`Mental Health Issues`
        ,colors = c('Green','Black','Blue','Red')
        ,text = ~paste("Location: ", Location
                       ,'\n Date: ', Date 
                       ,'\n Total victims : ', `Total victims` 
                       ,'\n Race: ', Race)) %>% 
    layout(title = "Mass Shootings in US with year and state"
           , xaxis = list(title = "Years")
           , yaxis = list(title = "States"))
## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

在種族和年齡方面的分析

種族

例如,加州的白人射手數量最多

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`
        ,color =~`Race`) %>% 
  layout(title = "Total victims by State with respect to Race"
         , showlegend = T
         , barmode = 'stack'
         , xaxis = list(title = "State")
         , legend = list(x = 1, y = 1)
         , hovermode = 'compare')

年齡

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`
        ,color =~`NewAge`) %>% 
  layout(title = "Total incidents by State with respect to Age"
         , showlegend = T
         , barmode = 'stack'
         , xaxis = list(title = "State")
         , legend = list(x = 1, y = 1)
         , hovermode = 'compare')

原因和目標的分析

原因

例如,加州擁有最多精神病患者以及會犯下恐怖行動的槍手

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`
        ,color =~`Cause`) %>% 
  layout(title = "Total incidents by State with respect to Cause"
         , showlegend = T
         , barmode = 'stack'
         , xaxis = list(title = "State")
         , legend = list(x = 1, y = 1)
         , hovermode = 'compare')

目標

學校和家庭,是恐怖槍手所鎖定的最佳目標之一

作者認為可能有牽涉到心理健康問題,當時位在喬治亞的7名槍手犯下槍擊案的目標也是鎖定家人、女朋友以及同事等等。

plot_ly(data = Crime
        ,type = 'histogram'
        ,x = ~`State`
        ,color =~`Target`) %>% 
  layout(title = "Total incidents by State with respect to Target"
         , showlegend = F
         , barmode = 'stack'
         , xaxis = list(title = "State")
    , hovermode = 'compare')

心得

藉由這次的學習,第一次透過資料來達到分析數據的體驗