Complaint CodedMonth DateOccur FlagCrime FlagUnfounded
1: 19-064672 2020-01 1/1/2016 0:01 Y
2: 20-001342 2020-01 1/1/2017 15:01 Y
3: 20-000276 2020-01 1/1/2020 0:01 Y
4: 20-000390 2020-01 1/1/2020 0:01 Y
5: 20-000025 2020-01 1/1/2020 0:01 Y
6: 20-000274 2020-01 1/1/2020 0:01 Y
FlagAdministrative Count FlagCleanup Crime District
1: 1 NA 21000 6
2: 1 NA 115400 1
3: 1 NA 41011 3
4: 1 NA 64701 1
5: 1 NA 64701 4
6: 1 NA 142320 4
Description ILEADSAddress
1: RAPE -- FORCIBLE 6000
2: STLG BY DECEIT/IDENTITY THEFT REPORT 622
3: AGG.ASSAULT-FIREARM/CITIZEN ADULT 1ST DEGREE 1609
4: LARCENY-FROM MTR VEH UNDER $500 4677
5: LARCENY-FROM MTR VEH UNDER $500 0
6: DESTRUCTION OF PROPERTY-MALICIOUS/PRIV PROP 720
ILEADSStreet Neighborhood LocationName LocationComment CADAddress
1: HARNEY AVE 76 6000
2: W COURTOIS ST 2 622
3: S 13TH ST 33 1609
4: ROSA AVE 6 4677
5: S 11TH ST / SPRUCE ST 35
6: OLIVE ST 35 GALLERY 720 @LACLEDE GAS HQ 720
CADStreet XCoord YCoord
1: HARNEY 0.0 0.0
2: COURTOIS 887302.4 988957.3
3: 13TH 904149.1 1012541.0
4: ROSA 883704.8 999396.9
5: 906542.5 1016191.0
6: OLIVE 908135.8 1017699.0
The STL Metropolitan Police produces a monthly crime update.
Stored in a csv format and can be downloaded.
Located at https://www.slmpd.org/Crimereports.shtml.
The file provides all crime details collected from the preceding month.
Contains locations, neighborhoods, precincts, map coordinates and times of crimes in the St Louis Metropolitan Area.
Complaint CodedMonth DateOccur FlagCrime
Length:51982 Length:51982 Length:51982 Length:51982
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
FlagUnfounded FlagAdministrative Count FlagCleanup
Length:51982 Length:51982 Min. :-1.0000 Mode:logical
Class :character Class :character 1st Qu.: 1.0000 NA's:51982
Mode :character Mode :character Median : 1.0000
Mean : 0.9783
3rd Qu.: 1.0000
Max. : 1.0000
Crime District Description ILEADSAddress
Min. : 10000 Min. :0.000 Length:51982 Length:51982
1st Qu.: 64601 1st Qu.:2.000 Class :character Class :character
Median : 71013 Median :4.000 Mode :character Mode :character
Mean :117941 Mean :3.557
3rd Qu.:151130 3rd Qu.:5.000
Max. :266999 Max. :6.000
ILEADSStreet Neighborhood LocationName LocationComment
Length:51982 Min. : 0.00 Length:51982 Length:51982
Class :character 1st Qu.:16.00 Class :character Class :character
Mode :character Median :36.00 Mode :character Mode :character
Mean :38.03
3rd Qu.:59.00
Max. :88.00
CADAddress CADStreet XCoord YCoord
Length:51982 Length:51982 Min. : 0 Min. : 0
Class :character Class :character 1st Qu.:884689 1st Qu.:1001325
Mode :character Mode :character Median :892100 Median :1015854
Mean :785051 Mean : 894353
3rd Qu.:898641 3rd Qu.:1028030
Max. :911332 Max. :1093318
Again, some fields are irrelevant to our analysis.
We will remove these elements using a tidyverse library called dplyr.
We will also have to restructure certain date/time variables.
Flags are not needed.
Don’t see how count field is significant in the analysis.
Rows: 261
Columns: 15
$ Complaint <chr> "20-000005", "20-000030", "20-000083", "20-000204", "2~
$ CodedMonth <chr> "2020-01", "2020-01", "2020-01", "2020-01", "2020-01",~
$ DateOccur <chr> "1/1/2020 0:18", "1/1/2020 2:40", "1/1/2020 10:57", "1~
$ Crime <int> 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000~
$ District <int> 3, 6, 5, 3, 1, 4, 2, 4, 5, 6, 5, 5, 5, 2, 4, 6, 6, 4, ~
$ Description <chr> "HOMICIDE", "HOMICIDE", "HOMICIDE", "HOMICIDE", "HOMIC~
$ ILEADSAddress <chr> "3004", "5470", "1219", "4114", "4401", "1517", "4000"~
$ ILEADSStreet <chr> "S JEFFERSON AVE", "GENEVIEVE AVE", "N EUCLID AVE", "M~
$ Neighborhood <int> 22, 72, 53, 16, 16, 36, 27, 60, 38, 74, 48, 54, 50, 28~
$ LocationName <chr> "", "", "", "SOUTH GANGWAY", "", "", "", "", "BERNARD ~
$ LocationComment <chr> "", "", "", "", "OUTSIDE", "IN STREET", "REAR ALLEY", ~
$ CADAddress <chr> "", "5406", "1219", "4114", "", "", "4011", "2507", "4~
$ CADStreet <chr> "", "GENEVIEVE", "EUCLID", "MINNESOTA", "", "", "SHAW"~
$ XCoord <dbl> 899171.6, 892799.8, 888944.8, 895415.3, 891839.7, 9056~
$ YCoord <dbl> 1007325.0, 1043342.0, 1028564.0, 1000658.0, 1000169.0,~
I wanted to select a specific crime. In this case we will look at Homicides.
Some data fields are not relevant to the analysis so I’ve limited the data to the following 6 elements.
Homicides are UCR coded as 10000.
Although the STLMPD website states rows are unique, they are NOT.
During this phase I also wanted to determine data types.
The mix is a combination of characters string and integers.
I will have to re-charactize some elements to more easily manipulate later.
“CodedMonth” and “DateOccur” are not date/time elements, so they need to be changed.
Classes 'data.table' and 'data.frame': 261 obs. of 15 variables:
$ Complaint : chr "20-000005" "20-000030" "20-000083" "20-000204" ...
$ CodedMonth : Date, format: "2020-01-28" "2020-01-28" ...
$ DateOccur : POSIXct, format: "2020-01-01 00:18:00" "2020-01-01 02:40:00" ...
$ Crime : int 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 ...
$ District : int 3 6 5 3 1 4 2 4 5 6 ...
$ Description : chr "HOMICIDE" "HOMICIDE" "HOMICIDE" "HOMICIDE" ...
$ ILEADSAddress : chr "3004" "5470" "1219" "4114" ...
$ ILEADSStreet : chr "S JEFFERSON AVE" "GENEVIEVE AVE" "N EUCLID AVE" "MINNESOTA AVE" ...
$ Neighborhood : int 22 72 53 16 16 36 27 60 38 74 ...
$ LocationName : chr "" "" "" "SOUTH GANGWAY" ...
$ LocationComment: chr "" "" "" "" ...
$ CADAddress : chr "" "5406" "1219" "4114" ...
$ CADStreet : chr "" "GENEVIEVE" "EUCLID" "MINNESOTA" ...
$ XCoord : num 899172 892800 888945 895415 891840 ...
$ YCoord : num 1007325 1043342 1028564 1000658 1000169 ...
- attr(*, ".internal.selfref")=<externalptr>
Need to use some R libraries to convert data types.
Used stringr and lubridate libraries to change data types.
Changed “CodedMonth” to a string value closer to one resembling a year/month/day field.
Used 28 days as the day value so I do not have to constantly worry about the changing days/month values.
Since the data is collected as of the last day of the month, it will not affect the monthly crime perspective.
Next I created a concatonated string group and convert that field into a “POSIX” day/month/day variable.
Reporting.diff YCoord XCoord CADStreet CADAddress LocationComment
1: 164 days 0 0.0 BELLERIVE 112 <NA>
2: 150 days 0 0.0 WELLS 5203 BOARDING HOUSE
3: 127 days 0 0.0 VANDEVENTER 2822 <NA>
4: 125 days 0 0.0 <NA> <NA>
5: 51 days 0 0.0 DELMAR 5453 <NA>
---
257: -3 days 1027233 905683.5 HEBERT 1922 <NA>
258: -3 days 1025961 906344.3 <NA> <NA> <NA>
259: -3 days 1043190 886571.2 STRATFORD 6335 RESIDENCE
260: -3 days 1024302 907688.3 MADISON 1306 <NA>
261: -3 days 1026529 907935.3 10TH 2712 <NA>
LocationName Neighborhood ILEADSStreet ILEADSAddress Description
1: <NA> 1 BELLERIVE 112 HOMICIDE
2: 51 <NA> 5203 HOMICIDE
3: ZX GAS STATION 56 VANDEVENTER 2821 HOMICIDE
4: BP GAS STATION 64 GRAND 209 HOMICIDE
5: <NA> 49 DELMAR 5453 HOMICIDE
---
257: <NA> 63 HEBERT ST 1922 HOMICIDE
258: <NA> 63 ST LOUIS AVE 1420 HOMICIDE
259: <NA> 70 STRATFORD AVE 6339 HOMICIDE
260: <NA> 63 MADISON ST 1306 HOMICIDE
261: <NA> 64 N 10TH ST 2712 HOMICIDE
District Crime DateOccur CodedMonth Complaint
1: 1 10000 2020-02-15 22:30:00 2020-07-28 20-007630
2: 5 10000 2020-07-01 00:01:00 2020-11-28 20-039980
3: 5 10000 2020-05-24 01:14:00 2020-09-28 20-021821
4: 6 10000 2020-07-26 02:40:00 2020-11-28 20-032905
5: 5 10000 2020-05-08 14:00:00 2020-06-28 20-019553
---
257: 4 10000 2020-03-31 05:00:00 2020-03-28 20-014426
258: 4 10000 2020-07-31 22:30:00 2020-07-28 20-033932
259: 6 10000 2020-08-31 08:25:00 2020-08-28 20-039270
260: 4 10000 2020-08-31 18:26:00 2020-08-28 20-039382
261: 4 10000 2020-05-31 02:43:00 2020-05-28 20-023001
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\jim_PC_dell\Desktop\Crime-master\St Louis Shape files\nbrhds_wards\BND_Nhd88_cw.shp", layer: "BND_Nhd88_cw"
with 88 features
It has 6 fields
Integer64 fields read as strings: NHD_NUM
Collected US Census data to bring in geospatial polygons that represent St Louis Neighborhoods.
Transformed mapview data into WGS84 structure.
Check to make sure data is a geospatial object.
Use census geospatial data to generate a map.
Rows: 88
Columns: 6
$ NHD_NUM <chr> "43", "29", "28", "40", "41", "42", "39", "44", "36", "37",~
$ NHD_NAME <chr> "Franz Park", "Tiffany", "Botanical Heights", "Kings Oak", ~
$ ANGLE <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
$ NHD_NUMTXT <chr> "43 Franz Park", "29 Tiffany", "28 Botanical Heights", "40 ~
$ SHAPE_area <dbl> 11012014, 5887342, 11586012, 4706723, 9245751, 9771242, 179~
$ SHAPE_len <dbl> 14740.430, 10467.847, 14700.023, 9239.956, 12357.106, 12518~
We have 88 neighborhoods and their name and number are factor types in R.
The polygon shapes are included in this data frame.
# A tibble: 12 x 3
# Groups: CodedMonth [12]
CodedMonth Crime n
<date> <int> <int>
1 2020-07-28 10000 47
2 2020-06-28 10000 31
3 2020-08-28 10000 30
4 2020-11-28 10000 24
5 2020-09-28 10000 21
6 2020-05-28 10000 20
7 2020-12-28 10000 20
8 2020-04-28 10000 18
9 2020-10-28 10000 15
10 2020-01-28 10000 14
11 2020-03-28 10000 11
12 2020-02-28 10000 10
Group data by coded month.
Count the number of homicides per month.
Data presented in a bar graph with totals displayed above the bar.
I added a smoothing line to get a better view of the crime movement.
Note that October 2018 was the peak.
It was when Channel 5 reported the sever increase in carjackings. Looks like homicids too.
It was also the timeframe when they reported establishing atask force.
# A tibble: 61 x 6
NHD_NAME Crime n cumulative total cumul.percent
<chr> <int> <int> <int> <int> <dbl>
1 Baden 10000 15 15 261 5.75
2 Hamilton Heights 10000 14 29 261 11.1
3 Jeff Vanderlou 10000 14 43 261 16.5
4 Walnut Park West 10000 11 54 261 20.7
5 Carondelet 10000 10 64 261 24.5
6 Dutchtown 10000 10 74 261 28.4
7 Walnut Park East 10000 10 84 261 32.2
8 Greater Ville 10000 9 93 261 35.6
9 Wells Goodfellow 10000 8 101 261 38.7
10 Mount Pleasant 10000 7 108 261 41.4
# ... with 51 more rows
Had to adjust the factor variables (NHD_NAME) and to account for missing variables (NA).
Count by crime and put in decending order.
This is a display of the highest crime neighborhoods.
70% of the homicides are committed in the top 21 neighborhoods (23%)
Group by Neighborhood Name.
Chart puts data in a descending order and presents greater than 5.
# A tibble: 261 x 4
# Groups: CodedMonth [12]
CodedMonth DateOccur hr.day day.cat
<chr> <dttm> <int> <fct>
1 2020-01-28 2020-01-01 00:18:00 0 night
2 2020-01-28 2020-01-01 02:40:00 2 night
3 2020-01-28 2020-01-01 10:57:00 10 morning
4 2020-01-28 2020-01-02 02:10:00 2 night
5 2020-01-28 2020-01-02 12:25:00 12 afternoon
6 2020-01-28 2020-01-03 21:21:00 21 evening
7 2020-01-28 2020-01-09 13:00:00 13 afternoon
8 2020-01-28 2020-01-14 03:00:00 3 night
9 2020-01-28 2020-01-14 12:23:00 12 afternoon
10 2020-01-28 2020-01-18 18:14:00 18 afternoon
# ... with 251 more rows
Create and mutate an hour of day field using lubridate.
This creates a new field to crimeA data frame to categorize a day into 6 hour blocks.
Used a logic functions to segment day categories
create a dataframe to thin out and group variables that focus on time of day.
Reporting.diff YCoord XCoord CADStreet
Length:261 Min. : 0 Min. : 0 Length:261
Class :difftime 1st Qu.:1003364 1st Qu.:886653 Class :character
Mode :numeric Median :1026723 Median :893117 Mode :character
Mean : 918686 Mean :801786
3rd Qu.:1033512 3rd Qu.:897716
Max. :1060370 Max. :909769
CADAddress LocationComment LocationName Neighborhood
Length:261 Length:261 Length:261 Length:261
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
ILEADSStreet ILEADSAddress Description District
Length:261 Length:261 Length:261 Min. :0.000
Class :character Class :character Class :character 1st Qu.:3.000
Mode :character Mode :character Mode :character Median :5.000
Mean :4.157
3rd Qu.:6.000
Max. :6.000
Crime DateOccur CodedMonth
Min. :10000 Min. :2020-01-01 00:18:00 Min. :2020-01-28
1st Qu.:10000 1st Qu.:2020-05-21 17:21:00 1st Qu.:2020-05-28
Median :10000 Median :2020-07-14 20:47:00 Median :2020-07-28
Mean :10000 Mean :2020-07-15 09:48:42 Mean :2020-07-29
3rd Qu.:10000 3rd Qu.:2020-09-20 15:07:00 3rd Qu.:2020-09-28
Max. :10000 Max. :2020-12-28 01:03:00 Max. :2020-12-28
Complaint NHD_NAME
Length:261 Length:261
Class :character Class :character
Mode :character Mode :character
We will use the data we restructed earlier in the analysis.
We will use the crime D file.
Check the structure of the file we selected.
XCoord and YCoord coordinates are based on the State Plane North American Datum 1983 (NAD83) format.
This data will have to be converted to lat/long values.
Some of the XCoords and YCoords have values of O. This will need to be accounted for later in the analysis.
Reporting.diff YCoord XCoord CADStreet CADAddress LocationComment
1: 164 days 0 0 BELLERIVE 112 <NA>
2: 150 days 0 0 WELLS 5203 BOARDING HOUSE
3: 127 days 0 0 VANDEVENTER 2822 <NA>
4: 125 days 0 0 <NA> <NA>
5: 51 days 0 0 DELMAR 5453 <NA>
6: 50 days 0 0 <NA> <NA> <NA>
7: 33 days 0 0 MAFFITT 5376 <NA>
8: 21 days 0 0 ALICE 4561 <NA>
9: 21 days 0 0 <NA> 1 <NA>
10: 20 days 0 0 <NA> 4949 <NA>
11: 19 days 0 0 <NA> <NA> <NA>
12: 15 days 0 0 CABANNE 5811 <NA>
13: 15 days 0 0 <NA> <NA> <NA>
14: 14 days 0 0 VISTA 3635 <NA>
15: 12 days 0 0 <NA> <NA> <NA>
16: 11 days 0 0 <NA> <NA> <NA>
17: 8 days 0 0 <NA> <NA> <NA>
18: 6 days 0 0 ALICE 2145 <NA>
19: 5 days 0 0 <NA> <NA> <NA>
20: 4 days 0 0 BROADWAY 8105 <NA>
21: 4 days 0 0 ALDINE 4578 <NA>
22: 4 days 0 0 RIO TINTO 7837 <NA>
23: 3 days 0 0 <NA> <NA> <NA>
24: 1 days 0 0 NEWBERRY 4544 <NA>
25: 1 days 0 0 BROADWAY 8200 <NA>
26: 0 days 0 0 SELBER 5831 <NA>
27: -2 days 0 0 BROADWAY 8551 <NA>
Reporting.diff YCoord XCoord CADStreet CADAddress LocationComment
LocationName Neighborhood ILEADSStreet ILEADSAddress
1: <NA> 1 BELLERIVE 112
2: 51 <NA> 5203
3: ZX GAS STATION 56 VANDEVENTER 2821
4: BP GAS STATION 64 GRAND 209
5: <NA> 49 DELMAR 5453
6: <NA> 60 DODIER 1929
7: <NA> 50 MAFFITT 5372
8: <NA> 68 ALICE 4561
9: <NA> 0 UNKNOWN 20-050582 0
10: <NA> 0 UNKNOWN 0
11: <NA> 71 LILLIAN 5210
12: <NA> 48 CABANNE 5811
13: PARKING LOT 56 NORTH MARKET 3905
14: <NA> 0 UNKNOWN CITY OF ST LOUIS 0
15: <NA> 78 BLACKSTONE 1474
16: <NA> 67 LEE 3844
17: <NA> 35 WASHINGTON 405
18: <NA> 68 ALICE 2144
19: <NA> 64 MOUND 120
20: <NA> 2 BROADWAY 8105
21: <NA> 56 ALDINE 4576
22: <NA> 1 RIO SILVA 7859
23: <NA> 65 <NA> <NA>
24: <NA> 54 NEWBERRY 4544
25: Circle K 2 BROADWAY 8200
26: <NA> 50 GOODFELLOW 3401
27: <NA> 74 BROADWAY 8608
LocationName Neighborhood ILEADSStreet ILEADSAddress
Description District Crime DateOccur CodedMonth Complaint
1: HOMICIDE 1 10000 2020-02-15 22:30:00 2020-07-28 20-007630
2: HOMICIDE 5 10000 2020-07-01 00:01:00 2020-11-28 20-039980
3: HOMICIDE 5 10000 2020-05-24 01:14:00 2020-09-28 20-021821
4: HOMICIDE 6 10000 2020-07-26 02:40:00 2020-11-28 20-032905
5: HOMICIDE 5 10000 2020-05-08 14:00:00 2020-06-28 20-019553
6: HOMICIDE 4 10000 2020-05-09 16:20:00 2020-06-28 20-019670
7: HOMICIDE 5 10000 2020-07-26 20:20:00 2020-08-28 20-033059
8: HOMICIDE 6 10000 2020-12-07 20:08:00 2020-12-28 20-055207
9: HOMICIDE 0 10000 2020-11-07 09:31:00 2020-11-28 20-050582
10: HOMICIDE 0 10000 2020-04-08 21:00:00 2020-04-28 20-015525
11: HOMICIDE 6 10000 2020-12-09 13:26:00 2020-12-28 20-055459
12: HOMICIDE 5 10000 2020-12-13 03:30:00 2020-12-28 20-056007
13: HOMICIDE 5 10000 2020-08-13 01:03:00 2020-08-28 20-035970
14: HOMICIDE 0 10000 2020-07-14 13:12:00 2020-07-28 20-030853
15: HOMICIDE 5 10000 2020-03-16 20:45:00 2020-03-28 20-012642
16: HOMICIDE 6 10000 2020-12-17 23:00:00 2020-12-28 20-056813
17: HOMICIDE 4 10000 2020-12-20 21:26:00 2020-12-28 20-057230
18: HOMICIDE 6 10000 2020-12-22 09:54:00 2020-12-28 20-057462
19: HOMICIDE 4 10000 2020-12-23 23:30:00 2020-12-28 20-057726
20: HOMICIDE 1 10000 2020-12-24 05:10:00 2020-12-28 20-057741
21: HOMICIDE 5 10000 2020-12-24 12:25:00 2020-12-28 20-057800
22: HOMICIDE 1 10000 2020-12-24 20:15:00 2020-12-28 20-057833
23: HOMICIDE 4 10000 2020-12-25 01:39:00 2020-12-28 20-057853
24: HOMICIDE 5 10000 2020-12-27 14:32:00 2020-12-28 20-058118
25: HOMICIDE 1 10000 2020-12-27 18:07:00 2020-12-28 20-058138
26: HOMICIDE 5 10000 2020-12-28 01:03:00 2020-12-28 20-058163
27: HOMICIDE 6 10000 2020-03-30 12:55:00 2020-03-28 20-014351
Description District Crime DateOccur CodedMonth Complaint
NHD_NAME
1: Carondelet
2: Academy
3: Greater Ville
4: Near North Riverfront
5: Visitation Park
6: St. Louis Place
7: Wells Goodfellow
8: O'Fallon
9: <NA>
10: <NA>
11: Mark Twain
12: West End
13: Greater Ville
14: <NA>
15: Hamilton Heights
16: Fairground Neighborhood
17: Downtown
18: O'Fallon
19: Near North Riverfront
20: Patch
21: Greater Ville
22: Carondelet
23: Hyde Park
24: Lewis Place
25: Patch
26: Wells Goodfellow
27: Baden
NHD_NAME
Collect those records whose X/Y values are zeros.
These records will need a different type of processing.
Reporting.diff YCoord XCoord CADStreet CADAddress LocationComment
1: 47 days 1001653 884160.5 <NA> <NA> <NA>
2: 37 days 1007223 883071.9 PARKER 5274 <NA>
3: 27 days 1007325 899171.6
4: 27 days 1043342 892799.8 GENEVIEVE 5406
5: 27 days 1028564 888944.8 EUCLID 1219
---
230: -3 days 1027233 905683.5 HEBERT 1922 <NA>
231: -3 days 1025961 906344.3 <NA> <NA> <NA>
232: -3 days 1043190 886571.2 STRATFORD 6335 RESIDENCE
233: -3 days 1024302 907688.3 MADISON 1306 <NA>
234: -3 days 1026529 907935.3 10TH 2712 <NA>
LocationName Neighborhood ILEADSStreet ILEADSAddress Description
1: <NA> 5 CHRISTY BLVD 4934 HOMICIDE
2: <NA> 14 PARKER AVE 5274 HOMICIDE
3: 22 S JEFFERSON AVE 3004 HOMICIDE
4: 72 GENEVIEVE AVE 5470 HOMICIDE
5: 53 N EUCLID AVE 1219 HOMICIDE
---
230: <NA> 63 HEBERT ST 1922 HOMICIDE
231: <NA> 63 ST LOUIS AVE 1420 HOMICIDE
232: <NA> 70 STRATFORD AVE 6339 HOMICIDE
233: <NA> 63 MADISON ST 1306 HOMICIDE
234: <NA> 64 N 10TH ST 2712 HOMICIDE
District Crime DateOccur CodedMonth Complaint
1: 1 10000 2020-06-11 23:51:00 2020-07-28 20-025203
2: 2 10000 2020-09-21 05:14:00 2020-10-28 20-042812
3: 3 10000 2020-01-01 00:18:00 2020-01-28 20-000005
4: 6 10000 2020-01-01 02:40:00 2020-01-28 20-000030
5: 5 10000 2020-01-01 10:57:00 2020-01-28 20-000083
---
230: 4 10000 2020-03-31 05:00:00 2020-03-28 20-014426
231: 4 10000 2020-07-31 22:30:00 2020-07-28 20-033932
232: 6 10000 2020-08-31 08:25:00 2020-08-28 20-039270
233: 4 10000 2020-08-31 18:26:00 2020-08-28 20-039382
234: 4 10000 2020-05-31 02:43:00 2020-05-28 20-023001
NHD_NAME
1: Bevo Mill
2: North Hampton
3: Benton Park
4: Walnut Park East
5: Fountain Park
---
230: Old North St. Louis
231: Old North St. Louis
232: Mark Twain I-70 Industrial
233: Old North St. Louis
234: Near North Riverfront
These records are in much better shape.
They have both X and Y coordinates.
Function transforms all the State Plane Coordinate values into NAD84 lat/long coordinates.
More modern mapping structure used for GPS Mapping.
Used censusxy library to pull latitude/longitude.
The geocode function from the library requires a street address and number, city, and zip code (if available).
It goes to the US Census Bureau to look up the address reported on police record and returns a lat/long.
It creates an sf file and allows plotting of locations on a map.
Can only convert 22 instances with censusxy since some addresses locations are missing.
** cxy_geocode changed. class id function not output **
Add neighborhoods.
From https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html
These records are overlaid on the neighborhood polygons.
They have both X and Y coordinates.
***
Peaks illustrate highest crime numbers for that area.
Contours indicate similiar occurrances.
It uses clusters counts to illustrate homicice numbers in selected city areas.
As you drill down it recalculates the numbers over city areas.
From intersection of Goodfellow and MLK.
North along Goodfellow to W. Florissant.
Then Southeast along W. Florissant to Prarie.
Then southwest along Prarie/Vandeventner to MLK.
Back to MLK and Goodfellow.
This is how it plots out with homicides.
A better prediction here, but the box still misses the south side hotspot.
Also, note the area running west along Interstate 55 and Northwest along Interstate 70.
And the mayor said she would give him an A?
Established in 2014.
These are the 6 police districts.
Now they are considering restructuring them again.
They want to increase the number.
Improvement or just more overhead?
Need to collect more data for greater understanding of crime parameters.
This data set has close to 8,000 instances of “FIREARM” defined crime. Where are the locations?
Need to plot heroine and cocaine locations to see overlaps.
There is no gang data available since 2012. St Louis does not have a Gang Division. Does it need one?
UCR reporting structure is poorly constructed for nation as a whole. How could it be improved?
---
title: "Homicides"
output:
flexdashboard::flex_dashboard:
storyboard: true
source_code: embed
theme: cerulean
---
```{r, echo=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(echo = FALSE,
include = FALSE,
eval = TRUE,
message = FALSE,
warning = FALSE,
fig.retina = 1,
tidy = TRUE)
```
```{r echo=FALSE}
# install all the library packages
library(tidyverse)
library(rgdal)
library(sp)
library(sf)
# library(raster) # there is a problem with raster running qith dplyr probably don't need raster (dplyr issue 3893)
library(leaflet)
library(leafpop)
library(mapview)
# Had to reload old version of mapview 2.7.8 to run correctly New version does not support .fgb files
library(censusxy)
library(tidycensus)
library(ggplot2)
library(ggmap)
library(plotly)
library(RColorBrewer)
library(data.table)
library(fasttime)
library(sparklyr)
library(lubridate)
library(maps)
library(stringr)
library(readr)
library(knitr)
library(censusxy)
```
### 1. Begin by collecting crime data from the STL Metropolitan Police Website
```{r, include=TRUE}
# Collect St Louis City crime UCR statistics
# pull in state coordinate system files from st louis police reports using data.table
crime <- fread("data/Group2018_2020.csv", stringsAsFactors=FALSE)
head(crime)
```
***
- The STL Metropolitan Police produces a monthly crime update.
- Stored in a csv format and can be downloaded.
- Located at .
- The file provides all crime details collected from the preceding month.
- Contains locations, neighborhoods, precincts, map coordinates and times of crimes in the St Louis Metropolitan Area.
### 2. Look at the Data Values
```{r, include=TRUE}
summary(crime)
```
***
- Again, some fields are irrelevant to our analysis.
- We will remove these elements using a tidyverse library called *dplyr*.
- We will also have to restructure certain date/time variables.
- Flags are not needed.
- Don't see how count field is significant in the analysis.
### 3. Adjust Data Structures to Match that Needed for Analysis
```{r, include=TRUE}
crimeA <- crime %>%
dplyr::select(-FlagCrime, -FlagUnfounded, -FlagAdministrative, -Count, -FlagCleanup) %>%
filter(Crime == 10000) %>%
distinct(Complaint, .keep_all = TRUE)
glimpse(crimeA)
```
***
- I wanted to select a specific crime. In this case we will look at Homicides.
- Some data fields are not relevant to the analysis so I've limited the data to the following 6 elements.
- Homicides are UCR coded as *10000*.
- Although the STLMPD website states rows are unique, they are *NOT*.
- During this phase I also wanted to determine data types.
- The mix is a combination of characters string and integers.
- I will have to re-charactize some elements to more easily manipulate later.
- "CodedMonth" and "DateOccur" are not date/time elements, so they need to be changed.
### 4. Prepare Data for Manipulating Date/time Fields
```{r, include=FALSE}
crimeA$CodedMonth <- str_c(crimeA$CodedMonth, "28", sep = "-") # use stringr to create add a day to the y/m structure
crimeA$CodedMonth <- as_date(crimeA$CodedMonth) # use lubridate to convert to actual y/m/d
crimeA$DateOccur <- mdy_hm(crimeA$DateOccur) # use lubridate to change string to date/time structure
```
```{r, include=TRUE}
### Result of Changing String Value {data-background=#fae5e3}
# - "CodedMonth" is now a date format and "DateOccur" is now a POSIX date time data type.
# - Check structures of the data.
str(crimeA)
```
***
- Need to use some R libraries to convert data types.
- Used *stringr* and *lubridate* libraries to change data types.
- Changed "CodedMonth" to a string value closer to one resembling a year/month/day field.
- Used 28 days as the day value so I do not have to constantly worry about the changing days/month values.
- Since the data is collected as of the last day of the month, it will not affect the monthly crime perspective.
- Next I created a concatonated string group and convert that field into a "POSIX" day/month/day variable.
```{r}
### Check Final Data Structure {data-background=#fae5e3}
summary(crimeA)
```
```{r}
### Make Date Structures Compatable and Calculate Reporting Delays {data-background=#fae5e3}
# - An interesting side note is to see the differences between reporting day and actual incident date.
# - Some of the records are reported significantly longer than 30 days.
crimeB <- crimeA %>% mutate(Reporting.diff = CodedMonth - as_date(DateOccur)) %>%
dplyr::select(Reporting.diff:Complaint) %>%
arrange(desc(Reporting.diff))
crimeB$Neighborhood <- as_factor(crimeB$Neighborhood) # change to factor for later join
```
### 5. Review Reporting Delays
```{r, include=TRUE}
crimeB
```
### **6. Bring in the Neighborhood Details**
```{r, include=TRUE}
### Now join neighborhoods with names
#add neighborhood shapes to a data frame
# From https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html
hoods.sf <- readOGR("St Louis Shape files/nbrhds_wards/BND_Nhd88_cw.shp")
hoods.sf <- spTransform(hoods.sf, CRS("+proj=longlat +datum=WGS84"))
#mapviewOptions(fgb = FALSE)
hoods <- mapview(hoods.sf, map.types = c("OpenStreetMap"),
layer.name = c("Neighborhoods"),
alpha.regions = 0.1,
alpha = 2,
legend = FALSE,
zcol = c("NHD_NAME"))
hoods
```
***
- Collected US Census data to bring in geospatial polygons that represent St Louis Neighborhoods.
- Transformed mapview data into *WGS84* structure.
- Check to make sure data is a geospatial object.
- Use census geospatial data to generate a map.
```{r}
### Convert Neighborhood Details {data-background=#fae5e3}
# - Change SF file into a data frame.
# collect neighborhood details from shape file
hoods.df <- as(hoods.sf, "data.frame")
class(hoods.df) # check class
```
### 7. Look at the data frame after adding in Neighborhood data
```{r, include=TRUE}
glimpse(hoods.df)
```
***
- We have 88 neighborhoods and their name and number are factor types in R.
- The polygon shapes are included in this data frame.
```{r}
### Clean Up Data - Trim Neighborhoods and Prepare for Joins {data-background=#fae5e3}
# - Bring in the neighborhood name with their respective number codes.
# - Create a new data frame.
crimeC <- hoods.df %>% dplyr::select(NHD_NUM, NHD_NAME)
# crimeC$NHD_NUM <- as.integer(crimeC$NHD_NUM) # convert to integer
# join homicide table with hoods table to get neighborhood names
crimeD <- left_join(crimeB, crimeC, by = c("Neighborhood" = "NHD_NUM"))
```
```{r}
### See the Final Data Frame
glimpse(crimeD)
```
### 8. Group by Month and Count Number of Homicides per Month
```{r, include=TRUE}
crimeA %>%
group_by(CodedMonth) %>%
count(Crime) %>%
arrange(desc(n))
```
***
- Group data by coded month.
- Count the number of *homicides per month*.
- Data presented in a bar graph with totals displayed above the bar.
- I added a smoothing line to get a better view of the crime movement.
- Note that October 2018 was the peak.
- It was when Channel 5 reported the sever increase in carjackings. Looks like homicids too.
- It was also the timeframe when they reported establishing atask force.
```{r, include=FALSE}
### Plot the count by month
crime.month <- crimeA %>%
group_by(CodedMonth) %>%
count(Crime) %>%
arrange(desc(n))
xx = ggplot(crime.month, aes(x = CodedMonth, y = n)) +
geom_text(aes(label = n, y = n), size = 5, position = position_stack(vjust = 1.2)) +
geom_col(color = "cornflowerblue") +
geom_point() +
stat_smooth() + # add a smoothing regerssion for time series
scale_x_date(date_breaks = "4 weeks", date_labels = "%m") +
theme(axis.text.x = element_text(angle = 90)) + # change tex to verticle
labs(title = "Homicides Per Month", x= "Month", y = "C
Homicide Count")
```
### **9. Plot Homicides per Month Using _ggplot2_ Library**
```{r, include=TRUE}
### Homicides by Month
xx
```
### 10. Look at Neighborhood's by Name and Count Numbers {data-background=#fae5e3}
```{r, include=TRUE}
### Neighborhood By Name
### Group by Neighborhood and count
crimeD %>%
mutate_if(is.factor,
fct_explicit_na,
na_level = "to_impute") %>%
group_by(NHD_NAME) %>%
count(Crime, sort = TRUE) %>%
arrange(desc(n)) %>%
ungroup()%>%
mutate (cumulative = cumsum(n), total = sum(n), cumul.percent = cumsum(c(n/total *100)))
```
***
- Had to adjust the factor variables (NHD_NAME) and to account for missing variables (NA).
- Count by crime and put in decending order.
- This is a display of the highest crime neighborhoods.
- 70% of the homicides are committed in the top 21 neighborhoods (23%)
```{r}
### 11. Neighborhoods Count by Month
# - Group by Neighborhood Name.
# - Chart puts data in a descending order and presents greater than 5.
### Plot the count by month
hood.number <- crimeD %>%
mutate_if(is.factor,
fct_explicit_na,
na_level = "to_impute") %>%
group_by(NHD_NAME) %>%
count(Crime) %>%
filter(n > 5) %>%
arrange(desc(n))
```
```{r}
xy = ggplot(hood.number, aes(x = reorder(NHD_NAME, +n), y = n)) +
geom_bar(stat = "identity") +
geom_col(color = "cornflowerblue") +
coord_flip() +
theme(axis.text.x = element_text(angle = 90)) + # change tex to verticle
labs(title = "Homicides by Neighborhood", x= "Neighborhood", y = "Homicide Count")
```
### **11. Homicides by Neighborhood**
```{r, include=TRUE}
xy
```
***
- Group by Neighborhood Name.
- Chart puts data in a descending order and presents greater than 5.
```{r, echo=FALSE, include=FALSE}
### 12. Time of Day Carjacks
## create and mutate an hour of day field using lubridate
hour.day <- as.integer(format(crimeA$DateOccur, "%H"))
crimeA <- crimeA %>% as_tibble() %>%
mutate(hr.day = as.integer(format(crimeA$DateOccur, "%H")))
## This adds a new field to crimeA data frame to categorize a day into 6 hour blocks
## used a logic functons to segment day categories
## adds field to crimeA
crimeA$day.cat <- ifelse(crimeA$hr.day > 0 & crimeA$hr.day < 6, "night",
ifelse(crimeA$hr.day >= 6 & crimeA$hr.day < 12, 'morning',
ifelse(crimeA$hr.day > 12 & crimeA$hr.day <= 18, "afternoon",
ifelse(crimeA$hr.day > 18 & crimeA$hr.day < 24, "evening",
ifelse(crimeA$hr.day == 0, "night",
ifelse(crimeA$hr.day == 12, "afternoon", NA ))))))
## arrange as factors
day.lvls <- c("morning", "afternoon", "evening", "night")
crimeA$day.cat <- factor(crimeA$day.cat, levels = day.lvls)
```
### **12. Time of Day Homicidess**
- Look at the time of the day that the homicides occurred
```{r, echo=FALSE, include=TRUE}
homicide_tod <- crimeA %>% select(c(2:3,16:17)) %>%
group_by(CodedMonth)
homicide_tod$CodedMonth <- as.character(homicide_tod$CodedMonth)
homicide_tod
ggplot(homicide_tod) +
geom_bar(mapping = aes(x = CodedMonth, fill = day.cat), position = "dodge") +
scale_fill_discrete(name = "Time of Day", labels = c("Morning 6-12 ", "Afternoon 12-18", "Evening 18-24", "Night 24-6")) +
theme(axis.text.x = element_text(angle = 90)) +
labs(title = "Monthly Homicide by Time of Day", x= "Time of Day", y = "Homicide Count")
```
***
- Create and mutate an hour of day field using lubridate.
- This creates a new field to crimeA data frame to categorize a day into 6 hour blocks.
- Used a logic functions to segment day categories
- create a dataframe to thin out and group variables that focus on time of day.
### 13. Let's Look at the Geospatial Aspects of the Homicide Analysis
```{r, include=TRUE}
### Summary of the Characteristics of the Crime Data {data-background=#fae5e3}
summary(crimeD)
```
***
- We will use the data we restructed earlier in the analysis.
- We will use the crime D file.
- Check the structure of the file we selected.
### 14. Important to understanding the geospatial structures of the data
- XCoord and YCoord coordinates are based on the State Plane North American Datum 1983 (NAD83) format.
- This data will have to be converted to lat/long values.
- Some of the XCoords and YCoords have values of O. This will need to be accounted for later in the analysis.
```{r}
### Let's Review the Basic Data Structure {data-background=#fae5e3}
str(crimeD)
```
### 15. Must Account For Inconsistent Coordinate Data
```{r}
crimeD.zeros <- crimeD %>% filter(XCoord < 1)
```
```{r, include=TRUE}
### Missing Coordinates {data-background=#fae5e3}
crimeD.zeros # there are 20 homicide records that cannot be processed directly
```
***
- Collect those records whose X/Y values are zeros.
- These records will need a different type of processing.
```{r}
### Records That Can Be Directly Converted to Lat/Long {data-background=#fae5e3}
crimeD.complete <- crimeD %>% filter(XCoord > 1)
```
### 16. Complete Records
```{r, include=TRUE}
crimeD.complete
```
***
- These records are in much better shape.
- They have both X and Y coordinates.
### 17. Now we need to convert the NAD83 Coordinates to WGS84 Structure
```{r, echo=TRUE}
nad83_coords <- data.frame(x=crimeD.complete$XCoord, y=crimeD.complete$YCoord) # My coordinates in NAD83
nad83_coords <- nad83_coords *.3048 ### Feet to meters
coordinates(nad83_coords) <- c('x', 'y')
proj4string(nad83_coords)=CRS("+init=epsg:2815")
coordinates_deg <- spTransform(nad83_coords,CRS("+init=epsg:4326"))
coordinates_deg
#str(coordinates_deg)
#class(coordinates_deg)
# add converted lat-lonf and convert to numeric values
crimeD.complete$lon <- as.numeric(coordinates_deg$x)
crimeD.complete$lat <- as.numeric(coordinates_deg$y)
#class(crimeD.complete)
```
***
- Function transforms all the State Plane Coordinate values into NAD84 lat/long coordinates.
- More modern mapping structure used for GPS Mapping.
```{r}
### Review Charistics of Downloaded Crime Data {data-background=#fae5e3}
glimpse(crimeD.complete)
```
### 18. Get Incomplete Data Missing Coordinates {data-background=#fae5e3}
- Used _censusxy_ library to pull latitude/longitude.
- The geocode function from the library requires a street address and number, city, and zip code (if available).
- It goes to the US Census Bureau to look up the address reported on police record and returns a lat/long.
- It creates an _sf_ file and allows plotting of locations on a map.
- Can only convert 22 instances with _censusxy_ since some addresses locations are missing.
** cxy_geocode changed. class id function not output **
```{r}
data <- mutate(crimeD.zeros, address.comb = paste(CADAddress, CADStreet, sep = " "), city = "St Louis", state = "MO")
crimeD_sf <- cxy_geocode(data, street = 'address.comb', city = 'city', state = 'state', class = "sf")
STL_homicides.small <- mapview(crimeD_sf,
map.types = c("OpenStreetMap"),
legend = FALSE,
popup = popupTable(data,zcol = c("Complaint",
"CodedMonth",
"NHD_NAME",
"District",
"Crime",
"Description")))
```
```{r}
### Locations Obtained From US Census With Addresses Only ...
STL_homicides.small
```
```{r}
### Larger Grouping that Contained Coordinates
#- These records contain the X/Y plotted locations.
### create an sf file that will map coordinates
data.one <- mutate(crimeD.complete, address.comb = paste(CADAddress, CADStreet, sep = " "), city = "St Louis", state = "MO")
crimeD_one.sf <- st_as_sf(data.one, coords = c("lon", "lat"), crs = 4326, agr = "constant")
STL_homicides <- mapview(crimeD_one.sf, map.types = c("OpenStreetMap"),
legend = FALSE,
popup = popupTable(data.one, zcol = c("Complaint",
"CodedMonth",
"NHD_NAME",
"District",
"Crime",
"Description")))
```
### 19. Combine Map Sets to View the Entire Picture of Homicide Location in St Louis
```{r, include=TRUE}
total_homicides <- STL_homicides + STL_homicides.small
total_homicides
```
```{r}
### Bring Up Neighborhood Map {data-background=#fae5e3}
hoods
```
***
- Add neighborhoods.
- From
### **20. Final Map of Homicides with Neighborhood Overlays**
```{r, include=TRUE}
#- Combine all the maps.
total_homicides <- STL_homicides + STL_homicides.small + hoods
total_homicides
```
***
- These records are overlaid on the neighborhood polygons.
- They have both X and Y coordinates.
```{r, echo=FALSE}
### Now We Look at Some Plots Targeting the Intensity of the Crime Area {data-background=#fae5e3}
# - Start with a quick plot of the homicides locations.
### reduce crime to violent crimes in downtown
violent_crimes <- crimeD.complete %>%
filter(
Crime == 10000,
-90.3238 <= lon & lon <= -90.1794334,
38.0 <= lat & lat <= 39.0 )
# use qmplot to make a scatterplot on a map
qmplot(lon, lat, data = violent_crimes,
maptype = "toner-lite", color = I("red"), zoom = 12)
```
### **25. Now We Look at These Homicides Plots with Density Contours**
```{r, include=TRUE}
### Density contour plots
qmplot(lon, lat, data = violent_crimes, maptype = "toner-lite",
geom = "density2d", color = I("red"), zoom = 12)
```
***
- Peaks illustrate highest crime numbers for that area.
- Contours indicate similiar occurrances.
### **21. Another View Using Same Data Set Gives Us Heat Map**
```{r, include=TRUE}
### This provides a good look at the density of homicides in the city
qmplot(lon, lat, data = violent_crimes, geom = "blank",
zoom = 14, maptype = "toner-background", legend = FALSE) +
stat_density_2d(aes(fill = ..level..), geom = "polygon", alpha = .35, colour = NA) +
scale_fill_gradient2("Homicides\nHeatmap", low = "white", mid = "yellow", high = "red", midpoint = 20)
```
***
- Darker areas indicate higher level of homicides.
```{r}
### Another View of Crime Area Numbers {data-background=#fae5e3}
# - Use clusters to illustrate numbers in an area
zz <- leaflet(data=crimeD.complete) %>%
addTiles() %>%
setView(-90.222, 38.608, zoom = 11) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addCircleMarkers(lng = ~lon,
lat = ~lat,
fillColor = blues9,
stroke = FALSE, fillOpacity = 0.8,
clusterOptions = markerClusterOptions(),
popup = ~DateOccur) %>%
addPolygons(data= hoods.sf, label = ~NHD_NAME,
color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.005,
highlightOptions = highlightOptions(color = "white",
weight = 2,
bringToFront = TRUE))
```
### **22. Here is a Very Interesting View Called a Cluster Map**
```{r, include=TRUE}
zz
```
***
- It uses clusters counts to illustrate homicice numbers in selected city areas.
- As you drill down it recalculates the numbers over city areas.
```{r}
#### Task force focus
### Created database that defines the crime focus area
police_crime_focus <- fread("police_crime_focus.csv", stringsAsFactors=FALSE)
### Create a spatial file of the police crime focus
# police_crime_focus
police_point.sf <- st_as_sf(police_crime_focus,
coords = c("lon", "lat"),
crs = 4326, agr = "constant")
###police points
police_point.sf
### Create matrisx of lat/long
df <- data.frame(police_crime_focus$lon, police_crime_focus$lat)
# You need first to close your polygon
# (first and last points must be identical)
df <- rbind(df, df[1,])
### Create a lolygon of the area of the police box
police.polygon <- st_sf(st_sfc(st_polygon(list(as.matrix(df)))), crs = 4326)
# police.polygon
police.box <- mapview(police.polygon, map.types = c("OpenStreetMap"),
layer.name = c("Police Box"),
legend = FALSE,
alpha.regions = 0.3,
alpha = 6,
label = NULL,
color = "red",
col.regions = "red")
## Show police box in red
```
### 23. This Illustrates the "Hayden Rectangle" Plotted Out
```{r, include=TRUE}
police.box
```
***
- From intersection of Goodfellow and MLK.
- North along Goodfellow to W. Florissant.
- Then Southeast along W. Florissant to Prarie.
- Then southwest along Prarie/Vandeventner to MLK.
- Back to MLK and Goodfellow.
```{r}
# Add in Police Box
STLtotal_homicides <- STL_homicides + STL_homicides.small + police.box
```
### **24. This is the Chief's Box Overlaid with Homicides**
```{r, include=TRUE}
STLtotal_homicides
```
***
- This is how it plots out with homicides.
- A better prediction here, but the box still misses the south side hotspot.
- Also, note the area running west along Interstate 55 and Northwest along Interstate 70.
- And the mayor said she would give him an *A*?
```{r, message=FALSE}
#add police district shapes to a data frame
police_district.sf <- readOGR("police-districts/GIS.STL.POLICE_DISTRICTS_2014.shp")
police_district.sf <- spTransform(police_district.sf, CRS("+proj=longlat +datum=WGS84"))
police_district <- mapview(police_district.sf, map.types = c("OpenStreetMap"),
layer.name = c("DISTNO"),
alpha.regions = 0.1,
alpha = 7,
legend = FALSE,
zcol = c("DISTNO"))
```
### **25. View Crime based on Police Districts**
```{r, include=TRUE}
police_district
```
***
- Established in 2014.
- These are the 6 police districts.
- Now they are considering restructuring them again.
- They want to increase the number.
- Improvement or just more overhead?
```{r}
# combine total crimes and pokice districts
district_homicides <- police_district + STL_homicides + STL_homicides.small
```
### **26. This Overlays Homicides Within the Police Districts**
```{r, include=TRUE}
district_homicides
```
```{r, echo=FALSE}
# Provide cluster view with current police districts using
xxx <- leaflet(data=crimeD.complete) %>%
addTiles() %>%
setView(-90.222, 38.608, zoom = 11) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addCircleMarkers(lng = ~lon,
lat = ~lat,
fillColor = blues9,
stroke = FALSE, fillOpacity = 0.8,
clusterOptions = markerClusterOptions(),
popup = ~DateOccur) %>%
addPolygons(data=police_district.sf, label = ~DISTNO,
color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.005,
highlightOptions = highlightOptions(color = "white",
weight = 3,))
```
### **27. Finally We Look at Police Districts with Crime Clustering**
```{r, include=TRUE}
xxx
```
***
- Review crimes by each of 6 police districts.
### **28. Food for Thought**
- Need to collect more data for greater understanding of crime parameters.
- This data set has close to 8,000 instances of "FIREARM" defined crime. Where are the locations?
- Need to plot heroine and cocaine locations to see overlaps.
- There is no gang data available since 2012. St Louis does not have a Gang Division. Does it need one?
- UCR reporting structure is poorly constructed for nation as a whole. How could it be improved?
```{r}
sessionInfo()
```