This project aims to design a dashboard using shiny.semantics
of the
marine
dataset
Marine dataset has 3102887 rows and 20 variables with the following structure:
ID | Description |
---|---|
LAT | Ship’s latitude |
LON | Ship’s longitude |
SPEED | Ship’s speed in knots |
COURSE | Ship’s course as angle |
HEADING | Ship’s compass direction |
DESTINATION | Ship’s destination (reported by the crew) |
FLAG | Ship’s flag |
LENGTH | Ship’s length in meters |
SHIPNAME | Ship’s name |
SHIPTYPE | Ship’s type |
SHIPID | Ship’s unique identifier |
WIDTH | Ship’s width in meters |
DWT | Ship’s deadweight in tones |
DATETIME | Date and time of the observation |
PORT | Current port reported by the vessel |
Date | Date extracted from DATETIME |
Weeknb | Week number extracted from date |
Shiptype | Ship’s type from SHIPTYPE |
Port | Current port assigned based on the ship’s location |
Isparked | Indicator whether the ship is moving or not |
longestdistance | Longest distance between vessel observations |
The data is a log of the AIS signal that renders each vessel position under a frequency of time.
- There were found cases of conflict between SHIP_ID and SHIPNAME, in theory SHIP_ID is unique and therefore should have a unique name as well as for SHIPNAME having a single SHIP_ID
SHIPID | n | LastSHIPNAME |
---|---|---|
4666609 | 6 | BLACKPEARL 7.3V |
315731 | 2 | BBAS |
315950 | 2 | WLA-311 |
316404 | 2 | KM ,TAN BORCHARDT |
316482 | 2 | WXA A SZCZESCIA |
The same happens under SHIPNAME:
SHIPNAME | n | SelectID |
---|---|---|
SATAIS | 19 | 2.866114e+14 |
ALANA | 2 | 3.484650e+05 |
AMANDA | 2 | 3.233550e+05 |
ARGO | 2 | 3.653787e+06 |
AURA | 2 | 3.460220e+05 |
These inconsistencies were not treated and should be kept on track for deployment.
- Relationship between IS_PARKED and SPEED hold beyond the speed of 3 knots. It is safe then to assume that IS_PARKED can be used to filter out cases of vessels that are not in movement.
In order to run into shiny, the data was cleaned based out of the cases:
- Removed the cases of vessels in movement
IS_PARKED = 0
leading to 333.188 observations, a reduction of 90% under the raw data. - Removed 6 cases of vessels that had only one signal despite being in movement that would cause issues to the max distance calculation.
- Calculated the max distance beforehand, returning to the app only the data needed for the dashboard, leading to 2.020 observations. A reduction of 99% under the silver data (Parked data)