Overview

Dataset statistics

Number of variables5
Number of observations13986
Missing cells70
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory546.5 KiB
Average record size in memory40.0 B

Variable types

Numeric2
DateTime1
Categorical1
Boolean1

Alerts

computed has constant value ""Constant
movie has a high cardinality: 818 distinct valuesHigh cardinality
rating is highly overall correlated with countHigh correlation
count is highly overall correlated with ratingHigh correlation
timestamp has unique valuesUnique

Reproduction

Analysis started2023-04-10 10:19:11.779062
Analysis finished2023-04-10 10:19:17.642619
Duration5.86 seconds
Software versionydata-profiling vv4.1.2
Download configurationconfig.json

Variables

rating
Real number (ℝ)

Distinct221
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8993522
Minimum2.01
Maximum4.63
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size109.4 KiB
2023-04-10T10:19:17.820055image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum2.01
5-th percentile3.09
Q13.74
median3.99
Q34.18
95-th percentile4.44
Maximum4.63
Range2.62
Interquartile range (IQR)0.44

Descriptive statistics

Standard deviation0.41720531
Coefficient of variation (CV)0.10699349
Kurtosis1.7726651
Mean3.8993522
Median Absolute Deviation (MAD)0.22
Skewness-1.2127862
Sum54536.34
Variance0.17406027
MonotonicityNot monotonic
2023-04-10T10:19:18.184675image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.18 431
 
3.1%
4.08 418
 
3.0%
4.29 416
 
3.0%
3.88 371
 
2.7%
3.94 330
 
2.4%
4.23 319
 
2.3%
3.77 314
 
2.2%
4 302
 
2.2%
4.06 294
 
2.1%
3.57 271
 
1.9%
Other values (211) 10520
75.2%
ValueCountFrequency (%)
2.01 5
< 0.1%
2.03 1
 
< 0.1%
2.07 4
< 0.1%
2.08 6
< 0.1%
2.12 5
< 0.1%
2.27 2
 
< 0.1%
2.28 1
 
< 0.1%
2.3 4
< 0.1%
2.31 1
 
< 0.1%
2.32 3
< 0.1%
ValueCountFrequency (%)
4.63 1
 
< 0.1%
4.58 5
 
< 0.1%
4.57 143
1.0%
4.56 1
 
< 0.1%
4.55 1
 
< 0.1%
4.54 146
1.0%
4.52 4
 
< 0.1%
4.51 3
 
< 0.1%
4.5 7
 
0.1%
4.49 40
 
0.3%

count
Real number (ℝ)

Distinct9194
Distinct (%)65.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean723678.15
Minimum159
Maximum1809060
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size109.4 KiB
2023-04-10T10:19:18.447909image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum159
5-th percentile58627
Q1297988.5
median800336
Q31059577
95-th percentile1372210.8
Maximum1809060
Range1808901
Interquartile range (IQR)761588.5

Descriptive statistics

Standard deviation436894.36
Coefficient of variation (CV)0.60371363
Kurtosis-1.0256216
Mean723678.15
Median Absolute Deviation (MAD)363135
Skewness0.072813566
Sum1.0121363 × 1010
Variance1.9087668 × 1011
MonotonicityNot monotonic
2023-04-10T10:19:18.708064image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
807145 7
 
0.1%
1336334 5
 
< 0.1%
1050058 5
 
< 0.1%
1073076 5
 
< 0.1%
800336 5
 
< 0.1%
826408 5
 
< 0.1%
1045269 5
 
< 0.1%
1338234 5
 
< 0.1%
876934 5
 
< 0.1%
101772 5
 
< 0.1%
Other values (9184) 13934
99.6%
ValueCountFrequency (%)
159 1
 
< 0.1%
192 5
< 0.1%
200 2
 
< 0.1%
202 1
 
< 0.1%
206 1
 
< 0.1%
213 1
 
< 0.1%
216 1
 
< 0.1%
219 1
 
< 0.1%
221 1
 
< 0.1%
222 2
 
< 0.1%
ValueCountFrequency (%)
1809060 1
 
< 0.1%
1783020 1
 
< 0.1%
1764034 4
< 0.1%
1763443 1
 
< 0.1%
1763442 2
< 0.1%
1763441 1
 
< 0.1%
1762431 1
 
< 0.1%
1762429 1
 
< 0.1%
1762428 2
< 0.1%
1761960 4
< 0.1%
Distinct13986
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size109.4 KiB
Minimum2023-02-24 23:49:18.424099
Maximum2023-04-09 12:31:32.423742
2023-04-10T10:19:18.969128image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-10T10:19:19.226298image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

movie
Categorical

Distinct818
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Memory size109.4 KiB
m3gan
 
161
la-la-land
 
158
infinity-pool
 
154
knives-out-2019
 
153
get-out-2017
 
152
Other values (813)
13208 

Length

Max length82
Median length45
Mean length15.805162
Min length2

Characters and Unicode

Total characters221051
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)0.3%

Sample

1st rowthe-quiet-girl
2nd rowinside-2023
3rd rowof-an-age
4th rowgods-time
5th rowm3gan

Common Values

ValueCountFrequency (%)
m3gan 161
 
1.2%
la-la-land 158
 
1.1%
infinity-pool 154
 
1.1%
knives-out-2019 153
 
1.1%
get-out-2017 152
 
1.1%
glass-onion-a-knives-out-mystery 151
 
1.1%
gone-girl 149
 
1.1%
spirited-away 149
 
1.1%
the-shining 149
 
1.1%
everything-everywhere-all-at-once 148
 
1.1%
Other values (808) 12462
89.1%

Length

2023-04-10T10:19:19.529493image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
m3gan 161
 
1.2%
la-la-land 158
 
1.1%
infinity-pool 154
 
1.1%
knives-out-2019 153
 
1.1%
get-out-2017 152
 
1.1%
glass-onion-a-knives-out-mystery 151
 
1.1%
gone-girl 149
 
1.1%
spirited-away 149
 
1.1%
the-shining 149
 
1.1%
everything-everywhere-all-at-once 148
 
1.1%
Other values (808) 12462
89.1%

Most occurring characters

ValueCountFrequency (%)
- 26539
 
12.0%
e 20527
 
9.3%
a 15976
 
7.2%
t 15713
 
7.1%
n 13611
 
6.2%
i 12966
 
5.9%
o 12248
 
5.5%
r 11865
 
5.4%
s 11014
 
5.0%
l 9509
 
4.3%
Other values (27) 71083
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 180694
81.7%
Dash Punctuation 26539
 
12.0%
Decimal Number 13818
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 20527
11.4%
a 15976
 
8.8%
t 15713
 
8.7%
n 13611
 
7.5%
i 12966
 
7.2%
o 12248
 
6.8%
r 11865
 
6.6%
s 11014
 
6.1%
l 9509
 
5.3%
h 9159
 
5.1%
Other values (16) 48106
26.6%
Decimal Number
ValueCountFrequency (%)
2 5998
43.4%
0 3340
24.2%
1 1993
 
14.4%
9 1102
 
8.0%
7 527
 
3.8%
3 372
 
2.7%
4 218
 
1.6%
6 133
 
1.0%
8 84
 
0.6%
5 51
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 26539
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 180694
81.7%
Common 40357
 
18.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 20527
11.4%
a 15976
 
8.8%
t 15713
 
8.7%
n 13611
 
7.5%
i 12966
 
7.2%
o 12248
 
6.8%
r 11865
 
6.6%
s 11014
 
6.1%
l 9509
 
5.3%
h 9159
 
5.1%
Other values (16) 48106
26.6%
Common
ValueCountFrequency (%)
- 26539
65.8%
2 5998
 
14.9%
0 3340
 
8.3%
1 1993
 
4.9%
9 1102
 
2.7%
7 527
 
1.3%
3 372
 
0.9%
4 218
 
0.5%
6 133
 
0.3%
8 84
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 221051
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 26539
 
12.0%
e 20527
 
9.3%
a 15976
 
7.2%
t 15713
 
7.1%
n 13611
 
6.2%
i 12966
 
5.9%
o 12248
 
5.5%
r 11865
 
5.4%
s 11014
 
5.0%
l 9509
 
4.3%
Other values (27) 71083
32.2%

computed
Boolean

Distinct1
Distinct (%)< 0.1%
Missing70
Missing (%)0.5%
Memory size109.4 KiB
False
13916 
(Missing)
 
70
ValueCountFrequency (%)
False 13916
99.5%
(Missing) 70
 
0.5%
2023-04-10T10:19:19.802766image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Interactions

2023-04-10T10:19:16.614022image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-10T10:19:16.050582image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-10T10:19:16.836563image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-10T10:19:16.379383image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-04-10T10:19:19.927146image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ratingcount
rating1.0000.547
count0.5471.000

Missing values

2023-04-10T10:19:17.132280image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-10T10:19:17.373634image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ratingcounttimestampmoviecomputed
04.00178962023-02-24 23:49:18.424099the-quiet-girlNone
13.323062023-02-24 23:49:18.744766inside-2023None
23.7527112023-02-24 23:49:19.072004of-an-ageNone
33.471922023-02-24 23:49:19.366313gods-timeNone
43.122620362023-02-24 23:49:19.697683m3ganNone
53.31863582023-02-24 23:49:20.057960infinity-poolNone
63.944335372023-02-24 23:49:20.381691how-to-train-your-dragonNone
73.422391772023-02-24 23:49:20.733004raya-and-the-last-dragonNone
83.131596582023-02-24 23:49:21.065949robotsNone
93.9911464522023-02-24 23:49:21.442832la-la-landNone
ratingcounttimestampmoviecomputed
139764.132040042023-04-09 12:31:30.498380drive-my-carFalse
139773.793913832023-04-09 12:31:30.749797palm-springs-2020False
139783.343629942023-04-09 12:31:30.975680oceans-eightFalse
139794.213310152023-04-09 12:31:31.178286children-of-menFalse
139803.584263512023-04-09 12:31:31.402401wreck-it-ralphFalse
139813.586079362023-04-09 12:31:31.586157bullet-trainFalse
139822.405304332023-04-09 12:31:31.802569thor-the-dark-worldFalse
139834.2111902023-04-09 12:31:31.983872past-livesFalse
139844.107742882023-04-09 12:31:32.199486blade-runner-2049False
139854.156085622023-04-09 12:31:32.423742little-miss-sunshineFalse