Missing Data
In Julia, missing values in data are represented using the special object missing, which is the single instance of the type Missing.
julia> missing
missing
julia> typeof(missing)
Missing
The Missing type lets users create Vectors and DataFrame columns with missing values. Here we create a vector with a missing value and the element-type of the returned vector is Union{Missing, Int64}.
julia> x = [1, 2, missing]
3-element Array{Union{Missing, Int64},1}:
1
2
missing
julia> eltype(x)
Union{Missing, Int64}
julia> Union{Missing, Int}
Union{Missing, Int64}
julia> eltype(x) == Union{Missing, Int}
true
missing values can be excluded when performing operations by using skipmissing, which returns a memory-efficient iterator.
julia> skipmissing(x)
Base.SkipMissing{Array{Union{Missing, Int64},1}}(Union{Missing, Int64}[1, 2, missing])
The output of skipmissing can be passed directly into functions as an argument. For example, we can find the sum of all non-missing values or collect the non-missing values into a new missing-free vector.
julia> sum(skipmissing(x))
3
julia> collect(skipmissing(x))
2-element Array{Int64,1}:
1
2
The function coalesce can be used to replace missing values with another value (note the dot, indicating that the replacement should be applied to all entries in x):
julia> coalesce.(x, 0)
3-element Array{Int64,1}:
1
2
0
The Missings.jl package provides a few convenience functions to work with missing values.
The function Missings.replace returns an iterator which replaces missing elements with another value:
julia> using Missings
julia> Missings.replace(x, 1)
Missings.EachReplaceMissing{Array{Union{Missing, Int64},1},Int64}(Union{Missing, Int64}[1, 2, missing], 1)
julia> collect(Missings.replace(x, 1))
3-element Array{Int64,1}:
1
2
1
julia> collect(Missings.replace(x, 1)) == coalesce.(x, 1)
true
The function Missings.T returns the element-type T in Union{T, Missing}.
julia> eltype(x)
Union{Int64, Missing}
julia> Missings.T(eltype(x))
Int64
The missings function constructs Vectors and Arrays supporting missing values, using the optional first argument to specify the element-type.
julia> missings(1)
1-element Array{Missing,1}:
missing
julia> missings(3)
3-element Array{Missing,1}:
missing
missing
missing
julia> missings(1, 3)
1×3 Array{Missing,2}:
missing missing missing
julia> missings(Int, 1, 3)
1×3 Array{Union{Missing, Int64},2}:
missing missing missing
See the Julia manual for more information about missing values.