Functions

Base.append!
Base.copy
Base.delete!
Base.filter
Base.filter!
Base.first
Base.get
Base.hcat
Base.issorted
Base.keys
Base.last
Base.length
Base.names
Base.ndims
Base.pairs
Base.parent
Base.propertynames
Base.push!
Base.repeat
Base.show
Base.similar
Base.size
Base.sort
Base.sort!
Base.sortperm
Base.unique
Base.unique!
Base.vcat
CategoricalArrays.categorical
Compat.eachcol
Compat.eachrow
DataAPI.describe
DataFrames.DataFrame!
DataFrames.allowmissing!
DataFrames.antijoin
DataFrames.categorical!
DataFrames.combine
DataFrames.completecases
DataFrames.crossjoin
DataFrames.disallowmissing!
DataFrames.dropmissing
DataFrames.dropmissing!
DataFrames.flatten
DataFrames.groupby
DataFrames.groupcols
DataFrames.groupindices
DataFrames.innerjoin
DataFrames.insertcols!
DataFrames.leftjoin
DataFrames.mapcols
DataFrames.mapcols!
DataFrames.ncol
DataFrames.nonunique
DataFrames.nrow
DataFrames.order
DataFrames.outerjoin
DataFrames.rename
DataFrames.rename!
DataFrames.repeat!
DataFrames.rightjoin
DataFrames.select
DataFrames.select!
DataFrames.semijoin
DataFrames.stack
DataFrames.transform
DataFrames.transform!
DataFrames.unstack
DataFrames.valuecols
Missings.allowmissing
Missings.disallowmissing

Joining, Grouping, and Split-Apply-Combine

DataFrames.innerjoin — Function

innerjoin(df1, df2; on, makeunique = false,
          validate = (false, false))
innerjoin(df1, df2, dfs...; on, makeunique = false,
          validate = (false, false))

Perform an inner join of two or more data frame objects and return a DataFrame containing the result. An inner join includes rows with keys that match in all passed data frames.

Arguments

df1, df2, dfs...: the AbstractDataFrames to be joined

Keyword Arguments

on : A column name to join df1 and df2 on. If the columns on which df1 and df2 will be joined have different names, then a left=>right pair can be passed. It is also allowed to perform a join on multiple columns, in which case a vector of column names or column name pairs can be passed (mixing names and pairs is allowed). If more than two data frames are joined then only a column name or a vector of column names are allowed. on is a required argument.
makeunique : if false (the default), an error will be raised if duplicate names are found in columns not joined on; if true, duplicate names will be suffixed with _i (i starting at 1 for the first duplicate).
validate : whether to check that columns passed as the on argument define unique keys in each input data frame (according to isequal). Can be a tuple or a pair, with the first element indicating whether to run check for df1 and the second element for df2. By default no check is performed.

When merging on categorical columns that differ in the ordering of their levels, the ordering of the left data frame takes precedence over the ordering of the right data frame.

If more than two data frames are passed, the join is performed recursively with left associativity. In this case the validate keyword argument is applied recursively with left associativity.

Examples

julia> name = DataFrame(ID = [1, 2, 3], Name = ["John Doe", "Jane Doe", "Joe Blogs"])
3×2 DataFrame
│ Row │ ID    │ Name      │
│     │ Int64 │ String    │
├─────┼───────┼───────────┤
│ 1   │ 1     │ John Doe  │
│ 2   │ 2     │ Jane Doe  │
│ 3   │ 3     │ Joe Blogs │

julia> job = DataFrame(ID = [1, 2, 4], Job = ["Lawyer", "Doctor", "Farmer"])
3×2 DataFrame
│ Row │ ID    │ Job    │
│     │ Int64 │ String │
├─────┼───────┼────────┤
│ 1   │ 1     │ Lawyer │
│ 2   │ 2     │ Doctor │
│ 3   │ 4     │ Farmer │

julia> innerjoin(name, job, on = :ID)
2×3 DataFrame
│ Row │ ID    │ Name     │ Job    │
│     │ Int64 │ String   │ String │
├─────┼───────┼──────────┼────────┤
│ 1   │ 1     │ John Doe │ Lawyer │
│ 2   │ 2     │ Jane Doe │ Doctor │

julia> job2 = DataFrame(identifier = [1, 2, 4], Job = ["Lawyer", "Doctor", "Farmer"])
3×2 DataFrame
│ Row │ identifier │ Job    │
│     │ Int64      │ String │
├─────┼────────────┼────────┤
│ 1   │ 1          │ Lawyer │
│ 2   │ 2          │ Doctor │
│ 3   │ 4          │ Farmer │

julia> innerjoin(name, job2, on = :ID => :identifier)
2×3 DataFrame
│ Row │ ID    │ Name     │ Job    │
│     │ Int64 │ String   │ String │
├─────┼───────┼──────────┼────────┤
│ 1   │ 1     │ John Doe │ Lawyer │
│ 2   │ 2     │ Jane Doe │ Doctor │

julia> innerjoin(name, job2, on = [:ID => :identifier])
2×3 DataFrame
│ Row │ ID    │ Name     │ Job    │
│     │ Int64 │ String   │ String │
├─────┼───────┼──────────┼────────┤
│ 1   │ 1     │ John Doe │ Lawyer │
│ 2   │ 2     │ Jane Doe │ Doctor │

Functions

Joining, Grouping, and Split-Apply-Combine

Basics

Unsorted