| Type: | Package |
| Title: | Necessary Condition Analysis |
| Version: | 5.0.0 |
| Date: | 2026-03-20 |
| Description: | Performs a Necessary Condition Analysis (NCA). (Dul, J. 2016. Necessary Condition Analysis (NCA). ”Logic and Methodology of 'Necessary but not Sufficient' causality." Organizational Research Methods 19(1), 10-52) <doi:10.1177/1094428115584005>. NCA identifies necessary (but not sufficient) conditions in datasets, where x causes (e.g. precedes) y. Instead of drawing a regression line ”through the middle of the data” in an xy-plot, NCA draws the ceiling line. The ceiling line y = f(x) separates the area with observations from the area without observations. (Nearly) all observations are below the ceiling line: y <= f(x). The empty zone is in the upper left hand corner of the xy-plot (with the convention that the x-axis is ”horizontal” and the y-axis is ”vertical” and that values increase ”upwards” and ”to the right”). The ceiling line is a (piecewise) linear non-decreasing line: a linear step function or a straight line. It indicates which level of x (e.g., an effort, a characteristic) is necessary but not sufficient for a (desired or undesired) level of y (e.g., good performance or disease). A quick start guide for using this package can be found here: https://repub.eur.nl/pub/78323/ or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2624981. |
| URL: | https://www.erim.eur.nl/necessary-condition-analysis/ |
| License: | GPL (≥ 3) |
| Depends: | R (≥ 3.5.0) |
| Imports: | gplots, quantreg, KernSmooth, lpSolve, ggplot2, doParallel, foreach, iterators, plotly, truncnorm, DBI, RSQLite |
| NeedsCompilation: | no |
| Repository: | CRAN |
| Suggests: | testthat |
| Packaged: | 2026-03-20 11:10:41 UTC; govert |
| Author: | Jan Dul [aut], Govert Buijs [cre] |
| Maintainer: | Govert Buijs <buijs@rsm.nl> |
| Date/Publication: | 2026-03-20 13:00:02 UTC |
Necessary Condition Analysis
Description
The NCA package implements Necessary Condition Analysis (NCA) as developed by Dul (2016). For running the NCA package a data file (e.g., mydata.csv, which contains the input data) must be available. An example data file (presented in above article) is included in the package. The user must load the data and call the nca function.
Details
| Package: | NCA |
| Type: | Package |
| Version: | 5.0.0 |
| Date: | 2026-03-20 |
| License: | GPL (>= 3) |
Author(s)
Author: Jan Dul jdul@rsm.nl
Maintainer: Govert Buijs buijs@rsm.nl
References
Dul, J. 2016. Necessary Condition Analysis (NCA).Logic and methodology of 'necessary but not sufficient' causality.
Organizational Research Methods 19(1), 10-52. doi:10.1177/1094428115584005
Dul, J. (2020). Conducting Necessary Condition Analysis. Sage publishers. ISBN: 9781526460141.
https://uk.sagepub.com/en-gb/eur/conducting-necessary-condition-analysis-for-business-and-management-students/book262898
Dul, J., van der Laan, E., & Kuik, R. (2020). A statistical significance test for Necessary Condition Analysis. Organizational Research Methods, 23(2), 385-395.
doi:10.1177/1094428118795272
See Also
Examples
# A more detailed guide can be found here : https://repub.eur.nl/pub/78323/
# or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2624981
# Load data from a CSV file with header and row names:
data <- read.csv('mydata.csv', row.names=1)
# Or load and rename the example dataset
data(nca.example)
data <- nca.example
# Run NCA with the dataset and name the analysis 'model'.
# Specify the conditions and outcomes by column index or name
# More than 1 condition can be specified with a vector
model <- nca_analysis(data, c(1, 2), 3)
# A quick summary of the analysis can be displayed by 'model'
model
# A full summary of the analysis is shown by nca_output (see documentation for more options)
nca_output(model)
# The results of the analysis is a list of 6 items :
# - plots (1 for each condition)
# - summaries (1 for each condition)
# - bottleneck tables (1 for each ceiling technique)
# - peers (1 dataframe for each condition)
# - tests (1 list for each condition)
# - test.time (total time to run all tests)
names(model)
# The first item contains the graphical outputs for each condition
# This is not really useful to humans
model$plots[[1]]
# The second item contains a list with the summaries for the conditions
# Extract individual elements with nca_extract
model$summaries[[1]]
# The third item contains a list with the bottleneck tables, one for each ceiling technique
model$bottlenecks$cr_fdh
# The fourth item shows the peers, for each condition
model$peers$Individualism
# For the fifth and sixth item, the test.rep needs to be larger than 0
# for performing the statistical test
# Optionally the p_confidence (default 0.95) and the p_threshold (default 0) can be set
model <- nca_analysis(data, c(1, 2), 3, test.rep=100)
# The fifth item shows the tests for each condition
# This is not really useful to humans
model$tests$Individualism
# The last item shows the total time needed to perform the analysis.
# For large values of test.rep the test may take long.
model$test.time
a set of all available ceiling techniques
Description
Ceilings to use for the nca or nca_analysis methods
> nca(data, c(1, 2), 3, ceilings=c('ols', 'ce_fdh', 'cr_fdh'))
Note that the ols regression line is not a ceiling but is included as a reference.
| Ceiling Technique | Name |
| cols | Corrected Ordinary Least Squares |
| qr | Quantile Regression |
| ce_vrs | Ceiling Envelopment with Varying Return to Scale |
| cr_vrs | Ceiling Regression with Varying Return to Scale |
| ce_fdh | Ceiling Envelopment with Free Disposal Hull |
| cr_fdh | Ceiling Regression with Free Disposal Hull |
| c_lp | Ceiling Linear Programming |
Note: The SFA and LH ceiling lines are deprecated (discontinued) from version 3.2.0
a set defining the line colors for the plots
Description
Set before calling nca_output
> line.colors['ce_fdh'] <- 'blue'
Reset one line color by setting it to NULL
> line.colors['ce_fdh'] <- NULL
Reset all line colors by setting line.colors to NULL
> line.colors <- NULL
Format
This is a list with default line colors for each ceiling technique
| ols | 'green' | c_lp | 'blue' | |
| cols | 'darkgreen' | qr | 'lightpink' | |
| ce_vrs | 'orchid4' | cr_vrs | 'violet' | |
| ce_fdh | 'red' | cr_fdh | 'orange' |
a set defining the line types for the plots
Description
Set before calling nca_output
> line.types['ce_fdh'] <- 1
Reset one line type by setting it to NULL
> line.types['ce_fdh'] <- NULL
Reset all line types by setting line.types to NULL
> line.types <- NULL
Format
This is a list with default line types for each ceiling technique
| ols | 1 | c_lp | 2 | |
| cols | 3 | qr | 4 | |
| ce_vrs | 5 | cr_vrs | 1 | |
| ce_fdh | 6 | cr_fdh | 1 |
parameter defining the width of the lines in the plots
Description
This will be used for the lwd parameter of the plot, default is 1.5.
Set before calling nca_output
> line.width <- 5
Run a basic NCA analyses on a data set
Description
Run a basic NCA analyses on a data set
Usage
nca(data, x, y, ceilings=c('ols', 'ce_fdh', 'cr_fdh'))
Arguments
data |
dataframe with columns of the variables |
x |
collection of the columns with the conditions |
y |
index or name of the column with the outcomes |
ceilings |
vector with the ceiling techniques to include in this analysis |
Value
Returns a list with 3 items (see examples for further explanation):
plots |
A list of plot-data for each x-y combination |
summaries |
A list of dataframes with the summaries for each x-y combination |
bottlenecks |
A list of dataframes with a bottleneck table for each ceiling technique |
Examples
# Load the data
data(nca.example)
data <- nca.example
# Basic NCA analysis
# Conditions in the first 2 columns, outcome in the third column
# This shows scatter plot(s) with the ceiling lines and the effect size(s) on the console
nca(data, c(1, 2), 3)
# Columns can be selected by name as well
nca(data, c('Individualism', 'Risk taking'), 'Innovation performance')
# Define the ceiling techniques via the ceilings parameter
nca(data, c(1, 2), 3, ceilings=c('ols', 'ce_vrs'))
# These are the available ceiling techniques
print(ceilings)
NCA example data with 2 conditions and 1 output
Description
This data set has Individualism and Risk taking as conditions, and Innovation performance as the output for 28 countries.
Usage
data(nca.example)
Format
A matrix containing 28 observations, incl. headers and row names.
NCA example data with 3 conditions and 1 outcome
Description
This data set has Contractual detail, Goodwill trust, and Competence trust as conditions, and Innovation as outcome for 48 buyer-supplier relationships. See Table 2 in: Van der Valk, W., Sumo, R., Dul, J., & Schroeder, R. G. (2016). When are contracts and trust necessary for innovation in buyer-supplier relationships? A necessary condition analysis. Journal of Purchasing and Supply Management, 22(4), 266-277.
Usage
data(nca.example2)
Format
A matrix containing 48 observations, incl. headers.
Run NCA analyses on a data set
Description
Run multiple types of NCA analyses on a dataset
Usage
nca_analysis(data, x, y, ceilings=c('ols', 'ce_fdh', 'cr_fdh'), custom=NULL,
corner=NULL, flip.x=FALSE, flip.y=FALSE, scope=NULL,
bottleneck.x='percentage.range', bottleneck.y='percentage.range',
steps=10, step.size=NULL, cutoff=0, qr.tau=0.95,
effect_aggregation = 1, test.rep=0,
test.p_confidence=0.95, test.p_threshold=0.05)
Arguments
data |
dataframe with columns of the variables |
x |
index or name (or a vector of those) with condition(s) x |
y |
index or name of the column with the outcome y |
ceilings |
vector with the ceiling techniques to include in this analysis |
custom |
custom ceiling line defined by a intercept-slope pair per condition. If only 1 pair given it will be used for all conditions. |
corner |
either an integer or a vector of integers, indicating the corner to analyze, see Details |
flip.x |
reverse the direction of the conditions |
flip.y |
reverse the direction of the outcomes, boolean |
scope |
a theoretical scope in list format : (x.low, x.high, y.low, y.high), see Details |
bottleneck.x |
options for displaying the conditions in the bottleneck table |
bottleneck.y |
options for displaying the outcomes in the bottleneck table. |
steps |
this argument accepts 2 types : |
step.size |
define the step size in the bottleneck table. |
cutoff |
display calculated x,y values that are lower/higher than lowest/highest observed x,y values in the bottleneck table as: |
qr.tau |
define the qr tau (between 0 and 1) for the quantile regression ceiling technique, default 0.95 |
effect_aggregation |
define the corners to aggregate into the effect size. 1 is upper-left and is always selected, 2 is upper-right, 3 is lower-left and 4 is lower-right |
test.rep |
number of resamples in the statistical approximate permutation test. For test.rep = 0 no statistical test is performed |
test.p_confidence |
confidence level of the estimated p-value. |
test.p_threshold |
define the threshold significance level in the returned plot of the statistical test, default 0.05 |
Details
Corners
Corner 1 is the upper-left corner and corner 2 is the upper-right corner.
These two corners are used for an analysis of the necessity of the presence/high level
if x (corner = 1 ) or the absence/low level if x (corner = 2) for the presence/high level
of y, respectively.
Corner 3 is the lower-left corner and corner 4 is the lower-right corner.
These two corners are used for an analysis of the necessity of the presence/high level
of x (corner = 3 ) or the absence/low level if x (corner = 4) for the absence/low level
of y, respectively.
By default the upper left corner is analysed for all conditions and corner
is not defined. If corner is defined, flip.x and flip.y are ignored.
Scope
By default, the theoretical scope is not defined and the empirical scope is used based on the minimum and maximum observed values of x and y.
Value
Returns a list of 6 items (see examples for further explanation):
plots |
A list of plot-data for each x-y combination |
summaries |
A list of dataframes with the summaries for each x-y combination |
bottlenecks |
A list of dataframes with a bottleneck table for each ceiling technique |
peers |
A list of ceilings, with a list of peers for each condition. Peers are corner points of the CE-FDH ceiling line (e.g., the northwest-corners points for corner = 1) |
tests |
The results of the test for each condition (not human friendly, use nca_output) |
test.time |
The total time needed to run the tests for all conditions |
Examples
# Load the data
data(nca.example)
data <- nca.example
# Basic NCA analysis, with conditions in the first 2 columns
# and the outcome in the third column
model <- nca_analysis(data, c(1, 2), 3)
# Use nca_output to show the summaries (see nca_output documentation for more options)
nca_output(model)
# Columns can be selected by name as well
model <- nca_analysis(data, c('Individualism', 'Risk taking'), 'Innovation performance')
# Define the ceiling techniques via the ceilings parameter, see 'ceilings' for all types
model <- nca_analysis(data, c(1, 2), 3, ceilings=c('ce_fdh', 'ce_vrs'))
# These are the available ceiling techniques
print(ceilings)
# Define a custom ceiling line by providing the custom parameter, pairs of intercept-slope
model <- nca_analysis(data, c(1, 2), 3, custom=c(0, 1, 0, 2))
# If a single pair is provided, it will be used for all conditions
model <- nca_analysis(data, c(1, 2), 3, custom=c(0, 1))
# By default, the upper-left corner is analysed. With the corner argument for each
# condition a different corner can be selected. Select corner 1 or 2
# for an analysis of necessary conditions for the presence/high level of the
# outcome, and corner 3 or 4 for an analysis of necessary conditions for
# the absence/low level of the outcome. It is not possible to combine
# corner 1 or 2 with corner 3 or 4 in the same analysis as different outcomes are analysed.
# This analyses the upper right corner for the first condition
# and the upper left corner for the second condition:
model <- nca_analysis(data, c(1, 2), 3, corner=c(2, 1))
# Alternatively, for using the upper right corner(s), 'flip' the x variables
model <- nca_analysis(data, c(1, 2), 3, flip.x=TRUE)
# It is also possible to flip a single x variable
model <- nca_analysis(data, c(1, 2), 3, flip.x=c(TRUE, FALSE))
# Flip the y variable if the lower corners need analysing
model <- nca_analysis(data, c(1, 2), 3, flip.x=c(TRUE, FALSE), flip.y=TRUE)
# Use a theoretical scope instead of the (calculated) empirical scope
model <- nca_analysis(data, c(1, 2), 3, scope=c(0, 120, 0, 240))
# Display the peers for a ceiling and a condition
print(model$peers$ce_fdh$Individualism)
# By default, the bottleneck tables use percentages of the range for the x and y values.
# Using the percentage of the max value is also possible
model <- nca_analysis(data, c(1, 2), 3, bottleneck.y='percentage.max')
# Use the actual values, in this case the x-value
model <- nca_analysis(data, c(1, 2), 3, bottleneck.x='actual')
# Use percentile, in this case for the y-values
model <- nca_analysis(data, c(1, 2), 3, bottleneck.y='percentile')
# Any combination is possible
model <- nca_analysis(data, c(1, 2), 3, bottleneck.x='actual', bottleneck.y='percentile')
# The number of steps is adjustible via the steps parameter
model <- nca_analysis(data, c(1, 2), 3, steps=20)
# The steps parameter also accepts a list of values
# These are interpreted as actual or percentage / percentile depending on bottleneck.y
model <- nca_analysis(data, c(1, 2), 3, steps=seq(50, 120, 10))
# Or via the step.size parameter, this ignores the steps parameter
model <- nca_analysis(data, c(1, 2), 3, step.size=5)
# If the ceiling line crosses the X = Xmax line at a point C below Y = Ymax,
# for Y < Yc < Ymax, the corresponding X in the bottleneck table is displayed as 'NA'
# It is also possible to display them as Xmax
model <- nca_analysis(data, c(1, 2), 3, cutoff=1)
# or as the calculated value on the ceiling line
model <- nca_analysis(data, c(1, 2), 3, cutoff=2)
# To run tests, the test.rep needs to be larger than 0
# Optionally the p_confidence (default 0.95) and the p_threshold (default 0) can be set
model <- nca_analysis(data, c(1), 3, test.rep=1000,
test.p_confidence=0.9, test.p_threshold=0.05)
# The output of the tests can be shown via nca_output with test=TRUE
nca_output(model, test=TRUE)
# If reproducible outputs are needed, set the seed to a fixed number
set.seed(42)
model <- nca_analysis(data, c(1), 3, test.rep=1000)
Function for permutation tests
Description
Function for permutation tests
Usage
nca_difference(data1, data2 = NULL, x, y,
ceilings = c("ce_fdh", "cr_fdh"), scope = NULL, test.rep = 1000,
test.type = c("contrast", "independent", "paired"))
Arguments
data1 |
Dataframe 1. |
data2 |
Dataframe 2, defaults to NULL. |
x |
Index or name (or a vector of those) with condition(s) x |
y |
Index or name of the column with the outcome y |
ceilings |
Vector with the ceiling techniques to test |
scope |
A theoretical scope in list format : (x.low, x.high, y.low, y.high), see nca_analysis |
test.rep |
Number of permutations to use, defaults to 1000. |
test.type |
Name of the test type: "contrast", "independent" or "paired". |
Examples
ceilings <- "cr_fdh"
y <- "Y"
test.rep <- 1000
# Contrast test: effect size difference between two conditions
test.type <- "contrast"
data1 <- nca_random(100, c(0.2,0.4), c(1,1))
x <- c("X1", "X2")
difference <- nca_difference(data1 = data1, x = x, y = y,
ceilings = ceilings, test.rep = test.rep,
test.type = test.type)
print(difference)
# Independent test: effect size difference between two datasets
test.type <- "independent"
data1 <- nca_random(100, 0.2, 1)
data2 <- nca_random(100, 0.4, 1)
x <- "X"
difference <- nca_difference(data1 = data1, data2 = data2, x = x, y = y,
ceilings = ceilings, test.rep = test.rep,
test.type = test.type)
print(difference)
# paired test: effect size differences between 2 measurements
test.type <- "paired"
data1 <- nca_random(100, 0.2, 1)
data2 <- nca_random(100, 0.4, 1)
x <- "X"
difference <- nca_difference(data1 = data1, data2 = data2, x = x, y = y,
ceilings = ceilings, test.rep = test.rep,
test.type = test.type)
print(difference)
get model parameters
Description
Get the model parameters.
Usage
nca_extract(model, x=NULL, ceiling=NULL, param='Effect size')
Arguments
model |
The model produced by nca_analysis |
x |
Name of the condition, defaults to the first |
ceiling |
Name of the ceiling, defaults to the first |
param |
Name of the parameter, default 'Effect size' |
Examples
data(nca.example)
model <- nca_analysis(nca.example, c('Individualism', 'Risk taking'), 3)
# The names of the parameters can be found in the summaries
nca_output(model, plots=FALSE, summaries=TRUE)
# Get the Ceiling zone
extract <- nca_extract(model, 'Individualism', 'ce_fdh', 'Ceiling zone')
print(extract)
# Get the Scope
extract <- nca_extract(model, 'Individualism', 'ce_fdh', 'Scope')
print(extract)
# For multiple values: vectorize one argument and pin the others
names <- c('Individualism', 'Risk taking')
extracts <- sapply(names, nca_extract,
model=model, ceiling='ce_fdh', param='Ceiling zone')
print(extracts)
params <- c('Effect size', 'Ceiling zone')
extracts <- sapply(params, nca_extract,
model=model, x='Individualism', ceiling='ce_fdh')
print(extracts)
# It is also possible to get the extract directly from the model
# Get the Ceiling zone
print(model$`Ceiling zone`$Individualism$ce_fdh)
# Get the Scope, global parameters are independent of the ceiling
print(model$Scope$Individualism)
Outlier detection
Description
Detect outliers on the dataset.
Usage
nca_outliers(data, x, y, ceiling = NULL,
corner = NULL, flip.x = FALSE, flip.y = FALSE, scope=NULL,
k = 1, min.dif = 1e-2, max.results = 25, plotly=FALSE, condensed = FALSE)
Arguments
data |
Dataframe with columns of the variables |
x |
Index or name of the column with the condition |
y |
Index or name of the column with the outcome |
ceiling |
Name of the ceiling technique to be used. If not provided, the default ceilings (CE_FDH) will be used |
corner |
either an integer or a vector of integers, indicating the corner to analyze, see Details |
flip.x |
reverse the direction of the conditions |
flip.y |
reverse the direction of the outcomes, boolean |
scope |
a theoretical scope in list format : (x.low, x.high, y.low, y.high), see nca_analysis |
k |
use combinations of observations, default is 1 (single observation) |
min.dif |
set the threshold for the minimum dif.rel to be considered as outlier, default is 1e-2 |
max.results |
only show the first 'max.results' outliers, default is 25 |
plotly |
If true shows the interactive scatter plot(s), one for each condition. |
condensed |
If true and k > 1, hide outlier combinations for which the effect size is not increased compared to the effect size of single elements of that combination. (default FALSE) |
Details
Outliers
The potential outliers are displayed with the original effects size and the
effect size if this outlier is removed. The absolute and relative differences
between both effect sizes is also shown.
The table also displays if a point is a ceiling zone outlier or a scope outlier.
Plotly
The plot highlights the potential outliers.
The names, relative difference and XY coordinates of all points pop up when moving the pointer over the plot.
The toolbar allows several actions such as zoom and selection of parts of the plot.
Examples
# A basic example of the nca_outliers command:
data(nca.example)
outliers <- nca_outliers(nca.example, 1, 3)
# This prints the outlier table
print(outliers)
# Plotly displays a scatterplot with the outliers
nca_outliers(nca.example, 1, 3, plotly = TRUE)
# Test for combinations of observations
# Useful to detect clusters of observations as possible outliers
nca_outliers(nca.example, 1, 3, k = 2)
# Just like the nca_analysis command, nca_outliers accept both flip and corner arguments
nca_outliers(nca.example, 1, 3, corner=3)
# It is possible to define the maximum number of results (default is 25)
nca_outliers(nca.example, 1, 3, max.results=5)
# Do no show possible outliers where the abs(dif.rel) is smaller than min.dif
nca_outliers(nca.example, 1, 3, min.dif=10)
# If k > 1, the effect size of a single observation might not change
# when paired with another observation, e.g. dif.rel of Obs1 == dif.rel of Obs1+Obs2.
# The example below hides combinations of Japan with Portugal, Greece, etc.
nca_outliers(nca.example, 1, 3, k = 2, condensed = TRUE)
display the result of the NCA analysis
Description
Show the plots, NCA summaries and bottleneck tables of a NCA analysis.
Usage
nca_output(model, plots=TRUE, plotly=FALSE, bottlenecks=FALSE,
summaries=TRUE, test=FALSE, pdf=FALSE, path=NULL, selection = NULL,
plot_bottlenecks=0)
Arguments
model |
Displays output of the nca or nca_analysis command |
plots |
If true (default) show the scatter plot(s), one for each condition. |
plotly |
If true shows the interactive scatter plot(s), one for each condition. |
bottlenecks |
If true displays the bottleneck table(s) in the Console window, one table for each ceiling line |
summaries |
If true shows the summaries for each condition in the Console window, see Details |
test |
If true shows the result of the statistical sigificance test (if present), see Details |
pdf |
If true exports the output to a pdf file, except for the plotly plot |
path |
Optional path for the output file(s) |
selection |
Optionally selects the conditions for inclusion in the output. |
plot_bottlenecks |
Optionally show the bottleneck lines |
Details
Plotly
The plot highlights the points that construct the ceiling line ('peers').
The names and XY coordinates of all points pop up when moving the pointer over the plot.
The toolbar allows several actions such as zoom and selection of parts of the plot. Optionally subgroups of points can be labeled.
Summaries
The output starts with 6 lines of basic information ("global") about the dataset ("Number of observations", "Scope", "Xmin", "Xmax", "Ymin", and "Ymax").
"Scope" refers to the empirical area of possible x-y combinations, given the minimum and maximum observed x and y values.
The next 11 lines present the NCA parameters ("param", see below) for each of the selected ceiling techniques (the defaults techniques are ce_fdh and cr_fdh).
The 11 printed NCA parameters are:
- Ceiling zone, which is the size of the "empty" area in the upper-left corner
- Effect size, which is the ceiling zone divided by the scope
- # above, which is the number of observations that are above the ceiling line and hence in the "empty" ceiling zone
- Ceiling accuracy, which is the number of observations on or below the ceiling line divided by the total number of observations and multiplied by 100 percent
- Fit, which relates to the "closeness" of the selected ceiling line to the ce_fdh ceiling line
- Slope and Intercept, which are the slope and the intercept of the straight ceiling line (no values are printed if the ceiling line is not a straight line, but a step function)
- Abs. ineff., which is the total xy-space where x does not constrain y, and y is not constrained by x
- Rel. ineff., which is the total xy-space where x does not constrain y, and y is not constrained by x as percentage of the scope
- Condition ineff., which is the condition inefficiency that indicates for which range of x (as a percentage of the total range) x does not constrain y (i.e., there is no ceiling line in that x-range)
- Outcome ineff., which is the outcome efficiency that indicates for which range of y (as a percentage of the total range of y) y is not constrained by x (i.e., there is no ceiling line in that y-range)
Test
NCA's statistical test is a randomness test to evaluate if the observed effect size may be a random result of unrelated X and Y variables
Examples
# Use the result of the nca command:
data(nca.example)
data <- nca.example
model <- nca_analysis(data, c(1, 2), 3)
# Show the full summaries in the Console window
nca_output(model)
# Suppress the summaries and display the plots
nca_output(model, plots=TRUE, summaries=FALSE)
# Display the plots via Plotly
nca_output(model, plotly=TRUE, summaries=FALSE)
# Label the observation of the Plotly plot by using a vector of names (no more than 5).
# For example label the observations in nca.example
labels <- c('Australia', 'Europe', 'Europe', 'North America', 'Europe', 'Europe',
'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Asia',
'North America', 'Europe', 'Australia', 'Europe', 'Europe', 'Europe', 'Europe',
'Asia', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'North America')
nca_output(model, plotly=labels, summaries=FALSE)
# Suppress the summaries and display the bottlenecks
nca_output(model, bottlenecks=TRUE, summaries=FALSE)
# Show the results of the statistical significance test (p-value)
# Make sure to set test.rep in nca_analysis
nca_output(model, test=TRUE)
# Show all five
nca_output(model, plots=TRUE, plotly=TRUE, bottlenecks=TRUE, test=TRUE)
# Per condition, export plots and summaries to PDF files,
# and export all the bottleneck tables to a single PDF file
nca_output(model, plots=TRUE, bottlenecks=TRUE, pdf=TRUE)
# Use the path option to export to an existing directory
outdir <- '/tmp'
nca_output(model, plots=TRUE, pdf=TRUE, path=outdir)
# Limit the output to a selection of conditions by name
nca_output(model, plots=TRUE, selection=c("Individualism"))
# Or by column index, in both cases the order matters
nca_output(model, plots=TRUE, selection=c(2, 1))
# Show the bottleneck lines in the plots or plotly
# 1 shows the outcome lines, 2 also shows the lines for the condition
nca_output(model, plots = TRUE, bottlenecks = TRUE, summaries = FALSE, plot_bottlenecks = 1)
Function to evaluate power
Description
Function to evaluate power, test if a sample size is large enough to detect necessity.
Usage
nca_power(n = c(20, 50, 100), effect = 0.10, slope = 1, ceiling = "ce_fdh",
corner = 1, p = 0.05, distribution.x = "uniform", distribution.y = "uniform",
rep = 100, test.rep = 200)
Arguments
n |
Number of datapoints to generate, either an integer or a vector of integers. |
effect |
Effect size of the generated datasets (single value or vector of values). |
slope |
Slope of the line (single value or vector of values). |
ceiling |
Ceiling technique to use for this analysis (single value or vector of values). |
corner |
Integer, indicating the corner to analyze, see nca_analysis. |
p |
Significance level. |
distribution.x |
Distribution type(s) for X, "uniform" (default) or "normal" (single value or vector of values). |
distribution.y |
Distribution type(s) for Y, "uniform" (default) or "normal" (single value or vector of values). |
rep |
Number of analyses done per iteration. |
test.rep |
Number of resamples in the statistical approximate permutation test.
For test.rep = 0 no statistical test is performed. See |
Examples
# Simple example
## Not run: results <- nca_power()
print(results)
Function to show the powerplot
Description
Function to show the powerplot, optional over various variables.
Usage
nca_powerplot(df_power, x.variable = 'n', x.name = 'Sample size',
x.min = 0, x.max = NULL,
line.variable = 'ES', line.name = 'Effect size')
Arguments
df_power |
Dataframe as returned by the nca_power function. |
x.variable |
The variable to use for the X scale, default is sample size. |
x.name |
The name for the X scale. |
x.min |
The minimum value for the X scale, defaults to 0. |
x.max |
The maximum value for the X scale, defaults to the maximum X. |
line.variable |
Variable that represented the line. |
line.name |
The name of the variable that represented the line. |
Examples
# Simple example, showing a normal plot with the power as a function of the sample size
# A line for each effect size will be shown
## Not run: results <- nca_power(effect = c(.1, .2, .3))
nca_powerplot(results)
# This example shows the power for different ceiling lines
## Not run: results <- nca_power(ceiling = c("ce_fdh", "cr_fdh"))
powerplot <- nca_powerplot(results, line.variable='ceiling', line.name='Ceiling')
generating random data that meets necessity
Description
Generate N datapoints, with 'normal' or 'uniform' distributions for X and Y
Usage
nca_random(n, intercepts, slopes, corner=1,
distribution.x = "uniform", distribution.y = "uniform",
mean.x = 0.5, mean.y = 0.5, sd.x = 0.2, sd.y = 0.2)
Arguments
n |
Number of observations to generate, should be an integer > 1. |
intercepts |
The intercept or a vector of intercepts of the line. |
slopes |
The slope or a vector if slopes of the line. |
corner |
Define which corner should be empty, default is 1 (upper left). |
distribution.x |
Type of the distribution for X, "uniform" (default) or "normal". |
distribution.y |
Type of the distribution for Y, "uniform" (default) or "normal". |
mean.x |
Distribution Mean of X (default 0.5), ignored distribution.x == "uniform". |
mean.y |
Distribution Mean of Y (default 0.5), ignored distribution.y == "uniform". |
sd.x |
Distribution SD of X (default 0.2), ignored distribution.x == "uniform". |
sd.y |
Distribution SD of Y (default 0.2), ignored distribution.y == "uniform". |
Examples
# Generate a uniform dataset, default for X and Y
data <- nca_random(100, 0, 1)
# It is also possible to generate a dataset with multiple conditions,
# by supplying vectors for the intercepts and slopes
data <- nca_random(100, c(0, 0.25), c(1, 0.75))
# Single values will be repeated to complement a vector
data <- nca_random(100, c(0, 0.25), 1)
# The default is an empty space in the upper left corner.
# A different corner can be selected with the corner argument
data <- nca_random(100, 0, 1, corner=4)
# Generate a dataset with a normal distribution for X and a uniform distribution for Y
data <- nca_random(100, 0, 1, distribution.x = "normal", distribution.y = "uniform")
# Generate a dataset with a normal distribution for X and Y, with adjusted MEAN
data <- nca_random(100, 0, 1, distribution.x = "normal", distribution.y = "normal",
mean.x = 0.75, mean.y = 0.75)
# Generate a dataset with a normal distribution for X and Y, with adjusted SD
data <- nca_random(100, 0, 1, distribution.x = "normal",
distribution.y = "normal", sd.x = 0.1, sd.y = 0.1)
Normalizes numeric data
Description
(Min-max) Normalize the columns of the data
Usage
nca_util_normalize(data, scale = NULL, min_max = NULL)
Arguments
data |
Input dataframe or matrix. |
scale |
Set(s) of output Min-Max values. |
min_max |
Set(s) of (theoretical) Min-Max values. |
Details
scale
If scale is NULL (default), all columns will be normalized between 0 and 1.
Use c(0, 100) to normalize between 0 and 100.
Replicate for each column for different scale, e.g. c(0, 1, 0, 1, 0, 100).
Use NA in the min-max pair if a column does not need to be normalized.
Examples
# Normalize the data between 0 and 1
data(nca.example)
data.normalized <- nca_util_normalize(nca.example)
# Equivalent to this
data.normalized <- nca_util_normalize(nca.example, c(0, 1))
# Normalize the data between 0 and 100
data.normalized <- nca_util_normalize(nca.example, c(0, 100))
# Normalize each column with different values
data.normalized <- nca_util_normalize(nca.example, c(0, 1, 0, 10, 0, 100))
# Use NA if a column doesn't need to be normalized
data.normalized <- nca_util_normalize(nca.example, c(0, 1, NA, NA, 0, 100))
# It is also possible to use a theoretic range for the normalization
data.normalized <- nca_util_normalize(nca.example, min_max = c(0, 100, 0, 150, 0, 300))
parameter defining the point color in the plots
Description
This will be used for the col parameters of the plots, default is blue.
Set before calling nca_output
> point.color <- red
parameter defining the plotting symbol in the plots
Description
This will be used for the pch parameter of the plots, default is 21.
Set before calling nca_output
> point.type <- 22
See
http://www.sthda.com/english/wiki/r-plot-pch-symbols-the-different-point-shapes-available-in-r for more symbols