PINE LIBRARY

DataCleaner

Updated
Library "DataCleaner"
Functions for acquiring outlier levels and acquiring a cleaned version of a series.

outlierLevel(src, len, level) Gets the (standard deviation) outlier level for a given series.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    len: The The number of bars to measure.
    level: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
  Returns: The average of the series plus the multiple of the standard deviation.

cleanUsing(src, result, len, maxDeviation) Returns an array representing the result series with (outliers provided by the source) removed.
  Parameters:
    src: The source series to read from.
    result: The result series.
    len: The maximum size of the resultant array.
    maxDeviation: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
  Returns: An array containing the cleaned series.

clean(src, len, maxDeviation) Returns an array representing the source series with outliers removed.
  Parameters:
    src: The source series to read from.
    len: The maximum size of the resultant array.
    maxDeviation: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
  Returns: An array containing the cleaned series.

outlierLevelAdjusted(src, level, len, maxDeviation) Gets the (standard deviation) outlier level for a given series after a single pass of removing any outliers.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    level: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
    len: The The number of bars to measure.
    maxDeviation: The optional standard deviation level to use when cleaning the series. The default is the value of the provided level.
  Returns: The average of the series plus the multiple of the standard deviation.
Release Notes
v2 Added simple utility for cleaning arrays.

Added:
cleanArray(src, maxDeviation) Returns an array representing the source array with outliers removed.
  Parameters:
    src: The source series to read from.
    maxDeviation: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
  Returns: An array containing the cleaned series.
Release Notes
v3 Bugfix: cleanArray should avoid empty arrays.
Release Notes
v4 Exposes simpler "naOutliers" method.

Added:
naOutliers(src, len, maxDeviation) Returns only values that are within the maximum deviation.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    len: The The number of bars to measure.
    maxDeviation: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
  Returns: The average of the series plus the multiple of the standard deviation.
Release Notes
v5 Better doc comments.
Release Notes
v6 Added naOutliersArray for keeping the original array indexes but setting the values to NA when they are outliers.

Added:
naArrayOutliers(src, maxDeviation) Returns an array representing the source array with outliers removed.
  Parameters:
    src: The array to set outliers to N/A.
    maxDeviation: The maximum deviation before considered an outlier.
  Returns: True if there were any outliers; otherwise false.
Release Notes
v7 Added normalize function and improved naOutliers.

Added:
normalize(src, len, maxDeviation, baseline) Returns the source value adjusted by its standard deviation.
  Parameters:
    src: The series to measure.
    len: The number of bars to measure the standard deviation.
    maxDeviation: The maximum deviation before considered an outlier.
    baseline: The value considered to be at center. Typically zero.
Release Notes
v8 Fix normalize funciton.
Release Notes
v9 Major Update:
  • Allow for a standard deviation function that returns both the mean and the standard deviation and avoid double calculation.
  • Expose the option to use WMA instead of SMA when averaging.
  • Expose the option to use SMA as a smoothing function.


Added:
stdev(src)
  Calculates and returns both the mean and the standard deviation.
  Parameters:
    src: The array to use for the calculation.

Deviation
  Contains the mean (average) and the value of the standard deviation.
  Fields:
    mean: The mean (average).
    stdev

Updated:
outlierLevel(src, len, level, useWma, smoothing)
  Gets the (standard deviation) outlier level for a given series.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    len: The The number of bars to measure.
    level: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
    useWma
    smoothing
  Returns: The average of the series plus the multiple of the standard deviation.

naOutliers(src, len, maxDeviation, useWma, smoothing)
  Returns only values that are within the maximum deviation.
  Parameters:
    src: The series to filter results from.
    len: The The number of bars to measure.
    maxDeviation: The maximum deviation before considered an outlier.
    useWma: When true, the calulcation is done using an improved weighted moving average to avoid giving significance to older values.
    smoothing

normalize(src, len, maxDeviation, baseline, useWma, smoothing)
  Returns the source value adjusted by its standard deviation.
  Parameters:
    src: The series to measure.
    len: The number of bars to measure the standard deviation.
    maxDeviation: The maximum deviation before considered an outlier.
    baseline: The value considered to be at center. Typically zero.
    useWma
    smoothing

outlierLevelAdjusted(src, len, level, maxDeviation, useWma, smoothing)
  Gets the (standard deviation) outlier level for a given series after a single pass of removing any outliers.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    len: The The number of bars to measure.
    level: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
    maxDeviation: The optional standard deviation level to use when cleaning the series. The default is the value of the provided level.
    useWma
    smoothing
  Returns: The average of the series plus the multiple of the standard deviation.
Release Notes
v10 Allow for custom input if used as an indicator.
Release Notes
v11 Documentation update

Updated:
outlierLevel(src, len, level, useWma, smoothing)
  Gets the (standard deviation) outlier level for a given series.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    len: The The number of bars to measure.
    level: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
    useWma: When true, the calulcation is done using an improved weighted moving average to avoid giving significance to older values.
    smoothing: The number of extra bars to help in smoothing out the result so that large spikes dont' occur from recent data.
  Returns: The average of the series plus the multiple of the standard deviation.

naOutliers(src, len, maxDeviation, useWma, smoothing)
  Returns only values that are within the maximum deviation.
  Parameters:
    src: The series to filter results from.
    len: The The number of bars to measure.
    maxDeviation: The maximum deviation before considered an outlier.
    useWma: When true, the calulcation is done using an improved weighted moving average to avoid giving significance to older values.
    smoothing: The number of extra bars to help in smoothing out the result so that large spikes dont' occur from recent data.

normalize(src, len, maxDeviation, baseline, useWma, smoothing)
  Returns the source value adjusted by its standard deviation.
  Parameters:
    src: The series to measure.
    len: The number of bars to measure the standard deviation.
    maxDeviation: The maximum deviation before considered an outlier.
    baseline: The value considered to be at center. Typically zero.
    useWma: When true, the calulcation is done using an improved weighted moving average to avoid giving significance to older values.
    smoothing: The number of extra bars to help in smoothing out the result so that large spikes dont' occur from recent data.

outlierLevelAdjusted(src, len, level, maxDeviation, useWma, smoothing)
  Gets the (standard deviation) outlier level for a given series after a single pass of removing any outliers.
  Parameters:
    src: The series to average and add a multiple of the standard deviation to.
    len: The The number of bars to measure.
    level: The positive or negative multiple of the standard deviation to apply to the average. A positive number will be the upper boundary and a negative number will be the lower boundary.
    maxDeviation: The optional standard deviation level to use when cleaning the series. The default is the value of the provided level.
    useWma: When true, the calulcation is done using an improved weighted moving average to avoid giving significance to older values.
    smoothing: The number of extra bars to help in smoothing out the result so that large spikes dont' occur from recent data.
  Returns: The average of the series plus the multiple of the standard deviation.
cleanDATAoutliersstandarddevationstatistics

Pine library

In true TradingView spirit, the author has published this Pine code as an open-source library so that other Pine programmers from our community can reuse it. Cheers to the author! You may use this library privately or in other open-source publications, but reuse of this code in a publication is governed by House rules.

Disclaimer