Kernelestimation — Indicators and Strategies

Kernels ©2024, GoemonYae; copied from @jdehorty's "KernelFunctions" on 2024-03-09 to ensure future dependency compatibility. Will also add more functions to this script. Library "KernelFunctions" This library provides non-repainting kernel functions for Nadaraya-Watson estimator implementations. This allows for easy substition/comparison of different kernel functions for one another in indicators. Furthermore, kernels can easily be combined with other kernels to create newer, more customized kernels. rationalQuadratic(_src, _lookback, _relativeWeight, startAtBar) Rational Quadratic Kernel - An infinite sum of Gaussian Kernels of different length scales. Parameters: _src (float) : The source series. _lookback (simple int) : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _relativeWeight (simple float) : Relative weighting of time frames. Smaller values resut in a more stretched out curve and larger values will result in a more wiggly curve. As this value approaches zero, the longer time frames will exert more influence on the estimation. As this value approaches infinity, the behavior of the Rational Quadratic Kernel will become identical to the Gaussian kernel. startAtBar (simple int) Returns: yhat The estimated values according to the Rational Quadratic Kernel. gaussian(_src, _lookback, startAtBar) Gaussian Kernel - A weighted average of the source series. The weights are determined by the Radial Basis Function (RBF). Parameters: _src (float) : The source series. _lookback (simple int) : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. startAtBar (simple int) Returns: yhat The estimated values according to the Gaussian Kernel. periodic(_src, _lookback, _period, startAtBar) Periodic Kernel - The periodic kernel (derived by David Mackay) allows one to model functions which repeat themselves exactly. Parameters: _src (float) : The source series. _lookback (simple int) : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _period (simple int) : The distance between repititions of the function. startAtBar (simple int) Returns: yhat The estimated values according to the Periodic Kernel. locallyPeriodic(_src, _lookback, _period, startAtBar) Locally Periodic Kernel - The locally periodic kernel is a periodic function that slowly varies with time. It is the product of the Periodic Kernel and the Gaussian Kernel. Parameters: _src (float) : The source series. _lookback (simple int) : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _period (simple int) : The distance between repititions of the function. startAtBar (simple int) Returns: yhat The estimated values according to the Locally Periodic Kernel.

Bandwidth Volatility - Silverman Rule of thumb Estimator Overview This indicator calculates volatility using the Rule of Thumb bandwidth estimator and incorporating the standard deviations of returns to get historical volatility. There are two options: one for the original rule of thumb bandwidth estimator, and another for the modified rule of thumb estimator. This indicator comes with the bandwidth , which is shown with the color gradient columns, which are colored by a percentile of the bandwidth, and the moving average of the bandwidth, which is the dark shaded area. The rule of thumb bandwidth estimator is a simple and quick method for estimating the bandwidth parameter in kernel density estimation (KSE) or kernel regression. It provides a rough approximation of the bandwidth without requiring extensive computation resources or fine-tuning. One common rule of thumb estimator is Silverman rule, which is given by h = 1.06*σ*n^(-1/5) where h is the bandwidth σ is the standard deviation of the data n is the number of data points This rule of thumb is based on assuming a Gaussian kernel and aims to strike a balance between over-smoothing and under-smoothing the data. It is simple to implement and usually provides reasonable bandwidth estimates for a wide range of datasets. However , it is important to note that this rule of thumb may not always have optimal results, especially for non-Gaussian or multimodal distributions. In such cases, a modified bandwidth selection, such as cross-validation or even applying a log transformation (if the data is right-skewed), may be preferable. How it works: This indicator computes the bandwidth volatility using returns, which are used in the standard deviation calculation. It then estimates the bandwidth based on either the Silverman rule of thumb or a modified version considering the interquartile range. The percentile ranks of the bandwidth estimate are then used to visualize the volatility levels, identify high and low volatility periods, and show them with colors. Modified Rule of thumb Bandwidth: The modified rule of thumb bandwidth formula combines elements of standard deviations and interquartile ranges, scaled by a multiplier of 0.9 and inversely with a number of periods. This modification aims to provide a more robust and adaptable bandwidth estimation method, particularly suitable for financial time series data with potentially skewed or heavy-tailed data. Formula for Modified Rule of Thumb Bandwidth: h = 0.9 * min(σ, (IQR/1.34))*n^(-1/5) This modification introduces the use of the IQR divided by 1.34 as an alternative to the standard deviation. It aims to improve the estimation, mainly when the underlying distribution deviates from a perfect Gaussian distribution. Analysis Rule of thumb Bandwidth: Provides a broader perspective on volatility trends, smoothing out short-term fluctuations and focusing more on the overall shape of the density function. Historical Volatility: Offers a more granular view of volatility, capturing day-to-day or intra-period fluctuations in asset prices and returns. Modelling Requirements Rule of thumb Bandwidth: Provides a broader perspective on volatility trends, smoothing out short-term fluctuations and focusing more on the overall shape of the density function. Historical Volatility: Offers a more granular view of volatility, capturing day-to-day or intra-period fluctuations in asset prices and returns. Pros of Bandwidth as a volatility measure Robust to Data Distribution: Bandwidth volatility, especially when estimated using robust methods like Silverman's rule of thumb or its modifications, can be less sensitive to outliers and non-normal distributions compared to some other measures of volatility Flexibility: It can be applied to a wide range of data types and can adapt to different underlying data distributions, making it versatile for various analytical tasks. How can traders use this indicator? In finance, volatility is thought to be a mean-reverting process. So when volatility is at an extreme low, it is expected that a volatility expansion happens, which comes with bigger movements in price, and when volatility is at an extreme high, it is expected for volatility to eventually decrease, leading to smaller price moves, and many traders view this as an area to take profit in. In the context of this indicator, low volatility is thought of as having the green color, which indicates a low percentile value, and also being below the moving average. High volatility is thought of as having the yellow color and possibly being above the moving average, showing that you can eventually expect volatility to decrease.

Bandwidth Bands - Silverman's rule of thumb What are Bandwidth Bands? This indicator uses Silverman Rule of Thumb Bandwidth to estimate the width of bands around the rolling moving average which takes in the log transformation of price to remove most of price skewness for the rest of the volatility calculations and then a exp() function is performed to convert it back to a right skewed distribution. These bandwidths bands could offer insights into price volatility and trading extremes. Silverman rule of thumb bandwidth: The Silverman Rule of Thumb Bandwidth is a heuristic method used to estimate the optimal bandwidth for kernel density estimation, a statistical technique for estimating the probability density function of a random variable. In the context of financial analysis, such as in this indicator, it helps determine the width of bands around a moving average, providing insights into the level of volatility in the market. This method is particularly useful because it offers a quick and straightforward way to estimate bandwidth without requiring extensive computational resources or complex mathematical calculation The bandwidth estimator automatically adjust to the characteristics of the data, providing a flexible and dynamic measure of dispersion that can capture variations in volatility over time. Standard deviations alone may not be as adaptive to changes in data distributions. The Bandwidth considers the overall shape and structure of the data distribution rather than just focusing on the spread of data points. Settings Source Sample length 1-4 SD options to disable or enable each band

WaveTrend 3D █ OVERVIEW WaveTrend 3D (WT3D) is a novel implementation of the famous WaveTrend (WT) indicator and has been completely redesigned from the ground up to address some of the inherent shortcomings associated with the traditional WT algorithm. █ BACKGROUND The WaveTrend (WT) indicator has become a widely popular tool for traders in recent years. WT was first ported to PineScript in 2014 by the user @LazyBear, and since then, it has ascended to become one of the Top 5 most popular scripts on TradingView. The WT algorithm appears to have origins in a lesser-known proprietary algorithm called Trading Channel Index (TCI), created by AIQ Systems in 1986 as an integral part of their commercial software suite, TradingExpert Pro. The software’s reference manual states that “TCI identifies changes in price direction” and is “an adaptation of Donald R. Lambert’s Commodity Channel Index (CCI)”, which was introduced to the world six years earlier in 1980. Interestingly, a vestige of this early beginning can still be seen in the source code of LazyBear’s script, where the final EMA calculation is stored in an intermediate variable called “tci” in the code. █ IMPLEMENTATION DETAILS WaveTrend 3D is an alternative implementation of WaveTrend that directly addresses some of the known shortcomings of the indicator, including its unbounded extremes, susceptibility to whipsaw, and lack of insight into other timeframes. In the canonical WT approach, an exponential moving average (EMA) for a given lookback window is used to assess the variability between price and two other EMAs relative to a second lookback window. Since the difference between the average price and its associated EMA is essentially unbounded, an arbitrary scaling factor of 0.015 is typically applied as a crude form of rescaling but still fails to capture 20-30% of values between the range of -100 to 100. Additionally, the trigger signal for the final EMA (i.e., TCI) crossover-based oscillator is a four-bar simple moving average (SMA), which further contributes to the net lag accumulated by the consecutive EMA calculations in the previous steps. The core idea behind WT3D is to replace the EMA-based crossover system with modern Digital Signal Processing techniques. By assuming that price action adheres approximately to a Gaussian distribution, it is possible to sidestep the scaling nightmare associated with unbounded price differentials of the original WaveTrend method by focusing instead on the alteration of the underlying Probability Distribution Function (PDF) of the input series. Furthermore, using a signal processing filter such as a Butterworth Filter, we can eliminate the need for consecutive exponential moving averages along with the associated lag they bring. Ideally, it is convenient to have the resulting probability distribution oscillate between the values of -1 and 1, with the zero line serving as a median. With this objective in mind, it is possible to borrow a common technique from the field of Machine Learning that uses a sigmoid-like activation function to transform our data set of interest. One such function is the hyperbolic tangent function (tanh), which is often used as an activation function in the hidden layers of neural networks due to its unique property of ensuring the values stay between -1 and 1. By taking the first-order derivative of our input series and normalizing it using the quadratic mean, the tanh function performs a high-quality redistribution of the input signal into the desired range of -1 to 1. Finally, using a dual-pole filter such as the Butterworth Filter popularized by John Ehlers, excessive market noise can be filtered out, leaving behind a crisp moving average with minimal lag. Furthermore, WT3D expands upon the original functionality of WT by providing: First-class support for multi-timeframe (MTF) analysis Kernel-based regression for trend reversal confirmation Various options for signal smoothing and transformation A unique mode for visualizing an input series as a symmetrical, three-dimensional waveform useful for pattern identification and cycle-related analysis █ SETTINGS This is a summary of the settings used in the script listed in roughly the order in which they appear. By default, all default colors are from Google's TensorFlow framework and are considered to be colorblind safe. Source: The input series. Usually, it is the close or average price, but it can be any series. Use Mirror: Whether to display a mirror image of the source series; for visualizing the series as a 3D waveform similar to a soundwave. Use EMA: Whether to use an exponential moving average of the input series. EMA Length: The length of the exponential moving average. Use COG: Whether to use the center of gravity of the input series. COG Length: The length of the center of gravity. Speed to Emphasize: The target speed to emphasize. Width: The width of the emphasized line. Display Kernel Moving Average: Whether to display the kernel moving average of the signal. Like PCA, an unsupervised Machine Learning technique whereby neighboring vectors are projected onto the Principal Component. Display Kernel Signal: Whether to display the kernel estimator for the emphasized line. Like the Kernel MA, it can show underlying shifts in bias within a more significant trend by the colors reflected on the ribbon itself. Show Oscillator Lines: Whether to show the oscillator lines. Offset: The offset of the emphasized oscillator plots. Fast Length: The length scale factor for the fast oscillator. Fast Smoothing: The smoothing scale factor for the fast oscillator. Normal Length: The length scale factor for the normal oscillator. Normal Smoothing: The smoothing scale factor for the normal frequency. Slow Length: The length scale factor for the slow oscillator. Slow Smoothing: The smoothing scale factor for the slow frequency. Divergence Threshold: The number of bars for the divergence to be considered significant. Trigger Wave Percent Size: How big the current wave should be relative to the previous wave. Background Area Transparency Factor: Transparency factor for the background area. Foreground Area Transparency Factor: Transparency factor for the foreground area. Background Line Transparency Factor: Transparency factor for the background line. Foreground Line Transparency Factor: Transparency factor for the foreground line. Custom Transparency: Transparency of the custom colors. Total Gradient Steps: The maximum amount of steps supported for a gradient calculation is 256. Fast Bullish Color: The color of the fast bullish line. Normal Bullish Color: The color of the normal bullish line. Slow Bullish Color: The color of the slow bullish line. Fast Bearish Color: The color of the fast bearish line. Normal Bearish Color: The color of the normal bearish line. Slow Bearish Color: The color of the slow bearish line. Bullish Divergence Signals: The color of the bullish divergence signals. Bearish Divergence Signals: The color of the bearish divergence signals. █ ACKNOWLEDGEMENTS @LazyBear - For authoring the original WaveTrend port on TradingView @PineCoders - For the beautiful color gradient framework used in this indicator @veryfid - For the inspiration of using mirrored signals for cycle analysis and using multiple lookback windows as proxies for other timeframes

KernelFunctions Library "KernelFunctions" This library provides non-repainting kernel functions for Nadaraya-Watson estimator implementations. This allows for easy substitution/comparison of different kernel functions for one another in indicators. Furthermore, kernels can easily be combined with other kernels to create newer, more customized kernels. Compared to Moving Averages (which are really just simple kernels themselves), these kernel functions are more adaptive and afford the user an unprecedented degree of customization and flexibility. rationalQuadratic(_src, _lookback, _relativeWeight, _startAtBar) Rational Quadratic Kernel - An infinite sum of Gaussian Kernels of different length scales. Parameters: _src : The source series. _lookback : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _relativeWeight : Relative weighting of time frames. Smaller values result in a more stretched-out curve, and larger values will result in a more wiggly curve. As this value approaches zero, the longer time frames will exert more influence on the estimation. As this value approaches infinity, the behavior of the Rational Quadratic Kernel will become identical to the Gaussian kernel. _startAtBar : Bar index on which to start regression. The first bars of a chart are often highly volatile, and omitting these initial bars often leads to a better overall fit. Returns: yhat The estimated values according to the Rational Quadratic Kernel. gaussian(_src, _lookback, _startAtBar) Gaussian Kernel - A weighted average of the source series. The weights are determined by the Radial Basis Function (RBF). Parameters: _src : The source series. _lookback : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _startAtBar : Bar index on which to start regression. The first bars of a chart are often highly volatile, and omitting these initial bars often leads to a better overall fit. Returns: yhat The estimated values according to the Gaussian Kernel. periodic(_src, _lookback, _period, _startAtBar) Periodic Kernel - The periodic kernel (derived by David Mackay) allows one to model functions that repeat themselves exactly. Parameters: _src : The source series. _lookback : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _period : The distance between repititions of the function. _startAtBar : Bar index on which to start regression. The first bars of a chart are often highly volatile, and omitting these initial bars often leads to a better overall fit. Returns: yhat The estimated values according to the Periodic Kernel. locallyPeriodic(_src, _lookback, _period, _startAtBar) Locally Periodic Kernel - The locally periodic kernel is a periodic function that slowly varies with time. It is the product of the Periodic Kernel and the Gaussian Kernel. Parameters: _src : The source series. _lookback : The number of bars used for the estimation. This is a sliding value that represents the most recent historical bars. _period : The distance between repititions of the function. _startAtBar : Bar index on which to start regression. The first bars of a chart are often highly volatile, and omitting these initial bars often leads to a better overall fit. Returns: yhat The estimated values according to the Locally Periodic Kernel.

Nadaraya-Watson: Rational Quadratic Kernel (Non-Repainting)What is Nadaraya–Watson Regression? Nadaraya–Watson Regression is a type of Kernel Regression, which is a non-parametric method for estimating the curve of best fit for a dataset. Unlike Linear Regression or Polynomial Regression, Kernel Regression does not assume any underlying distribution of the data. For estimation, it uses a kernel function, which is a weighting function that assigns a weight to each data point based on how close it is to the current point. The computed weights are then used to calculate the weighted average of the data points. How is this different from using a Moving Average? A Simple Moving Average is actually a special type of Kernel Regression that uses a Uniform (Retangular) Kernel function. This means that all data points in the specified lookback window are weighted equally. In contrast, the Rational Quadratic Kernel function used in this indicator assigns a higher weight to data points that are closer to the current point. This means that the indicator will react more quickly to changes in the data. Why use the Rational Quadratic Kernel over the Gaussian Kernel? The Gaussian Kernel is one of the most commonly used Kernel functions and is used extensively in many Machine Learning algorithms due to its general applicability across a wide variety of datasets. The Rational Quadratic Kernel can be thought of as a Gaussian Kernel on steroids; it is equivalent to adding together many Gaussian Kernels of differing length scales. This allows the user even more freedom to tune the indicator to their specific needs. The formula for the Rational Quadratic function is: K(x, x') = (1 + ||x - x'||^2 / (2 * alpha * h^2))^(-alpha) where x and x' data are points, alpha is a hyperparameter that controls the smoothness (i.e. overall "wiggle") of the curve, and h is the band length of the kernel. Does this Indicator Repaint? No, this indicator has been intentionally designed to NOT repaint. This means that once a bar has closed, the indicator will never change the values in its plot. This is useful for backtesting and for trading strategies that require a non-repainting indicator. Settings: Bandwidth. This is the number of bars that the indicator will use as a lookback window. Relative Weighting Parameter. The alpha parameter for the Rational Quadratic Kernel function. This is a hyperparameter that controls the smoothness of the curve. A lower value of alpha will result in a smoother, more stretched-out curve, while a lower value will result in a more wiggly curve with a tighter fit to the data. As this parameter approaches 0, the longer time frames will exert more influence on the estimation, and as it approaches infinity, the curve will become identical to the one produced by the Gaussian Kernel. Color Smoothing. Toggles the mechanism for coloring the estimation plot between rate of change and cross over modes.

STD-Filtered, Gaussian-Kernel-Weighted Moving Average [Loxx]STD-Filtered, Gaussian-Kernel-Weighted Moving Average is a moving average that weights price by using a Gaussian kernel function to calculate data points. This indicator also allows for filtering both source input price and output signal using a standard deviation filter. Purpose This purpose of this indicator is to take the concept of Kernel estimation and apply it in a way where instead of predicting past values, the weighted function predicts the current bar value at each bar to create a moving average that is suitable for trading. Normally this method is used to create an array of past estimators to model past data but this method is not useful for trading as the past values will repaint. This moving average does NOT repaint, however you much allow signals to close on the current bar before taking the signal. You can compare this to Nadaraya-Watson Estimator wherein they use Nadaraya-Watson estimator method with normalized kernel weighted function to model price. What are Kernel Functions? A kernel function is used as a weighing function to develop non-parametric regression model is discussed. In the beginning of the article, a brief discussion about properties of kernel functions and steps to build kernels around data points are presented. Kernel Function In non-parametric statistics, a kernel is a weighting function which satisfies the following properties. A kernel function must be symmetrical. Mathematically this property can be expressed as K (-u) = K (+u). The symmetric property of kernel function enables its maximum value (max(K(u)) to lie in the middle of the curve. The area under the curve of the function must be equal to one. Mathematically, this property is expressed as: integral −∞ + ∞ ∫ K(u)d(u) = 1 Value of kernel function can not be negative i.e. K(u) ≥ 0 for all −∞ < u < ∞. Kernel Estimation In this article, Gaussian kernel function is used to calculate kernels for the data points. The equation for Gaussian kernel is: K(u) = (1 / sqrt(2pi)) * e^(-0.5 *(j / bw)^2) Where xi is the observed data point. j is the value where kernel function is computed and bw is called the bandwidth. Bandwidth in kernel regression is called the smoothing parameter because it controls variance and bias in the output. The effect of bandwidth value on model prediction is discussed later in this article. Included Loxx's Expanded Source types Signals Alerts Bar coloring