Example: Polynomial Regression for Spread AnalysisExample of applying polynomial regression channel to spreads or hedges between 2 assets.
Regression
Polynomial Regression Bands + Channel [DW]This is an experimental study designed to calculate polynomial regression for any order polynomial that TV is able to support.
This study aims to educate users on polynomial curve fitting, and the derivation process of Least Squares Moving Averages (LSMAs).
I also designed this study with the intent of showcasing some of the capabilities and potential applications of TV's fantastic new array functions.
Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modeled as a polynomial of nth degree (order).
For clarification, linear regression can also be described as a first order polynomial regression. The process of deriving linear, quadratic, cubic, and higher order polynomial relationships is all the same.
In addition, although deriving a polynomial regression equation results in a nonlinear output, the process of solving for polynomials by least squares is actually a special case of multiple linear regression.
So, just like in multiple linear regression, polynomial regression can be solved in essentially the same way through a system of linear equations.
In this study, you are first given the option to smooth the input data using the 2 pole Super Smoother Filter from John Ehlers.
I chose this specific filter because I find it provides superior smoothing with low lag and fairly clean cutoff. You can, of course, implement your own filter functions to see how they compare if you feel like experimenting.
Filtering noise prior to regression calculation can be useful for providing a more stable estimation since least squares regression can be rather sensitive to noise.
This is especially true on lower sampling lengths and higher degree polynomials since the regression output becomes more "overfit" to the sample data.
Next, data arrays are populated for the x-axis and y-axis values. These are the main datasets utilized in the rest of the calculations.
To keep the calculations more numerically stable for higher periods and orders, the x array is filled with integers 1 through the sampling period rather than using current bar numbers.
This process can be thought of as shifting the origin of the x-axis as new data emerges.
This keeps the axis values significantly lower than the 10k+ bar values, thus maintaining more numerical stability at higher orders and sample lengths.
The data arrays are then used to create a pseudo 2D matrix of x power sums, and a vector of x power*y sums.
These matrices are a representation the system of equations that need to be solved in order to find the regression coefficients.
Below, you'll see some examples of the pattern of equations used to solve for our coefficients represented in augmented matrix form.
For example, the augmented matrix for the system equations required to solve a second order (quadratic) polynomial regression by least squares is formed like this:
(∑x^0 ∑x^1 ∑x^2 | ∑(x^0)y)
(∑x^1 ∑x^2 ∑x^3 | ∑(x^1)y)
(∑x^2 ∑x^3 ∑x^4 | ∑(x^2)y)
The augmented matrix for the third order (cubic) system is formed like this:
(∑x^0 ∑x^1 ∑x^2 ∑x^3 | ∑(x^0)y)
(∑x^1 ∑x^2 ∑x^3 ∑x^4 | ∑(x^1)y)
(∑x^2 ∑x^3 ∑x^4 ∑x^5 | ∑(x^2)y)
(∑x^3 ∑x^4 ∑x^5 ∑x^6 | ∑(x^3)y)
This pattern continues for any n ordered polynomial regression, in which the coefficient matrix is a n + 1 wide square matrix with the last term being ∑x^2n, and the last term of the result vector being ∑(x^n)y.
Thanks to this pattern, it's rather convenient to solve the for our regression coefficients of any nth degree polynomial by a number of different methods.
In this script, I utilize a process known as LU Decomposition to solve for the regression coefficients.
Lower-upper (LU) Decomposition is a neat form of matrix manipulation that expresses a 2D matrix as the product of lower and upper triangular matrices.
This decomposition method is incredibly handy for solving systems of equations, calculating determinants, and inverting matrices.
For a linear system Ax=b, where A is our coefficient matrix, x is our vector of unknowns, and b is our vector of results, LU Decomposition turns our system into LUx=b.
We can then factor this into two separate matrix equations and solve the system using these two simple steps:
1. Solve Ly=b for y, where y is a new vector of unknowns that satisfies the equation, using forward substitution.
2. Solve Ux=y for x using backward substitution. This gives us the values of our original unknowns - in this case, the coefficients for our regression equation.
After solving for the regression coefficients, the values are then plugged into our regression equation:
Y = a0 + a1*x + a1*x^2 + ... + an*x^n, where a() is the ()th coefficient in ascending order and n is the polynomial degree.
From here, an array of curve values for the period based on the current equation is populated, and standard deviation is added to and subtracted from the equation to calculate the channel high and low levels.
The calculated curve values can also be shifted to the left or right using the "Regression Offset" input
Changing the offset parameter will move the curve left for negative values, and right for positive values.
This offset parameter shifts the curve points within our window while using the same equation, allowing you to use offset datapoints on the regression curve to calculate the LSMA and bands.
The curve and channel's appearance is optionally approximated using Pine's v4 line tools to draw segments.
Since there is a limitation on how many lines can be displayed per script, each curve consists of 10 segments with lengths determined by a user defined step size. In total, there are 30 lines displayed at once when active.
By default, the step size is 10, meaning each segment is 10 bars long. This is because the default sampling period is 100, so this step size will show the approximate curve for the entire period.
When adjusting your sampling period, be sure to adjust your step size accordingly when curve drawing is active if you want to see the full approximate curve for the period.
Note that when you have a larger step size, you will see more seemingly "sharp" turning points on the polynomial curve, especially on higher degree polynomials.
The polynomial functions that are calculated are continuous and differentiable across all points. The perceived sharpness is simply due to our limitation on available lines to draw them.
The approximate channel drawings also come equipped with style inputs, so you can control the type, color, and width of the regression, channel high, and channel low curves.
I also included an input to determine if the curves are updated continuously, or only upon the closing of a bar for reduced runtime demands. More about why this is important in the notes below.
For additional reference, I also included the option to display the current regression equation.
This allows you to easily track the polynomial function you're using, and to confirm that the polynomial is properly supported within Pine.
There are some cases that aren't supported properly due to Pine's limitations. More about this in the notes on the bottom.
In addition, I included a line of text beneath the equation to indicate how many bars left or right the calculated curve data is currently shifted.
The display label comes equipped with style editing inputs, so you can control the size, background color, and text color of the equation display.
The Polynomial LSMA, high band, and low band in this script are generated by tracking the current endpoints of the regression, channel high, and channel low curves respectively.
The output of these bands is similar in nature to Bollinger Bands, but with an obviously different derivation process.
By displaying the LSMA and bands in tandem with the polynomial channel, it's easy to visualize how LSMAs are derived, and how the process that goes into them is drastically different from a typical moving average.
The main difference between LSMA and other MAs is that LSMA is showing the value of the regression curve on the current bar, which is the result of a modelled relationship between x and the expected value of y.
With other MA / filter types, they are typically just averaging or frequency filtering the samples. This is an important distinction in interpretation. However, both can be applied similarly when trading.
An important distinction with the LSMA in this script is that since we can model higher degree polynomial relationships, the LSMA here is not limited to only linear as it is in TV's built in LSMA.
Bar colors are also included in this script. The color scheme is based on disparity between source and the LSMA.
This script is a great study for educating yourself on the process that goes into polynomial regression, as well as one of the many processes computers utilize to solve systems of equations.
Also, the Polynomial LSMA and bands are great components to try implementing into your own analysis setup.
I hope you all enjoy it!
--------------------------------------------------------
NOTES:
- Even though the algorithm used in this script can be implemented to find any order polynomial relationship, TV has a limit on the significant figures for its floating point outputs.
This means that as you increase your sampling period and / or polynomial order, some higher order coefficients will be output as 0 due to floating point round-off.
There is currently no viable workaround for this issue since there isn't a way to calculate more significant figures than the limit.
However, in my humble opinion, fitting a polynomial higher than cubic to most time series data is "overkill" due to bias-variance tradeoff.
Although, this tradeoff is also dependent on the sampling period. Keep that in mind. A good rule of thumb is to aim for a nice "middle ground" between bias and variance.
If TV ever chooses to expand its significant figure limits, then it will be possible to accurately calculate even higher order polynomials and periods if you feel the desire to do so.
To test if your polynomial is properly supported within Pine's constraints, check the equation label.
If you see a coefficient value of 0 in front of any of the x values, reduce your period and / or polynomial order.
- Although this algorithm has less computational complexity than most other linear system solving methods, this script itself can still be rather demanding on runtime resources - especially when drawing the curves.
In the event you find your current configuration is throwing back an error saying that the calculation takes too long, there are a few things you can try:
-> Refresh your chart or hide and unhide the indicator.
The runtime environment on TV is very dynamic and the allocation of available memory varies with collective server usage.
By refreshing, you can often get it to process since you're basically just waiting for your allotment to increase. This method works well in a lot of cases.
-> Change the curve update frequency to "Close Only".
If you've tried refreshing multiple times and still have the error, your configuration may simply be too demanding of resources.
v4 drawing objects, most notably lines, can be highly taxing on the servers. That's why Pine has a limit on how many can be displayed in the first place.
By limiting the curve updates to only bar closes, this will significantly reduce the runtime needs of the lines since they will only be calculated once per bar.
Note that doing this will only limit the visual output of the curve segments. It has no impact on regression calculation, equation display, or LSMA and band displays.
-> Uncheck the display boxes for the drawing objects.
If you still have troubles after trying the above options, then simply stop displaying the curve - unless it's important to you.
As I mentioned, v4 drawing objects can be rather resource intensive. So a simple fix that often works when other things fail is to just stop them from being displayed.
-> Reduce sampling period, polynomial order, or curve drawing step size.
If you're having runtime errors and don't want to sacrifice the curve drawings, then you'll need to reduce the calculation complexity.
If you're using a large sampling period, or high order polynomial, the operational complexity becomes significantly higher than lower periods and orders.
When you have larger step sizes, more historical referencing is used for x-axis locations, which does have an impact as well.
By reducing these parameters, the runtime issue will often be solved.
Another important detail to note with this is that you may have configurations that work just fine in real time, but struggle to load properly in replay mode.
This is because the replay framework also requires its own allotment of runtime, so that must be taken into consideration as well.
- Please note that the line and label objects are reprinted as new data emerges. That's simply the nature of drawing objects vs standard plots.
I do not recommend or endorse basing your trading decisions based on the drawn curve. That component is merely to serve as a visual reference of the current polynomial relationship.
No repainting occurs with the Polynomial LSMA and bands though. Once the bar is closed, that bar's calculated values are set.
So when using the LSMA and bands for trading purposes, you can rest easy knowing that history won't change on you when you come back to view them.
- For those who intend on utilizing or modifying the functions and calculations in this script for their own scripts, I included debug dialogues in the script for all of the arrays to make the process easier.
To use the debugs, see the "Debugs" section at the bottom. All dialogues are commented out by default.
The debugs are displayed using label objects. By default, I have them all located to the right of current price.
If you wish to display multiple debugs at once, it will be up to you to decide on display locations at your leisure.
When using the debugs, I recommend commenting out the other drawing objects (or even all plots) in the script to prevent runtime issues and overlapping displays.
My BTC log curveLogarithmic regression of the USD price of Bitcoin , calculated according to the equation:
y=A*exp(beta*x^lambda + c) + m*x + b
where x is the number of days since the genesis block. All parameters are editable in the script options.
Function Polynomial RegressionDescription:
A function that returns a polynomial regression and deviation information for a data set.
Inputs:
_X: Array containing x data points.
_Y: Array containing y data points.
Outputs:
_predictions: Array with adjusted _Y values.
_max_dev: Max deviation from the mean.
_min_dev: Min deviation from the mean.
_stdev/_sizeX: Average deviation from the mean.
Resources:
en.wikipedia.org
rosettacode.org
Function - Linear RegressionDescription:
A Function that returns a linear regression channel using (X,Y) vector points.
Inputs:
_X: Array containing x data points.¹
_Y: Array containing y data points.¹
Note:
¹: _X and _Y size must match.
Outputs:
_predictions: Array with adjusted _Y values at _X.
_max_dev: Max deviation from the mean.
_min_dev: Min deviation from the mean.
_stdev/_sizeX: Average deviation from the mean.
Resources:
www.statisticshowto.com
en.wikipedia.org
Trend Following with Moving AveragesHello Traders,
With the info "Trend is Your Friend ", you should not take position against the trend. This script checks multipte moving averages if they are above/below the closing price and try to find trend. The moving averages with the length 8, 13, 21, 34, 55, 89, 144, 233, 377 used. these are fibonacci numbers, but optionally you can change the lengths of each moving averages. while it's green you better take long positions, while it's red you better take short positions according to other indcators or tools.
Optionally you have "smoothing" option to get rid of whipsaws. it's enabled by default.
You have option to use following moving average types: EMA, SMA, RMA, WMA, VWMA. by default it's EMA
Also the script has "Resolution" option. with this option you can get the trend for other time frames, in following example 1h was set as for higher time frame on 15m chart:
This should not be used as buy/sell signal indicators as it's tries to find trend but not entry points, you should use other indicators (such RSI, Momentum) or other tools to find buy/sell signals.
Enjoy!
Optimized Linear Regression ChannelReturn a linear regression channel with a window size within the range (min, max) such that the R-squared is maximized, this allows a better estimate of an underlying linear trend, a better detection of significant historical supports and resistance points, and avoid finding a good window size manually.
Settings
Min : Minimum window size value
Max : Maximum window size value
Mult : Multiplicative factor for the rmse, control the channel width.
Src : Source input of the indicator
Details
The indicator displays the specific window size that maximizes the R-squared at the bottom of the lower channel.
When optimizing we want to find parameters such that they maximize or minimize a certain function, here the r-squared. The R-squared is given by 1 minus the ratio between the sum of squares (SSE) of the linear regression and the sum of squares of the mean. We know that the mean will always produce an SSE greater or equal to the one of the linear regression, so the R-squared will always be in a (0,1) range. In the case our data has a linear trend, the linear regression will have a better fit, thus having a lower SSE than the SSE of the mean, has such the ratio between the linear regression SSE and the mean SSE will be low, 1 minus this ratio will return a greater result. A lower R-squared will tell you that your linear regression produces a fit similar to the one produced by the mean. The R-squared is also given by the square of the correlation coefficient between the dependent and independent variables.
In pinescript optimization can be done by running a function inside a loop, we run the function for each setting and keep the one that produces the maximum or minimum result, however, it is not possible to do that with most built-in functions, including the function of interest, correlation , as such we must recreate a rolling correlation function that can be used inside loops, such functions are generally loops-free, this means that they are not computed using a loop in the first place, fortunately, the rolling correlation function is simply based on moving averages and standard deviations, both can be computed without using a loop by using cumulative sums, this is what is done in the code.
Note that because the R-squared is based on the SSE of the linear regression, maximizing the R-squared also minimizes the linear regression SSE, another thing that is minimized is the horizontality of the fit.
In the example above we have a total window size of 27, the script will try to find the setting that maximizes the R-squared, we must avoid every data points before the volatile bearish candle, using any of these data points will produce a poor fit, we see that the script avoid it, thus running as expected. Another interesting thing is that the best R-squared is not always associated to the lowest window size.
Note that optimization does not fix core problems in a model, with the linear regression we assume that our data set posses a linear trend, if it's not the case, then no matter how many settings you use you will still have a model that is not adapted to your data.
Linear Regression (All Data)The tool plots a linear regression line using the entire history of an instrument on chart. There are may be issues on intraday timeframes less then 1h. On daily, weekly and monthly charts it works without problem.
If an instrument has a lot of data points, you may not see the line (this is TV feature):
To fix that you need to scroll your chart to the left and find the starting point of the line:
And then do an auto-scroll to the last bar:
Quadratic RegressionFit a quadratic polynomial (parabola) to the last length data points by minimizing the sum of squares between the data and the fitted results. The script can extrapolate the results in the future and can also display the R-squared of the model. Note that this script is subject to some limitations (more in the "Notes" section).
Settings
Length : Number of data points to use as input.
Offset : Determine the number of past fitted values to be displayed, if 0 only the extrapolated values are displayed, if 55 only the past fitted values are displayed.
Src : Input data of the indicator
Show R2 : Determine if the value of the R-squared must be displayed, by default true.
Usage
When the underlying trend in the price is not linear, we might use more advanced models to estimate it, this is where using a higher-degree regression model might be required, as such a quadratic model (second-degree) is appropriate when the underlying trend is parabolic.
Here we can see that the quadratic regression (in blue) offer a better fit than a linear one.
Another advantage of the quadratic regression is that a linear one will always have the same direction, that's not the case with the quadratic regression and as such, it is possible to forecast reversals.
Above a linear regression (in red) and two quadratic regression (in blue) with both length = 54. Note that for the sake of clarity, the above image uses a quadratic regression to show all the past fitted values and another one to show all the forecasted values.
The R-Squared is also extremely useful when it comes to measuring the accuracy of the model, with values closer to 1 indicating that the model is appropriate, and thus suggesting that the underlying trend in the price is parabolic. The R-squared can also measure the strength of the trend.
Notes
The script uses the function line.new , as such only a maximum of 54 observations are displayed, getting more observations can be done by using an additional quadratic regression like we did in the previous section. Another thing is that line.new use xloc.bar_time , as such it is possible to observe some errors with the displayed results of the indicator, such as:
This will happen when applying the indicator to symbols with session breaks, I apologize for this inconvenience and I'll try to find solutions. Note however that the indicator will work perfectly on cryptos.
Summary
That's an indicator I really wanted to make, even if it is important to note that such models are rarely useful in stock markets, however it is more than possible to create a quadratic regression (with severe limitations) with pinescript.
Today I turn 21, while I should be celebrating I still wanted to share something with the community, it's also some kind of present to myself that tells me that I am a bit better at using pinescript than last year, and I am glad I could progress (instead of regress, regression , got it?). Thx a lot for reading!
Computing The Linear Regression Using The WMA And SMAPlot a linear regression channel through the last length closing prices, with the possibility to use another source as input. The line is fit by using linear combinations between the WMA and SMA thus providing both an interesting and efficient method. The results are the same as the one provided by the built-in linear regression, only the computation differ.
Settings
length : Number of inputs to be used.
src : Source input of the indicator.
mult : Multiplication factor for the RMSE, determine the distance between the upper and lower level.
Usage
In technical analysis a linear regression can provide an estimate of the underlying trend in the price, this result can be extrapolated to have an estimate of the future evolution of the trend, while the upper and lower level can be used as support and resistance levels.
The slope of the fitted line indicates both the direction and strength of the trend, with a positive slope indicating an up-trending market while a negative slope indicates a down-trending market, a steeper line indicates a stronger trend.
We can see that the trend of the S&P500 in this chart is approximately linear, the upper and lower levels were previously tested and might return accurate support and resistance points in the future.
By using a linear regression we are making the following assumptions:
The trend is linear or approximately linear.
The cycle component has an approximately constant amplitude (this allows the upper and lower level to be more effective)
The underlying trend will have the same evolution in the future
In the case where the growth of a trend is non-linear, we can use a logarithmic scale to have a linear representation of the trend.
Details
In a simple linear regression, we want to the slope and intercept parameters that minimize the sum of squared residuals between the data points and the fitted line
intercept + x*slope
Both the intercept and slope have a simple solution, you can find both in the calculations of the lsma, in fact, the last point of the lsma with period length is equal to the last point of a linear regression fitted through the same length data points. We have seen many times that the lsma is an FIR filter with a series of coefficients representing a linearly decaying function with the last coefficients having a negative value, as such we can calculate the lsma more easily by using a linear combination between a WMA and SMA: 3WMA - 2SMA , this linear combination gives us the last point of our linear regression, denoted point B .
Now we need the first point of our linear regression, by using the calculations of the lsma we get this point by using:
intercept + (x-length+1)*slope
If we get the impulse response of such lsma we get
In blue the impulse response of a standard lsma, in red the impulse response of the lsma using the previous calculation, we can see that both are the same with the exception that the red one appears as being time inverted, the first coefficients are negative values and as such we also have a linear operation involving the WMA and SMA but with inverted terms and different coefficients, therefore the first point of our linear regression, denoted point A , is given by 4SMA - 3WMA , we then only need to join these two points thanks to "line.new".
The levels are simply equal to the fitted line plus/minus the root mean squared error between the fitted line and the data points, right now we only have two points, we need to find all the points of the fitted line, as such we first need to find the slope, which can be calculated by diving the vertical distance between B and A (the rise) with the horizontal distance between B and A (the run), that is
(A - B)/(length-1)
Once done we can find each point of our line by using
B + slope*i
where i is the position of the point starting from B, i=0 give B since B + slope*0 = B , then we continue for every i , we then only need to sum the squared distance between each closing prices at position i and the point found at that same position, we divide by length-1 and take the square root of the result in order to have the RMSE.
In Summary
The following post as shown that it was possible to compute a linear regression by using a linear combination between the WMA and SMA, since both had extremely efficient computations (see link at the end of the post) we could have a calculation for the linear regression where the number of operations is independent of length .
This post took me eons to make because it's related to the lsma, and I am rarely short on words when it comes to anything related to the lsma. Thx to LucF for the feedback and everything.
NSDT ES Midline Zones**DESIGNED FOR ES/MES** This script provides an easy visualization of potential reversion zones to take trades back to the intraday midline. A common use would be to enter a position once price reached the outer yellow zones and retreats to either the red zone (for a short toward the midline) or a green zone (for a long back to the midline).
NSDT NQ Midline Zones**DESIGNED FOR NQ/MNQ** This script provides an easy visualization of potential reversion zones to take trades back to the intraday midline. A common use would be to enter a position once price reached the outer yellow zones and retreats to either the red zone (for a short toward the midline) or a green zone (for a long back to the midline).
Linear Regression ++Due to public demand
Linear Regression Formula
Scraped Calculation With Alerts
Here is the Linear Regression Script For traders Who love rich features
Features
++ Multi time frame -> Source Regression from a different Chart
++ Customized Colors -> This includes the pine lines
++ Smoothing -> Allow Filtered Regression; Note: Using 1 Defaults to the original line. The default is 1
++ Alerts On Channel/Range Crossing
Usage
++ Use this for BreakOuts and Reversals
++ This Script is not to be used Independently
Risks
Please note, this script is the likes of Bollinger bands and poses a risk of falling in a trend range.
Signals may Keep running on the same direction while the market is reversing.
Requests
If you have any feature requests, comment below or dm me. I will answer when i can.
Feel free to utilize this on your chart and share your ideas
For developers who want to use this on their chart, Please use this script
The original formula for calculation is posted there
❤❤❤ I hope you love this. From my heart! ❤❤❤
Lnear Regression ++Here is another amazing script for you guys
Target Audience
++ Programmers
++ Linear Regression Enthusiasts
Please Use this Indicator If you understand the risk posed by linear regression; ill explain some below
Features
++ Raw Formulae for the linear regression
--I understand that tradingview explanation on how the linreg function works is not clear to many of you and therefore i included this for developers
--Yes its much simpler than you thought, Do Enjoy
++ Alerts
--You can get alerts when the lower band is crossed/touched based on your settings
--These alerts are not repainting at all.
Linear Regression Limits
As you traders know, the market changes from time and new levels will get drawn
The alerts are based on these new levels and once we have new ones, we keep updating
Risk
This script is similar to Bollinger Bands style of alerts, If the market moves continuously to one direction after the break of a band, The levels change and you may receive a new signal confirmation
Cheers!! Enjoy!! Feel free to ask me for any improvements
Leavitt Convolutions Multicator - Jay Leavitt, Ph.D.Hot off the press, I present this next generation "Leavitt Convolutions Multicator" employing PSv4.0, originally formulated by Jay Leavitt, Ph.D. for TASC - January 2020 Traders Tips. Basically it's an all-in-one combination of three Leavitt indicators. This triplet indicator, being less than a 60 line implementation at initial release, is a heavily modified version of the original indicator using novel techniques, surpassing Leavitt's original intended design.
Utilizing the "Power of Pine", I included the maximum amount of features I could surmise in an ultra small yet powerful package. Configurations are displayed above in multiple scenarios that should be suitable for most traders.
Features List Includes:
Dark Background - Easily disabled in indicator Settings->Style for "Light" charts or with Pine commenting
AND much, much more... You have the source!
For those of you who are new to Pine Script, this script may also help you understand advanced programming techniques in Pine and how they may be utilized in a most effective manner. Most notably, the script shows how to potentially combine three indicators in one with Pine. This is commonly what my dense intricate code looks like behind the veil, and if you are wondering why there is no notes, that's because the notation is in the variable naming.
The comments section below is solely just for commenting and other remarks, ideas, compliments, etc... regarding only this indicator, not others. When available time provides itself, I will consider your inquiries, thoughts, and concepts presented below in the comments section, should you have any questions or comments regarding this indicator. When my indicators achieve more prevalent use by TV members, I may implement more ideas when they present themselves as worthy additions. As always, "Like" it if you simply just like it with a proper thumbs up, and also return to my scripts list occasionally for additional postings. Have a profitable future everyone!
Quadratic Least Squares Moving Average - Smoothing + Forecast Introduction
Technical analysis make often uses of classical statistical procedures, one of them being regression analysis, and since fitting polynomial functions that minimize the sum of squares can be achieved with the use of the mean, variance, covariance...etc, technical analyst only needed to replace the mean in all those calculations with a moving average, we then end up with a low lag filter called least squares moving average (lsma) .
The least squares moving average could be classified as a rolling linear regression, altho this sound really bad it is useful to understand the relationship of both methods, both have the same form, that is ax + b , where a and b are coefficients of the model. However in a simple linear regression a and b are constant, while the lsma use variables instead.
In a simple lsma we model the relationship of the closing price (dependent variable) with a linear sequence (independent variable), therefore x = 1,2,3,4..etc. However we can use polynomial of higher degrees to model such relationship, this is required if we want more reactivity. Therefore we can use a quadratic form, that is ax^2 + bx + c , where a,b and c are variables.
This is the quadratic least squares moving average (qlsma), a not so official term, but we'll stick with it because it still represent the aim of the filter quite well. In this indicator i make the calculations of the qlsma less troublesome, therefore one might understand how it would work, note that in general the coefficients of a polynomial regression model are found using matrix calculus.
The Indicator
A qlsma, unlike the classic lsma, will fit better to the price and will be more reactive, this is the advantage of using an higher degrees for its calculation, we can model more complex relationship.
lsma in green, qlsma in red, with both length = 200
However the over/under shoots are greater, i'll explain why in the next sections, but this is one of the drawbacks of using higher degrees.
The indicator allow to forecast future values, the ahead period of the forecast is determined by the forecast setting. The value for this setting should be lower than length, else the forecasts can easily over/under shoot which heavily damage the forecast. In order to get a view on how well the forecast is performing you can check the option "Show past predicted values".
Of course understanding the logic behind the forecast is important, in short regressions models best fit a certain curve to the data, this curve can be a line (linear regression), a parabola (quadratic regression) and so on, the type of curve is determined by the degree of the polynomial used, here 2, which is a parabola. Lets use a linear regression model as example :
ax + b where x is a linear sequence 1,2,3...and a/b are constants. Our goal is to find the values for a and b that minimize the sum of squares of the line with the dependent variable y, here the closing price, so our hypothesis is that :
closing price = ax + b + ε
where ε is white noise, a component that the model couldn't forecast. The forecast of the closing price 14 step ahead would be equal to :
closing price 14 step aheads = a(x+14) + b
Since x is a linear sequence we only need to sum it with the forecasting horizon period, the same is done here with :
a*(n+forecast)^2 + b*(n + forecast) + c
Note that the forecast proposed in the indicator is more for teaching purpose that anything else, this indicator can't possibly forecast future values, even on a meh rate.
Low lag filters have been used to provide noise free crosses with slow moving average, a bad practice in my opinion due to the ability low lag filters have to overshoot/undershoot, more interesting use cases might be to use the qlsma as input for other indicators.
On The Code
Some of you might know that i posted a "quadratic regression" indicator long ago, the original calculations was coming from a forum, but because the calculation was ugly as hell as well as extra inefficient (dogfood level) i had to do something about it, the name was also terribly misleading.
We can see in the code that we make heavy use of the variance and covariance, both estimated with :
VAR(x) = SMA(x^2) - SMA(x)^2
COV(x,y) = SMA(xy) - SMA(x)SMA(y)
Those elements are then combined, we can easily recognize the intercept element c , who don't change much from the classical lsma.
As Digital Filter
The frequency response of the qlsma is similar to the one of the lsma, those filters amplify certain frequencies in the passband, and have ripples in the stop band. There is something interesting about those filters, first using higher degrees allow to greater boost of the frequencies in the passband, which result in greater over/under shoots. Another funny thing is that the peak/valley of the ripples is equal the peak or valley in the ripples of another lsma of different degree.
The transient response of those filters, that is impulse response, step response...etc is related to the degree of the polynomial used, therefore lets denote a lsma of degree p : lsma(p) , the impulse response of lsma(p) is a polynomial of degree p, and the step response is simple a polynomial of order p+1.
This is why it was more interesting to estimate the qlsma using convolution, however we can no longer forecast future values.
Conclusion
I proposed a more usable quadratic least squares moving average, with more options, as well as a cleaner and more efficient code. The process of shrinking the original code is made easier when you know about the estimations of both variance and covariance.
I hope the proposed indicator/calculation is useful.
Thx for reading !
Regression Channel [DW]This is an experimental study which calculates a linear regression channel over a specified period or interval using custom moving average types for its calculations.
Linear regression is a linear approach to modeling the relationship between a dependent variable and one or more independent variables.
In linear regression, the relationships are modeled using linear predictor functions whose unknown model parameters are estimated from the data.
The regression channel in this study is modeled using the least squares approach with four base average types to choose from:
-> Arnaud Legoux Moving Average (ALMA)
-> Exponential Moving Average (EMA)
-> Simple Moving Average (SMA)
-> Volume Weighted Moving Average (VWMA)
When using VWMA, if no volume is present, the calculation will automatically switch to tick volume, making it compatible with any cryptocurrency, stock, currency pair, or index you want to analyze.
There are two window types for calculation in this script as well:
-> Continuous, which generates a regression model over a fixed number of bars continuously.
-> Interval, which generates a regression model that only moves its starting point when a new interval starts. The number of bars for calculation cumulatively increases until the end of the interval.
The channel is generated by calculating standard deviation multiplied by the channel width coefficient, adding it to and subtracting it from the regression line, then dividing it into quartiles.
To observe the path of the regression, I've included a tracer line, which follows the current point of the regression line. This is also referred to as a Least Squares Moving Average (LSMA).
For added predictive capability, there is an option to extend the channel lines into the future.
A custom bar color scheme based on channel direction and price proximity to the current regression value is included.
I don't necessarily recommend using this tool as a standalone, but rather as a supplement to your analysis systems.
Regression analysis is far from an exact science. However, with the right combination of tools and strategies in place, it can greatly enhance your analysis and trading.
Forecasting - Quadratic RegressionThis script is written totally thanks to Alex Grover (). Here it is implemented in conjunction with the seasonal forecast I showed in one of my previous posts. It takes the calculated QReg curve and extends its last section (Season) into the future (Forecasted periods).
Forecasting - Locally Weighted Regression (rescaled)UPDATE: the original version works only with BTC. Here's a general version with rescaling.
Forecasting - Vanilla Locally Weighted RegressionThere is not much to say - just vanilla locally weighted regression in PineScript 4.
see: medium.com
also: cs229.stanford.edu
Forecasting - Least Squares RegressionTested on 5m TF with EURUSD. Settings should be modified appropriately for other TFs, lookbacks and securities. This indicator does not repaint.
Linear Regression Trend ChannelThis is my first public release of indicator code and my PSv4.0 version of "Linear Regression Channel", as it is more commonly known. It replicates TV's built-in "Linear Regression" without the distraction of heavy red/blue fill bleeding into other indicators. We can't fill() line.new() at this time in Pine Script anyways. I entitled it Linear Regression Trend Channel, simply because it seems more accurate as a proper description. I nicely packaged this to the size of an ordinary napkin within 20 lines of compact code, simplifying the math to the most efficient script I could devise that fits in your pocket. This is commonly what my dense intricate code looks like behind the veil, and if you are wondering why there is no notes, that's because the notation is in the variable naming. I excluded Pearson correlation because it doesn't seem very useful to me, and it would comprise of additional lines of code I would rather avoid in this public release. Pearson correlation is included in my invite-only advanced version of "Enhanced Linear Regression Trend Channel", where I have taken Linear Regression Channeling to another level of fully featured novel attainability using this original source code.
Features List Includes:
"Period" adjustment
"Deviation(s)" adjustment
"Extend Method" option to extend or not extend the upper, medial, and lower channeling
Showcased in the chart below is my free to use "Enhanced Schaff Trend Cycle Indicator", having a common appeal to TV users frequently. If you do have any questions or comments regarding this indicator, I will consider your inquiries, thoughts, and ideas presented below in the comments section, when time provides it. As always, "Like" it if you simply just like it with a proper thumbs up, and also return to my scripts list occasionally for additional postings. Have a profitable future everyone!
Time Series ForecastIntroduction
Forecasting is a blurry science that deal with lot of uncertainty. Most of the time forecasting is made with the assumption that past values can be used to forecast a time series, the accuracy of the forecast depend on the type of time series, the pre-processing applied to it, the forecast model and the parameters of the model.
In tradingview we don't have much forecasting models appart from the linear regression which is definitely not adapted to forecast financial markets, instead we mainly use it as support/resistance indicator. So i wanted to try making a forecasting tool based on the lsma that might provide something at least interesting, i hope you find an use to it.
The Method
Remember that the regression model and the lsma are closely related, both share the same equation ax + b but the lsma will use running parameters while a and b are constants in a linear regression, the last point of the lsma of period p is the last point of the linear regression that fit a line to the price at time p to 1, try to add a linear regression with count = 100 and an lsma of length = 100 and you will see, this is why the lsma is also called "end point moving average".
The forecast of the linear regression is the linear extrapolation of the fitted line, however the proposed indicator forecast is the linear extrapolation between the value of the lsma at time length and the last value of the lsma when short term extrapolation is false, when short term extrapolation is checked the forecast is the linear extrapolation between the lsma value prior to the last point and the last lsma value.
long term extrapolation, length = 1000
short term extrapolation, length = 1000
How To Use
Intervals are create from the running mean absolute error between the price and the lsma. Those intervals can be interpreted as possible support and resistance levels when using long term extrapolation, make sure that the intervals have been priorly tested, this mean the intervals are more significants.
The short term extrapolation is made with the assumption that the price will follow the last two lsma points direction, the forecast tend to become inaccurate during a trend change or when noise affect heavily the lsma.
You can test both method accuracy with the replay mode.
Comparison With The Linear Regression
Both methods share similitudes, but they have different results, lets compare them.
In blue the indicator and in red a linear regression of both period 200, the linear regression is always extremely conservative since she fit a line using the least squares method, at the contrary the indicator is less conservative which can be an advantage as well as a problem.
Conclusion
Linear models are good when what we want to forecast is approximately linear, thats not the case with market price and this is why other methods are used. But the use of the lsma to provide a forecast is still an interesting method that might require further studies.
Thanks for reading !