• Регистрация
Nomad
Nomad 0.00
н/д

Using full limit order book for price jump prediction

04.11.2019

In high frequency financial markets, the trading information is contained in the Limit Order Book (LOB). The main purpose of the paper is to investigate how full information about the LOB can help in predicting various events of interest to investors. Normally, a full LOB contains total volumes of orders for hundreds of prices. Using the full information runs into the curse of dimensionality which manifests itself in multicollinearity, insignificant coefficients, inflated estimate variances and high computation time. Due to these problems, order volumes for prices that are distant from ask and bid prices are usually not used in prediction procedures. For this reason we call such information a silent crowd. Here we propose a summary measure of the silent crowd and quantify its influence on trade jump prediction. We use a realistically simulated LOB as a vehicle for experiments and logistic regression as the prediction tool.

На высокочастотных финансовых рынках торговая информация содержится в Книге предельных заказов (LOB). Основная цель статьи - изучить, как полная информация о LOB может помочь в прогнозировании различных событий, представляющих интерес для инвесторов. Обычно полный LOB содержит общие объемы заказов по сотням цен. Использование полной информации наталкивается на проклятие размерности, которое проявляется в мультиколлинеарности, незначительных коэффициентах, завышенных дисперсиях оценок и большом времени вычислений. Из-за этих проблем объемы заказов по ценам, далеким от цен спроса и предложения, обычно не используются в процедурах прогнозирования. По этой причине мы называем такую ​​информацию молчаливой толпой. Здесь мы предлагаем сводную меру молчаливой толпы и количественно оцениваем ее влияние на прогнозирование торгового скачка. Мы используем реалистично смоделированный LOB в качестве средства для экспериментов и логистическую регрессию в качестве инструмента прогнозирования.

\documentclass[12pt]{article}

\usepackage{lmodern}

\usepackage{amssymb,amsmath,amsthm,natbib,amsthm,graphicx,anysize,epstopdf,hyperref,dsfont,pdfsync,comment,color}

\usepackage{geometry}

\geometry{height=8.5in,width=6.5in,letterpaper}

\DeclareGraphicsRule{.tif}{png}{.png}{`convert #1 `dirname #1`/`basename #1 .tif`.png}

\hypersetup{colorlinks=false}

\newtheorem{theorem}{Theorem}

\newtheorem{acknowledgement}[theorem]{Acknowledgement}

\newtheorem{condition}{Assumption}[section]

\newtheorem{lemma}{Lemma}

\begin{document}

\pagestyle{empty}

\begin{center} \large  \sc Using  full limit order book for price jump prediction \normalsize \rm  \\[0.2in]

\begin{tabular}{c}

\multicolumn{1}{c}{ \sc Kairat Mynbaev}\\[.1in]

New School of Economics\\

Satbayev University\\

22a Satpaev str.\\

Almaty 050013, Kazakhstan\\

email: kairat\_mynbayev@yahoo.com\\

\end{tabular}\\[.2in]

\today

%\begin{tabular}{lcl}

%\multicolumn{3}{c}{ \sc Carlos Martins-Filho}\\[.1in]

%Department of Economics &  &IFPRI \\

%University of Colorado &  & 2033 K Street NW  \\

%Boulder, CO 80309-0256, USA& \&&Washington, DC 20006-1002, USA\\

%email: carlos.martins@colorado.edu& & email: c.martins-filho@cgiar.org\\

%Voice: + 1 303 492 4599 & &Voice: + 1 202 862 8144\\

%\end{tabular}\\[.1in]

%\begin{center}

%and\\[.2in]

%\end{center}

%\begin{tabular}{l}

%\multicolumn{1}{c}{ \sc Aziza Aipenova}\\[.1in]

%Department of Mechanics and Mathematics\\

%Kazakh National University\\

%Al-Farabi 71\\

%Almaty 050040, Kazakhstan\\

%email: a.aipenova@mail.ru\\

%Voice: + 7 727 303 7004\\

%\end{tabular}\\[.1in]

%October, 2014\\[.1in]

\end{center}

\noindent \bf Abstract. \rm In high frequency financial markets the trading information is contained in the Limit Order Book (LOB). The main purpose of the paper is to investigate how full information about the LOB can help in predicting various events of interest to investors. Normally, a full LOB contains total volumes of orders for hundreds of prices. Using the full information runs into the curse of dimensionality which manifests itself in multicollinearity, insignificant coefficients, inflated estimate variances and high computation time. Due to these problems, order volumes for prices that are distant from ask and bid prices are usually not used in prediction procedures. For this reason we call such information a silent crowd. Here we propose a summary measure of the silent crowd and quantify its influence on trade jump prediction. We use a realistically simulated LOB as a vehicle for experiments and logistic regression as the prediction tool. The full code in Matlab includes 18 blocks.\\[.1in]

 

\noindent \bf Keywords \rm Simulation, trade jump prediction, high frequency trading, logistic regression, limit order book. \rm \\[.1in]

 

%\noindent \bf AMS-MS Classification. \rm 62F12, 62G07, 62G20.

 

\clearpage

\setlength{\baselineskip}{24pt}

\pagestyle{plain}

\setcounter{page}{1}

\setcounter{footnote}{0}

\section{Introduction}

The advent of information technologies made possible the transition from quote-driven markets to order-driven trading platforms. On many stock exchanges, including NYSE, NASDAQ, and the London Stock Exchange, trade orders are submitted and executed electronically \citep{PS}. Outstanding orders are recorded in what is called a Limit Order Book (LOB). For a fee, clients can have access to either partial or full information contained in the LOB. High speed communications, fast computers and computer algorithms enabled high frequency trading, when orders are submitted every millisecond. Analysing the LOB and making predictions regarding possible market moves in real time is essential for participants of this market.

 

One direction of research focuses on mathematical models of the LOB  \citep{Rosu}, \citep{Shek},  \citep{CST}, \citep{Cont}, \citep{HeK}. They provide kind of a common denominator for financial phenomena but are too judgmental in the sense that they typically impose restrictions which are hard to validate in practice \citep{BMP}, \citep{FKK}.

 

On the other hand, machine learning methods do not impose any a priori conditions and attempt to reveal the regularities that are in the data \citep{CS}, \citep{TNY}, \citep{JPR}, \citep{LR}. In particular, statistical methods are used to predict quantities that can be used profitably. The paper \citep{BC} presents a non-parametric model for trade sign inference. \citep{ZMA} uses logistic regression to predict occurrence of price jumps. \citep{FHST} and \citep{KZ} employ support vector machines to capture the dynamics of price movements.

 

The above references use real-world data. We work with a simulated LOB. The two approaches have different focuses.

 

The main value of real-world data is that it contains traces of investors' decisions, which are influenced by the shape of the LOB, among other things. The challenge is to infer about investors decisions and use that inference to successfully predict future price movements. This is complicated by many realities: different investors react differently to the market signals contained in the LOB, there are events outside the LOB influencing investors moves and the very invention of successful prediction mechanisms affects investors behaviour.

 

A simulated LOB should incorporate and exhibit the observed features of the real LOB. Different types of orders are posted in accordance with patterns observed in practice, but other than that they are absolutely random and independent, at least in our implementation. There are no built-in behavioral assumptions.

The simulated LOB is impartial, so to speak. It serves better the purpose of revealing relative importance of quantities contained in the LOB, as opposed to inferring about investors motivations. A real LOB is a snapshot of what has happened, while a simulated LOB allows one to fine-tune model parameters to achieve the desired patterns.

 

In Section 2 we describe the standard features of limit order books. In Section 3 we detail the simulations. Section 4 presents the main results. Section 5 contains conclusions.

 

\section{Order types and LOB structure}

In order-driven markets investors can submit three order types: limit orders, cancel orders and market orders. The minimum allowed price increment is called a tick. For simulation purposes the tick can be taken to be 1 without loss of generality.

 

A sell limit order is an order to sell a certain number of shares at a certain price (called ask) or higher. A buy limit order is an order to buy a certain number of shares at a certain price (called bid) or lower. If there is no offsetting order at the same price, a limit order is recorded in the LOB. Limit orders are executed against offsetting incoming orders in the order they (limit orders) were recorded. Limit orders have an expiration date, unless the investor specifies that the order is good until canceled. Order expiration dates are not seen in the LOB investors have access to. For modeling purposes all limit orders are considered as orders with no expiration date.

 

An investor can cancel his/her limit order (or its remaining part) any time. In fact, most limit orders are canceled before their execution.

 

It is useful to imagine the LOB as consisting of two parts, with a vertical price axis. The upper part contains all sell orders, and the lower one contains all buy orders (more precisely, total volumes against each tick). The lowest sell price is called the best ask and the highest buy price is called the best bid. Because of opposite order matching the best ask is always higher than the best bid. The midprice is defined by $midprice=(best\ ask + best\ bid)/2$. The difference $best\ ask - best\ bid$ is called a spread. The prices and total volumes at the best ask and bid are called first level quotes, the prices and total volumes one tick away from the best ask and bid are called second level quotes and so on.

 

A market sell order is an order to sell a certain number of shares at the best available price, that is at the best bid. Similarly, a market buy order is an order to buy a certain number of shares at the best available price, that is at the best ask. When a market sell order arrives, the total volume at the best bid may be smaller than the market order size. In this case the market order consumes all of the volume at the best bid, the best bid moves down and the remaining part of the market order is executed against the limit orders at the new best bid. Some exchanges use a different rule: if, say, a sell market order size is larger than the outstanding volume at the best bid, the remaining part of the market order stays in the LOB as a sell limit order. The difference between the first case, when the market order may be executed at several prices, and the second one, when it may be partially executed and the remainder stays as a limit order at the best bid, is that in the first case the best bid moves down (and the spread increases), while in the second case it is the best ask that moves down. In the first case the downward move of the midprice is determined by the relative size of the market order and liquidity at the bid side. In the second case this downward move depends on the spread, and the midprice right after execution of the market order will be lower than the best bid right before the execution. The midprice is more stable under the first arrangement, which we adopt in our simulations. Stability of market prices is one of desirable features.

 

Market orders are executed immediately, so in case of a real LOB, one can know about their arrival and size only from a change in total volumes of limit orders at the best ask and bid. There also can be errors in the way the LOB is recorded. This kind of problems do not arise with a simulated LOB. Experiments on a real stock exchange are costly and likely to disrupt its operations; in case of a change in rules governing an exchange large and technologically advanced players will win at the expense of small investors.

 

All the information above the ask price characterizes the supply, whereas all the information below the bid price characterizes the demand cite.

\section{Simulation description}

The task of modeling the LOB is complex because the impact of an order on the book depends on the state of the book. Therefore one cannot sum the incoming orders and post them to the book at equally spaced moments. One has to generate orders and post them immediately one by one. This requires a lot of calculation only a small part of which can be made faster using parallel computing. We have not been able to use the CUDA technology from Nvidia because it can handle only specific types of code.

 

Application of logit requires measuring depths at equally spaced moments, and their number should be large enough. With short time intervals (on the order of several milliseconds) the LOB is too poor. Increasing the lengths of time intervals increases the complexity of calculations.

 

Following the empirical pattern, the distribution of orders is defined in such a way that the spread of limit orders is very large, $\pm50\%$ of the midprice or more. On both sides of the midprice the distribution declines as a power law, up to 100 ticks from the midprice, and then falls to zero. Orders arrive independently at exponential rates.

 

Cancel order sizes are given as a fraction of the order depth.

 

The Matlab code consists of 18 programs. The first character in  the program name indicates its level. The lowest-level programs start with A, next-level programs start with B and so on. The level of a program is determined by the references contained in it. For example, the program C\_AllOrdersTimesAndPrices.m may refer to levels A and B but not higher.

 

The function A\_InDistr creates the initial distribution of orders.

 

The function A\_OrderTimes generates a sequence of order placement times up to given moment.

 

The function A\_Revert just makes some code more convenient to read.

 

A\_NormConstant realizes an empirically observed pattern in the distribution of orders from \cite{BMP}.

 

B\_OrderTimesFixedPrice generates lists of limit, cancel and

market order times (for all price ticks from 1 to MaxPrice).

 

B\_AskAndBid finds the best ask (the lowest ask price at which order size is not zero) and the best bid (the highest bid price at which order size is not zero).

 

The function B\_FindCum creates cumulative sums starting from the lower end

of B\_T. This is the most important part of the method. The silent crowd should be summarized in such a way that the prices close to the midprice should have larger weights. The weights should not be so heavy as to dampen the tail of the silent crowd.

 

B\_LODensity generates sizes of limit orders in the range (midprice-dist,midprice+dist), currently under condition MaxPrice=4*dist.

 

C\_AllOrdersTimesAndPrices puts into one matrix

 

\begin{itemize}

  \item all order times from lists of limit, cancel and market orders, unsorted (first column),

  \item order types (1 for Limit, 2 for Cancel, 3 for Market) (second column),

  \item and corresponding prices, numbered 1 through MaxPrice (third column).

\end{itemize}

This is necessary to create a line of orders that later will be posted to the LOB.

 

Next there are three functions that post three types of orders:

C\_PostCancelOrder, C\_PostLimitOrder and C\_PostMarketOrder.

 

E\_Inference\_A\_B collects statistical characteristics of the LOB. It is important that after about 50 orders the simulated LOB stabilizes and its two-humped shape corresponds to what is observed in practice.

 

Next we need to see how informative are the prices close to the midprice, compared to the informativeness of the silent crowd.

 

F\_band\_A\_B finds bands of order sizes of width band (band up from ask in A\_T and band down from bid in B\_T).

 

F\_weight\_A\_B prepares weights for averaging order sizes.

 

Finally, comparison is made between contribution of the prices that are close to the midprice (in the band) and contribution of the silent crowd.

 

\section{Preliminary results}

The density of incoming limit orders is generated according to what is observed in practice. 200 ticks up and down from the initial midprice the density tapers off. We set it to zero, see Figure \ref{Fig0}.

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{LODensity.eps}

   \caption{Density of limit orders}

   \label{Fig0}

\end{figure}

 

As it was mentioned above, after about 50 orders the simulated LOB stabilizes. The midprice falls from the one defined in the initial distribution and afterwards is pretty stable (Figure \ref{Fig1}).

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{midprice.eps}

   \caption{Stabilization of the midprice}

   \label{Fig1}

\end{figure}

 

The standard deviation of the midprice also stabilizes (Figure \ref{Fig2}).

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{midprice_stdev.eps}

   \caption{Stabilization of the standard deviation of the midprice}

   \label{Fig2}

\end{figure}

 

Its two-humped shape corresponds to what is observed in practice, see Figure \ref{Fig3}. This is a sign that relative order sizes have been chosen correctly (orders do not accumulate to infinity and are not consumed entirely by incoming buy orders).

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{2hump.eps}

   \caption{Two-humped distribution of order sizes}

   \label{Fig3}

\end{figure}

 

Another sign that the LOB is being simulated correctly is that the order lists in the LOB behave pretty irregularly. See on Figure \ref{Fig4} the behavior of the first five ask sizes.

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{asks.eps}

   \caption{Ask sizes at the first 5 prices}

   \label{Fig4}

\end{figure}

 

On Figure \ref{Fig5} we show the weight functions that are used for constructing the index of the silent crowd.

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{weight.eps}

   \caption{Ask sizes at the first 5 prices}

   \label{Fig5}

\end{figure}

 

We use the logit model to predict the price jump. This is done with two sets of predictors: one includes only prices close to the midprice and the other additionally includes the index of the silent crowd.

\begin{figure}[htbp]%

   \centering

   \includegraphics[scale=0.4]{improvement_with_silent_crowd.eps}

   \caption{R squared for two sets f predictors}

   \label{Fig6}

\end{figure}

From Figure \ref{Fig6} it is clear that the silent crowd significantly improves prediction if the number of prices included is low (less than or equal to five). Then its contribution falls and becomes negligible after the number of prices included exceeds eight.

\clearpage \setlength{\baselineskip}{12pt}

\begin{thebibliography}{9}

\bibitem[Parlour and Seppi(2008)]{PS} C. Parlour and D. J. Seppi. Limit Order Market: A Survey. Elsevier North-Holland, 2008.

\bibitem[Cont(2011)]{Cont} R. Cont. Statistical modeling of high-frequency financial data. Signal Processing Magazine, IEEE, 28(5):16--25, August 2011.

\bibitem[Cont et al.(2010)]{CST}R. Cont, S. Stoikov, and R. Talreja. A stochastic model for order book dynamics. Operations Research, 58(3):549--563, May 2010.

\bibitem[He and Kercheval(2012)]{HeK}H. He and A. N. Kercheval. A generalized birth-death stochastic model for highfrequency order book dynamics. Quantitative Finance, 12(4):547--557, April 2012.

\bibitem[Rosu(2009)]{Rosu}I. Rosu. A dynamic model of the limit order book. Review of Financial Studies, 22:4601--4641, June 2009.

\bibitem[Shek(2010)]{Shek}H. H. S. Shek. Modeling high frequency market order dynamics using self-excited point process, 2010.

\bibitem[Bouchaud et al.(2010)]{BMP} J.-P. Bouchaud, M. Mezard, and M. Potters. Statistical properties of stock order books: Empirical results and models. Quantitative Finance, 2(4):251--256, 2002.

\bibitem[Foucault et al.(2010)]{FKK}T. Foucault, O. Kadan, and E. Kandel. Limit order book as a market for liquidity. Review of Financial Studies, 18(4):1171--1217, August 2005.

\bibitem[Jondeau et al.(2005)]{JPR}E. Jondeau, A. Perilla, and G. Rockinger. Optimal Liquidation Strategies in Illiquid Markets, volume 553. Springer Berlin Heidelberg, 2005.

\bibitem[Linnainmaa and Rosu(2008)]{LR}J. T. Linnainmaa and I. Rosu. Weather and time series determinants of liquidity in a limit order market, 2008.

\bibitem[Crammer and Singer(2001)]{CS}K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernelbased vector machines. Journal of Machine Learning Research, 2:265--292, March 2001.

\bibitem[Tino et al.(2005)]{TNY}P. Tino, N. Nikolaev, and X. Yao. Volatility forecasting with sparse bayesian kernel models. In 4th International Conference on Computational Intelligence in Economics and Finance, 1052--1058, 2005.

\bibitem[Blazejewski and Coggins(2005)]{BC}A. Blazejewski and R. Coggins, "A Local Non-Parametric Model for Trade Sign Inference," Physica A: Statisti-cal Mechanics and Its Applications, Vol. 348, No. 1, 2005, pp. 481--495.

\bibitem[Zheng et al.(2013)]{ZMA}Ban Zheng, Eric Moulines, Fr\'{e}d\'{e}ric Abergel. Price Jump Prediction in a Limit Order Book. Journal of Mathematical Finance, 2013, 3, 242--255.

\bibitem[Fletcher et al.(2010)]{FHST}T. Fletcher, Z. Hussain and J. Shawe-Taylor, "Multiple Kernel Learning on the Limit Order Book," Quantitative Finance, 2010.

\bibitem[Kercheval and Zhang(2013)]{KZ}Alec N.Kercheval, Yuan Zhang. Modeling high-frequency limit order book dynamics with support vector machines. 2013

\end{thebibliography}

 

 

\section{Appendix. Matlab code}

\ttfamily

\hrule

 

function [A,B,MRate,LRate,Theta]=A\_InDistr(MaxPrice)\% MaxLOrderSize

 

\% This function creates the vectors of asks A and bids B.

 

\% In A nonzero orders are at the top, in B at the bottom

 

\% A and B have dimension MaxPrice which is the number of ticks and must be even;

 

\% half of quotes are asks and the other half are bids.

 

\% The function also sets the rates of order arrivals MRate,LRate,Theta

 

\% Standard order size is assigned here (but not remembered)

 

if round (MaxPrice/2)~=MaxPrice/2

 

    error('MaxPrice must be even')

 

end

 

A=zeros(MaxPrice,1);

 

B=zeros(MaxPrice,1);

 

StOrderSize=100;

 

A(1:MaxPrice/2)=StOrderSize*ones(MaxPrice/2,1);

 

A(MaxPrice/2+1:MaxPrice)=zeros(MaxPrice/2,1);

 

B(1:MaxPrice/2)=zeros(MaxPrice/2,1);

 

B(MaxPrice/2+1:MaxPrice)=StOrderSize*ones(MaxPrice/2,1);

 

\% Market and limit order rates are from Huang and Kercheval

 

MRate=3.16;\% Market order rates are the same for buy and sell

 

LRate=7.46;\% Limit orders rate

 

Theta=0.71;\% Theta is cancellation rate

 

 

 

\% LRate(1,1), LRate(2,1), ... are limit order rates for order arrivals at

 

\% distance 1,2,... from the opposite best quote of orders of size 1

 

\% LRate(:,1)=7.46*ones(MaxPrice,1);

 

\% LRate(1,2), LRate(2,2), ... are limit order rates for order arrivals at

 

\% distance 1,2,... from the opposite best quote of orders of size 2

 

\% LRate(:,2)=0.8*ones(MaxPrice,1);

 

end

 

 

\hrule

 

function y=A\_OrderTimes(time,rate)

 

\% The function generates a sequence of order placement times up to given time

 

\% (order type can be limit, cancel or market)

 

\% time is any positive moment

 

\% rate is the rate of exponential distribution rate*exp(-rate*time)

 

AccumTime=0;

 

NOrders=1;

 

y=0;

 

while AccumTime\textless time

 

    AccumTime=AccumTime+random(makedist('Exponential','mu',1/rate));

 

    if AccumTime\textless time

 

        OrderTime=AccumTime;

 

        if NOrders==1

 

            y=OrderTime;

 

        else

 

            y=cat(1,y,OrderTime);

 

        end

 

    end

 

    NOrders=NOrders+1;

 

end

 

end

 

\hrule

 

function C=A\_Revert(A)

 

\% This turns the column of asks A upside down to see if really asks are

 

\% higher than bids

 

dim=size(A,1);

 

B=zeros(dim,1);

 

for j=1:dim

 

    B(j)=A(dim+1-j);

 

end

 

C=B;

 

end

 

\hrule

 

function c=A\_NormConstant(d1,mu)

 

\% This function finds the normalizing constant for the density of incoming

 

\% limit orders; the density is assumed nonzero for distances \textless =100; after

 

\% 100 the density is zero (in Bouchaud et al it steeply declines after 100)

 

\% 1/d1 is the value at zero; mu determines the power (both from Bouchaud)

 

d=transpose(-100:100);\% d is the distance from zero.

 

prob=1./(d1+abs(d)+(1-abs(sign(d)))).\^(1+mu);

 

c=1/sum(prob);

 

end

 

\hrule

 

function [LList,CList,MList]=B\_OrderTimesFixedPrice(time,MaxPrice,LRate,

 

Theta,MRate)

 

\% This function generates lists of limit, cancel and

 

\% market order times (for all price ticks from 1 to MaxPrice).

 

\% Order sizes must be determined later

 

\% The results are put into lists: Limit, Cancel, Market

 

\% time is any positive time

 

\% MRate is the rate mu of arrivals of market orders (same for sell and buy)

 

\% LRate is the rate for limit orders (better make a matrix of rates), same for sell and buy

 

\% Theta is the cancellation rate

 

\% Sometimes market orders cannot be filled because of low depth. Then

 

\% the remainder should stay as a limit order (make an option ???)

 

 

 

LList=cell(MaxPrice,1);

 

CList=cell(MaxPrice,1);

 

MList=cell(MaxPrice,1);

 

parfor i=1:MaxPrice \% i means price

 

LList(i)={transpose(A\_OrderTimes(time,LRate))};

 

CList(i)={transpose(A\_OrderTimes(time,Theta))};

 

MList(i)={transpose(A\_OrderTimes(time,MRate))};

 

end

 

end

\hrule

 

function [ask,bid]=B\_AskAndBid(A,B)

 

\% This function finds the best ask (the lowest ask price at which

 

\% order size is not zero) and the best bid (the highest bid price at which

 

\% order size is not zero);

 

\% A,B are vectors of volumes of asks and bids at different prices

 

\% In A nonzero asks are at the top (A(1), A(2),...)

 

\% In B nonzero bids are at the bottom (B(MaxPrice), B(MaxPrice-1),...)

 

 

 

B=cumsum(B);\% This makes all components positive starting from the

 

 first positive

 

C=(B\textgreater 0);

 

bid=size(B,1)-sum(C)+1;

 

 

 

A=cumsum(A\_Revert(A));\% This upturns A and makes positive all components

 

\% starting from the first positive

 

C=(A\textgreater 0);

 

ask=sum(C);

 

if bid\textless =ask

 

    error('Best bid is higher than best ask');

 

end

 

end

\hrule

 

function B\_cum=B\_FindCum(B\_T)

 

\% This creates B\_cum from B\_T (cumulative sums starting from the lower end

 

\% of B\_T)

 

T=size(B\_T,2);

 

MaxPrice=size(B\_T,1);

 

 

 

B\_cum=zeros(MaxPrice,T);

 

for t=1:T

 

B\_cum(:,t)=A\_Revert(cumsum(A\_Revert(B\_T(:,t))));

 

end

\hrule

 

function LODensity=B\_LODensity(d1,mu,c,MaxPrice)

 

\% This function generates sizes of limit orders in the range

 

\% (midprice-dist,midprice+dist), currently under condition MaxPrice=4*dist

 

\% The density of these orders is from Bouchaud et al.

 

\% d1 and mu are parameters that define a power function

 

\% c is a normalizing constant, calculated by A\_NormConstant(d1,mu)

 

 

 

\% If the distance of the order from the midprice is \textgreater dist, then the order

 

\% is zero. For distances \textless =100 the number of orders is random, according to

 

\% the Bouchaud density (can be zero)

 

 

 

spread=201;\% Half of this from midprice determines the distance after which the order is zero

 

dist=(spread-1)/2;\% dist is the distance from midprice

 

if round (MaxPrice/2)~=MaxPrice/2

 

    error('MaxPrice must be even')

 

end

 

if MaxPrice~=4*dist

 

    error('MaxPrice must be 4 times dist')

 

end

 

d=cumsum(ones(spread,1),1);\% Natural numbers from 1 to dist

 

z=cumsum(c*1./(d1+abs(d-101)+(1-abs(sign(d-101)))).\^(1+mu));

 

\% This is the cumulative sum of the density sought

 

u=rand(10*MaxPrice,1);

 

LODensity=zeros(MaxPrice,1);

 

LLimit=MaxPrice/2-dist;

 

ULimit=MaxPrice/2+dist;

 

for j=LLimit:ULimit

 

    if j==LLimit

 

        LODensity(j)=sum((u\textless z(j-dist+1)));

 

    else

 

        LODensity(j)=sum(and(z(j-dist)\textless =u,u\textless z(j-dist+1)));

 

    end

 

end

 

end

\hrule

 

\hrule

 

function Matrix=C\_AllOrdersTimesAndPrices(LList,CList,MList,MaxPrice)

 

\% This function puts into one matrix

 

\% (A) all order times from lists of limit, cancel and market orders, unsorted (first column),

 

\% (B) order types (1 for Limit, 2 for Cancel, 3 for Market) (second column),

 

\% (C) and corresponding prices, numbered 1 through MaxPrice (third column)

 

parfor j=1:MaxPrice

 

    One(j)=size(LList{j},2);

 

    Two(j)=One(j)+size(CList{j},2);

 

    Three(j)=Two(j)+size(MList{j},2);

 

\% Vector of numbers of all orders for all prices

 

end

 

Segment=zeros(MaxPrice,2);\% (Segment(j,1),Segment(j,2)) is the segment for price j

 

Segment(1,1)=1;

 

Segment(1,2)=Three(1);

 

for j=2:MaxPrice

 

        Segment(j,1)=Segment(j-1,2)+1;

 

        Segment(j,2)=Segment(j-1,2)+Three(j);

 

end

 

Matrix=zeros(Segment(MaxPrice),3);

 

for j=1:MaxPrice

 

    \% Putting all info for one price into one matrix

 

    Piece=zeros(Three(j),3);

 

    \% Putting order times in the first column

 

    Piece(1:One(j),1)=LList{j};

 

    Piece(One(j)+1:Two(j),1)=CList{j};

 

    Piece(Two(j)+1:Three(j),1)=MList{j};

 

    \% Putting order types in the second column

 

    Piece(1:One(j),2)=1;

 

    Piece(One(j)+1:Two(j),2)=2;

 

    Piece(Two(j)+1:Three(j),2)=3;

 

    \% Putting price in the third column

 

    Piece(1:Three(j),3)=j;

 

    \% Putting all pieces together

 

    Matrix(Segment(j,1):Segment(j,2),1:3)=Piece;

 

end

 

end

\hrule

 

function [A,B]=C\_PostCancelOrder(A,B,ExecutePrice)

 

\% A single cancel order is posted to LOB at ExecutePrice, which reduces the

 

\% depth by 10\%

 

[ask,bid]=B\_AskAndBid(A,B);\% Find the best ask and bid

 

 

 

if ExecutePrice\textgreater =bid

 

    B(ExecutePrice)=floor(0.9*B(ExecutePrice));

 

elseif ExecutePrice\textless =ask

 

    A(ExecutePrice)=floor(0.9*A(ExecutePrice));

 

else

 

    \% Cancel order at midprice is impossible

 

end

 

end

\hrule

 

function [A,B]=C\_PostLimitOrder(A,B,ExecPrice,LODensity)

 

\% Limit order is posted at ExecPrice in accordance with LODensity

 

[ask,bid]=B\_AskAndBid(A,B);\% Find the best ask and bid

 

midprice=(ask+bid)/2;

 

\% midprice may move up and down; the middle of LODensity moves

 

\% correspondingly

 

if or(ExecPrice\textless midprice-98,98+midprice\textless ExecPrice)

 

    \% Do nothing because LODensity is zero in this range

 

elseif ExecPrice\textgreater midprice

 

    B(ExecPrice)=B(ExecPrice)+LODensity(ExecPrice-round(midprice)+200)*10;\% Order sizes are assumed multiples of 100

 

elseif ExecPrice\textless midprice

 

    A(ExecPrice)=A(ExecPrice)+LODensity(ExecPrice-round(midprice)+200)*10;\% Order sizes are assumed multiples of 100

 

elseif ExecPrice==midprice

 

    u1=(rand\textless 0.5);

 

    if u1==1

 

\% In case of equality ExecutePrice==midprice the order is executed as a bid

 

\% half of the time and as an ask the other half

 

        B(ExecPrice)=B(ExecPrice)+LODensity(200)*10;

 

    else

 

        A(ExecPrice)=A(ExecPrice)+LODensity(200)*10;

 

    end

 

end

 

end

\hrule

 

function [A,B]=C\_PostMarketOrder(A,B,ExecutePrice,MSize)

 

\% The function posts a market order of given size at given price to LOB

 

\% Find out about sizes of market orders ???

 

 

 

[ask,bid]=B\_AskAndBid(A,B);\% Find the best ask and bid

 

 

 

remainder=MSize;

 

if ExecutePrice==ask

 

    \% Effect of market buy order

 

    if remainder\textgreater sum(A)

 

        error('The market buy order consumes all sell orders')

 

    end

 

    while remainder\textgreater =A(ask)

 

        remainder=remainder-A(ask);

 

        A(ask)=0;

 

        ask=ask-1;

 

    end

 

    A(ask)=A(ask)-remainder;

 

elseif ExecutePrice==bid

 

    \% Effect of market sell order

 

    if remainder\textgreater sum(B)

 

        error('The market sell order consumes all buy orders')

 

    end

 

    while remainder\textgreater =B(bid)

 

        remainder=remainder-B(bid);

 

        B(bid)=0;

 

        bid=bid+1;

 

    end

 

    B(bid)=B(bid)-remainder;

 

end

 

end

 

 

 

ticID=tic;

 

\% MSize=2000;\% Have to determine MSize or market order frequency

 

\% d1=100;\% Distance from midprice up to which LODensity is defined

 

\% mu=0.6;\% Empirical parameter in LODensity

 

\% c=A\_NormConstant(d1,mu);\% Normalization constant in LODensity

 

T=1000;

 

A\_T=zeros(MaxPrice,T);

 

B\_T=zeros(MaxPrice,T);

 

A\_T(:,1)=A;

 

B\_T(:,1)=B;

 

for t=2:T

 

[LList,CList,MList]=B\_OrderTimesFixedPrice(time,MaxPrice,LRate,Theta,MRate);

 

Matrix=C\_AllOrdersTimesAndPrices(LList,CList,MList,MaxPrice);

 

[~,QIndex]=sort(Matrix(:,1));

 

NumberOfOrders=size(QIndex,1);

 

ATemp=A\_T(:,t-1);

 

BTemp=B\_T(:,t-1);

 

LODensity=B\_LODensity(d1,mu,c,MaxPrice);

 

    for j=1:NumberOfOrders

 

    OrderPrice=Matrix(QIndex(j),3);\% Price to which this order corresponds

 

    OrderType=Matrix(QIndex(j),2);\% Order type

 

    \% Type of order (1 for Limit, 2 for Cancel, 3 for Market)

 

        if OrderType==1\% Limit order is posted in agreement with LODensity

 

            [ATemp,BTemp]=C\_PostLimitOrder(ATemp,BTemp,OrderPrice,LODensity);

 

        elseif OrderType==2\% Cancel order

 

            [ATemp,BTemp]=C\_PostCancelOrder(ATemp,BTemp,OrderPrice);

 

        elseif OrderType==3\% Market order

 

            [ATemp,BTemp]=C\_PostMarketOrder(ATemp,BTemp,OrderPrice,MSize);

 

        end

 

    end

 

A\_T(:,t)=ATemp;

 

B\_T(:,t)=BTemp;

 

end

 

beep

 

elapsed=toc(ticID);

\hrule

 

function [A\_B \_Mean,A\_mean,B\_mean,A\_B \_StD,A\_B \_Ask \_Bid,A\_B \_Aver,A\_cum,B\_cum] =E\_Inference \_A \_B(A\_T,B\_T)

 

APlusB=A\_T+B\_T;

 

T=size(APlusB,2);

 

MaxPrice=size(APlusB,1);

 

A\_B \_Mean=mean(APlusB);

 

A\_B \_StD=std(APlusB);

 

StandardAB=zeros(MaxPrice,T);

 

A\_B \_Ask \_Bid=zeros(T,3);

 

    for t=1:T

 

    StandardAB(:,t)=(APlusB(:,t)-ones(MaxPrice,1)*A\_B \_Mean(t))/A\_B\_StD(t);

 

    [A\_B \_Ask \_Bid(t,1),A\_B \_Ask \_Bid(t,2)]=B\_AskAndBid(A\_T(:,t),B\_T(:,t));

 

    end

 

A\_B \_Ask \_Bid(:,3)=(A\_B \_Ask \_Bid(:,1)+A\_B \_Ask \_Bid(:,2))/2;

 

A\_B \_Aver=mean(StandardAB,2);

 

A\_mean=mean(A\_T);

 

B\_mean=mean(B\_T);

 

A\_cum=cumsum(A\_T);

 

B\_cum=B\_FindCum(B\_T);

 

figure

 

plot(A\_B \_Ask \_Bid,'DisplayName','A\_B \_Ask \_Bid')

 

title('Ask, bid and midprice')

 

figure

 

plot(A\_B \_Aver)

 

title('Average of standardized order distributions')

 

figure

 

plot(A\_B \_Mean)

 

title('Evolution of means')

 

figure

 

plot(A\_B \_StD)

 

title('Evolution of standard deviations')

 

end

\hrule

 

function [A\_band,B\_band]=F\_band \_A \_B(A\_B \_Ask \_Bid,A\_T,B\_T,band)

 

\% This function finds bands of order sizes of width band (band up from ask

 

\% in A\_T and band down from bid in B\_T)

 

\% Extracting ask, bid and mid

 

ask=A\_B \_Ask \_Bid(:,1);

 

bid=A\_B \_Ask \_Bid(:,2);

 

T=size(A\_T,2);

 

 

 

\% band is the number of steps up (in A\_T) or down (in B\_T) from mid

 

A\_band=zeros(band,T);

 

B\_band=zeros(band,T);

 

for t=1:T

 

\% Preparing the quotes from ask-band+1 to ask

 

A\_band(1:band,t)=A\_Revert(A\_T(ask(t)-band+1:ask(t),t));

 

\% Preparing the quotes from bid to bid+band-1

 

B\_band(1:band,t)=B\_T(bid(t):bid(t)+band-1,t);

 

end

 

\% Preparing regressors

 

A\_band=transpose(A\_band);

 

B\_band=transpose(B\_band);

 

 

 

\% Extracting ask, bid and mid

 

ask=A\_B \_Ask \_Bid(:,1);

 

bid=A\_B \_Ask \_Bid(:,2);

 

mid=A\_B \_Ask \_Bid(:,3);

 

\% move\_up (move\_down) shows when the midprice moved up (down, resp)

 

mid\_shift=mid(1:size(mid,1)-1);

 

mid\_cut=mid(2:size(mid,1));

 

move\_up=cat(1,0,(mid\_cut\textgreater mid\_shift));

 

move\_down=cat(1,0,(mid\_cut\textless mid\_shift));

 

 

 

R2Adj=zeros(10,1);

 

mdl\_wA=fitglm(weightA,move\_up,'Distribution','binomial');

 

R2Adj(1)=mdl\_wA.Rsquared.Adjusted;

 

 

 

mdl\_wB=fitglm(weightB,move\_up,'Distribution','binomial');

 

R2Adj(2)=mdl\_wB.Rsquared.Adjusted;\% B band affects move\_up stronger than A band

 

 

 

mdl\_bA=fitglm(A\_band,move\_up,'Distribution','binomial');

 

R2Adj(3)=mdl\_bA.Rsquared.Adjusted;\% This is stronger

 

 

 

mdl\_bB=fitglm(B\_band,move\_up,'Distribution','binomial');

 

R2Adj(4)=mdl\_bB.Rsquared.Adjusted;

 

 

 

mdl\_wA \_bB=fitglm(cat(2,weightA,B\_band),move\_up,'Distribution','binomial');

 

R2Adj(5)=mdl\_wA \_bB.Rsquared.Adjusted;

 

mdl\_bA \_wB=fitglm(cat(2,A\_band,weightB),move\_up,'Distribution','binomial');

 

R2Adj(6)=mdl\_bA \_wB.Rsquared.Adjusted;\% This combination is stronger

 

\% move\_up is regressed on aver indic and band for A\_T

 

mdl\_wA \_bA=fitglm(cat(2,weightA,A\_band),move\_up,'Distribution','binomial');

 

R2Adj(7)=mdl\_wA \_bA.Rsquared.Adjusted;

 

mdl\_wB \_bB=fitglm(cat(2,weightB,B\_band),move\_up,'Distribution','binomial');

 

R2Adj(8)=mdl\_wB \_bB.Rsquared.Adjusted;

 

mdl\_wA \_wB=fitglm(cat(2,weightA,weightB),move\_up,'Distribution','binomial');

 

R2Adj(9)=mdl\_wA \_wB.Rsquared.Adjusted;

 

 

 

mdl\_wA \_wB \_bA \_bB=fitglm(cat(2,weightA,weightB,A\_band,B\_band),move\_up, 'Distribution','binomial');

 

R2Adj(10)=mdl\_wA \_wB \_bA \_bB.Rsquared.Adjusted;\% This is slightly better than next

 

mdl\_bA \_bB=fitglm(cat(2,A\_band,B\_band),move\_up,'Distribution','binomial');

 

R2Adj(11)=mdl\_bA \_bB.Rsquared.Adjusted;

 

 

 

R\_wA \_wB=corrcoef(cat(2,weightA,weightB));

 

R\_wA \_bB=corrcoef(cat(2,weightA,B\_band));

 

R\_bA \_bB=corrcoef(cat(2,A\_band,B\_band));

\hrule

 

function [weightA,weightB]=F\_weight \_A \_B(A\_B \_Ask \_Bid,A\_cum,B\_cum,band,MSize)

 

\% This function prepares weights for averaging order sizes; checked in

 

\% Excel

 

\% band is the number of steps up (in A\_T) or down (in B\_T) from ask (bid, resp)

 

\% Extracting ask, bid, MaxPrice and T

 

ask=A\_B \_Ask \_Bid(:,1);

 

bid=A\_B \_Ask \_Bid(:,2);

 

MaxPrice=size(A\_cum,1);

 

T=size(A\_cum,2);

 

 

 

weightA=zeros(T,1);

 

weightB=zeros(T,1);

 

CumForB=B\_FindCum((ones(MaxPrice,1)));

 

 

 

for t=1:T

 

\% Preparing average indicator of asks from price1 to ask-band

 

weightA(t)=sum(A\_cum (1:ask(t)-band,t)./cumsum(ones(ask(t)-band,1))/MSize);

 

\% Preparing average indicator of bids from bid+band to MaxPrice

 

weightB(t)=sum (B\_cum (bid(t)+band:MaxPrice,t)./CumForB(bid(t)

 

+band:MaxPrice,1))/MSize;

 

end

 

end

 

% This function looks at contribution of bands and weights for bands from 5

% to max+4

max=20;

Summary\_R2=zeros(max,2);

 

Summary\_pVal=zeros(max,2);

 

for b=1:max

 

   [wA,wB]=F\_weight \_A \_B(A\_B \_Ask \_Bid,A\_cum,B\_cum,b+4,MSize);

 

   [A\_band,B\_band]=F\_band \_A \_B(A\_B \_Ask \_Bid,A\_T,B\_T,b+4);

 

    mdl\_wA \_wB \_bA

    \_bB=fitglm(cat(2,wA,wB,A\_band,B\_band),move\_up,'Distribution',

 

    'binomial');

 

    Summary\_R2(b,1)=mdl\_wA \_wB \_bA \_bB.Rsquared.Adjusted;

 

    mdl\_bA \_bB=fitglm(cat(2,A\_band,B\_band),move\_up,'Distribution','binomial');

 

    Summary\_R2(b,2)=mdl\_bA \_bB.Rsquared.Adjusted;

 

    Summary\_pVal(b,1)=mdl\_wA \_wB \_bA \_bB.Coefficients.pValue(2);

 

    Summary\_pVal(b,2)=mdl\_wA \_wB \_bA \_bB.Coefficients.pValue(3);

 

end

\rmfamily

\end{document}

 

Источники: 

  1. C. Parlour and D. J. Seppi. Limit Order Market: A Survey. Elsevier North-Holland, 2008
  2. R. Cont. Statistical modeling of high-frequency financial data. Signal Processing Magazine, IEEE, 28(5):16--25, August 2011.
  3. R. Cont, S. Stoikov, and R. Talreja. A stochastic model for order book dynamics. Operations Research, 58(3):549--563, May 2010.
  4. H. He and A. N. Kercheval. A generalized birth-death stochastic model for highfrequency order book dynamics. Quantitative Finance, 12(4):547--557, April 2012.
  5. I. Rosu. A dynamic model of the limit order book. Review of Financial Studies, 22:4601--4641, June 2009.
  6. H. H. S. Shek. Modeling high frequency market order dynamics using self-excited point process, 2010.
  7. J.-P. Bouchaud, M. Mezard, and M. Potters. Statistical properties of stock order books: Empirical results and models. Quantitative Finance, 2(4):251--256, 2002.
  8. T. Foucault, O. Kadan, and E. Kandel. Limit order book as a market for liquidity. Review of Financial Studies, 18(4):1171--1217, August 2005.
  9. E. Jondeau, A. Perilla, and G. Rockinger. Optimal Liquidation Strategies in Illiquid Markets, volume 553. Springer Berlin Heidelberg, 2005.
  10. J. T. Linnainmaa and I. Rosu. Weather and time series determinants of liquidity in a limit order market, 2008.
  11. K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernelbased vector machines. Journal of Machine Learning Research, 2:265--292, March 2001.
  12. P. Tino, N. Nikolaev, and X. Yao. Volatility forecasting with sparse bayesian kernel models. In 4th International Conference on Computational Intelligence in Economics and Finance, 1052--1058, 2005.
  13. A. Blazejewski and R. Coggins, "A Local Non-Parametric Model for Trade Sign Inference," Physica A: Statisti-cal Mechanics and Its Applications, Vol. 348, No. 1, 2005, pp. 481--495.
  14. Ban Zheng, Eric Moulines, Fr\'{e}d\'{e}ric Abergel. Price Jump Prediction in a Limit Order Book. Journal of Mathematical Finance, 2013, 3, 242--255.
  15. T. Fletcher, Z. Hussain and J. Shawe-Taylor, "Multiple Kernel Learning on the Limit Order Book," Quantitative Finance, 2010.
  16. Alec N.Kercheval, Yuan Zhang. Modeling high-frequency limit order book dynamics with support vector machines. 2013

Теги

    04.11.2019

    Комментарии

    • Nomad
      Nomad0.00
      19.09.2022 07:10

      Hi, glad to hear this. The paper was published here: Kazakh Mathematical Journal, 20:2 (2020) 44-53

      • gemma
        gemma0.00
        19.11.2022 07:40

        I did an odd Internet search and came upon your page retro games. This is a really good essay. The fact that some people continue to put forth a lot of work to maintain their websites is heartening.

        • hehena
          hehena0.00
          31.12.2022 12:09

          Чтобы стать преподавателем ногтевого сервиса, вам необходимо иметь прочную основу в области ногтевых технологий и страсть к обучению и обмену знаниями с другими. Вот некоторые шаги, которые вы можете предпринять, чтобы стать преподавателем ногтевого сервиса LV nails