Identification of stock market manipulation using a hybrid ensemble approach

Authors

  • Pearse Quinn School of Computing, Engineering and Intelligent Systems, Ulster University, United Kingdom
    United Kingdom
  • Marinus Toman School of Computing, Engineering and Intelligent Systems, Ulster University, United Kingdom
    United Kingdom
  • Kevin Curran 1School of Computing, Engineering and Intelligent Systems, Ulster University, United Kingdom
    United Kingdom

DOI:

https://doi.org/10.23917/arstech.v4i2.2576

Keywords:

Anomaly Detection, Deep Learning, Exponential Smoothing, Long Short-Term Memory (LSTM), Market Manipulation

Abstract

Anomaly detection in time series data is a complex data mining issue with many useful, real-world applications. Anomalies in datasets represent deviations in the expected behaviour of a system and can indicate rare but significant events that require intervention. Market manipulation is a serious issue in financial jurisdictions worldwide, with financial regulators such as the SEC constantly trying to prevent it and prosecute those guilty of it. This paper makes use of state-of-the-art deep learning techniques as well as more classical statistical techniques in order to detect anomalies in five real-world datasets. The predictions of these models are then aggregated in two different ensemble models. The results of the individual models as well as the ensemble models, are evaluated, and F1-Score measures performance. Nine individual models, consisting of three models based on LSTM with Dynamic Thresholding, three ARIMA models and three Exponential Smoothing models, were used to generate predictions of anomalies based on daily trading volumes. The individual predictions of these models were then aggregated, with two different ensemble methods being used, namely the majority voting ensemble method and the ensemble averaging aggregation method. While both performed well, the majority voting ensemble method was seen to be the superior method in this study, with an average F1Score of 0.494, compared to an F1Score of 0.414 for the ensemble averaging aggregation method.

Downloads

Download data is not yet available.

References

D.M. Hawkins, "Identification of outliers", Monographs on Statistics and Applied Probability, vol. 11, 1980. https://doi.org/10.1007/978-94-015-3994-4

R.J. Hsieh, J. Chou, and C.H. Ho, "Unsupervised online anomaly detection on multivariate sensing time series data for smart manufacturing", Proceedings - 2019 IEEE 12th Conference on Service-Oriented Computing and Applications, SOCA 2019, Institute of Electrical and Electronics Engineers Inc., pp. 90–97, 2019. https://doi.org/10.1109/SOCA.2019.00021

S. Chackravarthy, S. Schmitt, and L. Yang, "Intelligent crime anomaly detection in smart cities using deep learning", Proceedings - 4th IEEE International Conference on Collaboration and Internet Computing, CIC 2018, Institute of Electrical and Electronics Engineers Inc., pp. 399–404. 2018. https://doi.org/10.1109/CIC.2018.00060

M. Min, J.J. Lee, H. Park, H. Shin, and K. Lee, "A statistical approach towards fraud detection in the horse racing", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Science and Business Media Deutschland GmbH, pp. 191–202, 2020. https://doi.org/10.1007/978-3-030-65299-9_15

M.A. Hayes and M.A. Capretz, "Contextual anomaly detection framework for big sensor data", Journal of Big Data, vol. 2, no. 1, 2015, https://doi.org/10.1186/s40537-014-0011-y

I.K. Nti, A.F. Adekoya, and B.A. Weyori, "A systematic review of fundamental and technical analysis of stock market predictions", Artificial Intelligence Review, vol. 53, no. 4, pp. 3007–3057, 2020. https://doi.org/10.1007/s10462-019-09754-z

R.K. Aggarwal and G. Wu, "Stock market manipulations", Journal of Business, vol. 79, no. 4, pp. 1915–1953, 2006. https://doi.org/10.1086/503652

US Securities and Exchange Commission, "Market manipulation and case studies", 2023. https://www.sec.gov/file/market-manipulation-and-case-studies

B. Sairam, A. Agrawal, G. Krishna, and S.P. Sahu, "Automated vehicle parking slot detection system using deep learning", Proceedings of the 4th International Conference on Computing Methodologies and Communication, ICCMC 2020, Institute of Electrical and Electronics Engineers Inc., pp. 750–755, 2020. https://doi.org/10.1109/ICCMC48092.2020.ICCMC-000140.

T. Young, D. Hazarika, S. Poria, and E. Cambria, "Recent trends in deep learning based natural language processing", IEEE Computational Intelligence Magazine, vol. 13, no. 3, pp. 55–75, 2018. https://doi.org/10.1109/MCI.2018.2840738

X. Chen, X. Wang, K. Zhang, K.M. Fung, T.C. Thai, K. Moore, R.S. Mannel, H. Liu, B. Zheng, and Y. Qiu, "Recent advances and clinical applications of deep learning in medical image analysis", Medical Image Analysis, vol. 79, p. 102444, 2022. https://doi.org/10.1016/j.media.2022.102444

R. Chalapathy and S. Chawla, "Deep learning for anomaly detection: A survey", Computer Science, 2019. http://arxiv.org/abs/1901.03407

N.R. Prasad, S. Almanza-Garcia, and T.T. Lu, "Anomaly detection", Computers, Materials and Continua, vol. 14, no. 1, pp. 1–22, 2009. https://doi.org/10.1145/1541880.1541882

M. Goldstein and S. Uchida, "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data", PLoS One, vol. 11, no. 4, 2016. https://doi.org/10.1371/journal.pone.0152173

S.R. Islam, S. Khaled Ghafoor, and W. Eberle, "Mining illegal insider trading of stocks: a proactive approach", Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018, Institute of Electrical and Electronics Engineers Inc., pp. 1397–1406. 2019. https://doi.org/10.1109/BigData.2018.8622303

CFI for Team, "Insider Trading", 2023. https://corporatefinanceinstitute.com/resources/wealth-management/what-is-insider-trading/

T. Leangarun, P. Tangamchit, and S. Thajchayapong, 'Stock price manipulation detection using generative adversarial networks', IEEE Symposium Series on Computational Intelligence (SSCI), pp. 2104–2111, 2018. https://doi.org/10.1109/SSCI.2018.8628777

R. Dhir, "Pump-and-dump: Definition, how the scheme is illegal, and types", Investopedia, 2022. https://www.investopedia.com/terms/p/pumpanddump.asp

K. Hundman, V. Constantinou, C. Laporte, I. Colwell, and T. Soderstrom, "Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding", Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, pp. 387–395. 2018. https://doi.org/10.1145/3219819.3219845

J. Tallboys, Y. Zhu, and S. Rajasegarar, "Identification of stock market manipulation with deep learning", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Science and Business Media Deutschland GmbH, pp. 408–420, 2022. https://doi.org/10.1007/978-3-030-95405-5_29.

A. Geiger, D. Liu, S. Alnegheimish, A. Cuesta-Infante, and K. Veeramachaneni, "TadGAN: Time series anomaly detection using generative adversarial networks", Computer Science, 2020. http://arxiv.org/abs/2009.07769

T.S. Buda, B. Caglayan, and H. Assem, "DeepAD: A generic framework based on deep learning for time series anomaly detection", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Verlag, 2018, pp. 577–588. https://doi.org/10.1007/978-3-319-93034-3_46.

N. Laptev, S. Amizadeh, and I. Flint, "Generic and scalable framework for automated time-series anomaly detection", Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, pp. 1939–1947, 2015. https://doi.org/10.1145/2783258.2788611.

CFI for Team, "Short squeeze", 2023. https://corporatefinanceinstitute.com/resources/career-map/sell-side/capital-markets/short-squeeze/

GitHub, "Sintel-dev/Orion", 2023. https://github.com/sintel-dev/Orion

E.H.M. Pena, M.V.O. De Assis, and M.L. Proença, "Anomaly detection using forecasting methods ARIMA and HWDS", Proceedings - International Conference of the Chilean Computer Science Society, SCCC, IEEE Computer Society, pp. 63–66, 2013. https://doi.org/10.1109/SCCC.2013.18

E. Ostertagova, O. Ostertag, and E. Ostertagová, "The simple exponential smoothing model", The 4th International Conference on modelling of mechanical and mechatronic systems, Technical University of Košice, Slovak Republic, Proceedings of Conference. p. 380-384, 2011.

A.D. Aczel, "Complete business statistics", McGraw Hill, 1998.

D.C. Montgomery, L.A. Johnson, and J.S. Gardiner, "Forecasting and time series analysis", McGraw-Hill, Inc., 1990.

A. Kabán, "Nonparametric detection of meaningless distances in high dimensional data", Statistics and Computing, vol. 22, no. 2, pp. 375–385, 2012. https://doi.org/10.1007/s11222-011-9229-0

A. Dogan and D. Birant, "A weighted majority voting ensemble approach for classification", Proceedings, 4th International Conference on Computer Science and Engineering, Institute of Electrical and Electronics Engineers Inc., pp. 366–371, 2019. https://doi.org/10.1109/UBMK.2019.8907028

S. Alnegheimish, D. Liu, C. Sala, L. Berti-Equille, and K. Veeramachaneni, "Sintel: A machine learning framework to extract insights from signals", in Proceedings of the ACM SIGMOD International Conference on Management of Data, Association for Computing Machinery, pp. 1855–1865, 2022. https://doi.org/10.1145/3514221.3517910

S. Alnegheimish, 'Orion-a machine learning framework for unsupervised time series anomaly detection', PhD Thesis. Massachusetts Institute of Technology, 2022. https://dai.lids.mit.edu/wp-content/uploads/2022/06/sarah_sm_thesis.pdf

Aethlon Medical, "Aethlon announces FDA approval of IDE supplement for COVID-19 patients", CISION PR Newswire, 2020. https://www.prnewswire.com/news-releases/aethlon-announces-fda-approval-of-ide-supplement--for-covid-19-patients-301079557.html

Aethlon Medical, "Aethlon medical announces first patient treated in first-in-human clinical trial of HEMOPURIFIER® in head and neck cancer", CISION PR Newswire, 2020. https://www.prnewswire.com/news-releases/aethlon-medical-announces-first-patient-treated-in-first-in-human-clinical-trial-of-hemopurifier-in-head-and-neck-cancer-301193962.html

Downloads

Published

2023-11-29

How to Cite

Quinn, P., Toman, M., & Curran, K. (2023). Identification of stock market manipulation using a hybrid ensemble approach. Applied Research and Smart Technology (ARSTech), 4(2), 53–63. https://doi.org/10.23917/arstech.v4i2.2576

Similar Articles

You may also start an advanced similarity search for this article.