Prediksi Kelompok Usia Pengguna Netflix Menggunakan Metode Random Forest Berdasarkan Analisis Genre Tontonan dan Perilaku Pengguna

作者

  • Abelina Stevie Maria Trafin Universitas Universal
  • Masparudin Universitas Universal
  • Eka Lia Febrianti Universitas Universal

##plugins.pubIds.doi.readerDisplayName##:

https://doi.org/10.63643/jodens.v5i2.326

关键词:

Netflix, Random Forest, SMOTE, Klasifikasi Usia, Perilaku Pengguna

摘要

The accuracy of user demographics, particularly age, on video streaming platforms is often compromised by the widespread practice of shared accounts. This study addresses this challenge by implicitly classifying user age groups (Youth, Young Adult, Adult, Middle-Aged, Senior) based solely on behavioral data, including viewing genre frequency, sentiment analysis of reviews, and expenditure patterns. The core methodology employs a Random Forest Classifier optimized with SMOTE (Synthetic Minority Over-sampling Technique) to mitigate the severe class imbalance present in the dataset. The initial Baseline Model performed poorly, achieving only 40,13% accuracy and failing to identify minority classes. After implementing SMOTE and hyperparameter tuning, the Final Model demonstrated significant improvement, achieving an Accuracy of 79,26%. The engineered feature, Spend per Person, was identified as the most dominant predictor, validating the approach of using economic factors to differentiate genuine individual usage. Crucially, the model showed exceptional reliability in detecting sensitive age segments, such as Youth (F1-Score 0,88) and Seniors (F1-Score 0,75). This research provides an effective data-driven solution for enhancing age-based content personalization and parental control features.

参考

V. Shelake, S. Fernandes, and S. Shrungare, “AI-Driven Personalized Movie Recommendations: A Content and Sentiment-Aware Model for Streaming and Digital Entrepreneurship,” Aptisi Trans. Technopreneursh., vol. 7, no. 2, Apr. 2025, doi: 10.34306/att.v7i2.550.

A. S. Salsabila, C. A. Sari, and E. H. Rachmawanto, “Classification of Movie Recommendation on Netflix Using Random Forest Algorithm,” Adv. Sustain. Sci. Eng. Technol., vol. 6, no. 3, p. 02403016, Jul. 2024, doi: 10.26877/asset.v6i3.676.

M. Gollapalli et al., “Machine Learning Approach to Users’ Age Prediction: A Telecom Company Case Study in Saudi Arabia,” Math. Model. Eng. Probl., vol. 10, no. 5, pp. 1619–1629, Oct. 2023, doi: 10.18280/mmep.100512.

F. Qiu and Y. Cui, “An analysis of user behavior in online video streaming,” in Proceedings of the international workshop on Very-large-scale multimedia corpus, mining and retrieval, New York, NY, USA: ACM, Oct. 2010, pp. 49–54. doi: 10.1145/1878137.1878149.

B. Veloso, B. Malheiro, J. C. Burguillo, and ..., “Improving On-line Genre-based Viewer Profiling,” TVX2017 Work. …, 2017, [Online]. Available: http://www.open-access.bcu.ac.uk/id/eprint/4829%0Ahttp://www.open-access.bcu.ac.uk/4829/1/WS4p1- Jeremy Foss.pdf

B. H. Hayadi, “Clustering Netflix Shows Based on Features Using K-means and Hierarchical Algorithms to Identify Content Patterns,” Int. J. Appl. Inf. Manag., vol. 5, no. 2, pp. 98–110, Jul. 2025, doi: 10.47738/ijaim.v5i2.102.

A. Kulkarni, D. Chong, and F. A. Batarseh, “Foundations of data imbalance and solutions for a data democracy,” in Data Democracy, Elsevier, 2020, pp. 83–106. doi: 10.1016/B978-0-12-818366-3.00005-8.

Anju Fauziah and Julan Hernadi, “Klasifikasi Data Tak Seimbang Menggunakan Algoritma Random Forest dengan SMOTE dan SMOTE-ENN,” Teknomatika J. Inform. dan Komput., vol. 17, no. 2, pp. 38–47, Mar. 2025, doi: 10.30989/teknomatika.v17i2.1530.

A. M. A. Rahim, Inggrid Yanuar Risca Pratiwi, and Muhammad Ainul Fikri, “Klasifikasi Penyakit Jantung Menggunakan Metode Synthetic Minority Over-Sampling Technique Dan Random Forest Clasifier,” Indones. J. Comput. Sci., vol. 12, no. 5, Nov. 2023, doi: 10.33022/ijcs.v12i5.3413.

K. De Bock and D. Van den Poel, “Predicting Website Audience Demographics forWeb Advertising Targeting Using Multi-Website Clickstream Data,” Fundam. Informaticae, vol. 98, no. 1, pp. 49–70, Jan. 2010, doi: 10.3233/FI-2010-216.

L. S. R. and U. K., “Age Group Classification and Gender Prediction using Facial Skin Texture Analysis,” Int. J. Comput. Appl., vol. 186, no. 53, pp. 20–26, Dec. 2024, doi: 10.5120/ijca2024924208.

R. Chew, C. Kery, L. Baum, T. Bukowski, A. Kim, and M. Navarro, “Predicting Age Groups of Reddit Users Based on Posting Behavior and Metadata: Classification Model Development and Validation,” JMIR Public Heal. Surveill., vol. 7, no. 3, p. e25807, Mar. 2021, doi: 10.2196/25807.

Z. Anwer, S. Qureshi, S. M. Zeeshan Iqbal, A. Zia, and S. Anwer, “Predicting user behavior on video streaming by using watch-time duration analysis,” Knowledge-Based Syst., vol. 332, p. 114779, Jan. 2026, doi: 10.1016/j.knosys.2025.114779.

E. M. Khan, M. S. H. Mukta, M. E. Ali, and J. Mahmud, “Predicting Users’ Movie Preference and Rating Behavior from Personality and Values,” ACM Trans. Interact. Intell. Syst., vol. 10, no. 3, pp. 1–25, Sep. 2020, doi: 10.1145/3338244.

S. Mahimkar and D. G. S. K. Lagan Goel, “Predictive Analysis of TV Program Viewership Using Random Forest Algorithms,” IJRAR-International J. Res. Anal. Rev. (IJRAR), E-ISSN 2348-1269, P-ISSN 2349, vol. 5138, no. October 2021, pp. 309–322, 2021.

V. Oktaviani, N. Rosmawarni, and M. P. Muslim, “Perbandingan Kinerja Random Forest Dan Smote Random Forest Dalam Mendeteksi Dan Mengukur Tingkat Stres Pada Mahasiswa Tingkat Akhir,” Inform. J. Ilmu Komput., vol. 20, no. 1, pp. 43–49, Apr. 2024, doi: 10.52958/iftk.v20i1.9158.

M. Umer et al., “Scientific papers citation analysis using textual features and SMOTE resampling techniques,” Pattern Recognit. Lett., vol. 150, pp. 250–257, Oct. 2021, doi: 10.1016/j.patrec.2021.07.009.

P. Khant and B. Tidke, “Multimodal Approach to Recommend Movie Genres Based on Multi Datasets,” Indian J. Sci. Technol., vol. 16, no. 30, pp. 2304–2310, Aug. 2023, doi: 10.17485/IJST/v16i30.1238.

N. Istiqamah and M. Rijal, “Klasifikasi Ulasan Konsumen Menggunakan Random Forest dan SMOTE,” J. Syst. Comput. Eng., vol. 5, no. 1, pp. 66–77, Jan. 2024, doi: 10.61628/jsce.v5i1.1061.

##submission.downloads##

已出版

2025-12-30