top of page

Value Prediction of FIFA
players with MLP and CNN

Introduction
This project develops an end‑to‑end machine‑learning pipeline to predict professional footballers’ market values using annual FIFA datasets from 2016 to 2022. By extracting player attributes (physical, technical, age) and in‑game valuations via the community‑maintained Sofifa API and public Kaggle feeds, the aim was to reveal the key drivers of value and assess the benefit of deep architectures over classical regression.

Methods
Raw CSV/JSON rosters were ingested into a PostgreSQL database, where fields were cast to appropriate types, invalid or missing entries were filtered or median‑imputed, continuous variables normalized, and categorical features one‑hot encoded. After an initial linear regression baseline, we trained two neural models: a two‑layer Multi‑Layer Perceptron (MLP) with dropout and a 1D‑Convolutional Neural Network (CNN) treating technical and physical attribute groups as separate channels. Both were optimized via 5‑fold cross‑validation, early stopping, and grid‑search on learning rate, batch size, and layer depth within PyTorch/TensorFlow.

Results
On an 80/20 train/test split, the MLP model yield a MAPE of 1.44% and the CNN model yield a MAPE of 1.16%. The CNN model is more accurate is more robust under large and varied dataset. The MLP mode is faster to train despite it being more vulnerable to untidy dataset.

Discussion
Deep‑learning models outperformed linear regression by capturing non‑linear feature interactions, with the CNN’s localized convolutions offering a slight edge. While FIFA ratings provide a strong proxy for market value, the static game data omit real‑world factors (injuries, contracts, sentiment) and lack mid‑season updates. Future work will integrate live performance metrics, transfer‑fee records, and explore ensemble or transformer‑based approaches, with deployment via a Flask REST API for real‑time valuation queries.

1fe9492079f100d45a7df62955434fe.png
bottom of page